Incremental data import from local databases

Import Incremental data from local databases




Incremental data fetch is a method used to import new or modified data from a source. Zoho DataPrep helps you import incremental data from the following local databases using Zoho Databridge:
  1. MySQL
  2. MS SQL Server
  3. Oracle
  4. PostgreSQL
  5. Maria DB
  6. Pervasive SQL
  7. Sybase
  8. DB2
  9. Exasol
  10. Sqlite
  11. Actian Vector
  12. Greenplum
  13. Denodo
  14. Progress OpenEdge
  15. YugabyteDB
  16. Microsoft Access
  17. SAP Hana
  18. Connect using JDBC URL

To start with import

1. Open an existing pipeline or create a pipeline from the Home Page, Pipelines tab or Workspaces tab and click the Add data option.
InfoInfo: You can also click the Import data  icon at the top of the pipeline builder to bring data from multiple sources into the pipeline.


2. In the next screen, choose the required database or click the Databases category from the left pane. 



3. Select New connection from the Connection drop down. If you have existing connections, you can choose the required connection from the Connection drop down. 

4. Give your connection a name under the Connection name section.

5. Zoho Databridge is a tool that facilitates importing data from local databases. Databridge is mandatory to import data from local network.
Notes
Note: If this is the first time you are downloading Databridge, see how to install it here.



6. Once you have installed Databridge on your machine, select your Databridge from the Databridge drop-down. 
Notes
Note: Select the Databridge which is installed in the same network as the database you want to import the data from.

7. Select your Database type and enter the Database server host name and Port number.

8. Enter your Database name and provide the username and password if authentication is required.



9. Save your database configuration and connect to the database using Connect.
Notes
Note: The connection configuration will be automatically saved for importing from the database in the future. Credentials are securely encrypted and stored.


10. Select the table that need to be imported and click the Import button.
Warning
The incremental fetch option is not available when the data is imported using a query from databases.

11Once you have completed importing data, Pipeline builder page will open from where you can start applying transforms. You can also right-click the stage and choose the Prepare data option to prepare your data in the DataPrep Studio pageClick here to know more about the transforms.



12. Once you are done creating your data flow and applying necessary transforms in your stages, you can right-click a stage and add a destination to complete your data flow.
Notes
Note:  After adding a destination to the pipeline, you can try executing your pipeline using a manual run at first. Once you make sure manual run works, you can then set up schedule to automate the pipeline. Learn about the different types of runs here.

Import configuration for local database


If you import data from local database using the Select tables option, you can configure how to import and fetch incremental data from your local database using the below Import configuration options.

Below is a snapshot from the schedule configuration.



How to import data from source? Select the way you would like to import your data from the drop-down - Import all dataIncremental file fetchDo not import data.

Import all data

This option will import all available data for every run.



Incremental data fetch


Incremental data import is a method used to import new or modified records in a specific data interval.
Warning
Important: This incremental fetch option is not available when the data is imported using a query from databases.

 




Only modified and new data

To import the modified and new data incrementally from the last imported time, select Only modified and new data option from the drop-down.

Fetch based on: You can enter the date-time column name based on which the data must be sorted and imported.


Use the previously imported data if no new data is available: 

During incremental import,

  1. If the checkbox is checked: When there is no new data in the source, the last fetched data will be imported again.
  2. If the checkbox is unchecked: When there is no new data in the source, the import will fail and no files will be imported. This will, in turn, cause the entire pipeline job to fail.


Do not import data

The data is imported only once. The second time, the rules get applied to the same data and get exported.



Notes
Note: 
When you set up incremental data fetch using a datetime column:
 
1. The database timezone (where the data is stored) is used to identify new or updated records.
2. The schedule timezone only decides when the job runs; it does not affect which data is fetched.