Incremental data import from cloud databases

Import Incremental data from cloud databases

 
Incremental data fetch is a method used to import new or modified data from a source. Zoho DataPrep helps you import incremental data from the following cloud databases using the import configuration. 
  1. Amazon RDS - MySQL
  2. Amazon RDS - MS SQL Server
  3. Amazon RDS - Oracle
  4. Amazon RDS - PostgreSQL
  5. Amazon RDS - Maria DB
  6. Amazon RDS - Amazon Aurora MySQL
  7. Amazon RDS - Amazon Aurora PostgreSQL
  8. Amazon Redshift
  9. Amazon Athena
  10. Microsoft Azure-MySQL
  11. Microsoft Azure - PostgreSQL
  12. Microsoft Azure - Maria DB
  13. Microsoft Azure - SQL Database
  14. Microsoft Azure - SQL Data Warehouse
  15. Google Cloud SQL - MySQL
  16. Google Cloud SQL - PostgreSQL
  17. Snowflake
  18. Oracle Cloud
  19. IBM Cloud - DB2
  20. Heroku PostgreSQL
  21. Rackspace Cloud - MySQL
  22. Rackspace Cloud - Maria DB
  23. Panoply
  24. MySQL
  25. MS SQL Server
  26. Oracle
  27. PostgreSQL
  28. Maria DB
  29. MemSQL
  30. DB2

To start with import


1. Create a pipeline or open an existing pipeline from the Home Page, Pipelines tab, or Workspaces tab and click the Add data option. You can also click the Import data option under the Workspaces tab to import data.  
InfoInfo: You can also click the Import data  icon at the top of the pipeline builder and bring data from multiple sources into the pipeline.


 

2. Select the Cloud databases category from the left pane and choose the required cloud database. You can also search cloud databases in the search box.



3. Choose your Database service name and Database type

4. Enter your Database server host.        

5. Enter your Database name, username, and password if authentication is required. 

6.  You can also select the Use SSL checkbox if your database server has been setup to serve encrypted data through SSL.



Notes
Note: The Connection name must be unique for each connection. 

7. Click Connect.

8. Select the table to be imported and click Import to begin importing data from your cloud database service.




Notes Note The connection configuration will be saved for importing data in the future. Credentials are securely encrypted and stored. 
Warning
The incremental fetch option is not available when the data is imported using a query from databases.

9. Once you complete importing data, the Pipeline builder page opens and you can start applying transforms. You can also right-click the stage and choose the Prepare data option to prepare your data using various transforms in the DataPrep Studio page. Click here to know more about the transforms. 



10. Once you are done creating your data flow and applying necessary transforms in your stages, you can right-click a stage and add a destination to complete your data flow.
NotesAfter adding a destination to the pipeline, you can try executing your pipeline using a manual run at first. Once you make sure manual run works, you can then set up a schedule to automate the pipeline. Learn about the different types of runs here.

While configuring the Schedule, Backfill, Manual reload, Webhooks, or Zoho Flow, the import configuration needs to be mandatorily setup for all the sources. Without setting up the import configuration, the run cannot be saved. Click here to know more about how to set up import configuration.

11. After configuring a run, a pipeline job will be created at the run time. You can view the status of a job with the granular details in the Job summary. 
Click here to know more about the job summary.

Import configuration for cloud database

If you import data from cloud database using the Select tables option, you can configure how to import and fetch incremental data from your cloud database using the below Import configuration options.

You can select the Click here link to set the import configuration.

Below is a snapshot from the schedule configuration.

 

How to import data from source? Select the way you would like to import your data from the drop-down - Import all dataIncremental file fetchDo not import data.

Import all data

This option will import all available data for every run.



Incremental data fetch


Incremental data import is a method used to import new or modified records in a specific data interval.

Warning
Important: This incremental fetch option is not available when the data is imported using a query from databases.

 




Only modified and new data

To import the modified and new data incrementally in a specific data interval time, select Only modified and new data option from the drop-down.

Fetch based on: You can enter the date-time column name based on which the file must be sorted and imported.

Use the previously imported data if no new data is available: 

During incremental import,

  1. If the checkbox is checked: When there is no new data in the source, the last fetched data will be imported again.
  2. If the checkbox is unchecked: When there is no new data in the source, the import will fail and no files will be imported. This will, in turn, cause the entire pipeline job to fail.
WarningNote: Even if only one source in the pipeline has no new data and this option is unchecked, the entire pipeline job will fail. 

Do not import data

The data is imported only once. The second time, the rules get applied to the same data and get exported.


How incremental sync works

You can configure how to import and fetch incremental data from your source using the Import configuration option. Incremental data import is a method used to import new or modified records in a specific data interval.

In incremental file fetch, when the pipeline is run, new or modified data in the table is fetched based on the date-time column. For example, suppose the import configuration is set to fetch only modified or new data. In that case, the table's new or modified data within the specified time interval in the cloud database will be imported. If no new data is available, the pipeline will either fail without importing any data or re-import the previously fetched data, depending on whether the Use the previously imported data if no new data is available checkbox in the import configuration is selected. In subsequent intervals, any new or modified data within those intervals will be fetched accordingly.


Notes
Note: 
When you set up incremental data fetch using a datetime column:
 
1. The database timezone (where the data is stored) is used to identify new or updated records.
2. The schedule timezone only decides when the job runs; it does not affect which data is fetched.


SEE ALSO
How to add a new pipeline?
What other cloud database options are available in Zoho DataPrep?
How to schedule pipeline?     
How to import data from saved data connections?