The Schedule option is used to schedule the import from various sources, apply the transforms, and export to the destinations in your pipeline at regular intervals. You can also import incremental data from various sources like cloud storage, CRM, Creator, etc. while scheduling. Incremental data import is a method used to import new or modified records after the previous sync.
Click here if you want to know how to create a pipeline in the first place.
2. Click on the Add data option to bring your data from the required data source. Learn more on 50+ data sources.
After data is imported, you will be redirected to the pipeline builder where you can see your data source and a stage linked to it.
Stages are nodes created for processing data while applying data flow transforms. Every data imported from your data source will have a stage created by default.
3. You can right-click the stage and apply the data flow transforms.
4. You can also choose to Prepare Data to open the DataPrep Studio page and explore more data transforms. Click here to know the various elements in a pipeline builder.
5. Once you are done creating your data flow and applying necessary transforms in your stages, you can right-click a stage and add a destination to complete your data flow.
Note: Please add at least one destination to your pipeline to mark it as ready.
Data destination is a place where you want to export your data to. It can be your cloud database, business applications like Zoho CRM, Zoho Analytics, etc. You can choose your preferred destination out of 50+ data destinations in DataPrep to export your prepared data to.
After adding a destination, you may want to try executing your pipeline using the Run button. Each run is saved as a job. When a pipeline run is executed, the data fetched from your data sources will be prepared using the series of transforms you have applied in each of the stages, and then data will be exported to your destination. This complete process is captured in the job history. Once you ensure manual run works, you can then set up a schedule to automate the pipeline.
You can schedule your pipeline using the Schedule option in the pipeline builder.
1. Select a Repeat method (hourly, daily, weekly, monthly) and set frequency using Perform every dropdown. The options of the Perform every dropdown change with the Repeat method.
If the Repeat method is set to Hourly, you can specify the interval of the schedule from 1 to 24 hours using the Perform every option.
Past the hour - This option allows you to start the schedule at a specific minute (1-59th minute) within an hour. For the first schedule, the data interval will span from the previous one up to the current data interval, and the second schedule will extend from the current interval to the next, with subsequent schedules following this pattern.
Example: If the current time is 15th May 2024, 10:30 A.M. and you want the schedule to start at the 10th minute of every hour, set the Perform every option to 1 hour and the past the hour option to 10 minutes. The schedule will run every 1 hour at the 10th minute. The first schedule will run from 10.10 A.M. to 11:10 A.M., the second schedule will run from 11:10 A.M. to 12:10 P.M., and so on.
The job frequency will be as shown in the following image.
When the Repeat method is set to Daily, you can configure the schedule to recur Every 1 to 31 days, specifying the exact hour and minute for each schedule to occur. For the first schedule, the data interval will span from the previous one up to the current data interval, and the second schedule will extend from the current interval to the next, with subsequent schedules following this pattern
Example: If the current time is 4th May 2024, Saturday, 7:30 P.M. and you want the schedule to recur at every 2 weeks intervals on Sunday at 12.30 P.M. , set the Repeat option to Weekly and Every option to 2 weeks, Sun and Perform at option to 12 hrs and 30 mins. The schedule will run every 2 weeks on Sunday at 12.30 P.M.
The data interval of the first schedule will be from 21st April Sunday 12.30 P.M. to 5th May Sunday 12.30 P.M. and the second schedule will run from 5th May Sunday 12:30 P.M. to the next 2 weeks and so on.
The job frequency will be as shown in the following image.
Example: If the current time is 31th Jan 2024, 11:30 P.M. and you want the schedule to recur on the 1st of every month at 12.30 A.M. , set the Repeat option to Monthly and Every option to 1 month, choose Date, in the Day order option choose From beginning, Choose days as 1st and Perform at option to 0 hrs and 30 mins. The schedule will run every month on 1st date from the beginning of the month at 12.30 A.M.
The first schedule will be from 1st Jan, 12.30 A.M. to 1st Feb, 12.30 A.M., the second schedule will run from 1st Feb 12:30 A.M. to the 1st of next month 12.30 A.M. and so on.
The job frequency will be as shown in the following image.
2. Select the GMT at which you want to import new data found in the source. By default, your local time zone will be selected.
4. While configuring the schedule, the import configuration needs to be mandatorily setup for all the sources. Without setting up the import configuration, the schedule cannot be saved.
Select the Click here link to set the import configuration.
The import configuration is different for different sources.
How to import data from source? Select the way you would like to import your data from the drop-down - Import all data, Incremental file fetch, Do not import data.
For example,
If your data source contains 5 files that match the file pattern and you set the batch size to 10, then the 5 files will be fetched and exported as a single file based on the created or modified time in the source in the first schedule.
In the second schedule, if 4 new files with the same file pattern are added/modified in the source, then all the 9 files will be fetched and exported as a single file based on the created or modified time in the source.
In the third schedule, if 11 new files are added to the source, and your data source contains 20 files that match the file pattern, then only the first 10 files will be fetched and exported as a single file based on the created or modified time in the source since the batch size is set as 10. The same 10 files will be fetched in the upcoming schedules from the source and so on.
You can configure how to import and fetch incremental data from your source using the Import configuration option. You can import incremental data from sources like cloud storage, Zoho Creator, Zoho CRM, Zoho Bigin, Salesforce, Cloud databases, local databases, FTP, and local files from local network. Incremental data import is a method used to import new or modified records after the previous sync.
In incremental file fetch, the data interval of the first schedule will span from the previous one up to the current data interval, and the second schedule will extend from the current interval to the next, with subsequent schedules following this pattern.
Import configuration for incremental file fetch
The import configuration for incremental fetch is different for different sources. Below is an example for the import configuration of incremental fetch in cloud storage, FTP, and local file systems.
Use the previously imported file if no new file is available:
When there are no new files in the source during incremental import,
If the checkbox is checked: The last fetched files will be imported again.
If the checkbox is unchecked: The import will be skipped and no files will be imported.
Which file to import? You can choose to import All files, Newest file, or Oldest file using this option.
In the third schedule, if 13 new files are added to the source that matches the file pattern, then only the first 10 files will be fetched and exported as a single file based on the created time in the source since the batch size is set as 10. The same logic will be applied for files during incremental fetch in the upcoming schedules and so on.
The data is imported only once. The second time, the rules gets applied on the same data and gets exported.
4. Click Save to save the schedule import configuration for your pipeline.
Stop export if data quality drops below 100% : You can use this toggle if you would like to stop export if data quality drops below 100 percent.
Note: This option will be visible only if you have added more than one destination in your pipeline.
To rearrange the order of your export destinations
1) Click the Order exports toggle
5. After you configure the schedule configuration, click Save to execute the schedule. This will start the pipeline.
Each scheduled run is saved as a job. When a pipeline is scheduled, the data will be fetched from your data sources, prepared using the series of transforms you have applied in each of the stages, and then data will be exported to your destination at regular intervals. This complete process is captured in the job history.
6. To go to the jobs list of a particular pipeline, go to the ellipses icon in the pipeline builder, and click on the Job history menu to check the job status of your pipeline.
7. Click the required job ID in the Jobs history page to navigate to the Job summary of a particular job.
The Job summary shows the history of a job executed in a pipeline flow. Click here to know more.
You can view the status of the schedule on the Job summary page. There are three different status for a schedule job in DataPrep - Running, Success or Failure.
If the job fails, you can identify if the error is at the import stage, transform stage, or with the destination, and target matching in the Job summary page. You can hover on the failed entity to view the error details and fix them to proceed exporting.
In the Overview tab, you can view the status of the run along with the details such as the user who ran the pipeline, storage used, total rows processed, start time, end time and duration of the run. Click here to know more
In the Stages tab, you can view the details of each pipeline stage, such as Import, Transform and Export. Click here to know more.
In the Output tab, you see the list of all exported data. You can also download the outputs if needed. Click here to know more.
8. When the schedule is completed, the data prepared in your pipeline will be exported to the configured destinations.
Info: You can also view the status of your schedules later on the Jobs page.
Note: If you make any further changes to the pipeline, the changes are saved as a draft version. Choose the Draft option and mark your pipeline as ready for the changes to reflect in the schedule.
After you set your schedule, you can choose to Pause schedule or Resume schedule, Edit schedule and Remove schedule using the Schedule Active option in the pipeline builder.
When you edit and save a schedule, the next job will be from the last schedule run time to the next scheduled data interval.
SEE ALSO
Export destinations in DataPrep
Learn how to use the best tools for sales force automation and better customer engagement from Zoho's implementation specialists.
If you'd like a personalized walk-through of our data preparation tool, please request a demo and we'll be happy to show you how to get the best out of Zoho DataPrep.
You are currently viewing the help pages of Qntrl’s earlier version. Click here to view our latest version—Qntrl 3.0's help articles.