1. What is Zoho DataPrep?
Zoho DataPrep is an advanced self-service data preparation tool that helps organizations model, cleanse, prepare, enrich and organize large volumes of data from multiple data sources to serve data analytics and data warehousing with exceptional data quality, all without the need for any coding.
2. Can I get a quick walk through session of Zoho DataPrep?
Yes, please request a personalized demo by
mailing us at support@zohodataprep.com. Meanwhile, you can watch the getting started video from here.
3. What's new with Zoho DataPrep 2.0?
We have also recently launched the 2.0 version of DataPrep making it easier than before to build an end-to-end pipeline and have a complete control on the quality as well as data movement with our visual pipeline builder. This pipeline builder lets you bring data from multiple data sources, perform various transformations, including data blending, and export to multiple destinations in a single pipeline. This pipeline can be scheduled as a whole to make orchestration, pipeline management, and monitoring much easier. However, this requires users to map their existing datasets from DataPrep 1.0 to pipelines so that we can groups the datasets together and show as pipelines. Here's a detailed view on what 2.0 offers.
4. How can a user move to the latest 2.0 version of Zoho DataPrep?
You can move to the 2.0 version of DataPrep using the Try new version option on the top bar. To migrate your data from 1.0 to 2.0, you can follow the steps from our Migration guide. Here's a quick video on how to migrate data from 1.0 to 2.0.
5. How do I use Zoho DataPrep features right inside Zoho Analytics?
You can use the Zoho DataPrep add-on available within Zoho Analytics. You can access Zoho DataPrep add-on using the Prepare data option while importing data, or you could use the More option and select Prepare data to clean data that are already present in Zoho Analytics.
6. How do I prepare existing data in Zoho Analytics using the Zoho DataPrep Add-on?
You can click on the More option from the top menu and select the Prepare data option to clean the data that is already present in Zoho Analytics. You can also move data from Zoho Analytics to Zoho DataPrep through the DataPrep's import flow and utilize the full capability of Zoho DataPrep.
7. Can I import data from Zoho DataPrep to an existing table in Zoho Analytics?
You cannot import data directly from DataPrep in Zoho Analytics. However, you can add Zoho Analytics as a destination in Zoho DataPrep and push data to an existing table in Zoho Analytics by running the pipeline.
8. What connectors are supported in Zoho DataPrep?
Zoho DataPrep currently support the below connectors and we are also in works to add more:
- Zoho Analytics
- Zoho CRM
- Salesforce
- Zoho Bigin
- Zoho Creator
9. Can I import data from Zoho DataPrep into Zoho Analytics and vice versa?
Yes. You can seamlessly import your data from Zoho DataPrep to Zoho Analytics by using the Zoho DataPrep connector in the Import your Data section in Zoho Analytics. Similarly, you can use the Zoho Analytics connector in Zoho DataPrep to import and export your data from DataPrep.
We do not access any data that you upload to Zoho DataPrep. All data is encrypted at our data centers. We only collect the basic information on how you use the product and the features most used so that we can enhance and make them better. We assure you that we do not share this information externally and use this data only for our internal evaluation.
11. How many pipeline schedules are allowed at a time in Zoho DataPrep?
You can run any number of pipeline schedules at a time in Zoho DataPrep by adding any of our available destinations, configuring the schedule details and frequency. However, you have a limit on the rows processed based on your subscription.
12. What is the maximum number of columns that I can create within a stage?
In Zoho DataPrep, upto 400 columns supported within a single stage. This allows you to prepare large volumes of data with ease.
13. Can I import files of any size in Zoho DataPrep?
You can import JSON and XML files of a maximum size of 20 MB and import other supported files up to a size of 100 MB. The supported file types are CSV, TSV, JSON, HTML, XLS, XLSX and XML.
14. How many files and tables can I import in Zoho DataPrep at a time?
We support the import of up to 10 files or tables at a time in Zoho DataPrep.
15. How many rows can be processed every month in Zoho DataPrep?
16. What are the different editions of Zoho DataPrep and what is the price?
The various editions of Zoho DataPrep and their pricing are available here.
17. What should be done if the export fails due to dataset quality issues such as invalid values in the data?
If the export fails due to errors such as the ones below, the invalid values in your data must be fixed.
1. Export was aborted since the dataset quality 100 is below the configured minimum quality 100.
2. Export has failed due to invalid data present in your dataset which is not accepted by the destination.

Follow the steps to find the invalid values in the data:
1. Navigate to the J
ob summary ->Output tab.

2. Click the Data quality symbol to view the invalid columns.

3. After identifying the invalid columns, click the
Edit Pipeline option on the Job Summary page and navigate to the
Studio page. Use the appropriate
transforms to correct the invalid values, then rerun the pipeline.
4. Sometimes, the sample might not contain any invalid values, yet the full data will have them. In this case, please go to the last data preparation stage > In the right-hand side pane, click on the edit icon besides Sample Strategy > Select Errorneous sample and click Apply.
This will surface invalid values in your sample data. Please fix these invalid values and retry the job.
18. Can we convert a .txt file to .csv using Zoho DataPrep?
While direct conversion isn’t available, you can import a text file, prepare the data, and then export it as a CSV file.
19. Are there any limitations on the number of records that can be exported via schedule run to Cloud Sources?
There are no record count limitations, but batch size limits will apply during the export. Click here to know more about batch size limits.
20. How to re-authenticate a connection?
In Zoho DataPrep, schedule or other runs may fail if there is a need to re-authenticate. Taking over a schedule of a user who is no longer part of the organization or inactive is called Re-authentication. A schedule needs to be re-authenticated if any of the following conditions are true:
- A user who has configured a schedule is removed from the organization
- A user who has configured a schedule is inactive
- A connection used is either deleted or no longer with the user.
When editing a schedule fails throwing the below error, you can follow the below steps to re-authenticate your schedule.
You can click the the Re-authenticate option from any of the following places:
1. In the pipeline builder, right click the data source icon and choose the Datasource details option or click the data source icon once to view the data source details. The Re-authenticate option appears on the Datasource details that has failed either because the schedule creator is deactivated or removed from the organization, or the connection used is deleted or no longer shared with the user.
2. In the pipeline builder, open the studio page of the required dataset and click the Ruleset icon
and then click the Data source configuration icon. The Re-authenticate option appears on the Data source details that has failed either because the schedule creator is deactivated or removed from the organization, or the connection used is deleted or no longer shared with the user.
On the Data source details, click the Re-authenticate option to re-authenticate the connection. Once re-authenticated, you can proceed to schedule your data seamlessly.
When a transform stage shows Cached in the job summary, it means that the stage did not reprocess any data because there were no changes in its rules or in any of its dependent upstream nodes.
For example, consider the below pipeline.

Below is the schedule import configuration for the example pipeline.

The first manual run (test run) will show a "
Cached" status for the child datasets of Leads 2023 and Leads 2024, as they are already processed during the append transformation. So, when the manual run with existing data is performed, those stages are reused and marked as Cached while other transform stages are marked as "
Success".

However, if you re-run the pipeline (manual run with existing data or any other runs with import configuration set as "Do not import data") with no new rules added, the stage status will be "Cached".
Since there are no changes in the rules or input data, transform stages except Google drive is cached.

22. When does the stage show "Not run" status in the job summary?
The stages in the job summary show "Not Run" status in the following cases:
- Manual runs with existing data
- Runs where the import configuration is set to "Do not import data"
- Sectional runs
In these cases, the pipeline uses previously imported data instead of fetching new data from the source. As a result, the corresponding import stages are marked as "Not Run".
Example:
Let’s say you have two datasets, Leads 2023 and Leads 2024, coming from different sources. You import them, combine (append) them, and add a destination.

1. If you do a manual run with existing data, both import stages will show "Not Run", since the data isn’t fetched again.

2. Later, if you add a rule only in Leads 2024 dataset and do a sectional run, only the Leads 2024 transforms stage will run and show "Success". The import and export stages may still show "Not Run" because data wasn't re-imported or pushed again.