You can enter the URL to a CSV, TSV, JSON, HTML, XLS, XLSX or XML file. DataPrep also supports URL imports that require authentication.
To import files from URLs
1. Create a pipeline or open an existing pipeline from the Home Page, Pipelines tab or Workspaces tab and click the Add data option. You can also click the Import data option under the Workspaces tab to import data.
Info: You can also click the
Import data icon at the top of the pipeline builder and bring data from multiple sources into the pipeline.
2. Click URL from the Choose your data source section.
3. Enter the Dataset name and the request URL to fetch your data.
You can also click the Try sample link to import the sample dataset and see how it works.
4. Enter the necessary parameters and headers.
For example, you can declare parameters and headers with data types, content format and the language used.
5. Select one of the pagination options if you want to import your data page-wise.
Note: The None option is selected by default.
Pagination
i) Page Number - Select this option if your data is stored on multiple pages and you want to import them in the same order. You need to provide the following details when using this option:
- Page number parameter : This is the page parameter name provided by your API/URL provider.
- Initial value : The initial page number from which you want to start importing.
- Number of requests : The number of requests to execute in sequence.
In the above screenshot, we are fetching 10 pages. The URL will be executed 10 times as follows:
Here, the Parameter name is Page and the Number of requests is 10.
ii) Offset and limit - Select this option if you have a large dataset and want to import the data in batches by providing the start position and batch size of each batch. You need to provide the following details when using this option:
- Offset parameter : This is the offset parameter name given by your API provider.
- Initial value : Provide the initial value of offset.
- Number of requests : Number of requests to execute in the sequence.
- Limit parameter : This is the limit parameter name given by your API provider.
- Limit value : Number of records to be fetched from the offset.
From the second request, the Offset Parameter value is the previous offset value + limit.
Here the Offset Parameter name is Offset , Limit Parameter name is Limit . The Limit Value is 100 and the Number of requests is 10.
iii) Next page URL - Select this option if you have your data in pages where each page has the URL for the next page. You need to provide the following details when using this option:
URL Property Path : The property name that can fetch the URL of the next page.
In the above screenshot, the next page URL Property is /next_page_URL. The next page will be fetched from the JSON property /next_page_URL. This execution will continue until the /next_page_URL property is empty or null.
Note:
1. The Next Page URL pagination is applicable only for JSON and XML files.
2. These pagination options can be used in Zoho DataPrep only if your API provider permits you to do so.
6. Select one of the authorization types and give a name to this connection.
- No Auth - The URL you provide doesn't require any authorization.
- Basic Auth - The URL requires the 'Username' and 'Password' to authorize the URL and import data.
- OAuth2.0 - The URL is secured using OAuth2.0 authentication technique. It requires 'Client ID', 'Client Secret', 'Access Token', and 'Refresh token' to authorize the URL and allow import.
Note: The Auth configuration is saved for importing from the URL in the future. Credentials are securely encrypted and stored.
7. Give a name to your connection in the Connection name text box.
8. Click Authenticate to fetch the data from the URL.
9. Once all files are successfully imported, it will take you to the
Pipeline builder page where you can start applying applying
transforms. You can also right-click the stage and choose the
Prepare data option to prepare your data in the
DataPrep Studio page.
8. Once you are done creating your data flow and applying necessary transforms in your stages, you can right-click a stage and
add a destination to complete your data flow.
Note: After adding a destination to the pipeline, you can try executing your pipeline using a manual run at first. Once you make sure manual run works, you can then set up schedule to automate the pipeline. Learn about the different types of runs here.
To Edit the URL connection
DataPrep saves your data connections to avoid the hassle of keying in the credentials every time you need to connect to a data source or destination. You can always edit the saved data connection and update them with new parameters or credentials using the Edit connection option.
1. Click Saved data connections from the Choose a data source box while creating a new dataset.
2. You can manage your saved data connections right from the data import screen. Click the ellipsis (3 dots) icon to share, edit, view the connection overview, or remove the connection.
3. Click the Edit connection option to update the saved connection with new parameters or credentials.
4. You can also edit the pagination options under the Pagination tab if you want to import your data page-wise:
- Page number
- Offset and limit
- Next page URL
5. Click the Update button to update the connection.
SEE ALSO