Zoho DataPrep allows you to export data to PostgreSQL, also known as Postgres, a cloud database. It is a free and open-source relational database management system.
To export data to Heroku PostgreSQL
1.
Open an existing pipeline or create a pipeline from the Home Page, Pipelines tab or Workspaces tab. You can bring your data from 50+sources.
2. On the
Pipeline Builder page, once you have completed creating your data flow and applying the necessary
transforms in your stages, you can right-click a stage and select the
Add Destination option.
3. You can select Oracle Cloud from the Cloud databases category or search it in the search box.
Note: If you have already added a PostgreSQL connection earlier, you can simply select the existing connection under the Saved connections section and proceed with exporting.
4. If your data contains columns with personal data, you can choose to include or exclude some or all of them in the PII columns section.
You can also apply the necessary security methods below to protect your data column:
A. Data masking
Data masking hides original data with 'x' to protect personal information.
B. Data Tokenization
Data tokenization replaces each distinct value in your data with a random value. Hence the output is statistically identical to the original data.
C. None
You can select None if you do not want to use any security method.
5. Click Next and select Heroku PostgreSQL in the Database service name dropdown.
6. Enter the Endpoint, Port, Database name, Username and Password to authenticate the database connection.
7. You can also select the Use SSL checkbox if your database server has been setup to serve encrypted data through SSL.
8. Enter a unique name for your connection under Connection name and click Connect.
Note: The connection configuration will be saved for exporting data in the future. Credentials are securely encrypted and stored.
Note: If you face trouble connecting to your database, please make sure Zoho DataPrep's IP Addresses are whitelisted in your application to export data to cloud databases.
Click here to know about Zoho DataPrep IP addresses.
9. Once you have successfully connected to your Heroku PostgreSQL account, you can choose how and where to export the data.
10. Choose Existing table if you want to export data to an existing table and select one from the list of tables available in the database. If you select the existing table option, there are two ways in which you can choose how to add the new rows to the table.
- If the new rows are to be added to the table, choose Append.
- If the newly added rows are to replace the existing rows, select Overwrite from the dropdown.
11. If you want to create a new table and export data, select the New table option, enter the Schema name, and Table name and choose how to add the new rows to the table.
Note: Schema name is not a mandatory field.
- If the new rows are to be added to the table, choose Append.
- If the newly added rows are to replace the existing rows, select Overwrite from the dropdown.
Note: For
schedule and
backfill run, the first export will be done to a new table and the subsequent exports will be done to an existing table and this option will be used to add the new rows to the existing table.
12. Click
Save. Now that you have added a destination, you may want to try executing your pipeline using a manual run at first. Once you make sure manual run works, you can then set up schedule to automate the pipeline. Learn about the different types of runs
here.
Info: Each run is saved as a job. When a pipeline run is executed, the data fetched from your data sources will be prepared using the series of transforms you have applied in each of the stages, and then data will be exported to your destination. This complete process is captured in the
Jobs page.
13. If the manual run succeeds without any errors, your data will be exported successfully. If you are exporting data to an existing table in your cloud database, and if the manual run fails throwing the below target match error, you can fix them by completing the target matching steps.
Target matching is a useful feature in DataPrep that prevents export failures caused due to errors from the data model mismatch.
Note: Target matching will be applied even if you export data to a new table and automate the pipeline using the
Schedule run option. Only during the first schedule it will treated as a new table. In the subsequent exports, the new table will be treated as an existing table and target matching will be applied.
Target matching during export to cloud databases
Target matching happens before the data is exported to the destination. Target matching is a useful feature in DataPrep that prevents export failures caused due to errors from the data model mismatch. Using target matching, you can set the required cloud database table as the target and align the source dataset columns to match with your target table. This ensures seamless export of high quality data to the cloud databases.
Note: Target matching failure is not an export failure. Target matching happens before the data is actually exported to the destination. This way the schema or data model errors that could cause export to fail are caught beforehand preventing export failures.
When target match check fails
1. If the target match check fails during export, you can go to the
DataPrep Studio page, click the target matching icon
at the top right corner and choose the
Show target option. The target's data model is displayed above the existing source dataset. The columns in the source dataset are automatically aligned to match the columns in the target dataset, if found.
Target matching displays the different icons and suggestions on the matched and unmatched columns. You can click on these suggestions to quickly make changes to match the existing column with the target column. To make it easier for you to fix the errors, the target module in your cloud database is attached as a target to your data. You can view the mapping of your data with the table in the DataPrep Studio page along with the errors wherever there is a mismatch. You can hover over the error icons to understand the issue and click on them to resolve each error.
Note: All columns are displayed in the grid by default. However, you can filter out the required option by clicking the All columns link.
2. Click the View summary link to view the summary of the target match errors. The summary shows the different model match errors and the number of columns associated with each error. You can click on the required error columns and click Apply to filter out specific error columns.
Target match error summary
- The Target match errors section shows the errors and the number of columns associated with each error.
- The section at the top lists the error categories along with the number of errors in each category.
- You can click them to filter errors related to each category in the panel.
- In the default view, all columns are displayed. However, you can click any error category and get a closer look at the columns or view the error columns alone by selecting the Show only errors checkbox.
- Your filter selection in the Target match error summary will also be applied on the grid in the DataPrep Studio page.
Target matching errors
The errors in target matching are explained below:
-
Unmatched columns : This option shows all the unmatched columns in the source and target.
Note:
- The non-mandatory columns in the target can either be matched with a source column if available or ignored.
- The columns in the source that are missing in the target need to be matched or removed to proceed exporting.
When using the unmatched columns option, you can toggle the Show only mandatory columns option to see if there are any mandatory columns(set as mandatory in the target) and include them. You can also fix only the mandatory columns and proceed to exporting.
- Data type mismatch : This option displays the columns from the source having data types that do not match the columns in the target.
- Data format mismatch : This option displays columns from the source having date, datetime and time formats that differ from those in the target.
- Constraint mismatch : This option displays the columns that do not match the data type constraints of the columns in the target. To know how to add constraints for a column, click here.
-
Mandatory column mismatch: This option displays the columns that are set as mandatory in the target but not set as mandatory in your source.
Note: The mandatory columns cannot be exported to the destination unless they are matched and set as mandatory. You can click the
icon above the column to set it as mandatory. You can also use the
Set as mandatory (not null) check box under the
Change data type transform to set a column as mandatory.
- Data size overflow warnings : This option filters the columns with data exceeding the maximum size allowed in the target.
3. After fixing the errors you can go to the
Pipeline builder page and run your pipeline to export your data.
Once you make sure manual run works, you can then set up schedule to automate the pipeline. Learn about the different types of runs here