Export data to Amazon S3

Zoho DataPrep supports exporting data to Amazon S3 cloud storage service. Amazon Simple Storage Service, also known as Amazon S3 from AWS provides object storage through a web service interface.

Important

Prior to connecting your Amazon S3 data with Zoho DataPrep, you need to enable the below permissions from your Amazon S3 account:

1. ListAllMyBuckets

2. Get BucketLocation

3. ListBucket

4. GetObject

Please refer to Amazon S3 help pages to provide these permissions.

- Policies and Permissions in Amazon S3

- Configuring IAM policy in Amazon S3

To export data to Amazon S3

1. Open an existing pipeline or create a pipeline from the Home Page, Pipelines tab or Workspaces tab. You can bring your data from 50+sources.

2. On the Pipeline Builder page, once you have completed creating your data flow and applying the necessary transforms in your stages, you can right-click a stage and select the Add Destination option.

3. Choose the Cloud storage category and Amazon S3 option or you can type Amazon S3 in the search box.

Note : If you had already added an Amazon S3 connection earlier, you can simply select the existing connection under the Saved connections section and proceed with exporting. To learn more about Saved connections, click here.

4. If your data has columns with personal data, you can choose to include or exclude some or all of them in the PII columns section.

You can also apply the necessary security methods below to protect your personal data column:

A. Data masking
     Data masking hides original data with 'x' to protect personal information.

B. Data Tokenization
     Data tokenization replaces each distinct value in your data with a random value. Hence the output is statistically identical to the original data.

C. None
     You can select None if you do not want to use any security method.

5. Click Next. If you have added a connection already, click the existing connection and start exporting data.

Note : Click the Add new link to add a new Amazon S3 account. You can create as many connections as required.

6. You will need to authenticate S3 when you try connecting to S3 for the first time.

7. Provide the Connection name, Access key, and Secret key.

8. Click the Authenticate Amazon S3 button to authenticate your account with your credentials.

Note : The connection configuration will be saved for accessing data in Amazon S3 in the future. Credentials are securely encrypted and stored.

9. Select the File format the data should be exported with.

10. Enter the Bucket name you want to export your data to.

11. Provide the Folder path where you want to export your data. If you choose to store the files in the bucket without any folder, you can leave this field empty.

12. Select the required option from the File export option. You can choose to Replace or update file with a new version at the destination, or Add as new file during export.

Note : The file will be updated as a new version or replaced based on the versioning configuration in your Amazon S3 bucket.

13. Select the preferred Amazon S3 connection from the Connection name dropdown.

14. You can also click the Advanced options link for the following options.

You can encode the file using character encoding methods like UTF-8 using the File encoding option.
You can use a Row Separator such as UNIX (Line Feed) or MAC (Carriage Return) to ensure the line endings are translated properly.
You can distinguish the point at which the content of a text field begins and ends using Text qualifier s such as Single Quote(') or Double Quote(").
You can separate out the data or split using a Delimiter such as a Comma(,), Semicolon(;), or a Space.
Enable the Compress as a .zip file toggle to export your file as a .zip file.
You can also encrypt the file and protect it with a password using the Encrypt file and protect using password option.

15. Click Export.

16. Now that you have added a destination, you may want to try executing your pipeline using a manual run at first. Once you make sure manual run works, you can then set up schedule to automate the pipeline. Learn about the different types of runs here

Info: Each run is saved as a job. When a pipeline run is executed, the data fetched from your data sources will be prepared using the series of transforms you have applied in each of the stages, and then data will be exported to your destination. This complete process is captured in the Jobs page

What are the other export options available?

How to schedule data export?

How to export data as files?

How to export data to Amazon S3?

Export data to Amazon S3

To export data to Amazon S3