How to import data from Google Drive into Zoho DataPrep?

Import data from Google Drive



Zoho DataPrep supports importing data from Google Drive, a file storage and synchronization service developed by Google. Google Drive allows users to store files in the cloud, synchronize files across devices, and share files.

 To import data from Google Drive

1. Open an existing pipeline or create a pipeline from the Home Page, Pipelines tab, or Workspaces tab and click the Add data option.
InfoInfo: You can also click the Import data  icon at the top of the pipeline builder and bring data from multiple sources into the pipeline.




2. Choose the Cloud storage category from the left pane and click the Google Drive icon to import from Google Drive. You can also search for the required cloud storage services in the search box.
 


NotesNote: If you had already added a Google Drive connection earlier, click the Saved connections category from the left pane and proceed to import. To learn more about Saved connections, click here.

3. If you have already added a connection, click the existing connection and start importing data.


Note: Click the Add new link to add a new Google Drive account. You can create as many Google Drive connections as required.

 

4. Authenticate your Google account. You will need to authorize DataPrep to access your files when you do this for the first time.



Note: The connection configuration will be saved for importing from the Google Drive in the future. Credentials are securely encrypted and stored.

 

5. Select the files you want to import and click the Import button. You can also select files from the Shared with me or Shared Drive tabs. You can also use the Advanced selection option to import files that match a specific pattern. Click here to know more. 

Shared with me - This tab shows the files that are shared with you individually
Shared drives - A Shared Drive in Google Drive is a shared space where teams can store, access, and collaborate on files together. Files shared with your team will be displayed here.


 

 
Notes
Note: The Shared drives and Shared with me tabs will appear only if your account has access to any Shared Drives or files shared with you.

6. If it is an HTML, XLS, or XLSX file, you can click the Preview option to view a sample of the data.

 

7. If the file is password-protected, enter the password and click the   right arrow.

 

 

8. Click the Import button.



9. Once you have completed importing data, the Pipeline builder page opens, and you can start applying transforms. You can also right-click the stage and choose the Prepare data option, and prepare your data in the DataPrep Studio page. Click here to know more about the transforms.




10. Once you are done creating your data flow and applying necessary transforms in your stages, you can right-click a stage and add a destination to complete your data flow.

NotesNote:  After adding a destination to the pipeline, you can try executing your pipeline using a manual run at first. Once you make sure the manual run works, you can then set up a schedule to automate the pipeline. Learn about the different types of runs here.

 

Advanced selection

To import files using Advanced selection,

1. Click the Advanced selection link.


Advanced selection helps you perform dynamic file selection based on regex. This can be used for getting new or incremental data from your Google Drive folder. The newly added or modified file that matches the file pattern during the specific data interval will be taken from your Google Drive folder. Click here to know more about incremental fetch.

Important: Advanced selection is used not only for incremental fetch. You can also use this option for bulk import of files based on the file pattern. 
Notes
Note: Supported formats are CSV, TSV, JSON, XML, TXT, XLS, and XLSX. You can also import files in zipped format.
For Zip files, only one file is supported. Make sure there is only one file compressed or zipped in the .zip file.
Notes
When you import files in bulk, the files in the trash will also be imported unless they are permanently deleted. To avoid this, please ensure the files are removed from the trash.



2. Provide the following details :

  • Choose folder: Choose the folder you want to import data from.

  • Folder path: The folder path where you want to search for files. Eg. 2023/ 
    If the files are stored in the drive without any folder, you can leave this field empty.
    Alert
    The "Folder Path" is a directory path field and does not support regex patterns.
    InfoInfo: Folder path is case-sensitive.

  • Include subfolders: You can also select the Include subfolders checkbox if you want to include subfolders while searching for a file. Click here to know the limitations of this option.

  • File pattern: The pattern used to match the file names in the folder. This supports regex-type matching. You can also use the pattern, ".*" to match any file in the path specified.
  • InfoInfo: File pattern is case-sensitive.

Note: The file pattern match is a simple regex type match. For example, to fetch files with file names such as Sales_2022.csv , Sales_2023.csv , Sales_2024.csv , you can input the pattern Sales_.* 


Similarly to fetch files such as PublicData1.csv , PublicData2.csv , PublicData3.csv , use Public.*


If you want to import a single file, then
 specify the pattern using the exact file name.
Eg: leads_jan_2022.*

  • File password: Enter the password if the file is password-protected.

  • Merge files and import - This will merge all the files that match the pattern specified and import them as a single dataset.
    IdeaYou can use this option to merge files together during the import itself, without having to perform unions post-import.
    InfoInfo: This option can merge a maximum of only 5 files at a time.
    NotesNote: If this checkbox is unchecked, then only 1 file will be fetched at a time.
    Eg, If your Google Drive account has 10 files, the first 5 will be merged into one dataset and imported. During the next reload, the remaining 5 files will be merged and imported.

    Similarly, if your Google Drive account has 8 files, the first 5 will be merged and fetched first, followed by the next 3. 
  • File typeChoose the required file format. The available formats are CSV, TSV, JSON, XLS, XLSX, XML, and TXT.
  • Sheet pattern: This option is available for the XLS and XLSX formats only. The pattern used to match the sheet names in the file. This supports regex-type matching. You can also use the pattern ".*" to match any sheet in the file.
    Notes
    The sheet pattern match is also a simple regex-based match. For example, to fetch sheets with names such as Sales_2022, Sales_2023, Sales_2024, you can input the pattern Sales_.*

    Similarly, to fetch sheets such as PublicData1, PublicData2, PublicData3, use Public.*

    If you want to import a single sheet, then specify the pattern using the exact sheet name.
    Eg: Leads_Jan_2022.*
  • Info
    Info: Sheet pattern is case-sensitive.

  • Sheet password: This option is available for XLS and XLSX formats only. Enter the password if the sheet is password-protected.
  • Merge sheets and import: This will merge all the sheets that match the pattern specified and import them as a single dataset.

Idea
You can use this option to merge sheets together during the import itself, without having to perform unions post-import.

NotesNote: If this checkbox is unchecked, then only 1 sheet will be fetched at a time. 
3. Click the Import button.



Notes
Note: Supported formats are CSV, TSV, JSON, XML, TXT, XLS, and XLSX. You can also import files in zipped format. 

 In the case of Zip files, only one file is supported. Make sure there is only one file compressed or zipped in the .zip file.
 

File parsing

File Parsing is the process of interpreting and structuring a file during import so that the data is correctly organized into rows and columns for processing.

In File Parsing, there are two options: Auto Parsing and Custom Parsing.

  1. Auto Parsing - Automatically detects the file structure (delimiter, headers, encoding, etc.) and formats the data accordingly.
  2. Custom Parsing - Allows you to manually configure how the file should be read, giving full control over delimiters, headers, encoding, and other settings.

Custom Parsing includes the following options:

  • File encoding: You can encode the file using character encoding methods like UTF-8 through the File Encoding option.

  • Text qualifiers: You can specify the characters that indicate the beginning and end of a text field, such as Single Quote (') or Double Quote (").

  • Delimiter: You can separate or split the data using a delimiter such as Comma (,), Semicolon (;), Space, Tab, or Pipe (|). You also have a Custom delimiter option to define your own separator.

  • Skip initial rows: Skip parsing a specified number of rows at the beginning of the file.

  • Comment character: Specifies the first character of a commented row. Commented rows will be skipped during import.

  • Escape character: Specifies the character used to escape delimiters or quotes so they are treated as plain text. Available options include Double Quote ("), Backslash (), Pipe (|), Carat (^), and Tilde (~).

  • Trim spaces automatically: Removes leading and trailing whitespaces from all columns during data import.

  • Data contains header: Specify the row number that should be used as the column header.

File parsing options

The table below covers the file import options for cloud storage services.



Limitations

 
1.  Using the Include subfolder option, you can only fetch files from a single sub-folder or all the sub-folders of the entire My drive, Shared with me, or 
Shared drives folders. You cannot fetch files from all the sub-folders within a specific folder.


2. To fetch files from a single subfolder: Enter the exact path of the subfolder in the folder path. For example, 2023/jan/. Fill in the required details. The files that match the mentioned file pattern will be fetched from the specified folder.



3. To fetch files from all subfolders of Google Drive: Leave the folder path empty and select the Include subfolders checkbox. Fill in the required details. The files that match the mentioned file pattern will be fetched from all the sub-folders of the entire My Drive, Shared with me, or Shared drives folders.

SEE ALSO

How to add a new pipeline?

How to import data from cloud databases?

How to import data from saved data connections?

How to schedule a pipeline?        

What other cloud storage options are available in Zoho DataPrep?  

How to export data to Google Drive?