How to import data from SharePoint into Zoho DataPrep?

Import data from SharePoint



  Zoho DataPrep supports importing data from SharePoint, a cloud storage service that allows users to store, organize, share, and access information from any device.


 To import data from SharePoint

1. Open an existing pipeline or create a pipeline from the Home Page, Pipelines tab or Workspaces tab and click the Add data option.

InfoInfo: You can also click the Import data  icon at the top of the pipeline builder and bring data from multiple sources into the pipeline.



2. Choose the Cloud storage category from the left pane and click SharePoint. You can also search SharePoint in the search box.

 

NotesNote: If you had already added a SharePoint connection earlier, click the Saved connections category from the left pane and proceed to import. To learn more about Saved connections, click here.

 

3. If you have already added a connection, click the existing connection and start importing data.

 

 

NotesNote: Click the Add new link to add a new SharePoint account. You can create as many SharePoint connections as required.

 

4. Authenticate your SharePoint account. You will need to authorize DataPrep to access your files when you do this for the first time.


Notes
Note: If you have authenticated the same Microsoft account for OneDrive and SharePoint, logging out from OneDrive will also log you out from SharePoint, as both services are linked to the same Microsoft account. Please make sure you have logged into the corresponding Microsoft account. 



 

Notes
Note: The connection configuration will be saved for importing from the SharePoint in the future. Credentials are securely encrypted and stored.

 

5. Choose the required site and drive. Select the files you want to import and click the Import button. You can also use the Advanced selection option to import files that match a specific pattern. Click here to know more. 


 

6. If it is an HTML, XLS, or XLSX file, you can click the Preview option to view a sample of the data.

7. If the file is password protected, enter the password, and click the  right arrow.


8. Click Import.
Notes
Note : Supported formats are CSV, TSV, JSON, XML, TXT , XLS, and XLSX. You can also import files in zipped format. Note : In the case of Zip files, only one file is supported. Make sure you have only one file compressed or zipped within the .Zip file.


 

9. Once you have completed importing data, Pipeline builder page opens and you can start applying transforms. You can also right-click the stage and choose the Prepare data option and prepare your data in the DataPrep Studio page. Click here to know more about the transforms.




10. Once you are done creating your data flow and applying necessary transforms in your stages, you can right-click a stage and add a destination to complete your data flow.
NotesNote:  After adding a destination to the pipeline, you can try executing your pipeline using a manual run at first. Once you make sure manual run works, you can then set up schedule to automate the pipeline. Learn about the different types of runs here.

Advanced selection

To import data using Advanced selection,

1. Click the Advanced selection link.


Advanced selection helps you perform dynamic file selection based on regex. This can be used for getting new or incremental data from your SharePoint site. The newly added or modified file that matches the file pattern during a specific data interval will be taken from your SharePoint site. Click here to know more about incremental fetch.


Info
Important: Advanced selection is used not only for incremental fetch. You can also use this option for bulk import of files based on the file pattern. 

Notes
Note : Supported formats are CSV, TSV, JSON, XML, TXT , XLS, and XLSX. You can also import files in zipped format.

2. Provide the below details:

  • Choose site : Choose the site you want to import data from.

  • Choose drive : Choose the drive in your site from where you want to import data.

  • Folder path : The folder path where you want to search for files. Eg. 2023/

    If the files are stored in the site without any folder, you can leave this field empty.

    Alert
    The "Folder Path" is a directory path field and does not support regex patterns.

InfoInfo: Folder path is case-sensitive.
  • Include subfolders : You can also select the Include subfolders checkbox if you want to include subfolders while searching for a file.

  • File pattern : The pattern used to match the file names in the site. This supports regex type matching. You can also use the pattern, ".*" to match any file in the path specified.
    InfoInfo: File pattern is case-sensitive.
Note: The file pattern match is a simple regex type match. For example, to fetch files with file names such as Sales_2022.csv , Sales_2023.csv , Sales_2024.csv , you can input the pattern Sales_.* 

Similarly to fetch files such as PublicData1.csv , PublicData2.csv , PublicData3.csv , use Public.*

If you want to import a single file, then specify the pattern using the exact file name.
Eg: leads_jan_2022.*
  • File password : Enter the password if the file is password protected.

  • Merge files and import - This will merge all the files that match the pattern specified and import them as a single dataset.
    IdeaImportant: You can use this option to merge files together during the import itself, without having to perform unions post import.
    InfoInfo: This option can merge a maximum of only 5 files at a time.
    NotesNote: If this checkbox is unchecked then, only 1 file will be fetched at a time.

    Eg. If your SharePoint account has 10 files, the first 5 will be merged into one dataset and imported. During the next reload, the remaining 5 files will be merged and imported.

    Similarly, if your SharePoint account has 8 files, the first 5 will be merged and fetched first, followed by the next 3.

  • File typeChoose the required file format. The available formats are CSV, TSV, JSON, XLS, XLSX, XML, and TXT.
  • Sheet pattern : This option is available for the XLS and XLSX formats only. The pattern used to match the sheet names in the file. This supports regex type matching. You can also use the pattern ".*" to match any sheet in the file.

    Notes
    The sheet pattern match is also a simple regex-based match. For example, to fetch sheets with names such as Sales_2022, Sales_2023, Sales_2024, you can input the pattern Sales_.*

    Similarly, to fetch sheets such as PublicData1, PublicData2, PublicData3, use Public.*

    If you want to import a single sheet, then specify the pattern using the exact sheet name.
    Eg: Leads_Jan_2022.*

  • Info
    Info: Sheet pattern is case-sensitive.

  • Sheet password : This option is available for XLS and XLSX formats only. Enter the password if the sheet is password protected.
  • Merge sheets and import : This will merge all the sheets that match the pattern specified and import them as a single dataset.

Idea
You can use this option to merge sheets together during the import itself, without having to perform unions post import.

NotesNote: If this checkbox is unchecked then, only 1 sheet will be fetched at a time. 

3. Click the Import button.



Notes
Note: Supported formats are CSV, TSV, JSON, XML, TXT, XLS, and XLSX. You can also import files in zipped format. 

 In the case of Zip files, only one file is supported. Make sure there is only one file compressed or zipped in the .zip file.
 

File parsing

File Parsing is the process of interpreting and structuring a file during import so that the data is correctly organized into rows and columns for processing.

In File Parsing, there are two options: Auto Parsing and Custom Parsing.

  1. Auto Parsing - Automatically detects the file structure (delimiter, headers, encoding, etc.) and formats the data accordingly.
  2. Custom Parsing - Allows you to manually configure how the file should be read, giving full control over delimiters, headers, encoding, and other settings.

Custom Parsing includes the following options:

  • File encoding: You can encode the file using character encoding methods like UTF-8 through the File Encoding option.

  • Text qualifiers: You can specify the characters that indicate the beginning and end of a text field, such as Single Quote (') or Double Quote (").

  • Delimiter: You can separate or split the data using a delimiter such as Comma (,), Semicolon (;), Space, Tab, or Pipe (|). You also have a Custom delimiter option to define your own separator.

  • Skip initial rows: Skip parsing a specified number of rows at the beginning of the file.

  • Comment character: Specifies the first character of a commented row. Commented rows will be skipped during import.

  • Escape character: Specifies the character used to escape delimiters or quotes so they are treated as plain text. Available options include Double Quote ("), Backslash (), Pipe (|), Carat (^), and Tilde (~).

  • Trim spaces automatically: Removes leading and trailing whitespaces from all columns during data import.

  • Data contains header: Specify the row number that should be used as the column header.

File parsing options

The table below covers the file import options for cloud storage services.



SEE ALSO

How to add a new pipeline?

How to import data from cloud databases?

How to import data from saved data connections?

How to schedule a pipeline?        

What other cloud storage options are available in Zoho DataPrep?  

How to export data to SharePoint?