Zoho DataPrep Studio

Zoho DataPrep Studio




An overview of data preparation in the Studio page is covered in the following sections:
  • Data distribution
  • Data quality
  • Intelligent suggestions
  • Search & filter
  • Topbar


The image above displays the Studio page and all the sections associated with it.
You can get a page tour by clicking the Page tour option from the Help menu in the topbar. The page tour gives you a complete walkthrough of the Studio page and makes it easy to understand.

Data distribution

In DataPrep, a histogram is a graphical representation of data distribution and the range of values present in a column. You can spot outliers and anomalies in the data using it. Selecting a bar or a section of the histogram filters the data within the range.



A detailed version of this histogram is also present under Column details, which appears at the bottom when a column is selected.  



You can edit the values present in the histogram and have it changed in the entire column. You can also sort the values using the icon. 

You can also click the search icon and search values in the histogram using one of the conditions below:
  1. contains
  2. doesn't contain
  3. is
  4. is not
  5. begins with
  6. doesn't begin with
  7. ends with
  8. doesn't end with

Data quality

DataPrep offers numerous options to measure and improve the quality of your data. Data quality can be assessed from the following areas in the data prep page. 
  • Data quality bar
  • Column details section
  • Dataset details section  

Data quality bar

A data quality bar represents the quality of data in each column. It splits data quality into valid data, invalid data, and missing values, based on data type of the column. Green represents valid data, red represents invalid data, and grey represents the missing values.

When you click on a section, DataPrep filters out appropriate rows so that you can easily deal with invalid or missing values in your dataset. 
  1. Hover over the data quality bar to get a quick look at the data quality of a column.

  1. You can also click on the Show for all columns option to view the data quality of all individual columns. 

Column details section

Column details section shows a data summary of each column with its data type, number of unique values in the column, and number of missing, invalid, and valid entries. 
Note: The first 100 rows of the dataset are processed to suggest a data type.
  1. The Column details are shown in the bottom panel whenever a column is selected. 
            

  1. This section has a detailed version of the histogram present at the top of each column. You can edit the values present in the histogram to have it changed in the entire column. You can also sort the values using the icon.

  2. You can also click the search icon and search values in the histogram using one of the conditions below:
    1. contains
    2. doesn't contain
    3. is
    4. is not
    5. begins with
    6. doesn't begin with
    7. ends with
    8. doesn't end with

       
  1. You can also click the Show more details link to see the expanded view of the details of the selected column. Various aspects of the column such as statistics, outliers, unique values, and data patterns are displayed under this section.


  1. You can also choose to have the widgets to be shown in the Show more details page from the context menu options next to the column name. 


Dataset details section

Dataset details reveal data quality for the entire dataset using a donut chart. The number is derived from the collective quality of the individual columns.
You will see this section for the first time when a dataset loads onto the data preparation screen, and whenever none of the columns are selected.  

Dataset details display the following information. 
  • Sample rows
  • Sample strategy (includes Random, Erroneous, Column based, and Initial data samples)
  • Total rows
  • Number of columns
  • Number of data types in the dataset 
  • Overall dataset data quality as a donut chart.
The donut chart splits data into a percentage of valid data, invalid data, and missing values. Click on the sections of the donut chart to selectively view valid, invalid, and missing values in your dataset.  


Sample strategy

Generating a sample is essential in speeding up the transformations performed. It is performed by taking a sample from the entire data using various strategies. The Initial sample strategy is used when the dataset is imported for the first time. You can change the strategy at any point during the data preparation process. Click on the edit icon in the dataset details panel to change the sample strategy. 


The different sample strategies available are: 
  • Initial sample: Generated from the first 5 MB data of the imported file.
  • Random sample: Randomly selected rows from the imported file. 
  • Erroneous sample: Rows containing invalid or missing entries. 
  • Column based sample: Generated based on the distinct values from the selected column.   

Intelligent suggestions

DataPrep suggests transforms based on the imported data and makes for effective data preparation. Suggestions are shown when one or multiple columns are selected, it is also shown when a filter is applied.
  1. When you click one of the suggested transform, you will be taken to the Studio panel with a live preview of the transformation to be applied to your data.  
  1. You may choose to edit the options and conditions in the operation bar before applying the suggested operation.
      

Search and filter

Perform search operations and filter data using the Search and filter box. The search and filter box is exploratory in nature and provides a way to filter your dataset without applying them as a rule.
However, you can choose to apply them as a rule by either keeping or deleting the filtered rows.

You can also select one of the default filter options using the filter icon from the Search and filter box:
  1. Filter rows with valid values
  2. Filter rows with invalid values
  3. Filter rows with missing values
  4. Filter rows with missing or invalid values
If you want to filter data based on custom filter conditions, you can use the Advanced filters option. 



To search data

  1. To search data, enter the value in the Search and filter box. The searched keyword will be added as a chip below the box based on the default condition 'contains'.
  2. You can also select the chip and edit the searched keyword and the condition at any time.


  1. Once the search result appears, you can choose to Keep filtered rows or Delete filtered rows using the Action dropdown.
All your searched keywords will be automatically included when you open the Advanced filters pane. 


To filter data

  1. To filter dataset by other means, you can simply click on the histogram, data quality bar, or the donut chart.   



  2. When you filter, a chip appears below the Search and filter box. You can select the chip and edit the searched keyword and the associated condition at any time.
  3. You can do multiple filters, and a chip would be added for each filter. 
1. All filters present will be automatically included when you open the Advanced filters pane.
2. This also includes the ones that you add while having the pane open. 
3. You can also edit the filters in the advanced filters pane. 
Next section covers Advanced filters
  1. The conditions available in the filter are:  
    1. Contains (default)
    2. Doesn't contain 
    3. Starts with 
    4. Doesn't start with 
    5. Ends with 
    6. Doesn't end with 
    7. Is 
    8. Is not
    9. Matches regex   
  1. After filtering, you can choose to apply either Keep filtered rows or Delete filtered rows rule using the Actions dropdown that appears next to the chip. 
  2. You can also edit the filters added using the Edit link. 
  3. To remove a particular filter, click the close icon that appears when you hover over the chip. 
  4. To remove all filters at one go, click the Clear link.

Advanced filters

The Advanced filters option allows you to filter data based on custom conditions applied over one or more columns. The advanced filters is exploratory in nature and provides a way to filter your dataset without applying them as a rule.
However, you can choose to apply them as a rule by either keeping or deleting the filtered rows.

1. All filters present will be automatically included when you open the Advanced filters pane.
2. Including the ones that you add while having the pane open.
3. You can also edit the filters in the Advanced filters pane.

To apply advanced filters

1. Choose the Advanced filters option from the Search and filter box. The Advanced filters pane will slide open to view.
     


2. Click the  icon to add columns to the filters. You can also reorder the filters using the drag and drop method.

3. When you add more than one filter to the Filters section, the logical operators, AND or OR appear next to the filters. You can click to toggle the logical operator between AND and OR.



4. Using the logical operators, you can combine the conditions and apply logic to determine the rule of precedence. The final expression is displayed in the Criteria expression box. You can click Edit to alter the default expression using logical operators and parenthesis to specify the precedence or the sequential order as to which condition should be evaluated first. Click Save after making the required changes.

For example, In the expression, ((1 OR 2) AND (3 OR 4)), first the condition ( 1 OR 2 ) will be executed and then, the condition ( 3 OR 4 ) will be executed next. Thirdly, since, the AND operator is used, the filter will be applied when both the conditions are true.

5. You can use the Clear button to remove all the filters.

6. For every filter added, you can select one of the following options from the drop-down:
  1.  Actual: This option lets you filter rows based on the actual values in the column.
  2. Data quality: This option lets you filter rows based on the quality of data in the column.
  3. Patterns: This option helps you filter rows based on the data patterns in the selected column.
  4. Seasonal: This option helps you filter rows based on the seasonal parameters such as quarter, month, week, etc.
  5. Outliers: This option allows you to filter rows based on the outliers present in the data of the selected column. 
The filter options are displayed based on the datatype of the column added for the filter. Click here to know more about the filter options.

7. Click the Filter button. The number of filters added will be shown in the chip that appears above the data grid. 

8.  Advanced filters filter the data without applying any rule. After when you're satisfied with your filtered data, you can choose to apply either Keep filtered rows or Delete filtered rows rule using the Actions dropdown.


Topbar

The topbar in the Studio page has a dataset switcher on the left hand side and a menu bar on the right. Click the dropdown next to the dataset name in the top bar to switch to other datasets. This is particularly useful to resume the remaining data prep activities in other datasets instantly.

You can also use this dropdown to take a look at the data quality of other datasets. This helps you prioritize data preparation activities in the workspace.





    Zoho CRM Training Programs

    Learn how to use the best tools for sales force automation and better customer engagement from Zoho's implementation specialists.

    Zoho CRM Training
      Redefine the way you work
      with Zoho Workplace

        Zoho DataPrep Personalized Demo

        If you'd like a personalized walk-through of our data preparation tool, please request a demo and we'll be happy to show you how to get the best out of Zoho DataPrep.

        Zoho CRM Training

          Create, share, and deliver

          beautiful slides from anywhere.

          Get Started Now


            Zoho Sign now offers specialized one-on-one training for both administrators and developers.

            BOOK A SESSION





                        Still can't find what you're looking for?

                        Write to us:  support@zohoforms.com


                              



                            



                          Manage your brands on social media

                              Zoho Marketing Automation

                                Zoho Sheet Resources

                                 




                                    Zoho Forms Resources


                                      Secure your business
                                      communication with Zoho Mail


                                      Mail on the move with
                                      Zoho Mail mobile application

                                        Stay on top of your schedule
                                        at all times


                                        Carry your calendar with you
                                        Anytime, anywhere




                                              Zoho Sign Resources

                                                Sign, Paperless!

                                                Sign and send business documents on the go!

                                                Get Started Now


                                                    Zoho SalesIQ Resources



                                                        Zoho TeamInbox Resources



                                                                Zoho DataPrep Resources



                                                                  Zoho DataPrep Demo

                                                                  Get a personalized demo or POC

                                                                  REGISTER NOW


                                                                    Design. Discuss. Deliver.

                                                                    Create visually engaging stories with Zoho Show.

                                                                    Get Started Now










                                                                                          • Related Articles

                                                                                          • Ruleset in Zoho DataPrep

                                                                                            This page covers the following sections: What is a Ruleset? Data source settings Ruleset operations Options with individual rules Ruleset Templates What is a Ruleset? Each transform applied on the dataset is stored in DataPrep as a rule, in order of ...
                                                                                          • Zoho DataPrep's Home page

                                                                                            The home page gives you a brief overview of the workspaces you own, where they are located, and the workspaces that are shared with you. From Zoho DataPrep's home page, you can manage your Zoho DataPrep account, create and manage workspaces. The ...
                                                                                          • Databridge in Zoho DataPrep

                                                                                            Zoho Databridge is a lightweight independent utility that connects your on-premise data source and Zoho applications to import data. In other words, you can quickly and securely transfer data from your internal network to Zoho applications. With Zoho ...
                                                                                          • Export data to Zoho WorkDrive

                                                                                            Zoho DataPrep supports exporting data to Zoho WorkDrive, a popular cloud storage service. Zoho WorkDrive allows you to store files and collaborate securely on the cloud. To export data to Zoho WorkDrive 1. Click the Export now option from the Export ...
                                                                                          • Scheduling data export

                                                                                            You can schedule data export using different export configurations in DataPrep. To schedule data export Click Export from the top bar in the Studio page and select Schedule export. 2. Click the New schedule button to create a new schedule for export. ...
                                                                                          Wherever you are is as good as
                                                                                          your workplace

                                                                                            Resources

                                                                                            Videos

                                                                                            Watch comprehensive videos on features and other important topics that will help you master Zoho CRM.



                                                                                            eBooks

                                                                                            Download free eBooks and access a range of topics to get deeper insight on successfully using Zoho CRM.



                                                                                            Webinars

                                                                                            Sign up for our webinars and learn the Zoho CRM basics, from customization to sales force automation and more.



                                                                                            CRM Tips

                                                                                            Make the most of Zoho CRM with these useful tips.



                                                                                              Zoho Show Resources