Zoho DataPrep Studio

Zoho DataPrep Studio




An overview of data preparation in the Studio page is covered in the following sections:
  • Data distribution
  • Data quality
  • Intelligent suggestions
  • Search & filter
  • Topbar


The image above displays the Studio page and all the sections associated with it.
You can get a page tour by clicking the  Page tour option from the Help menu in the topbar. The page tour gives you a complete walkthrough of the Studio page  and makes it easy to understand.

Data distribution

In DataPrep, a histogram is a graphical representation of data distribution and the range of values present in a column.  You can spot outliers and anomalies in the data using it. Selecting a bar or a section of the histogram filters the data within the range.



A detailed version of this histogram is also present under Column details, which appears at the bottom when a column is selected.  



You can edit the values present in the histogram and have it changed in the entire column.  You can also sort the values using the  icon. 

You can also click the search icon and search values in the histogram using one of the conditions below:
  1. contains
  2. doesn't contain
  3. is
  4. is not
  5. begins with
  6. doesn't begin with
  7. ends with
  8. doesn't end with

Data quality

DataPrep offers numerous options to measure and improve the quality of your data. Data quality can be assessed from the following areas in the data prep page. 
  • Data quality bar
  • Column details section
  • Dataset details section  

Data quality bar

A data quality bar represents the quality of data in each column. It splits data quality into valid data, invalid data, and missing values, based on data type of the column.  Green represents valid data, red represents invalid data, and grey represents the missing values.

When you click on a section, DataPrep filters out appropriate rows so that you can easily deal with invalid or missing values in your dataset. 
  1. Hover over the data quality bar to get a quick look at the data quality of a column.

  1. You can also click on the Show for all columns option to view the data quality of all individual columns. 

Column details section

Column details section shows a data summary of each column with its data type, number of unique values in the column, and number of missing, invalid, and valid entries. 
Note: The first 100 rows of the dataset are processed to suggest a data type.
  1. The Column details are shown in the bottom panel whenever a column is selected. 
            

  1. This section has a detailed version of the histogram present at the top of each column. You can edit the values present in the histogram to have it changed in the entire column. You can also sort the values using the  icon.

  2. You can also click the search icon and search values in the histogram using one of the conditions below:
    1. contains
    2. doesn't contain
    3. is
    4. is not
    5. begins with
    6. doesn't begin with
    7. ends with
    8. doesn't end with

       
  1. You can also click the Show more details  link to see the expanded view of the details of the selected column. Various aspects of the column such as statistics, outliers, unique values, and data patterns are displayed under this section. 

Outliers and Anomaly detection

  1. Zoho DataPrep uses machine learning techniques to identify anomalies in the data. This lets users deal with anomalies in the data pipeline. Users can decide to keep or remove anomalies from their dataset. Here's a quick GIF to demonstrate DataPrep's outlier detection capabilities. 



  2. You can also choose to have the widgets to be shown in the Show more details page from the context menu options next to the column name. 


Dataset details section

Dataset details reveal data quality for the entire dataset using a donut chart. The number is derived from the collective quality of the individual columns. You will see this section for the first time when a dataset loads onto the data preparation screen, and whenever none of the columns are selected.  

Dataset details display the following information. 
  • Sample rows
  • Sample strategy (includes Random, Erroneous, Column based, and Initial data samples)
  • Total rows
  • Number of columns
  • Number of data types in the dataset 
  • Overall dataset data quality as a donut chart.
The donut chart splits data into a percentage of valid data, invalid data, and missing values. Click on the sections of the donut chart to selectively view valid, invalid, and missing values in your dataset.  


Sample strategy

Generating a sample is essential in speeding up the transformations performed. It is performed by taking a sample from the entire data using various strategies. The Initial sample strategy is used when the dataset is imported for the first time. You can change the strategy at any point during the data preparation process. Click on the edit icon in the dataset details panel to change the sample strategy. 


The different sample strategies available are: 
  • Initial sample: Generated from the first 5 MB data of the imported file.
  • Random sample: Randomly selected rows from the imported file. 
  • Erroneous sample: Rows containing invalid or missing entries. 
  • Column based sample: Generated based on the distinct values from the selected column.   

Intelligent suggestions

DataPrep suggests transforms based on the imported data and makes for effective data preparation. Suggestions are shown when one or multiple columns are selected, it is also shown when a filter is applied.
  1. When you click one of the suggested transform, you will be taken to the Studio panel  with a live preview of the transformation to be applied to your data.  
  1. You may choose to edit the options and conditions in the operation bar before applying the suggested operation.
      

Search and filter

Perform search operations and filter data using the Search and filter box. The search and filter box is exploratory in nature and provides a way to filter your dataset without applying them as a rule. However, you can choose to apply them as a rule by either keeping or deleting the filtered rows.

You can also select one of the default filter options using the filter icon from the Search and filter box:
  1. Filter rows with valid values - Only rows with valid data are displayed 
  2. Filter rows with invalid values - Only rows with invalid data are displayed
  3. Filter rows with missing values - Only rows with missing data are displayed
  4. Filter rows with missing or invalid values - Only rows with missing or invalid data are displayed



If you want to filter data based on custom filter conditions, you can use the Advanced option that appears once you filter data. 



To search data

  1. To search data, enter the value in the Search and filter box. The searched keyword will be added as a chip below the box based on the default condition ' contains '.
  2. You can also select the chip and edit the searched keyword and the condition at any time.


  1. Once the search result appears, you can choose to Keep or Delete the filtered rows. The resultant dataset will be displayed with or without the filtered rows based on the selection.
All your searched keywords will be automatically included when you open the Advanced filters pane. 


To filter data

  1. To filter dataset by other means, you can simply click on the histogram, data quality bar, or the donut chart.   



  2. When you filter, a chip appears below the Search and filter box. You can select the chip and edit the searched keyword and the associated condition at any time.
  3. You can do multiple filters, and a chip would be added for each filter. 
1. All filters present will be automatically included when you open the  Advanced filters pane.
2. This also includes the ones that you add while having the pane open. 
3. You can also edit the filters in the advanced filters pane. 
Next section covers Advanced filters
  1. The conditions available in the filter are:  
    1. Contains (default)
    2. Doesn't contain 
    3. Starts with 
    4. Doesn't start with 
    5. Ends with 
    6. Doesn't end with 
    7. Is 
    8. Is not
    9. Matches regex   
  1. After filtering, you can choose to apply either  Keep rows or Delete rows rule that appears next to the chip. 
  2. You can also edit the filters added using the Edit link. 
  3. To remove a particular filter, click the close icon that appears when you hover over the chip. 
  4. To remove all filters at one go, click the Clear link.

Advanced filters

The Advanced filters option allows you to filter data based on custom conditions applied over one or more columns. The advanced filters is exploratory in nature and provides a way to filter your dataset without applying them as a rule. However, you can choose to apply them as a rule by either keeping or deleting the filtered rows. The Advanced link appears once you filter data. Click here to know more about filtering data.

1. All filters present will be automatically included when you open the Advanced filters pane.
2. Including the ones that you add while having the pane open.
3. You can also edit the filters in the Advanced filters pane.

To apply advanced filters

1. Click on the donut chart, histogram or the default filter options and filter data. You can also right-click on a column and select the  Click here to learn more about filtering data.

2. The Advanced link appears above the data grid. Click the Advanced link and pane will slide open to view.
     


2.  Click the  icon to add columns to the filters. You can also reorder the filters using the drag and drop method.

3. When you add more than one filter to the Filters section, the logical operators, AND or OR appear next to the filters. You can click to toggle the logical operator between AND and OR.



4. Using the logical operators, you can combine the conditions and apply logic to determine the rule of precedence. The final expression is displayed in the Criteria expression box. You can click Edit to alter the default expression using logical operators and parenthesis to specify the precedence or the sequential order as to which condition should be evaluated first. Click Save after making the required changes.

For example, In the expression, ((1 OR 2) AND (3 OR 4)), first the condition ( 1 OR 2 ) will be executed and then, the condition ( 3 OR 4 ) will be executed next. Thirdly, since, the AND operator is used, the filter will be applied when both the conditions are true.

5. You can use the Clear button to remove all the filters.

6.  For every filter added, you can select one of the following options from the drop-down:
  1.  Actual : This option lets you filter rows based on the actual values in the column.
  2. Data quality : This option lets you filter rows based on the quality of data in the column.
  3. Patterns : This option helps you filter rows based on the data patterns in the selected column.
  4. Seasonal : This option helps you filter rows based on the seasonal parameters such as quarter, month, week, etc.
  5. Outliers : This option allows you to filter rows based on the outliers present in the data of the selected column. 
The filter options are displayed based on the datatype of the column added for the filter. Click here to know more about the filter options.

7. Click the Filter button. The number of filters added will be shown in the chip that appears above the data grid.  

8.  Advanced filters filter the data without applying any rule. After when you're satisfied with your filtered data, you can choose to apply either the  Keep rows or Delete rows rule.


Topbar

The topbar in the  Studio page has a dataset switcher on the left hand side and a menu bar on the right. Click the dropdown next to the dataset name in the top bar to switch to other datasets. This is particularly useful to resume the remaining data prep activities in other datasets instantly.

You can also use this dropdown to take a look at the data quality of other datasets. This helps you prioritize data preparation activities in the workspace.




    Zoho CRM Training Programs

    Learn how to use the best tools for sales force automation and better customer engagement from Zoho's implementation specialists.

    Zoho CRM Training
      Redefine the way you work
      with Zoho Workplace

        Zoho DataPrep Personalized Demo

        If you'd like a personalized walk-through of our data preparation tool, please request a demo and we'll be happy to show you how to get the best out of Zoho DataPrep.

        Zoho CRM Training

          Create, share, and deliver

          beautiful slides from anywhere.

          Get Started Now


            Zoho Sign now offers specialized one-on-one training for both administrators and developers.

            BOOK A SESSION








                                You are currently viewing the help pages of Qntrl’s earlier version. Click here to view our latest version—Qntrl 3.0's help articles.




                                    Manage your brands on social media

                                      Zoho Desk Resources

                                      • Desk Community Learning Series


                                      • Digest


                                      • Functions


                                      • Meetups


                                      • Kbase


                                      • Resources


                                      • Glossary


                                      • Desk Marketplace


                                      • MVP Corner


                                      • Word of the Day


                                        Zoho Marketing Automation

                                          Zoho Sheet Resources

                                           

                                              Zoho Forms Resources


                                                Secure your business
                                                communication with Zoho Mail


                                                Mail on the move with
                                                Zoho Mail mobile application

                                                  Stay on top of your schedule
                                                  at all times


                                                  Carry your calendar with you
                                                  Anytime, anywhere




                                                        Zoho Sign Resources

                                                          Sign, Paperless!

                                                          Sign and send business documents on the go!

                                                          Get Started Now




                                                                  Zoho TeamInbox Resources



                                                                          Zoho DataPrep Resources



                                                                            Zoho DataPrep Demo

                                                                            Get a personalized demo or POC

                                                                            REGISTER NOW


                                                                              Design. Discuss. Deliver.

                                                                              Create visually engaging stories with Zoho Show.

                                                                              Get Started Now







                                                                                            You are currently viewing the help articles of Sprints 1.0. If you are a user of 2.0, please refer here.

                                                                                            You are currently viewing the help articles of Sprints 2.0. If you are a user of 1.0, please refer here.



                                                                                                  • Related Articles

                                                                                                  • Ruleset in Zoho DataPrep

                                                                                                    This page covers the following sections: What is a Ruleset? Data source settings Ruleset operations Options with individual rules Ruleset Templates What is a Ruleset? Each transform applied on the dataset is stored in DataPrep as a rule, in order of ...
                                                                                                  • Zoho DataPrep's Home page

                                                                                                    The home page gives you a brief overview of the workspaces you own, where they are located, and the workspaces that are shared with you. From Zoho DataPrep's home page, you can manage your Zoho DataPrep account, create and manage workspaces. The ...
                                                                                                  • Importing data from Zoho Analytics

                                                                                                    DataPrep supports importing data from Zoho Analytics . Zoho Analytics is a self-service BI and data analytics software that lets you visually analyze your data, create stunning data visualizations and discover hidden insights from your data. You can ...
                                                                                                  • Databridge in Zoho DataPrep

                                                                                                    Zoho Databridge is a lightweight independent utility that connects your on-premise data source and Zoho applications to import data. In other words, you can quickly and securely transfer data from your internal network to Zoho applications. With Zoho ...
                                                                                                  • Import data from Zoho WorkDrive

                                                                                                    Zoho DataPrep supports importing data from Zoho WorkDrive, a popular cloud storage service. Zoho WorkDrive allows you to store files and collaborate securely on the cloud. To import data from Zoho WorkDrive 1. Choose the Cloud storage category from ...
                                                                                                    Wherever you are is as good as
                                                                                                    your workplace

                                                                                                      Resources

                                                                                                      Videos

                                                                                                      Watch comprehensive videos on features and other important topics that will help you master Zoho CRM.



                                                                                                      eBooks

                                                                                                      Download free eBooks and access a range of topics to get deeper insight on successfully using Zoho CRM.



                                                                                                      Webinars

                                                                                                      Sign up for our webinars and learn the Zoho CRM basics, from customization to sales force automation and more.



                                                                                                      CRM Tips

                                                                                                      Make the most of Zoho CRM with these useful tips.



                                                                                                        Zoho Show Resources