Create bucket - Text

Create buckets - Text




You can choose to implement bucketing in a text column using conditions.

Consider a situation where you need to assign people to specific teams.  You can implement the logic of adding values to a bucket, which in this case is the team name. 

Name
Team
 Sena, Stephen, Alfy  Sales
 Weber, Tandy, Hallsy
 Marketing
 Jared
 Engineering
 Jerrie, Karim
 Support


In order to start the bucketing process with this data, you can form the conditions, such as:

If value in ( Sena, Stephen, Alfy ), add to the bucket: Sales
If value in ( Weber, Tandy, Hallsy ), add to the bucket: Marketing
If value in ( Jerrie, Karim ) add to the bucket: Support
If value in ( Jared ) add to the bucket: Engineering

Apart from the "in" condition, you can use the following other conditions:
  1. not in
  2. contains
  3. doesn't contain
  4. begins with
  5. doesn't begin with
  6. ends with
  7. doesn't end with
  8. regex

To apply bucketing in a text column:  
  1. Right-click the text column in the  Studio  page. 
  2. Select the Create Buckets option from the context menu. 
  3. In the Studio  panel, give a name to your new column under Base column name
  4. Enter the label for the bucket in the field  Bucket label.  
  5. If the selected column has values that won't fit in any of the conditions, it is marked with a separate label such as NA under the  Label for unmatched values  option. 
  6. Enter the conditions as required and click Save  (please refer to the sample use cases and examples above). 
  7. A preview of the resultant column will be shown next to the selected column in the data grid.
  8. Click Apply.



To apply filters

If you want to apply some filters along with this transform, you can use the filters functionality.

1. Click the  Filters  tab.

2. Click the   icon and add the required columns in the Filters section. You can also reorder the filters using the drag and drop method.

3. For every column added, you can select one of the following options from the drop-down:
  1. Actual: This option lets you filter rows based on the actual values in the column. Click  here  to know more.
  2. Data quality: This option lets you filter rows based on the quality of data in the column. Click  here  to know more.
  3. Patterns: This option helps you filter rows based on the data patterns in the selected column. Click  here  to know more.
  4. Outliers: This option allows you to filter rows based on the outliers present in the data of the selected column. Click  here  to know more. 
Notes
Note: The filter options are displayed based on the datatype of the column added for the filter.

4. When you add more than one filter to the  Filters  section, the logical operators, AND or OR appear next to the filters. You can click to toggle the logical operator between AND and OR.
  1. Using the logical operators, you can combine the conditions and apply logic to determine the rule of precedence. The final expression is displayed in the  Criteria expression  box. You can click  Edit  to alter the default expression using logical operators and parenthesis to specify the precedence or the sequential order as to which condition should be evaluated first. Click Save after making the required changes. 
  1. For example, In the expression, ((1 OR 2) AND (3 OR 4)) , at first the condition ( 1 OR 2 ) will be executed and the condition ( 3 OR 4 ) will be executed next. Thirdly, since, the AND operator is used, the filter will be applied when both the conditions are true.
5. You can further drill down to choose specific values based on the filter option selected for each filter, in the next section.



For example, in the above screenshot, the  Data quality  option is selected for the All columns filter in the  Filters  section. Based on the selection, further options to filter specific values are displayed in the  All columns (Data quality)  section.

6. You can choose to include or exclude the selected items in the last section.

7. If you want to remove all the filters for some reason, you can use the  Clear  button.

8. A live preview of the filter transform is shown as you make changes. 

9. Click the  Apply  button to apply the transform along with the filters.

To sort data

Under the Sort tab, you can sort data in the ascending or descending order based on any column. You can choose the column in the Sort by column drop down and choose the order to be sorted. 

Info
You can use this functionality only with the transform and not as a standalone function. However, you can use the Sort transform if you want only to sort data.