You can mark a column that contains PII(Personal Identifiable Information) or personal data and ePHI (Electronic Protected Health Information ) data using the Mark PII and ePHI data transform. You can also apply security methods on the columns to protect your data and choose to include or exclude these columns during export.
To mark columns with PII data or personal data
1. Click the Transform menu and select the Mark PII and ePHI data option.
You can also right-click on a column and select the Mark PII and ePHI data option from the context menu.
2. Add personal data columns to the Mark columns with personal data section.
3. Click Apply to mark the selected columns as personal data.
To mark columns with ePHI data or health data
1. Click the Transform menu and select the Mark PII and ePHI data option.
Info: You can also right-click on a column and select the Mark PII and ePHI data option from the context menu.
2. Add columns containing health data under the Mark columns with ePHI data section.
3. Click Apply to mark the selected columns as ePHI data columns.
To protect personal data or ePHI data during export
1. Click the Export now option from the Export menu in the DataPrep
Studio page.

2. From the side pane, you can choose the destination where you want to export data. For example, let us choose Files as the destination.
3. Choose the columns with personal data or ePHI data to be included during export using the corresponding check boxes.
Note : Columns not marked as personal data or ePHI data will be included by default.
4. Choose the required security method from the drop-down to protect your personal data and click Next.
There are three security measures that can be applied to the personal data or ePHI data columns. These security measures are used to protect sensitive data such as Personally Identifiable Information (PII).
Security measures to protect personal data or ePHI data
1. Data masking
Data masking hides original data with 'x' to protect personal information.
2.
Data Tokenization
Data tokenization replaces each distinct value in your data with a random value. Hence the output is statistically identical to the original data.
3.
None
You can select None if you do not want to use any security method.