To set up an OCR model, follow these 4 steps:
Step 3: Verify the model summary, train and test model
Training data is the primary dataset that is used to train a model, so that it can perceive the input information correctly and make accurate decisions based on the information provided. This ensures that the model performs the way it's intended to. For the OCR model, a set of images or PDFs of the same layout needs to be added and the values to be extracted from them must be tagged, so that model will know which values are required to be extracted from the training data added. You can check the guidelines section for more details.
After adding the fields and uploading the required images or PDFs, you need to tag the added fields to their corresponding values in each of the training data that you've uploaded. This helps the model to learn and identify which values must be extracted from the training data. All the data that can be extracted from your training data will appear highlighted. You can now choose to extract the specific text portions for your business needs.
Note:
- In case you've missed adding any fields, you can click + Add New Field under the Fields section on the right right, without going back to the Add Fields page.
- You can click the horizontal ellipsis beside a field name to edit/delete the field or to remove the field values in case you've incorrectly added any fields.
After adding the training data, you can review the model details, such as Model Name, Model Type, type of Training Data, and the Number of training data (images or PDFs) added. If you need to make any modifications, you can go back and make them. Otherwise, you can proceed to train the model.
Before you can actually use your OCR model in your application, you have to train it to perform the way you want.
After the training is complete, the user can view the status of the model (trained, failed, and draft), the model type, the date it was created on and updated on, and other details as mentioned below.
Under this section, you can view the current version of your model, the type of training data, and the field names whose values need to be extracted.
In this section, you can view the number of versions the model has, what version the model is currently running on, the model creation date, the number of fields added, and the count of trained data.
In this section, you can view the App Name, Form Name, and the Field Names in which the model is deployed in. You can also filter between different environments to check which environment a model is deployed in.
After training, you can test the model's reliability before deploying it in any of your applications. This ensures that the model identifies and extracts the required values correctly.
After you train your model, you need to publish it to make it available for deployment in your applications. Your users can then use your model and start extracting values from their images.
Retraining the model with the additional images or PDFs of similar and slightly varying layouts helps your model identify and extract the required values more accurately. This reworking on the model's efficiency allows the model to be tuned specifically to your business perspectives.
Note :
- Deleting a model that is deployed in any of your applications will remove its deployment in those applications. This action cannot be undone.
- After deletion, the added fields (model input and output fields) will remain in the form in which the respective model is deployed. The past data from the OCR model will remain as long as the respective fields are not deleted from the form.
- You cannot delete a model's version that is currently being used. Instead, you can switch versions and then delete that model version.
After you train and test your model, you can publish it to make it available to your users and start extracting the required values. After publishing, your model can be used by users in your applications. In case you don’t want your users to use the model, you can delete the model.
Note :
- Currently, you can add image and file upload fields as the source field. Therefore, only image or file upload type fields available in your form will be listed for source field selection.
- If there is no image or file upload field available in the chosen form, you will need to first create one in order to deploy the OCR model.
Note :
- Extracted field refers to the field in which the extracted values will be displayed in the live mode of your app.
- If you've chosen number as the field type, it is inclusive of decimal, percentage, and currency field values.
- If you've chosen text as the field type, it is inclusive of single-line and multi-line field values.
- If you've chosen date as the field type, only date values as per the chosen format will be displayed.
You can now access your app in live and upload the required images or PDF in the source field. The OCR field will try to identify the uploaded input and the extracted values will be displayed in the extracted field.
Learn how to use the best tools for sales force automation and better customer engagement from Zoho's implementation specialists.
If you'd like a personalized walk-through of our data preparation tool, please request a demo and we'll be happy to show you how to get the best out of Zoho DataPrep.
You are currently viewing the help pages of Qntrl’s earlier version. Click here to view our latest version—Qntrl 3.0's help articles.