Subject your physical data into your digital processes
Intelligent character recognition or ICR is an optical recognition technology that can recognize, capture, and digitize data available in any physical form using Artificial Intelligence.
In Zoho CRM, ICR is built as part of Zia Vision—the AI-powered image detection and validation tool.
In this document, you’d know about:
Why do you need ICR in CRM?
From business cards to product tags, registration forms to invoices, or proof of ID to shipping labels, your operations rely on printed data from a variety of physical media. Yet it's important that your agents keep pace with your business and not spend their valuable time and effort manually converting printed text into usable data.
- Let's say you're a busy sales rep and have met with a few high-value prospects at a customer conference and exchanged business cards. When you return to the office, you have to create a record for each business card you obtained, filling in details like the person's name, phone number, website, address, and so on.
- Or imagine a post-production workflow where a warehouse manager has to record information like product names, SKU numbers, dates of manufacture, and so on from product labels attached to packaged products and enter it all into Zoho CRM.
In both scenarios, the actual process of lead nurturing and stock management starts only after the data is uploaded into the system, but the agents had to spend most of their time importing the data into the system.
Thus, to help simplify record creation, enhance business productivity, and ease form fatigue, Zia Vision is optimized with the ICR functionality.
Choose the ICR technique that suits your business: Types of ICR in Zoho CRM
From working with predictive standard orientations like ID proofs to dynamically differing ones like business cards, your business works with a range of physical data forms and to suit and still accommodate different business needs, ICR in Zoho CRM is equipped with two types of extraction methods:
- Template-driven
- Zero-shot field prompting
Template-driven extraction is an extraction method where you train Zia with a few samples
- It requires you to upload sample images of the data in the same orientations (Same template with different data values, for Zia to understand)
- Manually localize values from the uploaded image and associate to the CRM fields from the chosen layout/ module.
By doing so, Zia will know where in the image to fetch the values from and which CRM field to associate them with in selected layout.
For example:
One-click insurance
Insurance businesses often involve time-consuming processes. To streamline them, Coverly—a tech-savvy insurance provider—came up with one-click insurance requests, whereby customers submit a government-authorized IDs to create instant policies. Using ICR, insurance agents working on these requests can use these images to create new records in just a few simple steps.
Completing vendor background checks
Zylker is an online education platform that enables domain experts to register and conduct courses on their platforms. To enhance their user experience, the platform onboards experts via webforms and requires only a few basic details, like their names and email addresses. With just these bits of information, Zylker can't authorize experts; however, by also requiring them to submit pictures of their IDs, Zylker can extract their data and enrich their records with the information they need.
Here's a look at how you can create a record from an image:
Like any discriminative AI model, Zia Vision requires preliminary training with image formats and orientations via sample images.
To train Zia
- Go to Setup > Zia > Vision > Intelligent Character Recognition.
- In the ICR page, click Create New Rule to create a training set for a module.
- In the Create New Rule page, do the following:
- Provide a rule name.
- Choose the desired module and the layout.
- From the type of extraction options, choose Template-driven. This will require you to train Zia with sample images before you extract data real time.
- In the Field to store the input image field, choose the desired field name to which you’d like to store your image you use for the extraction.
- Click Next. This will take you to the training interface named as Upload Image(s).

- In the Upload Image page, click Upload Image(s).

- In the following Upload Image for Extraction screen, add your image either by dragging and dropping it from your file manager or locating it by browsing.
- To browse for the file, click the Browse button and select the file from your device. Then click Extract.

- You'll arrive at Zia's training studio. Zia will recognize the characters in your image and mark all the valid characters as the region of interest using the bounding box.
- Zia recognizes every word in the image, including field names and field values.
- If what Zia recognizes is accurate, you can click on the box and associate it with the field name from the layout.
- If not, you can also drag the margins to cover the characters and associate them as a field's value.
- After you've associated the localized characters with fields, click Train.
- Click Save to save the rule.
Once you have trained Zia for the template-driven extraction, you can start creating and enriching records in that layout and module.
Accuracy in text-to-digital transformation is based on training and consistent model usage. In addition to the training set you upload, Zia trains itself from the images you upload to create records.
To ensure acceptance and accuracy, please ensure you've met the following prerequisites:
- You've created ICR rules under Zia Vision.
- The uploaded image should have the same layout as the sample.
- The image should have a card-based orientation.
Here are the guidelines for image tagging and data extraction:
If your image complies with the pre-requisites and ICR guidelines mentioned above, Zia can give you a perfect match.
Zero-shot field prompting
Zero-shot field prompting is an advanced method of data extraction that requires no training whatsoever. You tell Zia which fields to look out for in the input image (field prompting) and Zia, by the virtue of its intelligence, will be able to identify and extract those data without requiring any human intervention.
How does it work?
You saw how in template-driven extraction method, you have to localize values from the images and train Zia to associate them to the CRM fields. In the field prompting method of extraction, Zia deploys its VLM-based model, to do that.
Here’s a quick snapshot on VLM:
Vision-Language Model (VLM)
A Vision-Language Model (VLM) is a multi-modal generative AI system that accepts images and text as inputs and produces text outputs such as descriptions, labels, or extracted fields.
Architecture Overview:
A typical VLM integrates two core components:
- Vision Encoder - A neural network that converts an image into high-dimensional visual embeddings.
- Large Language Model (LLM) - A text-generation model that processes text tokens and produces outputs based on multi-modal context.
These components are connected using projection layer, so the LLM can interpret visual features as part of its input sequence.
The vision encoder converts the image into patch embeddings, while the LLM converts the prompt into text embeddings. These visual and textual embeddings are projected into a compatible space allowing the LLM to integrate visual context with the prompt. The LLM then performs standard next-token generation to produce outputs, guided by explicit instructions provided by the user.
In the case of Zia’s ICR, by choosing the CRM fields in the ICR rules, you are providing static prompts for Zia’s VLM. So, whenever there is an image uploaded in the given module, based on the fields given as prompts, Zia can instantly identify the relevant values from the input image, extract them as digital data, and renders outputs by associating the right values against those CRM fields.
As a user, all you have to do is to validate its accuracy and decide to retain the associations or not. If there are inconsistencies in the data (Say, you have O positive in your CRM picklist, and your customer wrote O+, Zia could identify them as a result, but still lets you fix them right after extraction.
This is called the Zero-shot field prompting: You give static prompts and you extract without any prior training.
Capabilities of ICR using VLM:
Extraction using the field prompting method can work with the following:
- Formats: JPG, JPEG, and PNG
- Orientations: Predictive geometric shapes and other custom shapes as in die-cut cards.
- Script types: Handwritten, legible printed texts, and dot printed texts.
Configuring rules for field prompts
To enable ICR capture for a module, you need to have the ICR rules in place.
To configure ICR rules for field prompting:
- Go to Setup > Zia > Vision > Intelligent Character Recognition.
- In the ICR page, click Create New Rule to create a training set for a module.
- In the Create New Rule page, do the following:
- Provide a rule name.
- Choose the desired module and layout.
- Choose the type of extraction as Field prompting.
- Select all the fields that you need to extract values for from the input image.
- Choose a field into which the input image shall be stored for future reference. Remember, this is optional.
- Click Proceed. Your ICR rule for that module and layout is ready. Now, your agents can start scanning images to digitize data.
Click here to learn how to create records using ICR.
Business scenarios
Reaching out to tradeshow prospects
One of the main benefits of attending tradeshows is attracting new leads, and one quick way to get acquainted is to exchange business cards. Using ICR, reps can quickly create lead records and start qualifying and nurturing them without spending time entering details into records manually.
Digitizing patient registrations
Patient registrations are still collected as hand-written forms in many hospitals. Although, they might have robust process automation, manually transferring this lengthy form-type data into your system and then to your process, is going to strain the state of flow. Using ICR, the front-desk executive can capture, upload, and create the patient record in an instant.
Working with ICR
Creating records
To create a record from an image
- Go to the desired module and click on the dropdown available next to Create record.
Remember: You have to have configured an ICR rule to enable this option.
- In the Create Record from Image page, drag and drop the image from which you mean to extract text. You can also click Browse to source the image from your device.
- Click Proceed. Zia will start extracting the data.
- Once the extraction is done, Zia will have associated the field values from the image. This is common for both the extraction techniques.

- Based on the accuracy of the extraction, you can choose to keep Zia's value associations. Remember, Zia will extract all characters in an image, which may include data that doesn't correspond to a field in your layout. You can unselect them if you don't need them.
- If you have used zero-shot field prompting for that module:
- If there are data inconsistencies (say, the data in the image is USA, and the data in your layout is United States of America), Zia will let you fix the inconsistencies.
- You can choose the right value from the picklist values.
- If you are an admin or a user with permission to create or edit the record’s layout, you can also new value right on that space.
- Click Proceed.
- If this image is not satisfactory, you can try a new image.
Update existing records
In addition to creating records from images, you can also enrich your existing records using images uploaded in your records.
To enrich existing records
- Go to the record you want to enrich and click on the image you'd like to refer to for enrichment.
- Click on the Enrich from Image button in the top-right corner of the preview.
- Zia will have isolated characters in the image. Validate the enriched and existing data and click Proceed.
- After associating, you will be navigated to the Edit record page, with the enriched fields highlighted for quick reference.
- Click Save to record the new changes.
In conclusion, ICR in Zoho CRM helps you achieve the following:
- Save time and effort.
- Improve agility and productivity.
- Take care of the operational logistics so that agents can focus on processes.
- Alleviate fatigue caused by constant gaze shifting between data sources and screens.
Although ICR offers two types of extraction, using the right one for the right activity, will bring efficiency to the process.
- If your data source always comes in a standard and constant format.Once trained on their standard structure, Zia will be able to churn the rest of the incoming records with higher accuracy.
- If you need to extract images as well as characters from the source image.
- If the format and orientation of your source image keeps differing for each uploads.
- If your source image has a range of script formats including hand-written notes.
- If you have many impromptu extraction needs.
Limits and limitations: A summary
- As part of its ICR training extraction methods, Zia can recognize human faces, but not products, objects, or animals.
- Template-driven extraction works only with images with a card-type orientation.
- Values extracted from images can only be associated with fields in the specified layout. It's not currently possible to associate with images with subform values.
Limitations of Zero shot field prompting:
- The current Zero shot field prompting is not equipped to capture images.
- Extracting data from a table and storing them to a subform in the layout is not possible.
Points to remember
- You must create ICR rules to start creating records.
- You can create up to three rules for an organization.
- One module can have only one rule.
- You can upload up to two images per module layout.
- At any point, you can edit, update, or delete the rule.
- For accurate association, in the case of template-driven extraction method, we recommend that you encompass the entire area containing your target characters without cutting corners.
- You can zoom the image in and out for legibility in association.