I recently asked Zoho for help using regex in Zoho Mail custom filters and was told it was NOT supported.
This was surprising (and frustrating) as regex in Zoho Mail certainly works, although it does have some quirks*
To encourage others, here are 3 regex examples I created in mail custom filters that have worked fine for me.
Spoiler: I'm not a coder so these examples took me a lot of searching (thanks to others on this site) and tinkering to get the job done — if anyone has improvements on these regex examples, I'm very happy for you to offer them.
Regex examples
1) Purpose: to find all Jpeg, Jpg and Png image urls in email content
regex = "(https?:\/\/[^\s'\"]+?\.(?:jpg|jpeg|png))";
2) Purpose: to find all image urls in email content
regex = '<img[^>]*src="(https?:[^\s,">]*)"';
3) Purpose: to remove unwanted characters in email content, leaving only the urls matched in examples 1 and 2
regex = "\A[\s\S]+?\|\||\|\|[\s\S]+?\||\|\|[\s\S]+?\Z";
Regex explanation and script examples:
1) regex = "(https?:\/\/[^\s'\"]+?\.(?:jpg|jpeg|png))";
(...) outer brackets allow match to be captured by variable $1
https? ? makes the previous char optional, so matches http or https
: literal colon
\/\/ '\' is required to escape '/' so \/\/, matches '//'
[^...] ^ = don't match chars in square brackets
\s = white space
' = single quote
\" = escaped double quote — matches double quote
(if i had used single quotes, instead of double, around the whole regex
I wouldn't have needed to escape the double quote here)
So, [^\s'\"] means match any char EXCEPT white space, single or double quotes
+ + = match one or more of the chars in square brackets
? ? = match the LEAST number of chars (lazy match)
\. = escaped dot matches an actual dot (without escape, the dot would match any char except newline)
(?:...) ?: = this bracketed group will not be captured by any variable
jpg|jpeg|png the pipe means 'OR' so this group matches jpg OR jpeg OR png
Script Example
foundUrls = mailContent.replaceall(regex,"||$1||");
Here the $1 variable captures the regex match between (...) so, "||$1||" means 'replace each match with a double pipe and the match, and a double pipe. i.e. place a double pipe around each matched URL.
2) regex = '<img[^>]*src="(https?:[^\s,">]*)"';
'....' using single quotes means it's not necessary to escape the double quotes inside the regex
<img matches liters <img
[^>] match any character EXCEPT >
* match the chars in the square brackets zero or multiple times
src=" match literal src="
(...) the match inside the brackets is saved to variable $1
https? match http or https (? means previous char is optional)
: literal colon
[^\s,">]* match any character EXCEPT white space, double quote, chevron — zero or multiple times
" the match must finish with a double quote
NB: the captured variable $1 will only contain the URL from http up to, but not including the final double quote
Script Example
foundUrls = MailContent.replaceAll(regex,"||$1||");
Here, the 'long' URL (<img.. src..."http...) is replaced by the 'short' URL (http....) and surrounded by double pipes
3) regex = "\A[\s\S]+?\|\||\|\|[\s\S]+?\||\|\|[\s\S]+?\Z";
The regex has three options for a match, each separated by a pipe meaning 'OR'
First option:
\A = Start Of File (i.e search from the beginning of the file)
[\s\S] matches any white space (\s) and any non-white space (\S)
+ + = match one or more of the chars in square brackets
? ? = match the LEAST number of characters necessary (lazy match)
\|\| '\' escapes the pipes so matches '||' (without escape the pipes would be read as 'OR')
| unescaped pipe = OR (i.e. match either first OR second parts of the regex)
Second option:
\|\| escaped pipes, so matches '||'
[\s\S]+? match any space or non-space one or more times, but use the LEAST chars necessary
\| escaped pipe, so matches '|' (NB: when this group is deleted it will leave one pipe between URLs)
| unescaped pipe = 'OR' (ie match second OR third parts of the regex)
Third option
\|\| escaped pipes, matches '||'
[\s\S]+? match any space or non-space one or more times, but use the LEAST chars necessary
\Z = End Of File (i.e. search to the end of the file)
Script example
cleanUrls = foundUrls.replaceAll(regex,"");
Thus
Option 1: 'StartOfFile up to and including '||' gets deleted.
Option 2: '||' to '|' inclusive gets deleted (leaving one pipe between URLs)
Option 3: '||' to EndOfFile gets deleted
Replacing the single pipes with commas to form a list of URLs is accomplished with:
urls = cleanUrls.toList("|");
NB These scripts use pipes to bracket and separate URLS It's therefore prudent, BEFORE applying the scripts, to remove any existing pipes from the selected text:
regex = '\|';
cleanMailContent = mailContent.replaceAll(regex,"");
* Quirks of Zoho Mail Deluge
While the scripting language is similar to JavaScript it is not identical and some features of Regex do not work.
In particular, I've not had success using 'lookahead' or 'lookbehind' operations in Zoho Mail custom filters.