Email validation using MATCH (RegEx)

Hi All,
I tired to validate email ids using RegEx (using Match). The best I could come up so far is as follows:

Expression:
^[a-zA-Z0-9!#$%&\’+/=?^_{|}~-][a-zA-Z0-9!.#$%&\’*+/=?^_{|}~-][email protected][a-zA-Z0-9-]+{[?.]+[a-zA-Z0-9-]+}$

The above expression DOES NOT cater for the following email formats: (There could be more that I haven’t thought of)
[email protected]
[email protected]
[email protected]
[email protected]

Please share if you can optimise the above RegEx expression and/or if you have a way to deal with the email formats mentioned above?

I had an earlier post about Match (RegEx).

Thanks

Mathew

1 Like

I made a little progress from yesterday. I am also attaching the test program (C10 app) for anybody to try.

Revised Expression:
^[a-zA-Z0-9!#$%&\’+/=?^_{|}~-][a-zA-Z0-9!#$%&\’*+/=?^_{|}~-]+{[?.][a-zA-Z0-9!#$%&\’+/=?^_{|}~-]+}*@[a-zA-Z0-9-]+{[?.][a-zA-Z0-9-]+}*$

My test application looks like this:

MatchExpression.zip (765.8 KB)

Regards

Mathew

1 Like

I just use the following - not as complicated as yours so may not catch all but works for my needs

Notice I use UPPER to eliminate all the a-zA-Z duplication you have.

  IF ~MATCH(UPPER(wrkEmail),'^[-A-Z0-9.!#$%&''*+-/=?^_`{{|}~][email protected]{{[-A-Z0-9.]+\.}+[A-Z][A-Z]+$',Match:Regular)
1 Like

Thanks Kevin for the UPPER function suggestion. But, I didn’t realise that I am calling MATCH with MATCH:Regular + Match:NoCase as this test app was done a few years ago. Therefore a-zA-Z is already being ignored.

Took the last bit from your suggestion.

Latest Expression:
^[A-Z0-9!#$%&\’+/=?^_{|}~-][A-Z0-9!#$%&\’*+/=?^_{|}~-]+{[?.][A-Z0-9!#$%&\’+/=?^_{|}~-]+}*@[A-Z0-9-]+{[?.][A-Z0-9-]+}*[A-Z0-9]$

This validates all the emails I have in my list.

Thanks

1 Like

I have been using https://emailregex.com/ for a while for this, is working nicely.

See the big title about half way down the page. General Email Regex (RFC 5322 Official Standard)

Regards

Richard Bryce
www.nsitau.com

using this regex string Latest Expression:
^[A-Z0-9!#$%&\’+/=?^{|}~-][A-Z0-9!#$%&\’*+/=?^{|}~-]+{[?.][A-Z0-9!#$%&\’+/=?^_{|}~-]+}@[A-Z0-9-]+{[?.][A-Z0-9-]+}[A-Z0-9]$

but it will validate [email protected] as invalid (so 1 char A before dot)
but [email protected] is OK (two AA before dot).

I am not an expert in Regex so could anyone give me the correct syntax, please?

tried this but it doesn’t work
^[A-Z0-9!#$%&'*+/=?^_'{|}~-]+(?:\.[A-Z0-9!#$%&'*+/=?^_'{|}~-]+)*@(?:[A-Z0-9](?:[A-Z0-9-]*[A-Z0-9])?\.)+[A-Z0-9](?:[A-Z0-9-]*[A-Z0-9])?$

You will have enough time to solve the issue on the vacations :slight_smile:

any idea how to fix it?

Match:NoCase … Therefore a-zA-Z is already being ignored.

I’m pretty certain Match:NoCase only applies outside of [sets] so [A-Za-z] is required to match both upper and lower case letters. The easier way for NoCase is to UPPER(email) as Kevin had so you only need sets with [A-Z]

I wonder about this part: {[?.][A-Z0-9-]+}

You’re saying an email can have a [Question Mark or Period] once, then [Letters/Numbers/Dash] 1 or more times?
An email can use a ? in the Domain instead of a Period?

Maybe [?.] would be more correct as a Backslash-Period an Escaped Period to indicate a Period?

I thought domains always ended with Period 2+ letters or numbers like .com or .co.uk: .[A-Z0-9][A-Z0-9]+

Carl,
Thank you for your answer.

I am talking about MatchExpression.zip example (attached here) and RegEx expression used there. So the problem is that using this regex string Latest Expression (see above)

but it will validate [email protected] as invalid (so 1 char A before dot)
but [email protected] is OK (two AA before dot).

and of course, I am using MATCH(UPPER(wrkEmail)…

So I am not sure what is wrong with regex above to validate email like [email protected]

My questions is why you have [?.]

That’s easy if you look at your expression split into parts like below in line 2 it requires 1 character from the set [a-z] , plus line 3 requires “+” (1 or more) from the next set ([a-z]+) so you require 2 before any period (or question mark)

1. ^
2. [A-Z0-9!#$%&\’+/=?^{|}~-]
3. [A-Z0-9!#$%&\’+/=?^{|}~-*]+
4. {[?.][A-Z0-9!#$%&\’+/=?^_{|}~-]+}
5. @
6. [A-Z0-9-]+
7. {[?.][A-Z0-9-]+}
8. [A-Z0-9]
9. $

I do not have the Email spec memorized and this is a complicated expression, maybe too complicated. Are trying to prevent invalid use of the period like … @. [email protected] ? I would do that separately. This is untested:

IF MATCH('^\.|\.$',Eml,Match:Regex) THEN    ! No   [email protected]
    Message('Cannot begin or end with a period')
...
P=STRPOS('\.\.|\[email protected]|@\.',Eml)  ! No .. [email protected] @.
IF P THEN 
  MESSAGE('Invalid Period at position ' & P & ': ' & SUB(Eml,P,2)
…

Maybe then your expression before the @ can be simpler.

I would suggest on your Window you add a TEXT control so you can work on the separate lines more easily (like my 1 to 9 above). PROP:LineCount and PROP:Line can be used to get the lines and concat into a single string.

In sets the hyphen should be first so [A-Z0-9-] as [-A-Z0-9]

Below worked for me, except it does not catch the … [email protected] that I would do separately. I added Underscore. Not sure why you have ? or . so changed [?.] to backslash-period. Is [email protected] with no .xx suffix valid?

^[-A-Z0-9!#$%&\’+/=?^{|}~_][-A-Z0-9!#$%&\’*+/=?^{|}~_.]*@[-A-Z0-9]+{\.[-A-Z0-9]+}[A-Z0-9]$

^
[-A-Z0-9!#$%&\’+/=?^{|}~_]
[-A-Z0-9!#$%&\’*+/=?^{|}~_.]*
@
[-A-Z0-9]+
{\.[-A-Z0-9]+}
[A-Z0-9]
$

If this will be used to tell a user Message(‘Email entry is invalid’) I would split checking into parts so messages make it obvious where the problem is so you don’t create puzzle for the user nor tech support.

Tell them they have a space at position X. You could just quietly take out spaces but I wouldn’t.
Check for a single @
Check for invalid periods
Validate before the @
Validate after the @

:grinning: It’s not I have it - this was original regex from the attached example. Honestly speaking I have no idea how does regex work and used example in my project.

Thank you for your detailed answer - I will follow it and try to figure out