Match function is not working to my requirement

Hello Clarioneers,
I am still using clarion 6.3.

I want to use MATCH procedure to check firstnames in my software. The Regular Express provided by a third party for the said field is [A-Za-z][A-Za-z’-]*. This expression validates perfectly with several online Regular Expression web applications. I am not having much luck with the same in Clarion.

The expression as I understand is:

  1. It can only start with a letter (A-Za-z)
  2. It can only have letters (A-Za-z) and “-” and " ’ ".

My attempt was this:

MATCH(LOC:FirstName,’[^A-Za-z][A-Za-z-’’]’,MATCH:REGULAR)

OR
basically, what I want is:

  1. It must start with a letter (a-z)
  2. Must only contain letters (a-z), single quote " ’ ", and “-”

Thanks

Mathew

Just a quick guess…
MATCH( CLIP(LOC:FirstName) , ‘[A-Z][A-Z-’’]*’, MATCH:Regular + Match:NoCase)

You had started with [^A-Za-z] which means anything other than a letter
The ^ is a NOT

Also you were only looking for TWO Characters, as you had dropped the * after the second one

So my string says:
Look for something that starts with a Letter,
and then is made up of only Letters or a Dash or a single quote

Note to add a single quote to a CW string you write two single quotes in a row
LikeThis = ‘Don’‘t Do That’

You might also want to allow for Spaces
so something like

‘[A-Z][A-Z-<32>]*’ as <32> is a CHR(32) which is a space

If you don’t want to allow embedded spaces, then you should probably CLIP() the string, to remove trailing spaces
Or write the match string to force the last character to be something other than a space

I think he wants the ^ before the first character class - you don’t want to have the regex match somewhere other than at the beginning of the string.

And as the variable is Firstname, it’s likely that he doesn’t expect it to be more than one word.

What he had was close, I think. It should have been:

MATCH(LOC:FirstName,’^[A-Za-z][A-Za-z-’’]’,MATCH:REGULAR)

Ah, yeah that’s true, a ^ is different from a [^(stuff)]
The ^ means start of line in one context
and
Not in this list of characters in the other

Thanks Ben and Mark,

First of all, sorry for the War and Peace to follow. But, I thought that is required to explain myself better.

I have tried the patterns ( ^[A-Za-z][A-Za-z-’’] ) you guys have mentioned without any success. When tried in Clarion 6.3 the application crashed (this could be a bug). IE with the pattern mentioned, if I call MATCH twice repeatedly, the application would crash. It did not atleast crashed in Clarion 8. But, did not give the desired results either.

Scenario 1: The Pattern that actually works : [^a-z][^a-z])
[^a-z] - matches everything outside a-z as the first character
[^a-z]
- rest of the data can be anything outside a-z
If the combination of the above 2 classes returns true, then the data is invalid.

Scenario 2: Pattern that I intent to get working : [^A-Za-z][^A-Za-z-<39>]*
[^a-z] - matches everything outside a-z in the first character position.
[^a-z-<39>]* - rest of the data can be anything outside a-z, and - character and single quote. According to clarion help, the - can be at the start of at the end of a set. for e.g [-a-z] or [a-z-].

If the combination of the above 2 classes return true, then the data passed will be considered as invalid.


Data [ DesiredMATCH_Result with [^A-Za-z][^A-Za-z-<39>] ]

firstname [ Success ]
3firstname [ Fail ]
first-name [ Pass ]
firstname- [ Pass ]
first1name [ Fail ]
firstname3 [ Fail ]
firstname’- [ Pass ]
-firstname [ Fail ]
'firstname [ Fail ]

I am providing a link to the test app I’ve created in clarion 8, should anybody who would like to give this a try and help me get this working.

Regards

Mathew

Your first work [^a-z][^a-z]* could be re-written as [^a-z]+
The * means any number of times
The + means at least once.

You pointed out that a dash can be the last character of in set
But then you placed a <39> after it, thereby breaking the rule.

Also remember you can use Match:Regular+Match:NoCase vs. getting into (slightly) more complex RegEx

OK, I looked at your attached program (even though I find it hard to work with .apps)

Why do you reverse the return values from ValidateData ?

I Found this easier to understand.

  IF MATCH( CLIP(LOC:Firstname),CLIP(LOC:Pattern), Match:Regular )
       ?Box1{prop:fill} = COLOR:GREEN
  ELSE ?Box1{prop:fill} = COLOR:RED
  END

I think the problem you have is that you’re thinking that the match must match the whole string
For instance, say you had this MATCH( CurrString, ‘[a-z]’, MATCH:Regular)

At first glance, you’d think this would only match a string that was one character long and was a lower case letter.
Well, no…
It will match any string, that has a lower case letter anywhere inside of it.

So… I think what you really having in mind is a pattern like this:

^[a-z][a-z'-]*$

Let me break that down…
The first and last symbols: ^ means start of string, and $ means end of string
So now only those things that match the pattern between the ^$ will now return as a match

Then we have [a-z] – any letter
Then we have [a-z’-] – any letter, a single quote or a dash

Note the the <39> is a notation for a CHR(39) but that only applies to string literals, not entered values like you had in your program.

So… the final answer

  MATCH( CLIP(IsAName), '^[a-z][a-z'-]*$', MATCH:Regular + Match:NoCase)

And don’t forget to fix the reversed return values in your ValidateData function

2 Likes

You are a genius Mark… That worked like a treat to me…

Thanks a ton…

You’re too kind…

Now that the primary issue has been solved…
I noticed in your app, that you had a function with arguments - excellent
However, the way that it was written could be improved.

In the template prompts, you wrote

Prototype:  (string,string),long
Parameters: (pData,pPattern)

Which means that you’ll have:

   MAP
      ValidateData(string,string),long
   END

ValidateData PROCEDURE(pData,pPattern)

Ignoring any data types and label name choices
I recommend changing it to:

Prototype:  (string pData, string pPatern),long
Parameters: (string pData, string pPatern)

or even…

Prototype:  (string pData, string pPatern),long
Parameters: (string pData, string pPatern) !,long

Your code will become more readable.

This is especially true, because normally you’re looking at MAP or the top of the Procedure, but not both at once.
Additionally Intellisense (code completion) only shows the what’s in the map (plus certain nearby comments), so having the labels for the arguments there, helps you discover how to make the call.

I also noticed that you’re using a LONG data type, to handle a TRUE/FALSE situation
I recommend using a BOOL for TRUE/FALSE values.

2 Likes

Hi Mark,
Thanks for pointing out the issue in my coding style. “some old habits die hard”… I will keep your suggestions in mind for future stuff.

As far as the LONG as a return value is concerned. I had this procedure using StrPos at some point, but when I changed the implementation to use MATCH, I did not change it back to BOOL.

Cheers