Following on from Carl’s post I have found that adding a null (‘<0>’) on the end of a string works on both the data and the regex.
You do have to be careful with the nulls as when used with strpos they will terminate the string so they are no good anywhere other than at the very end of a string.
What I have done is written a wrapper for strPos called strPosUnclipped:
StrPosUnclipped PROCEDURE (*string pText,*string pRegex) !,LONG
CODE
return strpos(choose(size(pText) >0 and pText [size(pText)] =' ',pText &'<0>',pText), |
choose(size(pRegex)>0 and pRegex[size(pRegex)]=' ',pRegex&'<0>',pRegex) )
If there are no spaces on the end it will call strPos() as normal, but where there are one or more spaces on the end it will append a null ‘<0>’ character to the string passed to strPos().
often when testing you want to pass by value so you can pass literals. As I am working in the IDE I cannot readily use overloading so I have written StrPosUnclippedByVal which simply accepts value parameters for the text and regex and calls the version with parameters passed by reference. Obviously you want to avoid using this version except where you want to use literals rather than a field.
StrPosUnclippedByVal PROCEDURE (string pText,string pRegex) !,LONG
CODE
return strposUnclipped(pText,pRegex)
so now with this new StrPosUnclipped function, we can implement StrPosAndLen without all the jumping through hoops trying to do character substitution:
StrPosAndLen PROCEDURE (*string pText,*string pRegex,*long pMinLen,*long pMaxLen)
! returns position and minimum and maximum length of string matched against regex
x long,auto
max long,auto
stPos long ! start position (return value)
dollarEnded string(size(pRegex)+1)
regex &string
CODE
pMinLen = 0; pMaxLen = 0 ! initialise/clear
if size(pText) = 0 or size(pRegex) = 0 then return 0.
stPos = strPosUnclipped(pText, pRegex) ! get start position
if ~stPos or stPos > size(pText) then return 0. ! no match
if stPos = size(pText) ! single char match
pMinLen = 1
pMaxLen = 1
return stPos
end
if pRegex[size(pRegex)] = '$' and sub(pRegex,size(pRegex)-1,1) <> '\'
regex &= pRegex
else
dollarEnded = pRegex & '$'
regex &= dollarEnded
end
max = size(pText) - stPos ! max increment size
loop x = 0 to max
if strPosUnclipped(pText[stPos : stPos+x],regex)
pMaxLen = x + 1
if pMinLen = 0 then pMinLen = pMaxLen.
elsif pMaxLen
break
end
end
return stPos ! return starting position
Note the minimum and maximum lengths are in the passed parameters (pMinLen and pMaxLen) while the start position is the returned value (same as with strPos).
The maxLen is handy as often you want the “leftmost longest” match and this code using strPosUnclipped() is much better (and simpler) than earlier versions trying to use a substitute character for spaces. So kudos to Carl for suggesting adding null on the end of a string.
Final comment: pText and pRegex are passed here by reference (for speed). As with StrPosUnclippedByVal() we can here do a wrapper for easy testing or use where we want to pass literals:
StrPosAndLenByVal PROCEDURE (string pText,string pRegex,*long pMinLen,*long pMaxLen)
CODE
return StrPosAndLen(pText,pRegex, pMinLen, pMaxLen)