StrPos() String Clipping - How to Match a Trailing Space?

Following on from Carl’s post I have found that adding a null (‘<0>’) on the end of a string works on both the data and the regex.

You do have to be careful with the nulls as when used with strpos they will terminate the string so they are no good anywhere other than at the very end of a string.

What I have done is written a wrapper for strPos called strPosUnclipped:

StrPosUnclipped      PROCEDURE  (*string pText,*string pRegex) !,LONG

  CODE
  return strpos(choose(size(pText) >0 and pText [size(pText)] =' ',pText &'<0>',pText), |
                choose(size(pRegex)>0 and pRegex[size(pRegex)]=' ',pRegex&'<0>',pRegex) )

If there are no spaces on the end it will call strPos() as normal, but where there are one or more spaces on the end it will append a null ‘<0>’ character to the string passed to strPos().

often when testing you want to pass by value so you can pass literals. As I am working in the IDE I cannot readily use overloading so I have written StrPosUnclippedByVal which simply accepts value parameters for the text and regex and calls the version with parameters passed by reference. Obviously you want to avoid using this version except where you want to use literals rather than a field.

StrPosUnclippedByVal PROCEDURE  (string pText,string pRegex) !,LONG
  CODE
  return strposUnclipped(pText,pRegex)

so now with this new StrPosUnclipped function, we can implement StrPosAndLen without all the jumping through hoops trying to do character substitution:

StrPosAndLen         PROCEDURE  (*string pText,*string pRegex,*long pMinLen,*long pMaxLen)
! returns position and minimum and maximum length of string matched against regex
x     long,auto
max   long,auto
stPos long              ! start position (return value)
dollarEnded string(size(pRegex)+1)                          
regex &string 

  CODE
  pMinLen = 0; pMaxLen = 0            ! initialise/clear
  if size(pText) = 0 or size(pRegex) = 0 then return 0.

  stPos = strPosUnclipped(pText, pRegex)           ! get start position
  if ~stPos or stPos > size(pText) then return 0.  ! no match
  if stPos = size(pText)        ! single char match
    pMinLen = 1
    pMaxLen = 1
    return stPos
  end
 
  if pRegex[size(pRegex)] = '$' and sub(pRegex,size(pRegex)-1,1) <> '\' 
    regex &= pRegex
  else
    dollarEnded = pRegex & '$'
    regex &= dollarEnded
  end

  max = size(pText) - stPos     ! max increment size
  loop x = 0 to max 
    if strPosUnclipped(pText[stPos : stPos+x],regex)  
       pMaxLen = x + 1
       if pMinLen = 0 then pMinLen = pMaxLen. 
    elsif pMaxLen
       break
    end
  end
  return stPos  ! return starting position

Note the minimum and maximum lengths are in the passed parameters (pMinLen and pMaxLen) while the start position is the returned value (same as with strPos).

The maxLen is handy as often you want the “leftmost longest” match and this code using strPosUnclipped() is much better (and simpler) than earlier versions trying to use a substitute character for spaces. So kudos to Carl for suggesting adding null on the end of a string.

Final comment: pText and pRegex are passed here by reference (for speed). As with StrPosUnclippedByVal() we can here do a wrapper for easy testing or use where we want to pass literals:

StrPosAndLenByVal    PROCEDURE  (string pText,string pRegex,*long pMinLen,*long pMaxLen)
  CODE
  return StrPosAndLen(pText,pRegex, pMinLen, pMaxLen)