StringTheory RegEx help finding Date in Text

I’ve got some garbage data created by carbon life forms which is not consistent from day to day. Sometimes they write additional narrative in a CSV column which is supposed to contain a date.

I’m not at all conversant with regex.

I tried using StringTheory findMatch with either of these as the regex string

'\d{{1,2}\/\d{{1,2}\/\d{{2,4}'

'\b(?:Jan(?:uary)?|Feb(?:ruary)?|Mar(?:ch)?|Apr(?:il)?|May|Jun(?:e)?|Jul(?:y)?|Aug(?:ust)?|Sep(?:tember)?|Oct(?:ober)?|Nov(?:ember)?|Dec(?:ember)?)\s+\d{{1,2}(?:st|nd|rd|th)?(?:,\s*)?\s+\d{{4}\b'

and neither extracts the date submitted as either Sep 10, 2025 or 9/10/2025.

In stringTheory.clw I can see that it’s calling strPos with the search string being the pRegEx parameter.

A simple text regex gr[ae]y to find “grey” or “gray” works. But I haven’t been able to get one of the date formats to work.

Claude kindly wrote me a parsing procedure in straight Clarion code that works well, but I’m curious as to what specific regex would have worked using the StringTheory method.

Hi Jane

there are lots of different versions of regex and Clarion’s implementation (used in both StrPos and Match) is a bit idiosyncratic and limited compared to many others.

so if you look in the Clarion help for strPos, or at

you will see what the valid options are - and you can see your syntax is not supported.

a while ago there was a discussion about quantifiers like your {1,2} and I wrote a function to expand them to valid Clarion syntax. see:

but in your case your might just write [0-9][0-9]? to indicate one or two digits.

so your first might be something like

‘[0-9][0-9]?/[0-9][0-9]?/[0-9][0-9][0-9]?[0-9]?’

or

ExpandRegex(‘[0-9]{1:2}/[0-9]{1:2}/[0-9]{2:4}’)

I am not sure about your second regex with the month names - maybe you need something like

January|Jan|February|Feb|… and so on

Others may know Clarion’s regex better than me but TBH it might be easier to just write straight code - and in that vein I would be interested to see what Claude wrote for you.

cheers

Geoff R

just out of interest I asked Claude to write code for your second regex with the month names and it was certainly quick to write it - I would have to study the code to see if it actually would work or not. Claude said:

I’ve created Clarion code that matches your date regex pattern. The code includes two procedures:

  1. MatchDatePattern - Returns TRUE if the text contains a date matching the pattern
  2. ExtractDateFromText - Extracts and returns the first matching date string from the text

The code matches:

  • Month names (full or abbreviated): January/Jan, February/Feb, etc.
  • Day numbers (1-31) with 1-2 digits
  • Optional ordinal suffixes (st, nd, rd, th)
  • Optional comma after the day
  • 4-digit year

Example matches:

  • “January 15, 2024”
  • “Jan 1st 2024”
  • “December 31st, 2023”
  • “Mar 3rd 2024”

The code manually parses the text since Clarion doesn’t have built-in regex support. It scans through the input string character by character, looking for the date pattern components in sequence.

I thought the line “since Clarion doesn’t have built-in regex support” was a bit harsh - perhaps it could say Clarion has limited built-in regex support.

anyway if you are interested to see Claude’s suggested code - it is published at:

https://claude.ai/public/artifacts/01d790d6-b750-48e2-a14a-65f5a7c0e7c4

is that the same as what it did for you Jane?

I wonder how wrapping something like this would perform vs Clarion’s MATCH() GitHub - kokke/tiny-regex-c: Small portable regex in C

Thanks for the replies, guys.

This is a link to my chat going back and forth trying to get working code:
https://claude.ai/share/fbd585f2-3dbf-4db9-9aa6-bd8ca184ceb5

I was compiling each new iteration of the generated code and complaining about what didn’t work.

For those who don’t have an account, I tried saving the chat as a PDF. Not well formatted, but readable. Final piece was sending “him” a few lines from Clarion help explaining the parameter order of INSTRING.
“Perseverance furthers!”

I’ll attach that chat PDF along with the final version of the code.

dateRegex.pdf (106.5 KB)
regexCode.txt (7.7 KB)

With thanks to Bruce and to John Hickey for pushing me ankle-deep into the pond :wink:

You’ll hate me for saying this Jane, but I believe you’re falling into an X Y Problem trap here.
The problem is that the format of the date is not necessarily consistent with what you were hoping for.
Your solution is to detect the issue, and presumably complain?
But, a better approach is to simply accept the date format as it is, and deal with it as it is.

You’ll be aware that the web server already does this (using StringTheory ). You can enter a date in any format, and it corrects it to be in the desired format. Most date entries are unambiguous, and for the small overlap that are (dd/mm vs mm/dd where d <=12) a suitable default can be used (or deduced from other dates.)

See https://www.capesoft.com/docs/StringTheory3/StringTheory.htm#sdfDeformatDate , DeformatDate method. This deformats the date (regardless of format) falling back on the picture only if the date text is ambiguous.

In other words, your X problem is “import the date”. This method lets you do that.
The Y problem you went with was “detect the wrong date format” - which lead you to RegEx.
(In my opinion, most problems that lead to a REGEX solution are Y problems. Regex is useful for reporting an error - but the error doesn’t need to be reported, it needs to be fixed.)

I can provide a contracting service for this regex problem if interested.

PM if interested, with your requirements, sample data and deadline.

Thanks, Richard, but my current solution is “good enough”.

NO, Obi Wan! I could never hate you!

But I don’t have an XY problem. Some of my best friends have XY chromosomes. Really! Hairy and prone to foibles though they may be… :smiling_face_with_sunglasses:

Actually, I do have an X Y problem here.

But I don’t think mine is the X Y problem that you think it is.

I’m not having problems deformating different date formats. I can actually throw most valid date formats as strings to Mr. SQL and have him happily CAST them to DATE types. It isn’t “import the date”.

My problem is that I have an internal NetTalk app for managing follow-up of our patients after they’ve been seen in hospital or in an emergency room visit.

Some of the external data comes from rock-solid hospital systems, produced by an IT department that’s been solid for years, and is automatically ingested into my app’s data tables by SQL Server Integration Services.

Some of the data must be manually uploaded from spreadsheets produced by nurses (not ours), which from day to day may have unexpected changes. (And, not relevant to this issue but emblematic, the sheets are Excel with multiple tabs, not CSV; because trying to make my life simpler is not their primary concern.)

Although I’ve had our nurses tell their nurses that columns need to hold the data they’re supposed to hold, and that other stuff should go into a Comments field, and that the format needs to be consistent from day to day, they’re nurses. They really want to communicate.

So when our nurses try to upload an Excel spreadsheet to the NetTalk app and the dischargeDate field contains a date (in whatever format), all is well with the world.

But when the field contains “date changed to 9/22/2025 because of relapse” my normal parsing fails.

So yes, an X Y problem.

My X problem is that I need to require data consistency from non-computer people who don’t work for our organization and who are trying to be helpful. And each time I complain about one data issue they fix that and then create a new one.

So instead of continuing to fight the good fight and clean up the source, my Y problem amounts to trying to do a bit of defensive programming. Not to detect the wrong format and complain (I’m plenty good at that), but to try to deal with what we’re given.

So the code I asked for was to determine whether there’s a date within a string; and if so, to extract the date.

Because I’m a lover, not a fighter… :roll_eyes:

1 Like

Hi Jane

I can see you have coaxed a solution from Claude but really the code it has generated is pretty woeful.

I was really impressed with your perserverance telling it/him about the instring parameters and was also impressed with Claude’s debugging skills, adding in debugging information and getting you to run it and report back. Still the end result leaves a lot to be desired. I have heard John H. say he doesn’t bother looking at the generated code - as long as it works - and this does seem to be the direction with vibe programming but I do shudder thinking of all the, (shall we say) “less than ideal” code this will result in.

If we start at the first line of code in ExtractDate we see

TempText = CLIP(pText)

where tempText is String(255). So that clip does nothing apart from slow things down. And why bother with the assign at all - all it does is restrict the imput to 255 chars - what if the date is after that? I have railed for years against people using unnecessary clips and this code is just filled with them:

  MonthStr = CLIP(SUB(pDateStr, 1, Slash1 - 1))
  DayStr = CLIP(SUB(pDateStr, Slash1 + 1, Slash2 - Slash1 - 1))
  YearStr = CLIP(SUB(pDateStr, Slash2 + 1, LEN(CLIP(pDateStr)) - Slash2))

those three fields are all strings so again clip doesn’t achieve anything. Then take this example where RESULT is string(50):

      Result = CLIP(Tokens[i]) & ' ' & CLIP(NextToken) & ', ' & CLIP(YearToken)
      RETURN Result

again a wasted clip - in order to return the clipped result it would be better to

return CLIP(Tokens[i]) & ' ' & CLIP(NextToken) & ', ' & CLIP(YearToken)

I’m sure AI will get better and will one day write reasonable Clarion code but it seems to be nowhere near it yet, at least in my experience.

My experience thus far is that it would be easier to write the code from scratch than waste my life correcting infuriating code! YMMV

Anyway seeing the original question mentioned StringTheory I thought I would knock up some code for comparison using ST. To me it is much cleaner and efficient. If you have time could you please test this and confirm it works as desired. The only difference really is I chose to return the Clarion date (as a long) or zero where no valid date is found, and was more flexible about the separators for the component parts of a date.

Anyway see how you go

cheers

Geoff R

ExtractDate          PROCEDURE  (STRING pText)

! Extracts dates in formats:
!   mm/dd/yyyy  mm-dd-yyyy
!   Mar 5, 2024 or Mar 5 2024 or Mar 5th 2024
!   March 21 2025 or March 21, 2025 or March 21st 2025
!
! Returns Clarion date
!
! (c) 13th October 2025, Geoffrey Robinson and released under MIT license
!
st    stringTheory
x     long,auto
month long
day   long
year  long
  CODE
  st.setValue(pText, st:clip)
  st.splitIntoWords(,,,false) ! false=don't use smart words
  loop x = 1 to st.records()
    if ~month
      month = st.getLine(x)
      if month >= 1 and month <= 12 
        ! valid numeric month
      else
        case lower(st.getLine(x))                 ! check month names
        of 'jan' orof 'january'   ; month = 1
        of 'feb' orof 'february'  ; month = 2
        of 'mar' orof 'march'     ; month = 3
        of 'apr' orof 'april'     ; month = 4
        of 'may'                  ; month = 5
        of 'jun' orof 'june'      ; month = 6
        of 'jul' orof 'july'      ; month = 7
        of 'aug' orof 'august'    ; month = 8
        of 'sep' orof 'september' ; month = 9
        of 'oct' orof 'october'   ; month = 10
        of 'nov' orof 'november'  ; month = 11
        of 'dec' orof 'december'  ; month = 12
        else                      ; month = 0     ! invalid month name
        end
      end
    elsif ~day
      st.setValueFromLine(x)
      if st.isAllDigits()
        day = st.getValue()
      else
        st.lower()
        if st.endsWith('st') or st.endsWith('nd') or st.endsWith('rd') or st.endsWith('th')
          st.adjustLength(-2)
          day = st.getValue()
        end
      end   
      if day < 1 or day > 31  ! invalid day number
        day = 0
        month = 0
        x -= 1  ! go back a token
      end
    elsif ~year
      year = st.getLine(x)
      if year < 1900 or year > 2100  ! some year validation
        year = 0
        month = 0
        day = 0
        x -= 2  ! go back two tokens
      else
        return date(month,day,year)  ! date found
      end
    end
  end

  return 0 ! no valid date found

#Edit 1: tweaked code with invalid year to go back two tokens
#Edit 2: ditto for invalid day - go back one token

Thanks, Geoff. I’ll look at your version later today.

A few items.

  1. My original brief flirtation with using StringTheory was to try to use the built-in regex matching documented for the FindMatch method. With my limited (er, virtually nil) knowledge of regex I wasn’t able to make that work for my date formats. Is that a viable approach?

  2. The arbitrary 255 character limit of TempText doesn’t bother me in this context. If someone is really going to write more than that into what is supposed to be a pure date field, they deserve the suffering that will ensue!

  3. Sad to say, I think that sometimes the pursuit of la vitesse cannot be justified. I spent some years writing assembler, including projects using 68705 microcontrollers that had 112 bytes of RAM (including the stack) and 3K of EEPROM, where saving a byte by using a branch rather than a jump instruction brought shivers of joy.
    I ran the testing code that Claude supplied (calling the ExtractDate procedure once with each of three valid date formats and once with no valid date) wrapped in a 5000X loop. The 5000 iterations of the four test calls ran in less than a half second.
    Considering that the particular spreadsheets where I need this bullet-proofing will have 10 to 100 patient records, and that this is less a labor-of-love than just one more in a string of projects that I need to get working, additional time spent optimizing for speed and/or elegance is difficult to justify on my timesheet.

  4. There are many things in life that I could find infuriating. But it’s too much bother :wink:

But I’ll play with your code. Thanks for taking the time! And please let me know about regex with FindMatch, if you have insight.

jf

So I ran your code and Claude’s code in this loop, Geoff

    _start   = clock()
    loop _x = 1 to 5000
      InputText = 'The meeting is scheduled for 03/15/2024 at 2pm'
      ExtractedDate = ExtractDate(InputText)
      _firstResult = clock()
      InputText = 'Event date is Mar 5, 2024 or thereabouts'
      ExtractedDate = ExtractDate(InputText)
      _secondResult = clock()
    
      InputText = 'The deadline is March 21 2025 at midnight'
      ExtractedDate = ExtractDate(InputText)   
      _thirdResult = clock()
    
      InputText = 'No date in this string at all'
      ExtractedDate = ExtractDate(InputText)    
    end ! loop
    
    
    _end = clock()
    
    message('first=' & _firstResult - _start & '  second='&_secondResult - _firstResult &'   third=' &_thirdResult-_secondResult & '   end='&_end - _thirdResult & '   _start='&_start&'   _end='&_end & '   ticks='&_end-_start)
      return

Claude’s time in clock ticks for five runs of the loop were
56,66,60,62,71

Your code’s were
24,23,23,24,25,24

BTW, Geoff, he/she/it was happy to remove the extraneous TempText variable.

1 Like

thanks for the tests Jane - did both ways give the same results?

I was thinking about the code in my sleep last night and realised there is a much simpler way to do my code without needing to backtrack on invalid year or day: Simply scan for the year first then check the two preceding tokens:

ExtractDate          PROCEDURE  (*STRING pText)

! Extracts dates in formats:
!   mm/dd/yyyy  mm-dd-yyyy
!   Mar 5, 2024 or Mar 5 2024 or Mar 5th 2024
!   March 21 2025 or March 21, 2025 or March 21st 2025
!
! Returns Clarion date
!
! (c) 13-14 October 2025, Geoffrey Robinson and released under MIT license
!
st    stringTheory,static,thread
x     long,auto
month long
day   long
year  long
  CODE
  st.setValue(pText, st:clip)
  st.splitIntoWords(,,,false) ! false=don't use smart words
  loop x = 3 to st.records() 
    year = st.getLine(x)
    if year < 1900 or year > 2100 then cycle. ! invalid year

    day = st.getLine(x-1)
    if ~day
      st.setValueFromLine(x-1)
      st.lower()
      if st.endsWith('st') or st.endsWith('nd') or st.endsWith('rd') or st.endsWith('th')
        st.adjustLength(-2)
        day = st.getValue()
      end
    end
    if day < 1 or day > 31 then cycle. ! invalid day number

    month = st.getLine(x-2)
    if ~month 
      case lower(st.getLine(x-2))           ! check month names
      of 'jan' orof 'january'   ; month = 1
      of 'feb' orof 'february'  ; month = 2
      of 'mar' orof 'march'     ; month = 3
      of 'apr' orof 'april'     ; month = 4
      of 'may'                  ; month = 5
      of 'jun' orof 'june'      ; month = 6
      of 'jul' orof 'july'      ; month = 7
      of 'aug' orof 'august'    ; month = 8
      of 'sep' orof 'september' ; month = 9
      of 'oct' orof 'october'   ; month = 10
      of 'nov' orof 'november'  ; month = 11
      of 'dec' orof 'december'  ; month = 12 
      end
    end
    if month >= 1 and month <= 12
      return date(month,day,year)  ! date found
    end
  end

  return 0 ! no valid date found

As you can see that code is way simpler and elegant. And yes quicker. I realise you are not worried about times as they are already “good enough” but could you please rerun those tests with this new version and probably remove (comment out) the three clock() assigns in the loop in case they may be distorting things if they are relatively expensive. IOW just have the _start and _end either side of the loop.

Regarding your “brief flirtation with using StringTheory” I would encourage you to instead use ST all the time as a default. The more you use it the more you will be a convert. You can see above how it can drastically simplify code to get a lot done in just a few lines of code. In this case by splitting the tokens into a queue we are able to address each one directly. Anyway the less code you have to write the less you have to maintain and the fewer bugs you are likely to have… just sayin’ :grinning_face:

And please let me know about regex with FindMatch, if you have insight.

Ah yes that was my first message back at the top of the thread - in essence Clarion’s regex (as used in st.FindMatch) is somewhat limited but I gave a couple of suggestions. Also that earlier link I gave had Rick M. mention PCRE (Perl Compatible Regular Expressions) that he uses - that is way more sophisticated than Clarion’s Match or StrPos.

You could go down the regex path, but often it tends to lead to complex almost indecipherable regex’s. Mind you, AI seems quite good at constructing and explaining them so maybe it is not as bad as it used to be - but I always remember Jamie Zawinski’s famous quote:

Some people, when confronted with a problem, think “I know, I’ll use regular expressions.” Now they have two problems.

anyway cheers for now

Geoff R
#Edit 1 - I noticed you were passing the string via a variable so I just changed the prototype to pass by reference ie. *STRING rather than STRING
#Edit 2 - tweaked month code a little
#Edit 3 - added “static,thread” to st declaration

Geoff, I use String Theory all the time in every app I write.

My “brief flirtation” was with trying to make the FindMatch method work with my limited regex skills.

1 Like

sorry, preaching to the converted!

1 Like

Thanks for sharing that, Jane … I’ve been having similar “conversations” with CoPilot (in Deep Think mode), trying to get a PowerShell script working in Task Scheduler.

It’s an intriguing way of working - I’m left with a mix of awe and frustration; the former a result of being impressed with well how it can interpret my needs - and then frustration at the circles it leads one into!

John M

1 Like

Well said!
Here’s the current chat leading up to version 29 of “his” code.
https://claude.ai/share/fbd585f2-3dbf-4db9-9aa6-bd8ca184ceb5

The LIBXL library I’m using outputs dates in an iso format, which I didn’t take into account in my first run at the code.
So I added that as a request.

Then “he” started adding/fixing things and breaking working things.

Then fixed the things “he” broke and broke something else.

And on and on.

I’m a cat person. (Although I’m not personally owned by one.)
This feels like arguing with a puppy that just wants to be cuddly and helpful and obsequious.

For those not having an account (or if the link above isn’t current), below are screenshots.

1 Like

So I asked ChatGPT what was leading Claude to go down the rabbit hole.
Its response makes sense to me, and it aligns partly with how I use AI myself. I tend to work in VS Code with GitHub Copilot when dealing with Clarion code — that way the context is always available to the model, and it avoids losing track of the current file state. Of course, that comes with a monthly subscription, but the benefit is that it keeps edits consistent and avoids exactly the sort of back-and-forth drift you see in Claude’s session below.

:backhand_index_pointing_down:


:puzzle_piece: Why this happens

What you’re seeing in the screenshots is not unusual — it’s a symptom of Claude (and other LLMs) losing code context and regenerating inconsistently.
Here’s what typically goes wrong:

  1. No persistent memory of the current code.
    Each time you say “fix these two things,” the model tries to reconstruct what it thinks the current version looks like from the chat history — but it doesn’t actually have the source file open or synchronized.
    So when it “fixes” one part, it can unintentionally:
  • Reintroduce an old variable (TempText)
  • Duplicate a procedure (GetMonthNumber)
  • Or revert logic you had just corrected.
  1. Ambiguous incremental editing.
    Instructions like “remove the duplicate” or “you didn’t fix it” depend on the model correctly remembering what’s in the latest version.
    Once that mental state drifts (and it often does after 5–6 turns), it applies the wrong patch.
  2. Overwriting by summarization.
    Claude often paraphrases or “summarizes” what the code should look like instead of diffing against your actual version — leading to fragments disappearing or duplicated blocks reappearing.

:light_bulb: Recommendations to avoid this “rabbit hole”

  1. Always paste the full current version.
    Each time you ask for a fix, start with:

“Here is the current version of my code. Please apply your changes directly to this version.”
That gives the model an explicit baseline to edit.

  1. Be precise with your change request.
    Instead of “fix both issues,” say:

“Remove any reference to TempText and make sure ExtractDate calls both ExtractNumericDate and ExtractTextDate in sequence.”
Avoid conversational shorthand like “fix it again” or “you didn’t fix it.”

  1. Ask for full rewrites periodically.
    After several incremental fixes, say:

“Please rewrite the entire file cleanly with all current corrections integrated.”
This resets any context drift and produces a coherent final version.

  1. Use line-anchored diffs when possible.
    You can say:

“In the function ExtractDate (around line 42), remove the TempText assignment.”
This helps the model apply a targeted change rather than guessing.

  1. Do not rely on it remembering previous code.
    Treat every edit as stateless: assume it forgot the exact code and you must re-supply it.
  2. Summarize the change outcome you expect.
    End your request with something like:

“After your change, there should be no references to TempText, and ExtractDate should call both extraction methods in that order.”
That gives the model a testable goal.

  1. Use smaller, atomic changes.
    Make one modification per turn if possible. Combining multiple changes (“remove duplicates, change parameters, and add ISO parsing”) increases the chance of regression.
3 Likes

Geoff,

I tried your new version.
Clock ticks were 31,31,31,30,30
Clock ticks for original version were 22,23,22,24,23

I’ll paste what I’m compiling

st                      StringTheory
                    map
ExtractDate           PROCEDURE  (*STRING pText),LONG
ExtractDateV1           PROCEDURE  (STRING pText),LONG
                    end
InputText               STRING(255)
ExtractedDate           STRING(50)
_start                  TIME
_end                    TIME
_x                      LONG


  CODE
! Start of "Add additional DebugHook statements"
! [Priority 5000]

! End of "Add additional DebugHook statements"
! Start of "Processed Code"
! [Priority 5000]

    _start   = clock()
    loop _x = 1 to 5000
      InputText = 'The meeting is scheduled for 03/15/2024 at 2pm'
      ExtractedDate = ExtractDate(InputText)
      InputText = 'Event date is Mar 5, 2024 or thereabouts'
      ExtractedDate = ExtractDate(InputText)
    
      InputText = 'The deadline is March 21 2025 at midnight'
      ExtractedDate = ExtractDate(InputText)      
      InputText = 'No date in this string at all'
      ExtractedDate = ExtractDate(InputText)    
    end ! loop
    _end = clock()
    
    message('New version  _start='&_start&'   _end='&_end & '   ticks='&_end-_start)
    
    
    _start   = clock()
    _end = ''
    loop _x = 1 to 5000
      InputText = 'The meeting is scheduled for 03/15/2024 at 2pm'
      ExtractedDate = ExtractDateV1(InputText)
      InputText = 'Event date is Mar 5, 2024 or thereabouts'
      ExtractedDate = ExtractDateV1(InputText)
    
      InputText = 'The deadline is March 21 2025 at midnight'
      ExtractedDate = ExtractDateV1(InputText)      
      InputText = 'No date in this string at all'
      ExtractedDate = ExtractDateV1(InputText)    
    end ! loop
    _end = clock()
    
    message('Old version  _start='&_start&'   _end='&_end & '   ticks='&_end-_start)    
    
    return

ExtractDate         PROCEDURE  (*STRING pText)

! Extracts dates in formats:
!   mm/dd/yyyy  mm-dd-yyyy
!   Mar 5, 2024 or Mar 5 2024 or Mar 5th 2024
!   March 21 2025 or March 21, 2025 or March 21st 2025
!
! Returns Clarion date
!
! (c) 13-14 October 2025, Geoffrey Robinson and released under MIT license
!
st                    stringTheory
x                     long,auto
month                 long
day                   long
year                  long
  CODE
    st.setValue(pText, st:clip)
    st.splitIntoWords(,,,false) ! false=don't use smart words
    loop x = 3 to st.records() 
      year = st.getLine(x)
      if year < 1900 or year > 2100 then cycle. ! invalid year

      day = st.getLine(x-1)
      if ~day
        st.setValueFromLine(x-1)
        st.lower()
        if st.endsWith('st') or st.endsWith('nd') or st.endsWith('rd') or st.endsWith('th')
          st.adjustLength(-2)
          day = st.getValue()
        end
      end
      if day < 1 or day > 31 then cycle. ! invalid day number

      month = st.getLine(x-2)
      if ~month 
        case lower(st.getLine(x-2))           ! check month names
          of 'jan' orof 'january'   ; month = 1
          of 'feb' orof 'february'  ; month = 2
          of 'mar' orof 'march'     ; month = 3
          of 'apr' orof 'april'     ; month = 4
          of 'may'                  ; month = 5
          of 'jun' orof 'june'      ; month = 6
          of 'jul' orof 'july'      ; month = 7
          of 'aug' orof 'august'    ; month = 8
          of 'sep' orof 'september' ; month = 9
          of 'oct' orof 'october'   ; month = 10
          of 'nov' orof 'november'  ; month = 11
          of 'dec' orof 'december'  ; month = 12 
        end
      end
      if month >= 1 and month <= 12
        return date(month,day,year)  ! date found
      end
    end

    return 0 ! no valid date found




ExtractDateV1         PROCEDURE  (STRING pText)

! Extracts dates in formats:
!   mm/dd/yyyy  mm-dd-yyyy
!   Mar 5, 2024 or Mar 5 2024 or Mar 5th 2024
!   March 21 2025 or March 21, 2025 or March 21st 2025
!
! Returns Clarion date
!
! (c) 13th October 2025, Geoffrey Robinson and released under MIT license
!
!st                    stringTheory
x                     long,auto
month                 long
day                   long
year                  long
  CODE
    st.start()
    st.setValue(pText, st:clip)
    st.splitIntoWords(,,,false) ! false=don't use smart words
    loop x = 1 to st.records()
      if ~month
        month = st.getLine(x)
        if month >= 1 and month <= 12 
          ! valid numeric month
        else
          case lower(st.getLine(x))                 ! check month names
            of 'jan' orof 'january'   ; month = 1
            of 'feb' orof 'february'  ; month = 2
            of 'mar' orof 'march'     ; month = 3
            of 'apr' orof 'april'     ; month = 4
            of 'may'                  ; month = 5
            of 'jun' orof 'june'      ; month = 6
            of 'jul' orof 'july'      ; month = 7
            of 'aug' orof 'august'    ; month = 8
            of 'sep' orof 'september' ; month = 9
            of 'oct' orof 'october'   ; month = 10
            of 'nov' orof 'november'  ; month = 11
            of 'dec' orof 'december'  ; month = 12
            else                      ; month = 0     ! invalid month name
          end
        end
      elsif ~day
        st.setValueFromLine(x)
        if st.isAllDigits()
          day = st.getValue()
        else
          st.lower()
          if st.endsWith('st') or st.endsWith('nd') or st.endsWith('rd') or st.endsWith('th')
            st.adjustLength(-2)
            day = st.getValue()
          end
        end   
        if day < 1 or day > 31  ! invalid day number
          day = 0
          month = 0
          x -= 1  ! go back a token
        end
      elsif ~year
        year = st.getLine(x)
        if year < 1900 or year > 2100  ! some year validation
          year = 0
          month = 0
          day = 0
          x -= 2  ! go back two tokens
        else
          return date(month,day,year)  ! date found
        end
      end
    end

    return 0 ! no valid date found