Another lexicographically sortable unique id generator

Using randomly generated unique ids has a lot of advantages over table based autonumbered fields.

The most common option is to use UUIDs/GUIDs, which are 128 bits long but commonly stored as a 36 bytes string.

Another option available for Clarion programmers is to use StringTheory guids, which are 16 bytes strings with uppercase letters and numbers.

I find two disadvantages of random ids:

  • There is no way to know in what order the records were added, which may be useful in many situations
  • They cause index fragmentation in SQL data bases

A solution to this exists for UUIDs/GUIDs with UUIDv7, there is a Clarion implementation in this site. But I prefer StringTheory guids, as they are more compact and easier to use in manual sql queries.

I had the idea to use Clarion standard date and time to create a big number that expreses the current date and time in hundreths for seconds since december 28 1800, that is, a Clarion standard date/time (similar to Unix time), encode it using base 36 to get the first 8 bytes of the id, and then fill the rest of the id with random base 36 digits.

I uploaded an implementation and sample project to github (tested in C11). To add it to your application, copy MakeSGuid.inc and MakeSGuid.clw to your app’s folder or to accessory\libsrc\win, and add INCLUDE('MakeSGuid.inc'),ONCE inside the global map.

Some comments about the code:

!!! <summary>
!!! Creates a lexicographically sortable unique id with uppercase letters and digits. Example: 90DKZP0B6WYLIFYZ
!!! </summary>
!!! <param name="pLength">The length of the id. Must be between 8 and 32, default is 16</param>
!!! <param name="pDate">Optional, used only when creating an id for old data</param>
!!! <param name="pTime">Optional, used only when creating an id for old data</param>
MakeSGuid           PROCEDURE(LONG pLength = 16,LONG pDate = 0,LONG pTime = 0)!,STRING
systemTime            LIKE(_SYSTEMTIME),AUTO           !System date/time
hundredthsPerDay      GROUP;LONG(8640000);LONG;END     !24*60*60*100 (one day in hundredths of a second)
dayInt64              LIKE(INT64),OVER(hundredthsPerDay) !As int64
base36Constant        GROUP;LONG(36);LONG;END          !Base 36
base36Int64           LIKE(INT64),OVER(base36Constant) !As int64
timestampInt64        LIKE(INT64),AUTO                 !Date/time in hundredths of a second since Dec 28, 1800
timeInt64             LIKE(INT64),AUTO                 !Time value
modResult             LIKE(INT64),AUTO                 !Modulo result
result                STRING(32),AUTO                  !Generated ID
position              LONG,AUTO                        !String position
base36Digits          STRING('0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ') !Base 36 encoding table
timestampDigits       EQUATE(8)                        !Base 36 digits for date/time (valid until 2694)
randomBytes           STRING(24),AUTO                  !Crypto random bytes
randomByteArray       BYTE,DIM(SIZE(randomBytes)),OVER(randomBytes) !As byte array
  CODE
  IF pLength < timestampDigits THEN pLength = timestampDigits. !Minimum length check
  IF pLength > SIZE(result) THEN pLength = SIZE(result).!Maximum length check
  IF NOT pDate
    GetLocalTime(systemTime)                            !Get system time
    pDate = DATE(systemTime.wMonth,systemTime.wDay,systemTime.wYear)  !Convert to Clarion date
    pTime = systemTime.wHour * 360000 + systemTime.wMinute * 6000 + systemTime.wSecond * 100 + systemTime.wMilliseconds * .10 + 1 !Convert to hundredths of a second
  END
  i64Assign(timestampInt64,pDate)                       !timestampInt64 = pDate
  i64Mult(timestampInt64,dayInt64,timestampInt64)       !timestampInt64 *= day
  i64Assign(timeInt64,pTime) ; i64Add(timestampInt64,timeInt64,timestampInt64) !timestampInt64 += pTime
  LOOP position = timestampDigits TO 1 BY -1            !Convert to base 36 (reverse order)
    i64Mod(timestampInt64,base36Int64,modResult)        !modResult = timestampInt64 % 36
    result[position] = base36Digits[ modResult.lo + 1 ] !Get encoded digit
    i64Div(timestampInt64,base36Int64,timestampInt64)   !timestampInt64 /= 36
  END
  IF pLength > timestampDigits
    BCryptGenRandom(0,randomBytes,pLength - timestampDigits,BCRYPT_USE_SYSTEM_PREFERRED_RNG) !Get random bytes
    LOOP position = timestampDigits + 1 TO pLength      !Add random digits
      result[position] = base36Digits[ ( randomByteArray[ position - timestampDigits ] % 36 ) + 1 ] !Convert to base 36 (slight bias acceptable)
    END
  END
  RETURN result[1 : pLength]                            !Return clipped ID

In my laptop (modern i9) it creates one million guids in 1.1 seconds. The initial version used DECIMAL for the big number operations, changing to int64 increased the speed about 50 times. Adding AUTO to the local variables reduced the time in half.

Clarion 11 i64.inc doesn’t include the prototype for i64Mod, but I found it exists in ClaRUN.dll. Copying the prototype from i64Div worked. svapifnc.inc doesn’t include GetLocalTime, but the prototype for GetSystemTime also worked. Both protoypes are in the module’s MAP.

Initially I interrupted the first loop using IF i64Is0(dt64) and did some string operations to left pad the id with zeros. But it turns out that not breaking the loop does the same, with less code and probably faster.

For many applications, it may be enough to use a length of 12 (8 bytes for date/time and 4 bytes of random), but in my tests of creating one million queue records in a few seconds, there were some duplicates. Using the default length of 16 worked reliably.

Using Myid = MakeSGuid(8,Mydate,Mytime) is a way to encode date and time in a single sortable string field. Not very practical for databases, as it’s not easily readable, but may be useful for some in memory queue based algorithms.

I hope this is useful to somebody.

Carlos

Edit June 15, 2025: Fixed collision in multi-threaded use by using BCryptGenRandom to get random data.
Edit June 16, 2025: Corrected external functions prototypes (credit: Carl Barnes). Added AUTO to randomData.
Edit June 17, 2025: Added pseudo-constants, removed tmp64. AI-improved variable names and comments.

5 Likes

Hello
I am using your code together with MariaDB v11.8.2 (latest stable version) and MariaDB ODBC v3.2.6 (latest stable version) and when testing writing from 3 different threads (simultaneously), I sometimes get an error (see image below) - at least once in a few minutes.
Each process writes random (1 to 2000 records) records to the database each 3 seconds.. error occur after few minutes or 100000 records (approx.).
Do you have some idea?

What method are you using to write the records…Access:Filename.Insert() or a Prop:SQL statement?

If you’re not using the latter, give it a try and see if the problem goes away?

Thanks for the suggestion, but it’s not possible to change the entire code to use Prop:SQL. It’s a web project (no browsing or forms) and all data is entered using Filename.Insert.
I’ll go back to the CoCreateGUID API

Hi Daniel

I did some testing using the InMemory driver and yes, I got duplicate IDs even with a few threads. It looks like Clarion’s random generator is not that good. I suspect that in some cases it returns the same value to different threads.

To fix this, I changed the code to call BCryptGenRandom and with that change I was able to generate 1,000,000 IDs in 100 threads (10,000 IDs each) without collisions.

Please download the updated version from GitHub. You’ll also need to download bcryptrandom.lib and save it in accessory\lib or the app folder.

Thanks for reporting this. I’ll appreciate your feedback if you try the updated version

Regards,

Carlos

2 Likes

Hi Carlos,

We use table based autonumbering because that way we can make a difference between filling tables partly with our data and for the bigger other part with the date from the client. In example a range of or ie all negative numberd record id’s are and can only be changed by ower company.

Do you see a opportunity this could be achieved by your id generator?

Hello @CarlosGtrz , just tested and it works fine now.
thank you for the update.

@Dirk1, we found that using auto-numbering is great when your software is used by only one user. If it is used by 2 users simultaneously (or more), you should avoid it (we learned this the hard way).

Hi Dirk,

I’m not sure if I understood your question, but I don’t think encoding ownership of the record in the ID is a good idea.

I’d suggest using a compound key that clearly indicates who owns and can change a record, something like:

Table:
Id STRING(16) !SGUID
Owner LONG !0 for internal record, 1 for client record or maybe some client ID number

Primary key:
Id

Secondary key:
Owner,Id

If I didn’t understand your need, please feel free to elaborate, and maybe add some example values about how you are currently using autonumbered keys.

1 Like

Hi Carlos,

Thanks, it’s a good approach.