Another lexicographically sortable unique id generator

Using randomly generated unique ids has a lot of advantages over table based autonumbered fields.

The most common option is to use UUIDs/GUIDs, which are 128 bits long but commonly stored as a 36 bytes string.

Another option available for Clarion programmers is to use StringTheory guids, which are 16 bytes strings with uppercase letters and numbers.

I find two disadvantages of random ids:

  • There is no way to know in what order the records were added, which may be useful in many situations
  • They cause index fragmentation in SQL data bases

A solution to this exists for UUIDs/GUIDs with UUIDv7, there is a Clarion implementation in this site. But I prefer StringTheory guids, as they are more compact and easier to use in manual sql queries.

I had the idea to use Clarion standard date and time to create a big number that expreses the current date and time in hundreths for seconds since december 28 1800, that is, a Clarion standard date/time (similar to Unix time), encode it using base 36 to get the first 8 bytes of the id, and then fill the rest of the id with random base 36 digits.

I uploaded an implementation and sample project to github (tested in C11). To add it to your application, copy MakeSGuid.inc and MakeSGuid.clw to your app’s folder or to accessory\libsrc\win, and add INCLUDE('MakeSGuid.inc'),ONCE inside the global map.

Some comments about the code:

!!! <summary>
!!! Creates a lexicographically sortable unique id with uppercase letters and digits. Example: 90DKZP0B6WYLIFYZ
!!! </summary>
!!! <param name="pLength">The length of the id. Must be between 8 and 32, default is 16</param>
!!! <param name="pDate">Optional, used only when creating an id for old data</param>
!!! <param name="pTime">Optional, used only when creating an id for old data</param>
MakeSGuid           PROCEDURE(LONG pLength = 16,LONG pDate = 0,LONG pTime = 0)!,STRING
sysdt                 LIKE(_SYSTEMTIME),AUTO           !To get the system local date and time
dt64                  LIKE(INT64),AUTO                 !To store date/time in hundredths of seconds since December 28 1800
tmp64                 LIKE(INT64),AUTO                 !Temporary variable to use with i64 operations
mod64                 LIKE(INT64),AUTO                 !To store dt % 36
guid                  STRING(32),AUTO                  !The returned id
idx                   LONG,AUTO                        !Index for string slicing
base36                STRING('0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZ') !Lookup table for base 36 encoding
digitsfordt           EQUATE(8)                        !Base 36 digits for the date/time part. Enough for dates until year 2694
  CODE
  IF pLength < digitsfordt THEN pLength = digitsfordt. !Check por valid length
  IF pLength > SIZE(guid) THEN pLength = SIZE(guid).
  IF NOT pDate
    GetLocalTime(sysdt)                                !Better accuracy than CLOCK() at the hundredths level
    pDate = DATE(sysdt.wMonth,sysdt.wDay,sysdt.wYear)  !Convert to clarion standard date and time
    pTime = sysdt.wHour * 360000 + sysdt.wMinute * 6000 + sysdt.wSecond * 100 + sysdt.wMilliseconds * .10 + 1 !60*60*100, 60*100
  END
  i64Assign(dt64,pDate)                                !dt64 = pDate
  i64Assign(tmp64,8640000) ; i64Mult(dt64,tmp64,dt64)  !dt64 *= 8640000  (24*60*60*100)
  i64Assign(tmp64,pTime)   ; i64Add(dt64,tmp64,dt64)   !dt64 += pTime
  i64Assign(tmp64,36)                                  !To use inside the loop
  LOOP idx = digitsfordt TO 1 BY -1                    !Convert to base 36, starting with last digit
    i64Mod(dt64,tmp64,mod64)                           !mod64 = dt64 % 36
    guid[idx] = base36[ mod64.lo + 1 ]                 !Get the encoded the digit. mod64.lo is a ULONG with the lower part of the int64
    i64Div(dt64,tmp64,dt64)                            !dt64 /= 36
  END
  LOOP idx = digitsfordt + 1 TO pLength                !Fill the rest of the string with random digits
    guid[idx] = base36[ RANDOM(1,36) ]
  END
  RETURN guid[1 : pLength]                             !Return clipped id

In my laptop (modern i9) it creates one million guids in 1.1 seconds. The initial version used DECIMAL for the big number operations, changing to int64 increased the speed about 50 times. Adding AUTO to the local variables reduced the time in half.

Clarion 11 i64.inc doesn’t include the prototype for i64Mod, but I found it exists in ClaRUN.dll. Copying the prototype from i64Div worked. svapifnc.inc doesn’t include GetLocalTime, but the prototype for GetSystemTime also worked. Both protoypes are in the module’s MAP.

Initially I interrupted the first loop using IF i64Is0(dt64) and did some string operations to left pad the id with zeros. But it turns out that not breaking the loop does the same, with less code and probably faster.

For many applications, it may be enough to use a length of 12 (8 bytes for date/time and 4 bytes of random), but in my tests of creating one million queue records in a few seconds, there were some duplicates. Using the default length of 16 worked reliably.

Using Myid = MakeSGuid(8,Mydate,Mytime) is a way to encode date and time in a single sortable string field. Not very practical for databases, as it’s not easily readable, but may be useful for some in memory queue based algorithms.

I hope this is useful to somebody.

Carlos

5 Likes