StringTheory and GUID from .GuidEncode

Hello,

I have a field in MS SQL databe, that is UUID type. I am getting wrong order in characters when using .GuidEncode() method.

Server Management Studio displays that the value is
8CE489C4-03EE-4F86-A5E4-612F3DEB233F
StringTheory encodes value as
C489E48C-EE03-864F-A5E4-612F3DEB233F

For the last two parts, orders of the characters are OK. But, first three parts are not. I have looked at the documentation, and I can not find anything. Is there some option that needs to be set?

Out of curiosity, I asked ChatGPT for its opinion:

Summary
-------
What you’re seeing is not a StringTheory option or bug. It’s a byte-order (endianness) mismatch between:
1) how SQL Server stores/returns a unique identifier as 16 raw bytes, and
2) how the “standard” GUID text format is displayed (like SSMS shows).

What GuidEncode() does
----------------------
StringTheory’s GuidEncode() expects the current string value to be EXACTLY 16 bytes (binary GUID).
It does NOT reorder any bytes. It simply:
- converts the 16 bytes to 32 hex digits (ToHex)
- formats them into 8-4-4-4-12 groups with hyphens (and optional braces)

So GuidEncode() will output a “correct” textual representation of the raw byte sequence you give it.

Why SSMS shows one value but GuidEncode shows a different “scrambled” value
---------------------------------------------------------------------------
SQL Server’s unique identifier has a well-known quirk:
When represented as raw bytes, the first 3 fields are stored/returned in little-endian order:
- first 4 bytes (Data1) are reversed
- next 2 bytes (Data2) are reversed
- next 2 bytes (Data3) are reversed
The remaining 8 bytes are in the expected order.

So if you fetch a GUID from SQL Server as 16 raw bytes and then simply hex+format it,
the first three groups appear “reversed”, while the last two groups look correct.

Example:
SSMS displays:
  8CE489C4-03EE-4F86-A5E4-612F3DEB233F

GuidEncode() outputs (from raw bytes):
  C489E48C-EE03-864F-A5E4-612F3DEB233F

Notice:
- 8CE489C4 -> C489E48C (bytes reversed in first 4-byte field)
- 03EE     -> EE03     (bytes reversed in next 2-byte field)
- 4F86     -> 864F     (bytes reversed in next 2-byte field)
- A5E4-612F3DEB233F remains the same (last 8 bytes unchanged)

Answer to “Is there an option that needs to be set?”
----------------------------------------------------
No. GuidEncode() has no “SQL GUID byte swap” option. It assumes your 16 bytes are already in the
standard display order.

Fix option A (recommended when you already have 16 bytes): swap the first 3 fields
---------------------------------------------------------------------------------
Reorder the 16 bytes before calling GuidEncode():

- reverse bytes 1–4
- reverse bytes 5–6
- reverse bytes 7–8
- leave bytes 9–16 unchanged

Clarion example:

GuidFixForDisplay PROCEDURE(STRING pGuidBin16),STRING
fixed STRING(16)
  CODE
  ! Reverse first 4 bytes
  fixed[1] = pGuidBin16[4]
  fixed[2] = pGuidBin16[3]
  fixed[3] = pGuidBin16[2]
  fixed[4] = pGuidBin16[1]

  ! Reverse next 2
  fixed[5] = pGuidBin16[6]
  fixed[6] = pGuidBin16[5]

  ! Reverse next 2
  fixed[7] = pGuidBin16[8]
  fixed[8] = pGuidBin16[7]

  ! Last 8 unchanged
  fixed[9 : 16] = pGuidBin16[9 : 16]

  RETURN fixed

Then:
  st.SetValue(GuidFixForDisplay(MyGuid16))
  st.GuidEncode()

Fix option B: don’t fetch as raw bytes
-------------------------------------
If you can, return the GUID already as text from SQL Server and skip GuidEncode():

  SELECT CONVERT(char(36), MyGuidCol) ...
or
  SELECT CAST(MyGuidCol AS varchar(36)) ...

Then treat it as a normal 36-character string.

Practical note
--------------
Exactly where to apply the fix depends on how Clarion/your SQL driver is returning the column:
- If it returns a 16-byte buffer (binary), use Fix A.
- If it returns a 36-char string (already formatted), do NOT use GuidEncode() at all.

A GUID is defined as Long + 2 UShorts + 8 Bytes

typedef struct _GUID {
  unsigned long  Data1;
  unsigned short Data2;
  unsigned short Data3;
  unsigned char  Data4[8];
} GUID;

I would guess ST is just 16 bytes of Hex so Little Endian for Long and Short.

The other reflects that it’s
LONG(8CE489C4h) +USHORT(03EEh) +USHORT(4F86h)
showing Big Endian. That’s the hex you would need in code to specify numbers, unless you specify it as a String. You will see common values like IUnknown in code.

Which is correct? Seems like it would matter how the field is treated when it is a Key. I.e. is it sorted as a STRING(16) so LE, or as LONG+SHORTs so BE. But does sort order really matter with what is a random number.

1 Like

This may be relevant: GUID (Microsoft COM UUID Mixed-Endian) constants in Clarion

1 Like

So, this is Microsoft’s implementation of the GUID. And solution is to rearrange string.

uuidSorted = uuidStr.GetValue()
uuidSorted = uuidSorted[7:8] & uuidSorted[5:6] & uuidSorted[3:4] & uuidSorted[1:2] & |
uuidSorted[9] & |
uuidSorted[12:13] & uuidSorted[10:11] & |
uuidSorted[14] & |
uuidSorted[17:18] & uuidSorted[15:16] & |
uuidSorted[19:36]

Thank you all for help, and explanation.