Clarion 12 way to handle UTF-8 Unicode String type?

Not disagreeing with what you’re saying, but if there was a UTF-8 string type, that would be awesome. And then if saying MyUTF16String = MyUTF8String would convert them, that would also be awesome.

Right now I’m not sure whether using WinAPI would be necessary to consume UTF-8.

This seems like a separate topic, and important as the web is commonly UTF-8.

The current help on TOUNICODE() and TOANSI() mentions UTF-8 strings can be converted using code page 65001.

The TOUNICODE function also allows to convert ANSI and Unicode strings to a character sequence encoded in UTF-8 by passing the value 65001 as the second parameter.

As a side effect, the TOANSI function also allows to convert ANSI and Unicode strings to a character sequence encoded in UTF-8 by passing value 65001 as the second parameter.


If you search for Delphi and UTF-8 you’ll find older examples where they used their regular ANSI String type to hold UTF-8. They had functions like ToUncode to encode.

Newer Delphi versions have a explicit Utf8String type.

UTF8String represents a string encoded using UTF-8 (variable number of bytes Unicode). It is a AnsiStringBase type with a UTF-8 code page. In Delphi, UTF8String is a true compiler type. The compiler does conversions between UnicodeString and UTF8String as necessary.

I agree it would be useful to have an explicit UTF-8 type like EString (8 string) or NString (narrow string).

1 Like

While a native type would be cool, its not necessary. We already have access to utf-8 support in StringTheory (making use of Win32 API calls.)

The only thing we lack at the moment is support for any unicode encoding on Windows and Reports. And presumably USTRING will adequately serve that purpose.

We’ll UTF-8 made by the RTL using ToUnicode and ToAnsi (,65001)b using the same API calls.

The Question is do we want a native UTF-8 Type so the compiler knows the String type and can help us.

Bruce Barrington thought the compiler should work for developer and make coding life easier. The Delphi team did it both ways and in the end decided to create a native UTF-8 type.

What we want, and what can be realistically achieved are 2 different things.

From a language design point of view, a utf-8 type would be great. From a “spend time on it” point of view, I’d rather they work on 64 bit.

A utf-8 type won’t offer any functionality we cant already do today. It’s a nice to have, not a necessity.

1 Like