How do I get Baltic characters saved in database and in e-mail

Clarion 10, NetTalk 9.3

I have a requirement for Baltic characters in e-mails. I played a little with charset (ISO-8859-1) and used HTML, tried system{Prop:Charset}=CharSet:Baltic
At first glance it looked correct. However, the characters are changed. ANSI?
This
ą, č, ę, ė, į, š, ų, ū, ž, Ą, Č, Ę, Ė, Į, Š, Ų, Ū, Ž.
becomes
a, c, e, e, i, š, u, u, ž, A, C, E, E, I, Š, U, U, Ž.

Most are distored.

Upon closed investigation, the distortion happens somewhere when assigning. It will be saved distorted and if I save it directly in database, it is distorted on screen when read back.

So, how do I make my program support Baltic?

Hi, the charset should be cp1257 (Windows1257)
In case of HTML it`s better to use UTF-8 but that depends on the DB you are using and the application.

I can’t find that charset. Can you tell me where I can find cp1257 and how to set it im this.context?

What database are you storing the data in?

It is MSSQL. The conversion happens when assigning string to string, or something like that, because it happens with strings before they are saved and when they are retrieved.

I think I have solved it now though.
Withing regional settings, there is a link to “Administrative settings” (or something like that). This leads to an “Area” setting, where I can set “Language for non-Unicode programs”. This I set to Lithuanian. When this is done, and computer restarted, I can write and read from database without distortion.
For pure text e-mails, the strings now works with ISO-8859-13.
For html, there are still too many wrong characters, so I do a search and replace for character to HTML equivalent.

Yes, this was the thing I was going to point you to. Non-unicode apps, which Clarion ones obviously are, use the OS default codepage (which differs depending on your location).
This setting will change the OS default codepage to the codepage of your choice, the assumption being that you’ve already checked that the codepage you pick contains all the characters you want.
MS has a whole website devoted to globalisation & internationalisation, but a lot of it assumes you’re using unicode and we aren’t there yet.

Think looked so good, but now my norwegian charaters are distored.
If I set 0{prop:fontcharset = CharSet:Baltic, they look OK, but it seems that this has negative impact on lists.

Are you wanting hhe chars to appear in the rmail, or on the screen? The two oputs require quite different techniques.

Also not sure if NT 9 can do it - it might, but its okd so i dont remember.
NT has properties for code page and charset.

Your first goal should be to understand the encoding and charset of the text in your database. Thus will then fliw into a String, and from there into the email object.

Understanding the encoding st each part of the flow makes it fairly easy. The NT webinar is lijely a good place to ask a question and see how you can follow the flow.

Displaying the string at any point on the screen will likely confuse you.

I wonder how they are going to implement the database indexing, like TPS files.

Obviously for SQL database stores it’s a non issue. SQL uses “collations” which determine things like “how characters are ordered” and so on. So the behaviour is described at the database level, not the program level.

The code with something like TPS is more tricky, because at it’s heart, the ordering of characters is arbitrary. Those of us raised on English have a “universal understanding” of word ordering, but in the real world it’s anything but.

For example, in german, words are ordered by the base letter. So Ä comes between A and B. Whereas in Swedish and Finnish Ä (and other letters with diacritics) come after Z. Indeed most european languages have their own sorting rules.

That’s before we consider pictograph languages, where the whole concept of “alphabetic ordering” simply does not exist. For example, perhaps you can order the following string alphabetically?
:slight_smile: :frowning: :winking_face_with_tongue: :monkey_face:

However this is nothing new for TPS. Clarion has a LOCALE command, and one of the settings there is CLACASE. This allows you to specify the ordering you desire. So, in other words, I imagine that TPS indexes would behave just as they always have.

I do not need to display strings for it to confuse me. I pass the string from SendTo_Email, where it is entered in textbox, to SendEmail and set the required properties for e-mail, but somewhere in the passing of that string, the characters are converted to ANSI, and the ANSI string is what is saved to database.
As mentioned, I had it working using window setting, but it is not possible to support both Norwegian and Lithuania in the same non-unicode program. Values stored in database are changed when activaiting support. One do not mess with customer data.

You’re on the right track. What you need to determine is where exactly the conversion happens.

For example, it’s unlikely that sending the email causes the data in the database to change. But the database will have a collation set. So, eliminating the email aspect for now, you’ll want to be able to see the string come into your program, see it get written to the database, and then determine how it is being stored.

Once that pipeline is sorted, and understood, then using the database fields in the email becomes the second task. For that, since you know the string being read, and the contents of that string (ie the encoding) you can then set the correct parameters for the email encoding, and, if necessary convert from one encoding to another.

It’s impossible to say what settings you need for the email object, since that is highly dependent on the encoding of the string as it flows into the email object.

Using peekram allows you to inspect the string at each point, and so determine what you are dealing with.

Cheers
Bruce