MySQL Record Insert Error,: Incorrect string value: '\x81

greg · June 30, 2022, 9:43pm

I have a batch process copying data from TPS to MySQL table. For a couple of records I’m getting an error

Record Insert Error,An error was experienced during the creation of a record.|File: StanMag|Error: Incorrect string value: ‘\x81 …’ for column mydb.mytable.myfield at row 1 (S1000)

This is a string field containing some descriptive information, I suspect end users may be copying content from a website or word document containing unicode characters.

How can I quickly remove these before assigning field values?

vitesse · July 1, 2022, 1:18pm

just loop through the characters in the field and remove anything with decimal value greater than 127.

or if you look at the character and see a certain obvious replacement then do that. There was something similar on the newsgroups recently with utf-8 text pasted from Word and there were quotes and degrees symbols that could be replaced:

st.replace(‘<0E2h,080h,099h>’,‘’‘’) ! single quote
st.replace(‘<0E2h,080h,09Dh>’,‘"’) ! double quotes
st.replace(‘<0C2h,0B0h>’,‘<0F8h>’) ! degrees

That is using StringTheory (and if you don’t have that then use SystemStringClass) to do the removals or replacements but it is pretty straightforward in raw Clarion code.

greg · July 1, 2022, 2:11pm

Thanks. I can’t remove anything decimal value greater than 127 as there are some national characters with acute accents above 127. But I just decided to replace CHR(129) with CHR(32) - space - and that did the trick. Not sure why MySQL doesn’t like CHR(129) and if there are any more characters MySQL would complain about. It does not complain about quotes.

vitesse · July 2, 2022, 4:52am

no worries Greg

<129> looks to be a u with an Umlaut over it so depending on context maybe consider replacing it with the letter ‘u’ For example if it is in the middle of a word.

however it is hard to know without seeing the context eg. what codepage is in use and has someone pasted some UTF-8 characters. I did read somewhere that “by default, MySQL databases and tables are created using a UTF-8 with 3-bytes encoding”, so maybe that is coming into play?

cheers

Geoff R

greg · August 12, 2022, 12:23am

Yes, MySQL data is stored in UTF-8 regardless of the app supplying the data . They do have ODBC drivers supporting Code Pages, however this may become an issue as it was in my case. In code page I was using the character was not “u” with an umlaut, but an unprintable character which cased the issue. The main issue is that Clarion still does not support UTF-8. Which is a shame, as it’s been out for 20+ years or more https://www.joelonsoftware.com/2003/10/08/the-absolute-minimum-every-software-developer-absolutely-positively-must-know-about-unicode-and-character-sets-no-excuses/