I have a batch process copying data from TPS to MySQL table. For a couple of records I’m getting an error
Record Insert Error,An error was experienced during the creation of a record.|File: StanMag|Error: Incorrect string value: ‘\x81 …’ for column mydb.mytable.myfield at row 1 (S1000)
This is a string field containing some descriptive information, I suspect end users may be copying content from a website or word document containing unicode characters.
How can I quickly remove these before assigning field values?
just loop through the characters in the field and remove anything with decimal value greater than 127.
or if you look at the character and see a certain obvious replacement then do that. There was something similar on the newsgroups recently with utf-8 text pasted from Word and there were quotes and degrees symbols that could be replaced:
That is using StringTheory (and if you don’t have that then use SystemStringClass) to do the removals or replacements but it is pretty straightforward in raw Clarion code.
Thanks. I can’t remove anything decimal value greater than 127 as there are some national characters with acute accents above 127. But I just decided to replace CHR(129) with CHR(32) - space - and that did the trick. Not sure why MySQL doesn’t like CHR(129) and if there are any more characters MySQL would complain about. It does not complain about quotes.
<129> looks to be a u with an Umlaut over it so depending on context maybe consider replacing it with the letter ‘u’ For example if it is in the middle of a word.
however it is hard to know without seeing the context eg. what codepage is in use and has someone pasted some UTF-8 characters. I did read somewhere that “by default, MySQL databases and tables are created using a UTF-8 with 3-bytes encoding”, so maybe that is coming into play?