Midrange News for the IBM i Community


Posted by: renojim
enforcing utf-8
has no ratings.
Published: 20 Dec 2013
Revised: 23 Dec 2013 - 3775 days ago
Last viewed on: 23 Apr 2024 (5543 views) 

Using IBM i? Need to create Excel, CSV, HTML, JSON, PDF, SPOOL reports? Learn more about the fastest and least expensive tool for the job: SQL iQuery.

enforcing utf-8 Published by: renojim on 20 Dec 2013 view comments(6)

So, I'm reading a db2 table and creating an xml file with the data. Came across a name in the table that has an illegal utf-8 character. An umlaut or some such thing. Looks like an O with an apostrophe over it. I dunno'. Don't care, really. I write that to the xml, and it blows up later when that xml is used. I've been asked to find a way to write utf-8 characters to the xml. In other words, catch the umlauts and whatever, and get rid of them. I don't have a clue how to do it. Anybody know?

Return to midrangenews.com home page.
Sort Ascend | Descend

COMMENTS

(Sign in to Post a Comment)
Posted by: renojim
Premium member *
Comment on: enforcing utf-8
Posted: 10 years 4 months 6 days 6 hours 12 minutes ago

Looks like iconv() will do it.

Posted by: Ringer
Premium member *
Comment on: enforcing utf-8
Posted: 10 years 4 months 6 days 6 hours 5 minutes ago

Are you writing to the IFS root system or a DB2 table? Either way, I'd think CCSID 1208 would work. Then just WRITE OS/400 will convert from the job's CCSID (probably 37) to the output file's CCSID. But I've been wrong before! 

Chris Ringer

Posted by: renojim
Premium member *
Comment on: enforcing utf-8
Posted: 10 years 4 months 6 days 3 hours 57 minutes ago

No, I'm working with fields. Reading a name from a db2 file, and writing it to a large character field as part of xml. Can't make iconv() work. If I run the example in Mr. Cozzi's article at http://www.mcpressonline.com/programming/rpg/converting-between-character-sets.html  using a from ccsid of 37 and a to ccsid of 1208, it translates ABCDEFG to  âäàáãå.

Posted by: bobcozzi
Site Admin ****
Chicagoland
Comment on: enforcing utf-8
Posted: 10 years 4 months 6 days 3 hours 44 minutes ago

You have a Database Field with a Field Name containing a character different from your system's CCSID?

Can you use DSPFD and see what the file's CCSID is currently set to?

So the only issue is that you need to xlate that field name to a UTF8 value, to do that you'd have to know what the CCSID of the file's field name.  Converting between characters sets and then looking at the data in Debug doesn't usually yield the proper characters as once the data is converted Debug is trying to display it in the CCSID of the JOB unless a specific CCSID is assigned to the field. 

If the field name is already in CCSID(1208) and you convert it from CCSID(37) to CCSID(1208) you'll have invalid results. That's probably what you're seeing .

You might try reading the field name into a field defined as a UCS-2 field.

Posted by: Ringer
Premium member *
Comment on: enforcing utf-8
Posted: 10 years 4 months 3 days 6 hours 21 minutes ago

Perhaps just write the XML entity. Ö 

Posted by: DaleB
Premium member *
Reading, PA
Comment on: enforcing utf-8
Posted: 10 years 4 months 3 days 5 hours 30 minutes ago

All of the above taken in to account, are you aware that UTF-8 uses variable width encoding? A little trickier with EBCDIC, but ASCII 0-127 encode directly to UTF-8; the high order bit is always 0. Any code point greater than ASCII 127 is encoded in UTF-8 with multiple bytes. So maybe what you're seeing is not a single charater (O-umlaut or whatever), but part of a multi-byte character.