On the Theme of Unexpected Behaviour...

| No Comments

I was processing a file with UTF-8 text and finding that my output was coming out in ISO-8859-1 (a.k.a. latin1). I verified this first using od -c | less and then by running recode latin1..utf8. Either Perl, XML::Parser or my code was silently converting the text.

It turned out that Perl was responsible this time. Or at least, the fix was at the Perl level. Again, the answer is in the friendly man pages, specifically in perluniintro. There are three alternatives:

  1. open FH, ">:utf8", "file";
  2. binmode(STDOUT, ":utf8");
  3. use open ’:utf8’; # open as a pragma

That makes not one but two cases recently solution was in the man pages. It helps that these are new style man pages, with lots of tasty example code.

About this Entry

This page contains a single entry by Christian published on May 27, 2005 8:17 AM.

Least Helpful Error Message Ever was the previous entry in this blog.

Ruby on Rails on MySql is the next entry in this blog.

Find recent content on the main index or look in the archive to find all content.