Trying to understand encoding.

Started by Tomás Di Doménicoabout 18 years ago3 messagesgeneral
Jump to latest
#1Tomás Di Doménico
tdidomenico@avature.net

Greetings.

I'm currently using 8.3, but I've been coping with this since previous
versions.

I'm trying to integrate some LATIN1 and some UTF8 DBs into a single UTF8
one. To avoid the "Invalid UNICODE character..." error, I used iconv to
convert the LATIN1 dumps to UTF8.

Now I have the data into the UTF8 DB, and using graphical clients
everything seems to be great. The thing is, when I query the data via
psql, with \encoding UTF8 I get weird data ("Neuquén" for "Neuquén").
However, with \encoding LATIN1, everything looks fine.

So, I have a UTF8 DB, (what I think is) UTF8 data, and I can only see it
right by setting \encoding to LATIN1 in psql, or using a graphical client.

If anyone could help me try and understand this mess, I'd really
appreciate it.

Ah, these are my locale settings, in case it helps.

LANG=en_US.UTF-8
LC_CTYPE=C
LC_NUMERIC="en_US.UTF-8"
LC_TIME="en_US.UTF-8"
LC_COLLATE=C
LC_MONETARY="en_US.UTF-8"
LC_MESSAGES="en_US.UTF-8"
LC_PAPER="en_US.UTF-8"
LC_NAME="en_US.UTF-8"
LC_ADDRESS="en_US.UTF-8"
LC_TELEPHONE="en_US.UTF-8"
LC_MEASUREMENT="en_US.UTF-8"
LC_IDENTIFICATION="en_US.UTF-8"

#2Doug McNaught
doug@mcnaught.org
In reply to: Tomás Di Doménico (#1)
Re: Trying to understand encoding.

On 2/15/08, Tomás Di Doménico <tdidomenico@avature.net> wrote:

Now I have the data into the UTF8 DB, and using graphical clients
everything seems to be great. The thing is, when I query the data via
psql, with \encoding UTF8 I get weird data ("NeuquÃ(c)n" for "Neuquén").
However, with \encoding LATIN1, everything looks fine.

Maybe your terminal program doesn't support UTF8, or it's
misconfigured? If you create a UTF8-encoded file and 'cat' it, is the
output correct?

-Doug

#3Tomás Di Doménico
tdidomenico@avature.net
In reply to: Doug McNaught (#2)
Re: Trying to understand encoding.

Geez. My default terminal didn't support UNICODE. Shame on me :P

Thanks!

Douglas McNaught wrote:

Show quoted text

On 2/15/08, Tomás Di Doménico <tdidomenico@avature.net> wrote:

Now I have the data into the UTF8 DB, and using graphical clients
everything seems to be great. The thing is, when I query the data via
psql, with \encoding UTF8 I get weird data ("NeuquÃ(c)n" for "Neuquén").
However, with \encoding LATIN1, everything looks fine.

Maybe your terminal program doesn't support UTF8, or it's
misconfigured? If you create a UTF8-encoded file and 'cat' it, is the
output correct?

-Doug

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org/