encoding question
Hi,
In phpPgAdmin, we automatically set the HTML page encoding to and encoding
that allows us to properly display the encoding of the current postgresql
database. I have a small problem with SQL_ASCII. Theoretically (and what
we currently do), we should set page encoding to US-ASCII. However,
Postgres seems to allow unlauts and all sorts of extra 8 bit data in ASCII
databases, so what encoding should I use. Is ISO-8859-1 a better choice?
Is SQL_ASCII basically equivalent to the LATIN1 encoding?
My other question is we play around with bytea fields to escape nulls and
chars < 32 and stuff so that when someone browses the table, they get
'\000<unknown>\000...', etc. However, are the other field types for which
we have to do this? Can you put nulls and stuff in text/varchar/char
fields? What about other fields?
Thanks,
Chris
My other question is we play around with bytea fields to escape nulls and
chars < 32 and stuff so that when someone browses the table, they get
'\000<unknown>\000...', etc. However, are the other field types for which
we have to do this? Can you put nulls and stuff in text/varchar/char
fields? What about other fields?
pg_escape_string
pg_escape_bytea
Escape everything :)
I don't think you see what I mean :)
I want to display the data on a webpage to the user. This means that a
varchar containing the string "I don't want it", should not appear as "I
don''t want it". So pg_escape_string isn't used there. bytea is different
tho because the default display isn't terribly useful...
Chris
----- Original Message -----
From: "Rod Taylor" <rbt@rbt.ca>
To: "Christopher Kings-Lynne" <chriskl@familyhealth.com.au>
Cc: "Hackers" <pgsql-hackers@postgresql.org>
Sent: Thursday, August 07, 2003 9:46 AM
Subject: Re: [HACKERS] encoding question
My other question is we play around with bytea fields to escape nulls and
chars < 32 and stuff so that when someone browses the table, they get
'\000<unknown>\000...', etc. However, are the other field types for which
we have to do this? Can you put nulls and stuff in text/varchar/char
fields? What about other fields?
pg_escape_string
pg_escape_bytea
Escape everything :)
Chris,
SQL_ASCII means that the data could be anything. It could be Latin1,
UTF-8, Latin9, whatever the code inserting data sends to the server. In
general the server accepts anything as SQL_ASCII. In general this
doesn't cause any problems as long as all the clients have a common
understanding on what the real encoding of the data is. However if you
set CLIENT_ENCODING then the server does assume that the data is really
7bit ascii.
In the jdbc driver we only support US-ASCII data if the character set is
SQL_ASCII since we use the CLIENT_ENCODING setting of UTF8 to have the
server perform the necessary conversion for us since java needs unicode
strings. And if you store anything other than US-ASCII data in a
SQL_ASCII database the server will return invalid UTF-8 data to the client.
thanks,
--Barry
Christopher Kings-Lynne wrote:
Show quoted text
Hi,
In phpPgAdmin, we automatically set the HTML page encoding to and encoding
that allows us to properly display the encoding of the current postgresql
database. I have a small problem with SQL_ASCII. Theoretically (and what
we currently do), we should set page encoding to US-ASCII. However,
Postgres seems to allow unlauts and all sorts of extra 8 bit data in ASCII
databases, so what encoding should I use. Is ISO-8859-1 a better choice?
Is SQL_ASCII basically equivalent to the LATIN1 encoding?My other question is we play around with bytea fields to escape nulls and
chars < 32 and stuff so that when someone browses the table, they get
'\000<unknown>\000...', etc. However, are the other field types for which
we have to do this? Can you put nulls and stuff in text/varchar/char
fields? What about other fields?Thanks,
Chris
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
Christopher Kings-Lynne kirjutas N, 07.08.2003 kell 04:33:
My other question is we play around with bytea fields to escape nulls and
chars < 32 and stuff so that when someone browses the table, they get
'\000<unknown>\000...', etc.
actually bytea *stores* char(0), you get \000 or \x0 or ¬@ or whatever
depending on whatever you use for displaying it.
the escaping i's done only to fit the data into a SQL statement when
inserting the data into the database. select returns straight bytes from
bytea.
However, are the other field types for which
we have to do this? Can you put nulls and stuff in text/varchar/char
fields?
No. Nulls are not allowed in text/varchar fields.
-------------
Hannu