UTF-8 to ASCII

Started by Martín Marquésalmost 19 years ago10 messagesgeneral
Jump to latest
#1Martín Marqués
martin@bugs.unl.edu.ar

I have a doubt about the function to_ascii() and what the documentation
says.

Basically, I passed my DB from latin1 to UTF-8, and I started getting an
error when using the to_ascii() function on a field of one of my DB [1]This I already solved using convert() to pass from UTF to Latin1, and after that I do a to_ascii().:

ERROR: la conversi�n de codificaci�n de UTF8 a ASCII no est� soportada

OK, it's in spanish, but basically it says that the conversion UTF8 to
ASCII is not supported, but in the documentation [2]http://www.postgresql.org/docs/8.1/interactive/functions-string.html#FTN.AEN7625 I see this in the
"Table 9-7. Built-in Conversions":

utf8_to_ascii UTF8 SQL_ASCII

Is the documentation wrong or something?

I'm on postgresql-8.1.8, and as you can see, I'm checking the
corresponding documentation.

[1]: This I already solved using convert() to pass from UTF to Latin1, and after that I do a to_ascii().
and after that I do a to_ascii().
[2]: http://www.postgresql.org/docs/8.1/interactive/functions-string.html#FTN.AEN7625
http://www.postgresql.org/docs/8.1/interactive/functions-string.html#FTN.AEN7625

--
21:50:04 up 2 days, 9:07, 0 users, load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Mart�n Marqu�s | SELECT 'mmarques' ||
Centro de Telem�tica | '@' || 'unl.edu.ar';
Universidad Nacional | DBA, Programador,
del Litoral | Administrador
---------------------------------------------------------

#2LEGEAY Jérôme
jerome.legeay@ffcat.coop
In reply to: Martín Marqués (#1)
Re: UTF-8 to ASCII

for convert my DB, i use this process:

createdb -T "old_DB" "copy_old_DB"
dropdb "old_DB"
createdb -E LATIN1 -T "copy_old_DB" "new_DB_name"

maybe this process will help you.

regards

Jérôme LEGEAY

Le 14:13 11/05/2007, vous avez écrit:

Show quoted text

I have a doubt about the function to_ascii() and what the documentation says.

Basically, I passed my DB from latin1 to UTF-8, and I started getting an
error when using the to_ascii() function on a field of one of my DB [1]:

ERROR: la conversión de codificación de UTF8 a ASCII no está soportada

OK, it's in spanish, but basically it says that the conversion UTF8 to
ASCII is not supported, but in the documentation [2] I see this in the
"Table 9-7. Built-in Conversions":

utf8_to_ascii UTF8 SQL_ASCII

Is the documentation wrong or something?

I'm on postgresql-8.1.8, and as you can see, I'm checking the
corresponding documentation.

[1]: This I already solved using convert() to pass from UTF to Latin1, and
after that I do a to_ascii().
[2]:
http://www.postgresql.org/docs/8.1/interactive/functions-string.html#FTN.AEN7625

--
21:50:04 up 2 days, 9:07, 0 users, load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Martín Marqués | SELECT 'mmarques' ||
Centro de Telemática | '@' || 'unl.edu.ar';
Universidad Nacional | DBA, Programador,
del Litoral | Administrador
---------------------------------------------------------

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org/

#3Martín Marqués
martin@bugs.unl.edu.ar
In reply to: LEGEAY Jérôme (#2)
Re: UTF-8 to ASCII

LEGEAY J�r�me wrote:

for convert my DB, i use this process:

createdb -T "old_DB" "copy_old_DB"
dropdb "old_DB"
createdb -E LATIN1 -T "copy_old_DB" "new_DB_name"

maybe this process will help you.

As I said in my original mail, the DB conversion went OK, but I see some
discrepancies in the documentation.

My question is if the documentation is correct, and if so, why don't I
get the right behavior?

--
21:50:04 up 2 days, 9:07, 0 users, load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Mart�n Marqu�s | SELECT 'mmarques' ||
Centro de Telem�tica | '@' || 'unl.edu.ar';
Universidad Nacional | DBA, Programador,
del Litoral | Administrador
---------------------------------------------------------

#4Arnaud Lesauvage
arnaud.lesauvage@supermail.fr
In reply to: Martín Marqués (#1)
Re: UTF-8 to ASCII

Martin Marques a �crit :

I have a doubt about the function to_ascii() and what the documentation
says.

Basically, I passed my DB from latin1 to UTF-8, and I started getting an
error when using the to_ascii() function on a field of one of my DB [1]:

ERROR: la conversi�n de codificaci�n de UTF8 a ASCII no est� soportada

OK, it's in spanish, but basically it says that the conversion UTF8 to
ASCII is not supported, but in the documentation [2] I see this in the
"Table 9-7. Built-in Conversions":

utf8_to_ascii UTF8 SQL_ASCII

Is the documentation wrong or something?

Hi Martin,
I think the documentation of 8.1 is wrong.
It looks different indocumentation of 8.2 :
to_ascii : Convert string to ASCII from another encoding *(only supports conversion from LATIN1, LATIN2, LATIN9, and WIN1250 encodings)*

Hi ran into this problem too, and I wrote a function that converts from DB encoding to LATIN9 before doing the to_ascii conversion : /to_ascii(convert(mystring, 'LATIN9'), 'LATIN9')/

Regards
--
Arnaud

#5Albe Laurenz
all@adv.magwien.gv.at
In reply to: Martín Marqués (#1)
Re: UTF-8 to ASCII

I have a doubt about the function to_ascii() and what the
documentation says.

Basically, I passed my DB from latin1 to UTF-8, and I started

What do you mean by 'passed the DB from Latin1 to UTF8'?

getting an error when using the to_ascii() function on a field
of one of my DB [1]:

ERROR: la conversión de codificación de UTF8 a ASCII no está soportada

OK, it's in spanish, but basically it says that the conversion
UTF8 to ASCII is not supported, but in the documentation [2] I see
this in the "Table 9-7. Built-in Conversions":

utf8_to_ascii UTF8 SQL_ASCII

Is the documentation wrong or something?

I'm on postgresql-8.1.8, and as you can see, I'm checking the
corresponding documentation.

[1]: This I already solved using convert() to pass from UTF
to Latin1, and after that I do a to_ascii().
[2]:
http://www.postgresql.org/docs/8.1/interactive/functions-string.html#FTN.AEN7625

Well, the documentation for to_ascii states clearly:
"The to_ascii function supports conversion from LATIN1, LATIN2,
LATIN9, and WIN1250 encodings only."

The table of conversions you quote belongs to the function convert().

So that should answer your question.

I am not sure what you are trying to achieve.
If you tell us, I might be able to tell you HOW to achieve it.

Yours,
Laurenz Albe

#6Martín Marqués
martin@bugs.unl.edu.ar
In reply to: Albe Laurenz (#5)
Re: UTF-8 to ASCII

Albe Laurenz wrote:

[2]:
http://www.postgresql.org/docs/8.1/interactive/functions-string.html#FTN.AEN7625

Well, the documentation for to_ascii states clearly:
"The to_ascii function supports conversion from LATIN1, LATIN2,
LATIN9, and WIN1250 encodings only."

Sorry, didn't see the footnote on the table.

--
21:50:04 up 2 days, 9:07, 0 users, load average: 0.92, 0.37, 0.18
---------------------------------------------------------
Lic. Mart�n Marqu�s | SELECT 'mmarques' ||
Centro de Telem�tica | '@' || 'unl.edu.ar';
Universidad Nacional | DBA, Programador,
del Litoral | Administrador
---------------------------------------------------------

#7Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Martín Marqués (#1)
Re: UTF-8 to ASCII

Martin Marques escribi�:

I have a doubt about the function to_ascii() and what the documentation
says.

Basically, I passed my DB from latin1 to UTF-8, and I started getting an
error when using the to_ascii() function on a field of one of my DB [1]:

ERROR: la conversi�n de codificaci�n de UTF8 a ASCII no est� soportada

Well, the to_ascii() documentation says that it only supports LATIN1,
LATIN2, LATIN9, and WIN1250. This is on a footnote.

I do think that there's something strange on the vicinity anyway,
because using convert() expliciting the conversion function gives a
mismatching error for me (local environment is UTF8, as is
client_encoding):

alvherre=# select convert('Mart�n' using utf8_to_ascii);
ERROR: character 0xc3 of encoding "MULE_INTERNAL" has no equivalent in "SQL_ASCII"

Why on earth is it talking about MULE_INTERNAL?

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#8Martin Gainty
mgainty@hotmail.com
In reply to: Martín Marqués (#1)
Re: UTF-8 to ASCII

Apparently you will need to implement a UNICODE aware JDBC driver
http://archives.postgresql.org/pgsql-general/2004-01/msg01649.php
Mart�n

This email message and any files transmitted with it contain confidential
information intended only for the person(s) to whom this email message is
addressed. If you have received this email message in error, please notify
the sender immediately by telephone or email and destroy the original
message without making a copy. Thank you.

----- Original Message -----
From: "Alvaro Herrera" <alvherre@commandprompt.com>
To: "Martin Marques" <martin@bugs.unl.edu.ar>
Cc: <pgsql-general@postgresql.org>
Sent: Friday, May 11, 2007 9:33 AM
Subject: Re: [GENERAL] UTF-8 to ASCII

Show quoted text

Martin Marques escribi�:

I have a doubt about the function to_ascii() and what the documentation
says.

Basically, I passed my DB from latin1 to UTF-8, and I started getting an
error when using the to_ascii() function on a field of one of my DB [1]:

ERROR: la conversi�n de codificaci�n de UTF8 a ASCII no est� soportada

Well, the to_ascii() documentation says that it only supports LATIN1,
LATIN2, LATIN9, and WIN1250. This is on a footnote.

I do think that there's something strange on the vicinity anyway,
because using convert() expliciting the conversion function gives a
mismatching error for me (local environment is UTF8, as is
client_encoding):

alvherre=# select convert('Mart�n' using utf8_to_ascii);
ERROR: character 0xc3 of encoding "MULE_INTERNAL" has no equivalent in
"SQL_ASCII"

Why on earth is it talking about MULE_INTERNAL?

--
Alvaro Herrera
http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#7)
Re: UTF-8 to ASCII

Alvaro Herrera <alvherre@commandprompt.com> writes:

Why on earth is it talking about MULE_INTERNAL?

IIRC, a lot of the conversions translate through some common
intermediate charset to save on code/table space. In such cases
the problem will usually be detected on the backend conversion...

regards, tom lane

#10Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#9)
Re: UTF-8 to ASCII

Tom Lane escribi�:

Alvaro Herrera <alvherre@commandprompt.com> writes:

Why on earth is it talking about MULE_INTERNAL?

IIRC, a lot of the conversions translate through some common
intermediate charset to save on code/table space. In such cases
the problem will usually be detected on the backend conversion...

Interesting, but it doesn't explain why the conversion doesn't work.
AFAICS the operation I am requesting is valid.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.