Bug Pgsql

Started by Казорез Александр Олеговичover 18 years ago2 messagesbugs
Jump to latest

Hi to all,

I came across a problem while switching from 8.0 to 8.1.4 (or, now, to 8.2.4). Here it is:

QUOTE: Some users are having problems loading UTF-8 data into 8.1.X. This is because previous versions allowed invalid UTF-8 byte sequences to be entered into the database, and this release properly accepts only valid UTF-8 sequences. One way to correct a dumpfile is to run the command iconv -c -f UTF-8 -t UTF-8 -o cleanfile.sql dumpfile.sql. The -c option removes invalid character sequences. A diff of the two files will show the sequences that are invalid. iconv reads the entire input file into memory so it might be necessary to use split to break up the dump into multiple smaller files for processing.

This quotation deals with receiving an inserts pack as 'plain text'. Well, well, but here we come to another problem: if a database is bulky, it 'does not want' to be loaded as plain text, and it requires choosing another data format. After this I made a backup with '-Ft'. But we can not simply put 'tar' through 'iconv' , so I unarchived it and made a 'find ./ -exec iconv'. Well, after all this I could not put it all back together so that 'pg_restore' would not find it incorrect. I had to make a 'ls | cat | psql databasename'. After 15 hours of work nothing changed! :)

Also, there is an error in 'select * from table1 where lower(field[1]) like 'test' '. It looks like this:

ERROR: invalid byte sequence for encoding "UTF 8"

HINT: This error can also happen if the byte sequence does not match the encoding expected by the server, which is controlled by "client_encoding".

This error occurs only when we make inquiries with 'lower/upper' for tables containing massives, and in all other cases it is working properly.

Have you ever come upon such a mistake? How did you get over it?

*****************************************

Старший администратор ОСА УИТ

ОАО "ИнвестКапиталБанк"

Казорез Александр Олегович

(347)291-37-60, вн. 2021

a.kazorez@investcapitalbank.ru <mailto:a.kazorez@investcapitalbank.ru>

ICQ 400-475-046

*****************************************

#2Rodriguez Fernando
rodriguez@ort.edu.uy
In reply to: Казорез Александр Олегович (#1)
Re: Bug Pgsql

Hi,
try this:
dump your db
create a db in sqlascii
import into new one
export from new one and then try to import in utf8.
i try somethng like this in 8.1.

good luck.

Казорез Александр Олегович wrote:

Show quoted text

Hi to all,

I came across a problem while switching from 8.0 to 8.1.4 (or, now, to
8.2.4). Here it is:

QUOTE: Some users are having problems loading UTF-8 data into 8.1.X.
This is because previous versions allowed invalid UTF-8 byte sequences
to be entered into the database, and this release properly accepts
only valid UTF-8 sequences. One way to correct a dumpfile is to run
the command iconv -c -f UTF-8 -t UTF-8 -o cleanfile.sql dumpfile.sql.
The -c option removes invalid character sequences. A diff of the two
files will show the sequences that are invalid. iconv reads the entire
input file into memory so it might be necessary to use split to break
up the dump into multiple smaller files for processing.

This quotation deals with receiving an inserts pack as ‘plain text’.
Well, well, but here we come to another problem: if a database is
bulky, it ‘does not want’ to be loaded as plain text, and it requires
choosing another data format. After this I made a backup with ‘-Ft’.
But we can not simply put ‘tar’ through ‘iconv’ , so I unarchived it
and made a ‘find ./ -exec iconv’. Well, after all this I could not put
it all back together so that ‘pg_restore’ would not find it incorrect.
I had to make a ‘ls | cat | psql databasename’. After 15 hours of work
nothing changed! :)

Also, there is an error in ‘select * from table1 where lower(field[1])
like 'test' ‘. It looks like this:

ERROR: invalid byte sequence for encoding "UTF 8"

HINT: This error can also happen if the byte sequence does not match
the encoding expected by the server, which is controlled by
"client_encoding".

This error occurs only when we make inquiries with ‘lower/upper’ for
tables containing massives, and in all other cases it is working properly.

Have you ever come upon such a mistake? How did you get over it?

//*****************************************//

//Старший администратор ОСА УИТ //

//ОАО "ИнвестКапиталБанк" //

//Казорез Александр Олегович //

//(347)291-37-60, вн. 2021//

//a.kazorez@investcapitalbank.ru// <mailto:a.kazorez@investcapitalbank.ru>

//ICQ 400-475-046//

//*****************************************//