JDBC bug in 7.1b4

Started by Rainer Mageralmost 25 years ago15 messages
#1Rainer Mager
rmager@vgkk.com

Hi,

While trying 7.1b4 I got this using JDBC2:

ERROR: A request from 10.0.0.46 (10.0.0.46) resulted in
java.lang.NumberFormatException: 20 18:46:53+09
java.lang.NumberFormatException: 20 18:46:53+09
at java.lang.Integer.parseInt(Integer.java, Compiled Code)
at java.lang.Integer.parseInt(Integer.java, Compiled Code)
at java.sql.Date.valueOf(Date.java, Compiled Code)
at org.postgresql.jdbc2.ResultSet.getDate(ResultSet.java,
Compiled Code)
at org.postgresql.jdbc2.ResultSet.getDate(ResultSet.java,
Compiled Code)

Sorry for the incomplete stack trace (and lack of line numbers) but the
rest of it shouldn't matter. BTW, I am using the new 7.1 JDBC driver.
I'll try to look at the Java code tomorrow but I'm hoping someone
already has a fix.

--Rainer

#2Rainer Mager
rmager@vgkk.com
In reply to: Rainer Mager (#1)
Problem with 7.0.3 dump -> 7.1b4 restore

We have a Unicode (UTF-8) database that we are trying to upgrade to 7.1b4.
We did a pg_dumpall (yes, using the old version) and then tried a restore.
We hit the following 3 problems:

1. Some of the text is large, about 20k characters, and is multiline. For
almost all of the lines this was fine (postgres put a \ at the end of the
previos line) but for some it was not. The lines I looked at all had
non-English characters (Japanese and/or Korean) at the end of the line. When
the restore encountered these lines it failed and, since the dump uses COPY,
the entire table was left blank.

2. Some two-byte dash/hyphen characters DID get correctly imported into the
database but could not be read out again via JDBC, that is, when read the
record was truncated at the character. This _might_ be related to a long
standing Java core bug regarding improper conversions between certain
languages and the internal Unicode representation for hyphens.

3. One other character, a two-byte apostrophe, was not restoreable,
similarly to the hyphen problem.

After fighting the above, I decided to try doing the dump with the -dn
flags. This fixed problem #1 but not 2 or 3. If needed I can try to get
details about the problem characters.

Finally, not a bug but, we have written a small perl script that inserts
transactions around every 500 INSERT lines in a PG dump. This speeds up
large restores by about 100 times! Really! I think this might be a good
thing for the dump command to do automatically.

Best regards,

--Rainer

#3Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Rainer Mager (#2)
Re: Problem with 7.0.3 dump -> 7.1b4 restore

We have a Unicode (UTF-8) database that we are trying to upgrade to 7.1b4.
We did a pg_dumpall (yes, using the old version) and then tried a restore.
We hit the following 3 problems:

1. Some of the text is large, about 20k characters, and is multiline. For
almost all of the lines this was fine (postgres put a \ at the end of the
previos line) but for some it was not. The lines I looked at all had
non-English characters (Japanese and/or Korean) at the end of the line. When
the restore encountered these lines it failed and, since the dump uses COPY,
the entire table was left blank.

2. Some two-byte dash/hyphen characters DID get correctly imported into the
database but could not be read out again via JDBC, that is, when read the
record was truncated at the character. This _might_ be related to a long
standing Java core bug regarding improper conversions between certain
languages and the internal Unicode representation for hyphens.

3. One other character, a two-byte apostrophe, was not restoreable,
similarly to the hyphen problem.

After fighting the above, I decided to try doing the dump with the -dn
flags. This fixed problem #1 but not 2 or 3. If needed I can try to get
details about the problem characters.

This might be related to a known bug with 7.0.x. Can you grab a patch
from ftp://ftp.sra.co.jp/pub/cmd/postgres/7.0.3/patches/copy.patch.gz
and try again?

Or even better, can you give me a minimum set of data that reproduces
your problem?
--
Tatsuo Ishii

#4Rainer Mager
rmager@vgkk.com
In reply to: Tatsuo Ishii (#3)
RE: Problem with 7.0.3 dump -> 7.1b4 restore

Well, I tried the patch and the newly produced dump was identical to the bad
dump from before, so the patch had no affect. I will try to trim it down to
a reasonably small file and email it to you.

--Rainer

Show quoted text

-----Original Message-----
From: pgsql-bugs-owner@postgresql.org
[mailto:pgsql-bugs-owner@postgresql.org]On Behalf Of Tatsuo Ishii
Sent: Friday, February 23, 2001 10:32 AM
To: rmager@vgkk.com
Cc: pgsql-bugs@postgresql.org; pgsql-hackers@postgresql.org
Subject: Re: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore
This might be related to a known bug with 7.0.x. Can you grab a patch
from ftp://ftp.sra.co.jp/pub/cmd/postgres/7.0.3/patches/copy.patch.gz
and try again?

Or even better, can you give me a minimum set of data that reproduces
your problem?
--
Tatsuo Ishii

#5Rainer Mager
rmager@vgkk.com
In reply to: Tatsuo Ishii (#3)
1 attachment(s)
RE: Problem with 7.0.3 dump -> 7.1b4 restore

Attached is a single INSERT that shows the problem. The character after the
word "Fiber" truncates the text when using JDBC. NOTE, the text IS in the
database, that is, the dump/restore seems ok, the problem is when trying to
read the text later. The database is UTF8 and I just tested with beta 5.

Oh, BTW, if I try to set (INSERT) this same character via JDBC and then
retreive it again then everything is fine.

--Rainer

Show quoted text

-----Original Message-----
From: pgsql-bugs-owner@postgresql.org
[mailto:pgsql-bugs-owner@postgresql.org]On Behalf Of Tatsuo Ishii
Sent: Friday, February 23, 2001 10:32 AM

Or even better, can you give me a minimum set of data that reproduces
your problem?
--
Tatsuo Ishii

Attachments:

example.sqlapplication/octet-stream; name=example.sqlDownload
#6Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Rainer Mager (#5)
RE: Problem with 7.0.3 dump -> 7.1b4 restore

Attached is a single INSERT that shows the problem. The character after the
word "Fiber" truncates the text when using JDBC. NOTE, the text IS in the
database, that is, the dump/restore seems ok, the problem is when trying to
read the text later. The database is UTF8 and I just tested with beta 5.

Oh, BTW, if I try to set (INSERT) this same character via JDBC and then
retreive it again then everything is fine.

Thanks. I'll dig into it.
--
Tatsuo Ishii

#7Tatsuo Ishii
t-ishii@sra.co.jp
In reply to: Rainer Mager (#5)
RE: Problem with 7.0.3 dump -> 7.1b4 restore

Attached is a single INSERT that shows the problem. The character after the
word "Fiber" truncates the text when using JDBC. NOTE, the text IS in the
database, that is, the dump/restore seems ok, the problem is when trying to
read the text later. The database is UTF8 and I just tested with beta 5.

Oh, BTW, if I try to set (INSERT) this same character via JDBC and then
retreive it again then everything is fine.

I have tested your data using psql:

unicode=# create table pr_prop_info(i1 int, i2 int, i3 int, t text);
CREATE
unicode=# \encoding LATIN1
unicode=# \i example.sql
INSERT 2378114 1
unicode=# select * from pr_prop_info;

The character after the word "Fiber" looks like "�Optic Cable". So as
long as the server/client encoding set correctly, it looks ok. I guess
we have some problems with JDBC driver. Unfortunately I am not a Java
guru at all. Can anyone look into our JDBC driver regarding this
problem?
--
Tatsuo Ishii

#8Rainer Mager
rmager@vgkk.com
In reply to: Tatsuo Ishii (#7)
Problems with latest tests

Hi all,

I haven't been following the current thread on failed tests but I just had
some so I thought I'd mention it. If this is a repeat then I apologize.

I configured with:

./configure --enable-multibyte --enable-syslog --with-java --with-maxbackend
s=70

And the tests give me this error:

Running with noclean mode on. Mistakes will not be cleaned up.
/opt/home/rmager/pgsql/src/test/regress/./tmp_check/install//usr/local/pgsql
/bin/pg_encoding: error while loading shared libraries:
/opt/home/rmager/pgsql/src/test/regress/./tmp_check/install//usr/local/pgsql
/bin/pg_encoding: undefined symbol: pg_char_to_encoding
initdb: pg_encoding failed

Perhaps you did not configure PostgreSQL for multibyte support or
the program was not successfully installed.

--Rainer

#9Rainer Mager
rmager@vgkk.com
In reply to: Tatsuo Ishii (#7)
Problem with test results submission form

I tried to submit the results of my regression tests and got this:

Warning: PostgreSQL query failed: ERROR: parser: parse error at or near "t"
in
/home/projects/pgsql/developers/vev/public_html/regress/regress.php on line
359
Database write failed.

--Rainer

#10Rainer Mager
rmager@vgkk.com
In reply to: Tatsuo Ishii (#7)
Problems with Multibyte in 7.1 beta?

I'm trying to run the latest CVS code's regression tests and have a problem.
They fail at initdb with this:

Running with noclean mode on. Mistakes will not be cleaned up.
/opt/home/rmager/devel/External/pgsql/src/test/regress/./tmp_check/install//
usr/local/pgsql/bin/pg_encoding: erro
r while loading shared libraries:
/opt/home/rmager/devel/External/pgsql/src/test/regress/./tmp_check/install//
usr
/local/pgsql/bin/pg_encoding: undefined symbol: pg_char_to_encoding
initdb: pg_encoding failed

Perhaps you did not configure PostgreSQL for multibyte support or
the program was not successfully installed.

I ran configure with this:

./configure --enable-multibyte --enable-syslog --with-java

Any ideas?

--Rainer

#11Rainer Mager
rmager@vgkk.com
In reply to: Tatsuo Ishii (#7)
Dead locks

Hi all,

We're using PG 7.0 and 7.1beta and are having dead lock problems. The docs
say the Postgres detects dead locks and automatically rolls back 1
transaction to recover but this is not our experience. Are the docs
incorrect or is this more serious?

Thanks,

--Rainer

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Rainer Mager (#11)
Re: Dead locks

"Rainer Mager" <rmager@vgkk.com> writes:

We're using PG 7.0 and 7.1beta and are having dead lock problems. The docs
say the Postgres detects dead locks and automatically rolls back 1
transaction to recover but this is not our experience. Are the docs
incorrect or is this more serious?

Which beta release?

There are some known undetected-deadlock cases in 7.0, which were
repaired in late January --- that would have been beta4 or possibly
beta5, I forget now. If you still see this behavior with 7.1RC1 then
I would like details.

regards, tom lane

#13Rainer Mager
rmager@vgkk.com
In reply to: Tatsuo Ishii (#7)
RE: Problem with 7.0.3 dump -> 7.1b4 restore

I just tested a bug I originally fount in 7.1b4 with the new 7.1RC3 and it
still exists. I would consider this a major bug because I know of no work
around.

Basically what happens is that a dump of an existing Unicode database (from
7.03) has a double-byte hyphen character that becomes \255 in the dump. When
the data is imported into the new 7.1 database it seems to correctly appear
(verified via psql) BUT when reading this record via JDBC the data is
truncated at this character.

I communicated briefly with Ishii-san regarding this a while back but I
never followed up. Considering RC3 is now out I thought I should revisit the
issue. It should be easy to test by editing and postgres Unicode database
dump and putting \255 somewhere in a string. I'm not sure if it matters but
the dump was done with "-dn" flags.

Thanks,

--Rainer

Show quoted text

-----Original Message-----
From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp]
Sent: Wednesday, February 28, 2001 11:02 AM
To: rmager@vgkk.com
Cc: pgsql-bugs@postgresql.org; pgsql-hackers@postgresql.org
Subject: RE: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore

Attached is a single INSERT that shows the problem. The

character after the

word "Fiber" truncates the text when using JDBC. NOTE, the text

IS in the

database, that is, the dump/restore seems ok, the problem is

when trying to

read the text later. The database is UTF8 and I just tested with beta 5.

Oh, BTW, if I try to set (INSERT) this same character via JDBC and then
retreive it again then everything is fine.

I have tested your data using psql:

unicode=# create table pr_prop_info(i1 int, i2 int, i3 int, t text);
CREATE
unicode=# \encoding LATIN1
unicode=# \i example.sql
INSERT 2378114 1
unicode=# select * from pr_prop_info;

The character after the word "Fiber" looks like "�Optic Cable". So as
long as the server/client encoding set correctly, it looks ok. I guess
we have some problems with JDBC driver. Unfortunately I am not a Java
guru at all. Can anyone look into our JDBC driver regarding this
problem?
--
Tatsuo Ishii

#14Rainer Mager
rmager@vgkk.com
In reply to: Rainer Mager (#13)
RE: Problem with 7.0.3 dump -> 7.1b4 restore

I noticed that 7.1 has officially been released. Does anyone know the status
of the bug I reported regarding encoding problems when dumping a 7.0 db an
restoring on 7.1?

Thanks,

--Rainer

#15Rainer Mager
rmager@vgkk.com
In reply to: Rainer Mager (#14)
RE: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore

Hi,

I'm trying to see if I can patch this bug myself because we are under some
time constraints. Can anyone give me a tip regarding where in the postgres
source the internal UTF-8 code is converted during a dump?

I believe that the character 0xAD is a ASCII character that looks like a
dash. According to the UTF-8 spec, anything over 0x7F requires another byte
with it (which, I think, means that you should never see the 0xAD character
by itself in a postgres dump, but I am seeing this). So, I'm guessing that
some piece of the UTF-8 conversion routine is a bit off.

Any tips on where to start? I would try to hack a fix by searching for the
offending character in the dump and replacing it with a normal dash but
unfortunately 0xAD is a valid byte when paired with other bytes and these
also exist in our dump.

--Rainer

Show quoted text

-----Original Message-----
From: pgsql-bugs-owner@postgresql.org
[mailto:pgsql-bugs-owner@postgresql.org]On Behalf Of Rainer Mager
Sent: Monday, April 16, 2001 12:15 PM
To: pgsql-bugs@postgresql.org; pgsql-hackers@postgresql.org
Subject: RE: [BUGS] Problem with 7.0.3 dump -> 7.1b4 restore

I noticed that 7.1 has officially been released. Does anyone know
the status
of the bug I reported regarding encoding problems when dumping a 7.0 db an
restoring on 7.1?

Thanks,

--Rainer

---------------------------(end of broadcast)---------------------------
TIP 2: you can get off all lists at once with the unregister command
(send "unregister YourEmailAddressHere" to majordomo@postgresql.org)