pg_dump/pg_restore problems with 7.4.1

Started by Ben Markleinover 22 years ago14 messagesgeneral
Jump to latest
#1Ben Marklein
the_pharcyde@yahoo.com

I'm trying to migrate from 7.2 to 7.4.1 via
pg_dump/pg_restore and have encountered a couple of
problems:

1) Index creation on a table fails:

db=# CREATE UNIQUE INDEX person_info_username_ix ON
person_info USING btree (username);
ERROR: could not create unique index
DETAIL: Table contains duplicated values.

Of course, this index existed in the 7.2 DB, so it
seems odd that it should not be possible to recreate.
Furthermore, a search for duplicate values fails:

db=# select username, count(*) from person_info group
by username having count(*) > 1;
username | count
----------+-------
(0 rows)

Is this a known issue or possibly a bug?

2) While importing I get: "ERROR: invalid memory
alloc request size 1073741824". I tracked this down
to a line with some non-ASCII characters. The type of
the new database is LATIN1, and these are valid LATIN1
characters. I googled for the error message and found
a bug report that sounds like the same problem:

http://www.mail-archive.com/pgsql-bugs@postgresql.org/msg07363.html

Anyone know if this was confirmed as a bug, and
whether it's being looked into? Thanks.

Ben

Ben

__________________________________
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ben Marklein (#1)
Re: pg_dump/pg_restore problems with 7.4.1

Ben Marklein <the_pharcyde@yahoo.com> writes:

I'm trying to migrate from 7.2 to 7.4.1 via
pg_dump/pg_restore and have encountered a couple of
problems:

1) Index creation on a table fails:

db=# CREATE UNIQUE INDEX person_info_username_ix ON
person_info USING btree (username);
ERROR: could not create unique index
DETAIL: Table contains duplicated values.

Of course, this index existed in the 7.2 DB, so it
seems odd that it should not be possible to recreate.

That's fairly interesting. It could be an issue of strings that are
equal under your current locale but weren't equal before ... but why
wouldn't the GROUP BY search find those duplicates too? I agree this
sounds like a bug. Can you provide test data to duplicate the problem?

2) While importing I get: "ERROR: invalid memory
alloc request size 1073741824". I tracked this down
to a line with some non-ASCII characters.

I have been trying to reproduce a recent report of this, without
success. Again, a test case would be of great value.

Also, what is your platform exactly, what locale settings, etc? Did
you build Postgres yourself (with what config settings) or use an RPM
(whose)?

regards, tom lane

#3Ben Marklein
the_pharcyde@yahoo.com
In reply to: Tom Lane (#2)
Re: pg_dump/pg_restore problems with 7.4.1

I can send you the test data, but it's confidential so
I'll need to ask that you handle it carefully. Can
you contact me off-list about this? I tried to send
to your account directly but was bounced by your spam
filter from both this and another account.

Ben

--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ben Marklein <the_pharcyde@yahoo.com> writes:

I'm trying to migrate from 7.2 to 7.4.1 via
pg_dump/pg_restore and have encountered a couple

of

problems:

1) Index creation on a table fails:

db=# CREATE UNIQUE INDEX person_info_username_ix

ON

person_info USING btree (username);
ERROR: could not create unique index
DETAIL: Table contains duplicated values.

Of course, this index existed in the 7.2 DB, so it
seems odd that it should not be possible to

recreate.

That's fairly interesting. It could be an issue of
strings that are
equal under your current locale but weren't equal
before ... but why
wouldn't the GROUP BY search find those duplicates
too? I agree this
sounds like a bug. Can you provide test data to
duplicate the problem?

2) While importing I get: "ERROR: invalid memory
alloc request size 1073741824". I tracked this

down

to a line with some non-ASCII characters.

I have been trying to reproduce a recent report of
this, without
success. Again, a test case would be of great
value.

Also, what is your platform exactly, what locale
settings, etc? Did
you build Postgres yourself (with what config
settings) or use an RPM
(whose)?

regards, tom lane

__________________________________
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus

#4Bruce Momjian
bruce@momjian.us
In reply to: Ben Marklein (#1)
Re: pg_dump/pg_restore problems with 7.4.1

Hope you got this fixed. We have never heard of similar problems with
upgrades --- other problems, yea, but not these. :-)

---------------------------------------------------------------------------

Ben Marklein wrote:

I'm trying to migrate from 7.2 to 7.4.1 via
pg_dump/pg_restore and have encountered a couple of
problems:

1) Index creation on a table fails:

db=# CREATE UNIQUE INDEX person_info_username_ix ON
person_info USING btree (username);
ERROR: could not create unique index
DETAIL: Table contains duplicated values.

Of course, this index existed in the 7.2 DB, so it
seems odd that it should not be possible to recreate.
Furthermore, a search for duplicate values fails:

db=# select username, count(*) from person_info group
by username having count(*) > 1;
username | count
----------+-------
(0 rows)

Is this a known issue or possibly a bug?

2) While importing I get: "ERROR: invalid memory
alloc request size 1073741824". I tracked this down
to a line with some non-ASCII characters. The type of
the new database is LATIN1, and these are valid LATIN1
characters. I googled for the error message and found
a bug report that sounds like the same problem:

http://www.mail-archive.com/pgsql-bugs@postgresql.org/msg07363.html

Anyone know if this was confirmed as a bug, and
whether it's being looked into? Thanks.

Ben

Ben

__________________________________
Do you Yahoo!?
Yahoo! Hotjobs: Enter the "Signing Bonus" Sweepstakes
http://hotjobs.sweepstakes.yahoo.com/signingbonus

---------------------------(end of broadcast)---------------------------
TIP 8: explain analyze is your friend

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#5Bruce Momjian
bruce@momjian.us
In reply to: Ben Marklein (#1)
Re: pg_dump/pg_restore problems with 7.4.1

Ben Marklein wrote:

I'm trying to migrate from 7.2 to 7.4.1 via
pg_dump/pg_restore and have encountered a couple of
problems:

1) Index creation on a table fails:

db=# CREATE UNIQUE INDEX person_info_username_ix ON
person_info USING btree (username);
ERROR: could not create unique index
DETAIL: Table contains duplicated values.

Of course, this index existed in the 7.2 DB, so it
seems odd that it should not be possible to recreate.
Furthermore, a search for duplicate values fails:

db=# select username, count(*) from person_info group
by username having count(*) > 1;
username | count
----------+-------
(0 rows)

Is this a known issue or possibly a bug?

Not sure. Never heard of this before.

2) While importing I get: "ERROR: invalid memory
alloc request size 1073741824". I tracked this down
to a line with some non-ASCII characters. The type of
the new database is LATIN1, and these are valid LATIN1
characters. I googled for the error message and found
a bug report that sounds like the same problem:

http://www.mail-archive.com/pgsql-bugs@postgresql.org/msg07363.html

Anyone know if this was confirmed as a bug, and
whether it's being looked into? Thanks.

Sounds like the same problem, but I don't remember the solution.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#6Eric Ridge
ebr@tcdi.com
In reply to: Bruce Momjian (#4)
ps output and postgres

I like how postgres backends modify the 'command' display in a (unix)
process listing to show what database they are connected to, and the
query being executed.

Has any thought been put into being able to make this configurable per
backend connection (maybe via a SET variable)? I'm thinking it would
be cool if a user process could provide its own status there,
identifying application-specific information. Similar to how Java
let's you set the "name" of a Thread (which sadly doesn't appear in a
process listing, but oh well).

I realize this could open up some security concerns by allowing us to
expose more information than we should, but for those of us that want
to shoot ourselves in the foot, it might be a useful feature.

Any thoughts?

eric

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#5)
Re: pg_dump/pg_restore problems with 7.4.1

Bruce Momjian <pgman@candle.pha.pa.us> writes:

Ben Marklein wrote:

db=# CREATE UNIQUE INDEX person_info_username_ix ON
person_info USING btree (username);
ERROR: could not create unique index
DETAIL: Table contains duplicated values.

Of course, this index existed in the 7.2 DB, so it
seems odd that it should not be possible to recreate.

Not sure. Never heard of this before.

It seems odd to me too. Is it possible that the new database uses a
different character set encoding and/or different locale, such that
two strings previously considered unequal are now equal? (Of course,
you'd think the count(*) check would find any such cases...)

2) While importing I get: "ERROR: invalid memory
alloc request size 1073741824". I tracked this down
to a line with some non-ASCII characters.

Sounds like the same problem, but I don't remember the solution.

I think this is this problem:

2004-01-17 21:15 tgl

* src/backend/commands/: copy.c (REL7_4_STABLE), copy.c: Don't use
%s-with-precision format spec to truncate data being displayed in a
COPY error message. It seems that glibc gets indigestion if it is
asked to truncate strings that contain invalid UTF-8 encoding
sequences. vsnprintf will return -1 in such cases, leading to
looping and eventual memory overflow in elog.c. Instead use our
own, more robust pg_mbcliplen routine. I believe this problem
accounts for several recent reports of unexpected 'out of memory'
errors during COPY IN.

Until we get around to issuing a 7.4.2, the easiest solution is to fix
the data to match the database encoding (or vice versa). You'd actually
have to do that anyway --- the bug is merely that you get a useless
error message instead of a helpful one when the incoming data is bogus
according to the database encoding.

regards, tom lane

#8Bruce Momjian
bruce@momjian.us
In reply to: Eric Ridge (#6)
Re: ps output and postgres

Eric Ridge wrote:

I like how postgres backends modify the 'command' display in a (unix)
process listing to show what database they are connected to, and the
query being executed.

Has any thought been put into being able to make this configurable per
backend connection (maybe via a SET variable)? I'm thinking it would
be cool if a user process could provide its own status there,
identifying application-specific information. Similar to how Java
let's you set the "name" of a Thread (which sadly doesn't appear in a
process listing, but oh well).

I realize this could open up some security concerns by allowing us to
expose more information than we should, but for those of us that want
to shoot ourselves in the foot, it might be a useful feature.

No one really has thought of that before. We could do it, though there
are admin reasons for restricting that ability. If we said only
superusers could change it, it wouldn't be very useful. It would be
cool if SET could change it, but it seems that would make it pretty
useless for administrator usage.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#9Eric Ridge
ebr@tcdi.com
In reply to: Bruce Momjian (#8)
Re: ps output and postgres

On Feb 11, 2004, at 10:00 PM, Bruce Momjian wrote:

No one really has thought of that before. We could do it, though there
are admin reasons for restricting that ability. If we said only
superusers could change it, it wouldn't be very useful.

That's a good point.

It would be cool if SET could change it, but it seems that would make
it pretty
useless for administrator usage.

Ran into a situation yesterday where all connections were exhausted on
a development database, and thanks to our nat-ing firewall, couldn't
tell where all the connections were coming from. It made me think that
intelligently mucking with the ps output might have made things easier
for me to find the person to yell at.

One could just as easily report info like "real" client ip, client
application state, etc, to a table, but having that stuff via 'ps' just
seemed like a cool idea.

In addition, some of our applications have a few background threads
that maintain persistent connections to the database. Being able to
logically label those processes would make it easier to identify which
backend processes are still connected.

eric

#10Bruce Momjian
bruce@momjian.us
In reply to: Eric Ridge (#9)
Re: ps output and postgres

Eric Ridge wrote:

On Feb 11, 2004, at 10:00 PM, Bruce Momjian wrote:

No one really has thought of that before. We could do it, though there
are admin reasons for restricting that ability. If we said only
superusers could change it, it wouldn't be very useful.

That's a good point.

It would be cool if SET could change it, but it seems that would make
it pretty
useless for administrator usage.

Ran into a situation yesterday where all connections were exhausted on
a development database, and thanks to our nat-ing firewall, couldn't
tell where all the connections were coming from. It made me think that
intelligently mucking with the ps output might have made things easier
for me to find the person to yell at.

One could just as easily report info like "real" client ip, client
application state, etc, to a table, but having that stuff via 'ps' just
seemed like a cool idea.

In addition, some of our applications have a few background threads
that maintain persistent connections to the database. Being able to
logically label those processes would make it easier to identify which
backend processes are still connected.

Well, let's see if someone else like the feature because adding it might
reduce its usability as a guaranteed value for admins.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#11Eric Ridge
ebr@tcdi.com
In reply to: Bruce Momjian (#10)
Re: ps output and postgres

On Feb 12, 2004, at 11:54 AM, Bruce Momjian wrote:

Well, let's see if someone else like the feature because adding it
might
reduce its usability as a guaranteed value for admins.

Maybe making the ability to do this a configuration option (off by
default, of course) would make everyone happy?

eric

#12Bruce Momjian
bruce@momjian.us
In reply to: Eric Ridge (#11)
Re: ps output and postgres

Eric Ridge wrote:

On Feb 12, 2004, at 11:54 AM, Bruce Momjian wrote:

Well, let's see if someone else like the feature because adding it
might
reduce its usability as a guaranteed value for admins.

Maybe making the ability to do this a configuration option (off by
default, of course) would make everyone happy?

Yes, I think that would be the only way to go, but I would like to have
some other folks interested in extending the ps display before adding
such a capability.

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#13Cott Lang
cott@internetstaff.com
In reply to: Bruce Momjian (#4)
Re: pg_dump/pg_restore problems with 7.4.1

On Wed, 2004-02-11 at 15:12, Bruce Momjian wrote:

2) While importing I get: "ERROR: invalid memory
alloc request size 1073741824". I tracked this down
to a line with some non-ASCII characters. The type of
the new database is LATIN1, and these are valid LATIN1
characters. I googled for the error message and found
a bug report that sounds like the same problem:

We saw this behavior on one of several development boxes we converted to
7.4.1. We tracked it down to some 8 bit ascii in a particular table.

#14Eric Ridge
ebr@tcdi.com
In reply to: Bruce Momjian (#12)
Re: ps output and postgres

On Feb 12, 2004, at 12:08 PM, Bruce Momjian wrote:

Yes, I think that would be the only way to go, but I would like to have
some other folks interested in extending the ps display before adding
such a capability.

Too bad nobody responded. I still think it would be a useful feature.

eric