Table has duplicate keys, what did I do

Started by John Gateleyabout 18 years ago10 messagesgeneral
Jump to latest
#1John Gateley
gateley@jriver.com

Somehow I have managed to have two tables with duplicate keys.
In both tables, the key is an integer, filled from a sequence.
There is only 1 duplicated entry in each table: in the first
table, there are two ID "1"s, and in the second table there are
two ID "123456"s (the second table entry is linked to the first
table's ID 1).

I noticed this because a pg_dump followed by a psql < dumpfile
will not reload.

I've figured out a fix: a script that cleans the dump file, removing
the two duplicate lines (leaving the original).

But, mostly, I'm wondering how I managed to get in this state,
if it was something I did, or perhaps caused by killing the
postmaster the wrong way (I don't think I ever did this, but
maybe), or a crash.

I did do a brief search, didn't find anything seemingly related to this.

Thanks,

j
--
John Gateley <gateley@jriver.com>

#2Dann Corbit
DCorbit@connx.com
In reply to: John Gateley (#1)
Re: Table has duplicate keys, what did I do

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of John Gateley
Sent: Monday, January 28, 2008 2:04 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Table has duplicate keys, what did I do

Somehow I have managed to have two tables with duplicate keys.
In both tables, the key is an integer, filled from a sequence.
There is only 1 duplicated entry in each table: in the first
table, there are two ID "1"s, and in the second table there are
two ID "123456"s (the second table entry is linked to the first
table's ID 1).

Because of the nature of the values of the id's (1 and 123456) it sounds
very much like a manual insertion. Is there a unique index on the
column? It definitely sounds like there should be. At any rate, I
guess that someone manually inserted the data. Without a unique index
on the column, there is no protection against this.

I noticed this because a pg_dump followed by a psql < dumpfile
will not reload.

I've figured out a fix: a script that cleans the dump file, removing
the two duplicate lines (leaving the original).

But, mostly, I'm wondering how I managed to get in this state,
if it was something I did, or perhaps caused by killing the
postmaster the wrong way (I don't think I ever did this, but
maybe), or a crash.

I did do a brief search, didn't find anything seemingly related to

this.

Thanks,

j
--
John Gateley <gateley@jriver.com>

---------------------------(end of

broadcast)---------------------------

Show quoted text

TIP 2: Don't 'kill -9' the postmaster

#3John Gateley
gateley@jriver.com
In reply to: Dann Corbit (#2)
Re: Table has duplicate keys, what did I do

On Mon, 28 Jan 2008 14:11:21 -0800
"Dann Corbit" <DCorbit@connx.com> wrote:

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of John Gateley
Sent: Monday, January 28, 2008 2:04 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Table has duplicate keys, what did I do

Somehow I have managed to have two tables with duplicate keys.
In both tables, the key is an integer, filled from a sequence.
There is only 1 duplicated entry in each table: in the first
table, there are two ID "1"s, and in the second table there are
two ID "123456"s (the second table entry is linked to the first
table's ID 1).

Because of the nature of the values of the id's (1 and 123456) it sounds
very much like a manual insertion. Is there a unique index on the
column? It definitely sounds like there should be. At any rate, I
guess that someone manually inserted the data. Without a unique index
on the column, there is no protection against this.

Yes, the id 1 definitely indicates to me that I did something.
However, there is an index on the column: it's the primary key
for the table. I'm not sure how I could manually insert it if
there were an existing index, or later create the index if it
didn't exist when I did the insert.

Thanks,

j

--
John Gateley <gateley@jriver.com>

#4Scott Marlowe
scott.marlowe@gmail.com
In reply to: John Gateley (#3)
Re: Table has duplicate keys, what did I do

On Jan 28, 2008 4:26 PM, John Gateley <gateley@jriver.com> wrote:

On Mon, 28 Jan 2008 14:11:21 -0800
"Dann Corbit" <DCorbit@connx.com> wrote:

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of John Gateley
Sent: Monday, January 28, 2008 2:04 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Table has duplicate keys, what did I do

Somehow I have managed to have two tables with duplicate keys.
In both tables, the key is an integer, filled from a sequence.
There is only 1 duplicated entry in each table: in the first
table, there are two ID "1"s, and in the second table there are
two ID "123456"s (the second table entry is linked to the first
table's ID 1).

Because of the nature of the values of the id's (1 and 123456) it sounds
very much like a manual insertion. Is there a unique index on the
column? It definitely sounds like there should be. At any rate, I
guess that someone manually inserted the data. Without a unique index
on the column, there is no protection against this.

Yes, the id 1 definitely indicates to me that I did something.
However, there is an index on the column: it's the primary key
for the table. I'm not sure how I could manually insert it if
there were an existing index, or later create the index if it
didn't exist when I did the insert.

Are you running with fsync=off and / or hardware that lies about fsync
(ATA / SATA are notorious for this) and possibly having an emergency
power outage of some kind? That's the most common cause of such
problems.

#5Li, Jingfa
jinli@paypal.com
In reply to: John Gateley (#3)
Re: Table has duplicate keys, what did I do

for now, are you able to insert duplicate keys(primary-key) into the two
tables you mentioned? if you can, check if your index is valid or not.
if index is valid, check if the unique contraint is still valid or not
-- perhaps you turned off the unique constraint, and insert the dup key,
and didn't turn back on the unique constrain validation...

JF

-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of John Gateley
Sent: Monday, January 28, 2008 2:26 PM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Table has duplicate keys, what did I do

On Mon, 28 Jan 2008 14:11:21 -0800
"Dann Corbit" <DCorbit@connx.com> wrote:

-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-
owner@postgresql.org] On Behalf Of John Gateley
Sent: Monday, January 28, 2008 2:04 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] Table has duplicate keys, what did I do

Somehow I have managed to have two tables with duplicate keys.
In both tables, the key is an integer, filled from a sequence.
There is only 1 duplicated entry in each table: in the first table,
there are two ID "1"s, and in the second table there are two ID
"123456"s (the second table entry is linked to the first table's ID
1).

Because of the nature of the values of the id's (1 and 123456) it
sounds very much like a manual insertion. Is there a unique index on
the column? It definitely sounds like there should be. At any rate,
I guess that someone manually inserted the data. Without a unique
index on the column, there is no protection against this.

Yes, the id 1 definitely indicates to me that I did something.
However, there is an index on the column: it's the primary key for the
table. I'm not sure how I could manually insert it if there were an
existing index, or later create the index if it didn't exist when I did
the insert.

Thanks,

j

--
John Gateley <gateley@jriver.com>

---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Gateley (#1)
Re: Table has duplicate keys, what did I do

John Gateley <gateley@jriver.com> writes:

Somehow I have managed to have two tables with duplicate keys.
...
But, mostly, I'm wondering how I managed to get in this state,

What PG version is this? We've fixed some bugs in the past that
could give rise to duplicated rows.

regards, tom lane

#7John Gateley
gateley@jriver.com
In reply to: Tom Lane (#6)
Re: Table has duplicate keys, what did I do

On Mon, 28 Jan 2008 21:36:35 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

John Gateley <gateley@jriver.com> writes:

Somehow I have managed to have two tables with duplicate keys.
...
But, mostly, I'm wondering how I managed to get in this state,

What PG version is this? We've fixed some bugs in the past that
could give rise to duplicated rows.

Currently 8.1.4, and it's been tracking Ubuntu 6.06 LTS for
updates, so it could have been earlier than 8.1.4 when the duplicate
row actually was created.

Thanks,

j

--
John Gateley <gateley@jriver.com>

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Gateley (#7)
Re: Table has duplicate keys, what did I do

John Gateley <gateley@jriver.com> writes:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

John Gateley <gateley@jriver.com> writes:

Somehow I have managed to have two tables with duplicate keys.

What PG version is this? We've fixed some bugs in the past that
could give rise to duplicated rows.

Currently 8.1.4, and it's been tracking Ubuntu 6.06 LTS for
updates, so it could have been earlier than 8.1.4 when the duplicate
row actually was created.

There was a fix released in 8.1.9 for a problem that could cause VACUUM
FULL to create duplicate copies of a row that had recently been updated.
Does that sound like a plausible scenario for your usage?

regards, tom lane

#9John Gateley
gateley@jriver.com
In reply to: Tom Lane (#8)
Re: Table has duplicate keys, what did I do

On Tue, 29 Jan 2008 12:53:11 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

John Gateley <gateley@jriver.com> writes:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

John Gateley <gateley@jriver.com> writes:

Somehow I have managed to have two tables with duplicate keys.

What PG version is this? We've fixed some bugs in the past that
could give rise to duplicated rows.

Currently 8.1.4, and it's been tracking Ubuntu 6.06 LTS for
updates, so it could have been earlier than 8.1.4 when the duplicate
row actually was created.

There was a fix released in 8.1.9 for a problem that could cause VACUUM
FULL to create duplicate copies of a row that had recently been updated.
Does that sound like a plausible scenario for your usage?

Yes, it does, very much so. The row in question is updated once
a minute (it is "test" data that is used by our system monitor
to ensure that the database is up, and one of the tests is updating
the row), and the database is vacuumed full once a day.

Thanks,

j

--
John Gateley <gateley@jriver.com>

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Gateley (#9)
Re: Table has duplicate keys, what did I do

John Gateley <gateley@jriver.com> writes:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

There was a fix released in 8.1.9 for a problem that could cause VACUUM
FULL to create duplicate copies of a row that had recently been updated.
Does that sound like a plausible scenario for your usage?

Yes, it does, very much so. The row in question is updated once
a minute (it is "test" data that is used by our system monitor
to ensure that the database is up, and one of the tests is updating
the row), and the database is vacuumed full once a day.

Yeah, that fits exactly. IIRC that VACUUM FULL bug could only be
triggered if there had been a series of multiple updates to the same row
within the lifespan of the oldest open transaction, so repeated updates
on a short timescale would form part of the triggering condition.

Sounds like you need to pester Ubuntu to freshen their package ...

regards, tom lane