Adding PRIMARY KEY: Table contains duplicated values

Started by Alexander Farberabout 13 years ago11 messagesgeneral

alexander.farber@gmail.com

about 13 years ago

Hello,

when trying to add a forgotten primary key pair
to a PostgreSQL 8.4.13 table I get the error:

# alter table pref_rep add primary key(id, author);
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"pref_rep_pkey" for table "pref_rep"
ERROR: could not create unique index "pref_rep_pkey"
DETAIL: Table contains duplicated values.

How could I find those duplicated pairs of id and author?

I've tried following, but this of course doesn't give me "pairs":

# select id, count(id) from pref_rep group by id order by count desc limit 5;
id | count
----------------+-------
OK408547485023 | 706
OK261593357402 | 582
DE11198 | 561
DE13041 | 560
OK347613386893 | 556
(5 rows)

Thank you
Alex

P.S. I've also asked my question also at SO,
hope it is okay to "crosspost" that way :-)

http://stackoverflow.com/questions/14688523/adding-primary-key-table-contains-duplicated-values

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Adrian Klaver

adrian.klaver@aklaver.com

about 13 years ago

In reply to: Alexander Farber (#1)

Re: Adding PRIMARY KEY: Table contains duplicated values

On 02/04/2013 06:17 AM, Alexander Farber wrote:

Hello,

when trying to add a forgotten primary key pair
to a PostgreSQL 8.4.13 table I get the error:

# \d pref_rep
Table "public.pref_rep"
Column | Type | Modifiers
-----------+-----------------------------+-----------------------------------------------------------
id | character varying(32) |
author | character varying(32) |
good | boolean |
fair | boolean |
nice | boolean |
about | character varying(256) |
stamp | timestamp without time zone | default now()
author_ip | inet |
rep_id | integer | not null default
nextval('pref_rep_rep_id_seq'::regclass)
Check constraints:
"pref_rep_check" CHECK (id::text <> author::text)
Foreign-key constraints:
"pref_rep_author_fkey" FOREIGN KEY (author) REFERENCES
pref_users(id) ON DELETE CASCADE
"pref_rep_id_fkey" FOREIGN KEY (id) REFERENCES pref_users(id) ON
DELETE CASCADE

# alter table pref_rep add primary key(id, author);
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"pref_rep_pkey" for table "pref_rep"
ERROR: could not create unique index "pref_rep_pkey"
DETAIL: Table contains duplicated values.

How could I find those duplicated pairs of id and author?

I've tried following, but this of course doesn't give me "pairs":

# select id, count(id) from pref_rep group by id order by count desc limit 5;
id | count
----------------+-------
OK408547485023 | 706
OK261593357402 | 582
DE11198 | 561
DE13041 | 560
OK347613386893 | 556
(5 rows)

SELECT * FROM (SELECT count(*) AS ct, id, author FROM pref_rep GROUP BY
id, author) AS dup WHERE dup.ct >1;

Thank you
Alex

--
Adrian Klaver
adrian.klaver@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Alexander Farber

alexander.farber@gmail.com

about 13 years ago

In reply to: Alexander Farber (#1)

Re: Adding PRIMARY KEY: Table contains duplicated values

Thank you -

On Mon, Feb 4, 2013 at 3:26 PM, Andrew Jaimes <andrewjaimes@hotmail.com> wrote:

SELECT id, author, count(1)
FROM pref_rep
GROUP BY id, author
HAVING count(1) >1

From: alexander.farber@gmail.com
http://stackoverflow.com/questions/14688523/adding-primary-key-table-contains-duplicated-values

this has worked and has delivered me 190 records
(I still wonder how they could have happened,
because I only used a stored procedure
with UPDATE - if NOT FOUND - INSERT
Is it maybe pgbouncer's fault?):

id | author | count
------------------------+------------------------+-------
DE10598 | OK495480409724 | 2
DE12188 | MR17925810634439466500 | 3
DE13529 | OK471161192902 | 2
DE13963 | OK434087948702 | 2
DE14037 | DE7692 | 2
......
VK45132921 | DE3544 | 2
VK6152782 | OK261593357402 | 2
VK72883921 | OK506067284178 | 2
(190 rows)

And then I'm trying to construct a query which
would delete the older (the "stamp" column)
of such pairs - but this also doesn't work:

# SELECT id, author, count(1), stamp
FROM pref_rep
GROUP BY id, author, stamp
HAVING count(1) >1;
id | author | count | stamp
----------------+-----------------------+-------+----------------------------
OK14832267156 | OK419052078016 | 2 | 2012-04-11 12:54:02.980239
OK333460361587 | VK151946174 | 2 | 2012-07-04 07:08:22.172663
OK351109431016 | OK165881471481 | 2 | 2011-09-18 18:29:33.51362
OK367507493096 | OK342027384470 | 5 | 2012-02-10 20:58:11.488184
OK430882956135 | OK331014635822 | 2 | 2012-11-21 18:38:23.141298
OK446355841129 | OK353460633855 | 2 | 2012-06-15 21:31:56.791688
OK450700410618 | OK511055704249 | 2 | 2012-03-16 15:19:50.27776
OK458979640673 | OK165881471481 | 2 | 2011-08-18 22:31:17.540112
OK468333888972 | MR5100358507294433874 | 2 | 2012-12-05 14:16:15.870061
OK485109177380 | DE12383 | 2 | 2011-09-16 16:00:38.625038
OK505164304516 | OK165881471481 | 2 | 2012-03-24 13:54:27.968482
(11 rows)

Any suggestions please? Should I use a temp table here?

Thank you
Alex

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Import Notes

Reply to msg id not found: BLU002-W75D3EFB47179B68DCFF7E1DD010@phx.gbl

Alexander Farber

alexander.farber@gmail.com

about 13 years ago

In reply to: Alexander Farber (#3)

Re: Adding PRIMARY KEY: Table contains duplicated values

Trying to delete the older of the duplicated pairs:

# SELECT id, author, count(1), max(stamp) as maxx
FROM pref_rep
GROUP BY id, author
HAVING count(1) >1 and stamp < maxx;
ERROR: column "maxx" does not exist
LINE 4: HAVING count(1) >1 and stamp < maxx;
^

On Mon, Feb 4, 2013 at 3:36 PM, Alexander Farber
<alexander.farber@gmail.com> wrote:

http://stackoverflow.com/questions/14688523/adding-primary-key-table-contains-duplicated-values

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Adrian Klaver

adrian.klaver@aklaver.com

about 13 years ago

In reply to: Alexander Farber (#4)

Re: Adding PRIMARY KEY: Table contains duplicated values

On 02/04/2013 06:45 AM, Alexander Farber wrote:

Trying to delete the older of the duplicated pairs:

# SELECT id, author, count(1), max(stamp) as maxx
FROM pref_rep
GROUP BY id, author
HAVING count(1) >1 and stamp < maxx;
ERROR: column "maxx" does not exist
LINE 4: HAVING count(1) >1 and stamp < maxx;

How about:
SELECT id, author, count(1), max(stamp) as maxx
FROM pref_rep
GROUP BY id, author
HAVING count(1) >1 and stamp < max(stamp);

^

On Mon, Feb 4, 2013 at 3:36 PM, Alexander Farber
<alexander.farber@gmail.com> wrote:

http://stackoverflow.com/questions/14688523/adding-primary-key-table-contains-duplicated-values

--
Adrian Klaver
adrian.klaver@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Alexander Farber

alexander.farber@gmail.com

about 13 years ago

In reply to: Adrian Klaver (#5)

Re: Adding PRIMARY KEY: Table contains duplicated values

Unfortunately that fails -

On Mon, Feb 4, 2013 at 3:55 PM, Adrian Klaver <adrian.klaver@gmail.com> wrote:

On 02/04/2013 06:45 AM, Alexander Farber wrote:

Trying to delete the older of the duplicated pairs:

How about:

SELECT id, author, count(1), max(stamp) as maxx
FROM pref_rep
GROUP BY id, author
HAVING count(1) >1 and stamp < max(stamp);

On Mon, Feb 4, 2013 at 3:36 PM, Alexander Farber
<alexander.farber@gmail.com> wrote:

http://stackoverflow.com/questions/14688523/adding-primary-key-table-contains-duplicated-values

# SELECT id, author, count(1), max(stamp) as maxx
pref-> FROM pref_rep
pref-> GROUP BY id, author
pref-> HAVING count(1) >1 and stamp < max(stamp);
ERROR: column "pref_rep.stamp" must appear in the GROUP BY clause or
be used in an aggregate function
LINE 4: HAVING count(1) >1 and stamp < max(stamp);

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Adrian Klaver

adrian.klaver@aklaver.com

about 13 years ago

In reply to: Alexander Farber (#4)

Re: Adding PRIMARY KEY: Table contains duplicated values

On 02/04/2013 06:45 AM, Alexander Farber wrote:

Trying to delete the older of the duplicated pairs:

# SELECT id, author, count(1), max(stamp) as maxx
FROM pref_rep
GROUP BY id, author
HAVING count(1) >1 and stamp < maxx;
ERROR: column "maxx" does not exist
LINE 4: HAVING count(1) >1 and stamp < maxx;
^

Caffeine has not reached critical mass yet, so test before using:

SELECT * FROM pref_rep JOIN (SELECT id, author, count(1) AS ct,
max(stamp) AS maxx FROM pref_rep GROUP BY id,author) max_time ON
max_time.id=pref_rep.id WHERE ct > 1 AND stamp < maxx;

--
Adrian Klaver
adrian.klaver@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Andreas Kretschmer

akretschmer@spamfence.net

about 13 years ago

In reply to: Alexander Farber (#1)

Re: Adding PRIMARY KEY: Table contains duplicated values

Alexander Farber <alexander.farber@gmail.com> wrote:

# alter table pref_rep add primary key(id, author);
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"pref_rep_pkey" for table "pref_rep"
ERROR: could not create unique index "pref_rep_pkey"
DETAIL: Table contains duplicated values.

How could I find those duplicated pairs of id and author?

similar example:

test=*# select * from foo;
id1 | id2
-----+-----
1 | 1
1 | 2
1 | 3
2 | 1
2 | 2
2 | 3
1 | 2
3 | 1
3 | 2
3 | 3
3 | 1
(11 rows)

Time: 0,151 ms
test=*# alter table foo add primary key (id1,id2);
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index "foo_pkey" for table "foo"
ERROR: could not create unique index "foo_pkey"
DETAIL: Key (id1, id2)=(1, 2) is duplicated.
Time: 1,394 ms
test=*# select id1, id2, count(*) as c from foo group by id1, id2 having count(*) > 1;
id1 | id2 | c
-----+-----+---
3 | 1 | 2
1 | 2 | 2
(2 rows)

Time: 0,331 ms

HTH.

Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknown)
Kaufbach, Saxony, Germany, Europe. N 51.05082ï¿½, E 13.56889ï¿½

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Andreas Kretschmer

akretschmer@spamfence.net

about 13 years ago

In reply to: Andreas Kretschmer (#8)

Re: Adding PRIMARY KEY: Table contains duplicated values

Andreas Kretschmer <akretschmer@spamfence.net> wrote:

Alexander Farber <alexander.farber@gmail.com> wrote:

# alter table pref_rep add primary key(id, author);
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"pref_rep_pkey" for table "pref_rep"
ERROR: could not create unique index "pref_rep_pkey"
DETAIL: Table contains duplicated values.

How could I find those duplicated pairs of id and author?

similar example:

test=*# select * from foo;
id1 | id2
-----+-----
1 | 1
1 | 2
1 | 3
2 | 1
2 | 2
2 | 3
1 | 2
3 | 1
3 | 2
3 | 3
3 | 1
(11 rows)

Time: 0,151 ms
test=*# alter table foo add primary key (id1,id2);
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index "foo_pkey" for table "foo"
ERROR: could not create unique index "foo_pkey"
DETAIL: Key (id1, id2)=(1, 2) is duplicated.
Time: 1,394 ms
test=*# select id1, id2, count(*) as c from foo group by id1, id2 having count(*) > 1;
id1 | id2 | c
-----+-----+---
3 | 1 | 2
1 | 2 | 2
(2 rows)

Time: 0,331 ms

If your next question is 'how to delete ...', my answer:
(yeah, reading SO ;-) )

test=*# select ctid,* from foo;
ctid | id1 | id2
--------+-----+-----
(0,1) | 1 | 1
(0,2) | 1 | 2
(0,3) | 1 | 3
(0,4) | 2 | 1
(0,5) | 2 | 2
(0,6) | 2 | 3
(0,7) | 1 | 2
(0,8) | 3 | 1
(0,9) | 3 | 2
(0,10) | 3 | 3
(0,11) | 3 | 1
(11 rows)

Time: 0,170 ms
test=*# delete from foo where ctid in (select min(ctid) from foo where (id1,id2) in (select id1, id2 from foo group by id1, id2 having count(*) > 1) group by id1,id2);
DELETE 2
Time: 0,559 ms
test=*# select ctid,* from foo;
ctid | id1 | id2
--------+-----+-----
(0,1) | 1 | 1
(0,3) | 1 | 3
(0,4) | 2 | 1
(0,5) | 2 | 2
(0,6) | 2 | 3
(0,7) | 1 | 2
(0,9) | 3 | 2
(0,10) | 3 | 3
(0,11) | 3 | 1
(9 rows)

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10

Alexander Farber

alexander.farber@gmail.com

about 13 years ago

In reply to: Andreas Kretschmer (#8)

Re: Adding PRIMARY KEY: Table contains duplicated values

Thank you -

On Mon, Feb 4, 2013 at 5:06 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:

# alter table pref_rep add primary key(id, author);
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"pref_rep_pkey" for table "pref_rep"
ERROR: could not create unique index "pref_rep_pkey"
DETAIL: Table contains duplicated values.

how do you get this DETAIL, is it a setting for psql prompt?

I've got a nice answer for my question at Stackoverflow:

DELETE FROM pref_rep p USING (
SELECT id, author, max(stamp) stamp
FROM pref_rep
GROUP BY id, author
HAVING count(1) > 1) AS f
WHERE p.id=f.id AND p.author=f.author AND p.stamp<f.stamp;

and learnt about SQL Fiddle too -
http://sqlfiddle.com/#!11/59fbc/11

Regards
Alex

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11

Andreas Kretschmer

akretschmer@spamfence.net

about 13 years ago

In reply to: Alexander Farber (#10)

Re: Adding PRIMARY KEY: Table contains duplicated values

Alexander Farber <alexander.farber@gmail.com> wrote:

Thank you -

On Mon, Feb 4, 2013 at 5:06 PM, Andreas Kretschmer
<akretschmer@spamfence.net> wrote:

# alter table pref_rep add primary key(id, author);
NOTICE: ALTER TABLE / ADD PRIMARY KEY will create implicit index
"pref_rep_pkey" for table "pref_rep"
ERROR: could not create unique index "pref_rep_pkey"
DETAIL: Table contains duplicated values.

how do you get this DETAIL, is it a setting for psql prompt?

You means the 'DETAIL: Key (id1, id2)=(1, 2) is duplicated.'?

I'm using 9.2, i think, that's the reason.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general