"Unvisible" duplicates in SELECT [...] HAVING count(attrib) > 1

Started by Krzysztof Gajdemskialmost 25 years ago3 messagesbugs
Jump to latest

============================================================================
POSTGRESQL BUG REPORT TEMPLATE
============================================================================

Your name : Krzysztof Gajdemski
Your email address : songo@vi.pl

System Configuration
---------------------
Architecture (example: Intel Pentium) : Intel Pentium

Operating System (example: Linux 2.0.26 ELF) : 2.2.19 SMP

PostgreSQL version (example: PostgreSQL-7.1): PostgreSQL-7.1rc4

Compiler used (example: gcc 2.8.0) : gcc 2.95.3

Please enter a FULL description of your problem:
------------------------------------------------

SELECT attrib FROM table GROUP BY attrib HAVING count(attrib) > 1;

query doesn't produce reliable results for some specific circumstances
(I suppose that number of records and/or strange ASCII chars in varchar
fields are one of required conditions to reproduce this behaviour, but I
can't define additional for now). Table which I use in following example
is available at: http://i.use.vi.pl/postgres/dup_example.sql.gz (size
400KB, this is why I don't place it here).

Please describe a way to repeat the problem. Please try to provide a
concise reproducible example, if at all possible:
----------------------------------------------------------------------
songo=> SELECT version();
version
----------------------------------------------------------------
PostgreSQL 7.1rc4 on i686-pc-linux-gnu, compiled by GCC 2.95.3
(1 row)

songo=> \i dup_example.sql
psql:dup_example.sql:14: NOTICE: CREATE TABLE/PRIMARY KEY will create implicit
index 'user_id_pkey' for table 'user_id'
CREATE
CHANGE
CHANGE
songo=> SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1;
nick
------------
sexylady\@
(1 row)

songo=> DELETE FROM user_id WHERE nick IN (
songo(> SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1);
DELETE 2
songo=> SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1;
nick
------
(0 rows)

songo=> CREATE UNIQUE INDEX nick_idx ON user_id(nick);
ERROR: Cannot create unique index. Table contains non-unique values
songo=> SELECT nick FROM user_id WHERE nick LIKE 'dr dre%';
nick
--------------------
dr dre\$\$\$\$\$\$
dr dre\$\$\$\$\$\$
(2 rows)

songo=> SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1;
nick
------
(0 rows)

If you know how this problem might be fixed, list the solution below:
---------------------------------------------------------------------

k.
--
- - - - - - Krzysztof Gajdemski | songo @ vi.pl | KG4751-RIPE
Registered Linux User # 133457 | BLUG Registered Member # 0005
PGP public keys at: keyserver.linux.bydg.org * KeyID: D336705B
,,Szanuj� was wszystkich, kt�rzy pozostajecie w cieniu'' SNERG

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Krzysztof Gajdemski (#1)
Re: "Unvisible" duplicates in SELECT [...] HAVING count(attrib) > 1

Krzysztof Gajdemski <songo@vi.pl> writes:

PostgreSQL version (example: PostgreSQL-7.1): PostgreSQL-7.1rc4

query doesn't produce reliable results for some specific circumstances

I tried your example in current sources and get reasonable-looking
behavior:

regression=# SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1;
nick
--------------------
dr dre\$\$\$\$\$\$
sexylady\@
(2 rows)

regression=# DELETE FROM user_id WHERE nick IN (
regression(# SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1);
DELETE 4
regression=# SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1;
nick
------
(0 rows)

regression=# CREATE UNIQUE INDEX nick_idx ON user_id(nick);
CREATE
regression=# SELECT nick FROM user_id WHERE nick LIKE 'dr dre%';
nick
------
(0 rows)

regression=# SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1;
nick
------
(0 rows)

regression=#

It could be that there's a locale dependency here (I did this in a
non-locale-enabled compilation), but I think it's more likely that
you've run into one of the complex-query bugs we corrected since rc4.
Please update to 7.1.1 (or better 7.1.2, which should be out very soon)
and see if you still see a problem.

regards, tom lane

In reply to: Tom Lane (#2)
Re: "Unvisible" duplicates in SELECT [...] HAVING count(attrib) > 1

18.05.2001, 11:12:27, Tom Lane wrote:

Krzysztof Gajdemski <songo@vi.pl> writes:

PostgreSQL version (example: PostgreSQL-7.1): PostgreSQL-7.1rc4

query doesn't produce reliable results for some specific circumstances

I tried your example in current sources and get reasonable-looking
behavior:

regression=# SELECT nick FROM user_id GROUP BY nick HAVING count(nick) > 1;
nick
--------------------
dr dre\$\$\$\$\$\$
sexylady\@
(2 rows)

[ cut ]

It could be that there's a locale dependency here (I did this in a
non-locale-enabled compilation),

I've used binary PostgreSQL package from Debian/unstable distribution,
so could it be, that this is Debian specific problem if you can't
reproduce it.

but I think it's more likely that
you've run into one of the complex-query bugs we corrected since rc4.
Please update to 7.1.1 (or better 7.1.2, which should be out very soon)
and see if you still see a problem.

Yes, it is a good idea. I've to try compile it directly from source
(with locale support enabled).

k.
--
- - - - - - Krzysztof Gajdemski | songo @ vi.pl | KG4751-RIPE
Registered Linux User # 133457 | BLUG Registered Member # 0005
PGP public keys at: keyserver.linux.bydg.org * KeyID: D336705B
,,Szanuj� was wszystkich, kt�rzy pozostajecie w cieniu'' SNERG