DUPS in tables columns ERROR: column ". . . " does not exist

Started by Albretch Muelleralmost 18 years ago8 messagesgeneral

lbrtchx@gmail.com

almost 18 years ago

Hi,
~
I am trying to get dups from some data from files which md5sums I
previously calculated
~
Here is my mere mortal SQL
~
SELECT md5, COUNT(md5) AS md5cnt
FROM jdk1_6_0_07_txtfls_md5
WHERE (md5cnt > 1)
GROUP BY md5
ORDER BY md5cnt DESC;
~
and this is what I get:
~
jpk=# SELECT md5, COUNT(md5) AS md5cnt
FROM jdk1_6_0_07_txtfls_md5
WHERE (md5cnt > 1)
GROUP BY md5
ORDER BY md5cnt DESC;
jpk-# jpk-# jpk-# jpk-# ERROR: column "md5cnt" does not exist
LINE 3: WHERE (md5cnt > 1)
~
I think I know what that one means based on the clear error message,
namely md5cntis not a table column itself, but I still think there
should be a way to formulate a simple query like this because PG does
take "ORDER BY md5cnt DESC" even if md5cnt is not a table column, why
on earth then it does not swallow and digest the "WHERE (md5cnt > 1)"
part?
~
You could go the monkey way running a query like:
~
SELECT md5, COUNT(md5) AS md5cnt FROM jdk1_6_0_07_txtfls_md5 GROUP BY
md5 ORDER BY md5cnt DESC;
~
and then use code to jump of the loop when md5cnt becomes 1 or you
could use nested SQL statements
~
How can you find duplicate records in a table?
~
Thanks
lbrtchx

Albretch Mueller

lbrtchx@gmail.com

almost 18 years ago

In reply to: Albretch Mueller (#1)

Re: DUPS in tables columns ERROR: column ". . . " does not exist

Also I know there is a DISTINCT keyword, but I also need to know how
many times the particular data in the column is repeated if it is,
that is why I need to go:
~
SELECT md5, COUNT(md5) AS md5cnt
FROM jdk1_6_0_07_txtfls_md5
WHERE (md5cnt > 1)
GROUP BY md5
ORDER BY md5cnt DESC;
~
Thanks
lbrtchx

Stefan Kaltenbrunner

stefan@kaltenbrunner.cc

almost 18 years ago

In reply to: Albretch Mueller (#1)

Re: DUPS in tables columns ERROR: column ". . . " does not exist

Albretch Mueller wrote:

Hi,
~
I am trying to get dups from some data from files which md5sums I
previously calculated
~
Here is my mere mortal SQL
~
SELECT md5, COUNT(md5) AS md5cnt
FROM jdk1_6_0_07_txtfls_md5
WHERE (md5cnt > 1)
GROUP BY md5
ORDER BY md5cnt DESC;

I think you are looking for HAVING as in:

SELECT md5, COUNT(md5)
FROM jdk1_6_0_07_txtfls_md5
GROUP BY md5
HAVING count(md5) > 1

Stefan

Albretch Mueller

lbrtchx@gmail.com

almost 18 years ago

In reply to: Stefan Kaltenbrunner (#3)

Re: DUPS in tables columns ERROR: column ". . . " does not exist

thank you Stefan your SQL worked, but still; I am just asking and my
programming bias will certainly show, but aren't you effectivly
"calling" count on the table three times if you go:
~
SELECT md5, COUNT(md5)
FROM jdk1_6_0_07_txtfls_md5
GROUP BY md5
HAVING COUNT(md5) > 1
ORDER BY COUNT(md5) DESC;
~
Shouldn't
~
SELECT md5, COUNT(md5) AS CNT
FROM jdk1_6_0_07_txtfls_md5
GROUP BY md5
HAVING CNT > 1
ORDER BY CNT DESC;
~
work?
~
jpk=# SELECT md5, COUNT(md5) AS CNT
FROM jdk1_6_0_07_txtfls_md5
GROUP BY md5
HAVING CNT > 1
ORDER BY CNT DESC;
jpk-# jpk-# jpk-# jpk-# ERROR: column "cnt" does not exist
LINE 4: HAVING CNT > 1
~
Thanks
lbrtchx

Tom Lane

tgl@sss.pgh.pa.us

almost 18 years ago

In reply to: Albretch Mueller (#4)

Re: DUPS in tables columns ERROR: column ". . . " does not exist

"Albretch Mueller" <lbrtchx@gmail.com> writes:

thank you Stefan your SQL worked, but still; I am just asking and my
programming bias will certainly show, but aren't you effectivly
"calling" count on the table three times if you go:

The system is smart enough to only do the count() once.

regards, tom lane

Albretch Mueller

lbrtchx@gmail.com

almost 18 years ago

In reply to: Tom Lane (#5)

Re: DUPS in tables columns ERROR: column ". . . " does not exist

The system is smart enough to only do the count() once.

~
But not smart enough to make a variable you declare point to that
internal variable so that things are clearer/ easier ;-)
~
Thanks
lbrtchx

Martijn van Oosterhout

kleptog@svana.org

almost 18 years ago

In reply to: Albretch Mueller (#6)

Re: DUPS in tables columns ERROR: column ". . . " does not exist

On Sat, Aug 30, 2008 at 01:36:25PM -0400, Albretch Mueller wrote:

The system is smart enough to only do the count() once.

~
But not smart enough to make a variable you declare point to that
internal variable so that things are clearer/ easier ;-)

The SQL standard has pretty clear rules about what variables can be
referenced from where, and this is one of those places (the rationale
is probably in there too).

Have a nice day
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Please line up in a tree and maintain the heap invariant while
boarding. Thank you for flying nlogn airlines.

Lew

noone@lwsc.ehost-services.com

almost 18 years ago

In reply to: Albretch Mueller (#2)

Re: DUPS in tables columns ERROR: column ". . . " does not exist

Albretch Mueller wrote:

Also I know there is a DISTINCT keyword, but I also need to know how
many times the particular data in the column is repeated if it is,
that is why I need to go:
~
SELECT md5, COUNT(md5) AS md5cnt
FROM jdk1_6_0_07_txtfls_md5
WHERE (md5cnt > 1)
GROUP BY md5
ORDER BY md5cnt DESC;

Use HAVING instead of WHERE.

--
Lew