count (DISTINCT expression [ , ... ] ) and documentation
I noticed that starting from 8.2 the documentation at
http://www.postgresql.org/docs/8.2/interactive/sql-expressions.html
say that multiple distinct expressions are supported
aggregate_name (DISTINCT expression [, expression] )
While previous docs just listed one:
aggregate_name (DISTINCT expression)
Still I'm using 8.3 and
select count(distinct c1, c2) from table1;
report:
No function matches the given name and argument types. You might
need to add explicit type casts.
What should I write in spite of?
select count(distinct c1, c2) from table1;
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Hello
count has only one argument,
try:
postgres=# select * from fooa;
a | b
----+----
10 | 20
(1 row)
postgres=# select count(distinct a,b) from fooa;
ERROR: function count(integer, integer) does not exist
LINE 1: select count(distinct a,b) from fooa;
^
HINT: No function matches the given name and argument types. You
might need to add explicit type casts.
postgres=# select count(distinct (a,b)) from fooa;
count
-------
1
(1 row)
regards
Pavel Stehule
2008/12/26 Ivan Sergio Borgonovo <mail@webthatworks.it>:
Show quoted text
I noticed that starting from 8.2 the documentation at
http://www.postgresql.org/docs/8.2/interactive/sql-expressions.html
say that multiple distinct expressions are supportedaggregate_name (DISTINCT expression [, expression] )
While previous docs just listed one:
aggregate_name (DISTINCT expression)
Still I'm using 8.3 and
select count(distinct c1, c2) from table1;
report:
No function matches the given name and argument types. You might
need to add explicit type casts.What should I write in spite of?
select count(distinct c1, c2) from table1;
--
Ivan Sergio Borgonovo
http://www.webthatworks.it--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Fri, 26 Dec 2008 15:46:48 +0100
"Pavel Stehule" <pavel.stehule@gmail.com> wrote:
count has only one argument,
then what was changed between 8.1 and 8.2 to change the docs?
None of the functions listed in:
http://www.postgresql.org/docs/8.2/static/functions-aggregate.html
seems to support
aggregate(distinct exp [,exp])
Does the change reflect the change in the possibility to write user
defined aggregates that support more then one distinct expression?
The first thing that comes to my mind to emulate
count(distinct a,b)
would be to
create table test.dist (a int, b int);
insert into test.dist values(1,0);
insert into test.dist values(1,0);
insert into test.dist values(1,1);
insert into test.dist values(0,0);
select count(*) from (select distinct a,b from test.dist ) a;
but still I can't think of anything that would work with
aggregate(distinct a,b)
not just count.
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
2008/12/26 Ivan Sergio Borgonovo <mail@webthatworks.it>:
On Fri, 26 Dec 2008 15:46:48 +0100
"Pavel Stehule" <pavel.stehule@gmail.com> wrote:count has only one argument,
then what was changed between 8.1 and 8.2 to change the docs?
None of the functions listed in:
http://www.postgresql.org/docs/8.2/static/functions-aggregate.html
seems to support
aggregate(distinct exp [,exp])
http://www.postgresql.org/docs/8.2/interactive/functions-aggregate.html
Does the change reflect the change in the possibility to write user
defined aggregates that support more then one distinct expression?
CREATE AGGREGATE name ( input_data_type [ , ... ] ) (
SFUNC = sfunc,
STYPE = state_data_type
[ , FINALFUNC = ffunc ]
[ , INITCOND = initial_condition ]
[ , SORTOP = sort_operator ]
)
you are able to write multi param aggregates.
regards
Pavel Stehule
Show quoted text
The first thing that comes to my mind to emulate
count(distinct a,b)
would be tocreate table test.dist (a int, b int);
insert into test.dist values(1,0);
insert into test.dist values(1,0);
insert into test.dist values(1,1);
insert into test.dist values(0,0);
select count(*) from (select distinct a,b from test.dist ) a;but still I can't think of anything that would work with
aggregate(distinct a,b)
not just count.--
Ivan Sergio Borgonovo
http://www.webthatworks.it--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Fri, 26 Dec 2008 16:23:52 +0100
"Pavel Stehule" <pavel.stehule@gmail.com> wrote:
2008/12/26 Ivan Sergio Borgonovo <mail@webthatworks.it>:
On Fri, 26 Dec 2008 15:46:48 +0100
"Pavel Stehule" <pavel.stehule@gmail.com> wrote:count has only one argument,
then what was changed between 8.1 and 8.2 to change the docs?
None of the functions listed in:
http://www.postgresql.org/docs/8.2/static/functions-aggregate.html
seems to support
aggregate(distinct exp [,exp])http://www.postgresql.org/docs/8.2/interactive/functions-aggregate.html
yeah but no function seems to support
aggregate(distinct x, y)
CREATE AGGREGATE name ( input_data_type [ , ... ] ) (
SFUNC = sfunc,
STYPE = state_data_type
[ , FINALFUNC = ffunc ]
[ , INITCOND = initial_condition ]
[ , SORTOP = sort_operator ]
)
OK... but how am I going to implement an user defined aggregate that
support without resorting to C?
myaggfunc(distinct x, y)?
Otherwise to what is it referring
http://www.postgresql.org/docs/8.2/interactive/sql-expressions.html
aggregate_name (DISTINCT expression [ , ... ] )
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
On Fri, Dec 26, 2008 at 03:34:33PM +0100, Ivan Sergio Borgonovo wrote:
I noticed that starting from 8.2 the documentation at
http://www.postgresql.org/docs/8.2/interactive/sql-expressions.html
say that multiple distinct expressions are supportedaggregate_name (DISTINCT expression [, expression] )
In 8.4, you'll be able to do:
WITH d AS (
SELECT DISTINCT c1, c2 FROM table1
)
SELECT count(*) FROM d;
and very likely an OLAP version. :)
Cheers,
David.
While previous docs just listed one:
aggregate_name (DISTINCT expression)
Still I'm using 8.3 and
select count(distinct c1, c2) from table1;
report:
No function matches the given name and argument types. You might
need to add explicit type casts.What should I write in spite of?
select count(distinct c1, c2) from table1;
--
Ivan Sergio Borgonovo
http://www.webthatworks.it--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
2008/12/26 Ivan Sergio Borgonovo <mail@webthatworks.it>:
On Fri, 26 Dec 2008 16:23:52 +0100
"Pavel Stehule" <pavel.stehule@gmail.com> wrote:2008/12/26 Ivan Sergio Borgonovo <mail@webthatworks.it>:
On Fri, 26 Dec 2008 15:46:48 +0100
"Pavel Stehule" <pavel.stehule@gmail.com> wrote:count has only one argument,
then what was changed between 8.1 and 8.2 to change the docs?
None of the functions listed in:
http://www.postgresql.org/docs/8.2/static/functions-aggregate.html
seems to support
aggregate(distinct exp [,exp])http://www.postgresql.org/docs/8.2/interactive/functions-aggregate.html
yeah but no function seems to support
aggregate(distinct x, y)
CREATE AGGREGATE name ( input_data_type [ , ... ] ) (
SFUNC = sfunc,
STYPE = state_data_type
[ , FINALFUNC = ffunc ]
[ , INITCOND = initial_condition ]
[ , SORTOP = sort_operator ]
)OK... but how am I going to implement an user defined aggregate that
support without resorting to C?myaggfunc(distinct x, y)?
Otherwise to what is it referring
http://www.postgresql.org/docs/8.2/interactive/sql-expressions.html
aggregate_name (DISTINCT expression [ , ... ] )
ok, I tested and it isn't supported yet. This is documentation bug.
DISTINCT is allowed only for single argument aggregate.
Regards
Pavel Stehule
Show quoted text
--
Ivan Sergio Borgonovo
http://www.webthatworks.it--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Fri, 26 Dec 2008 10:43:25 -0800
David Fetter <david@fetter.org> wrote:
On Fri, Dec 26, 2008 at 03:34:33PM +0100, Ivan Sergio Borgonovo
wrote:I noticed that starting from 8.2 the documentation at
http://www.postgresql.org/docs/8.2/interactive/sql-expressions.html
say that multiple distinct expressions are supportedaggregate_name (DISTINCT expression [, expression] )
In 8.4, you'll be able to do:
WITH d AS (
SELECT DISTINCT c1, c2 FROM table1
)
SELECT count(*) FROM d;
Nice, but what will be the difference from
select count(*) from (select distinct c1, c2 from t);
?
Optimisation?
Furthermore... I was actually looking at docs because I needed to
find a way supported by both postgresql and mysql and I've heard
that mysql is not that good at subselect and I doubt it supports
WITH AS. (OK not really a postgresql problem...).
Meanwhile what would you suggest as a general approach to stuff like
select count(distinct c1, c2) from t;
regardless of mysql support?
and considering mysql support?
I was thinking to find some way to exploit group by, but I didn't
come to anything useful yet.
and very likely an OLAP version. :)
What's "an OLAP version" of WITH d AS...
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
On Fri, Dec 26, 2008 at 08:03:30PM +0100, Ivan Sergio Borgonovo wrote:
On Fri, 26 Dec 2008 10:43:25 -0800
David Fetter <david@fetter.org> wrote:On Fri, Dec 26, 2008 at 03:34:33PM +0100, Ivan Sergio Borgonovo
wrote:I noticed that starting from 8.2 the documentation at
http://www.postgresql.org/docs/8.2/interactive/sql-expressions.html
say that multiple distinct expressions are supportedaggregate_name (DISTINCT expression [, expression] )
In 8.4, you'll be able to do:
WITH d AS (
SELECT DISTINCT c1, c2 FROM table1
)
SELECT count(*) FROM d;Nice, but what will be the difference from
select count(*) from (select distinct c1, c2 from t);
?
Optimisation?
None especially.
Furthermore... I was actually looking at docs because I needed to
find a way supported by both postgresql and mysql
Generally, it's *not* a good idea to try to support more than one
back-end. You wind up maintaining several disparate code bases, all
of which must do exactly the same thing, or you create your own RDBMS
in your client code, or worst of all, some of each.
Unless the most important attribute of the software, i.e. you can
jettison any other feature to support it, is to support more than one
RDBMS back-end, don't even try. Examples of software which needs to
support multiple RDBMS back-ends include, and are pretty much limited
to, ERD generators and migration tools.
and very likely an OLAP version. :)
What's "an OLAP version" of WITH d AS...
OLAP includes clauses like WINDOW() and OVER(), but since it's not
committed yet, I don't want to get too far into it :)
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On Fri, 26 Dec 2008 12:04:48 -0800
David Fetter <david@fetter.org> wrote:
On Fri, Dec 26, 2008 at 08:03:30PM +0100, Ivan Sergio Borgonovo
wrote:
aggregate_name (DISTINCT expression [, expression] )
In 8.4, you'll be able to do:
WITH d AS (
SELECT DISTINCT c1, c2 FROM table1
)
SELECT count(*) FROM d;
Nice, but what will be the difference from
select count(*) from (select distinct c1, c2 from t);
?
Optimisation?
None especially.
So what would be the advantage compared to subselect?
Furthermore... I was actually looking at docs because I needed to
find a way supported by both postgresql and mysqlGenerally, it's *not* a good idea to try to support more than one
back-end. You wind up maintaining several disparate code bases,
Not really my main target... I was just investigating if it could
come for free ;)
--
Ivan Sergio Borgonovo
http://www.webthatworks.it
Ivan Sergio Borgonovo <mail@webthatworks.it> writes:
David Fetter <david@fetter.org> wrote:
In 8.4, you'll be able to do:
WITH d AS (
SELECT DISTINCT c1, c2 FROM table1
)
SELECT count(*) FROM d;
Nice, but what will be the difference from
select count(*) from (select distinct c1, c2 from t);
?
Optimisation?
None especially.
So what would be the advantage compared to subselect?
None, David just has WITH on the brain ;-)
The subselect syntax certainly seems like the one most likely to work
across different SQL implementations. WITH is a pretty recent addition
to the SQL spec, and DISTINCT with multiple aggregate arguments isn't
in the spec at all. The COUNT(DISTINCT ROW(x,y)) hack is a cute idea
but I'm dubious that that's portable either (it certainly doesn't work
in pre-8.4 PG).
regards, tom lane
On Fri, Dec 26, 2008 at 07:13:48PM -0500, Tom Lane wrote:
Ivan Sergio Borgonovo <mail@webthatworks.it> writes:
David Fetter <david@fetter.org> wrote:
In 8.4, you'll be able to do:
WITH d AS (
SELECT DISTINCT c1, c2 FROM table1
)
SELECT count(*) FROM d;Nice, but what will be the difference from
select count(*) from (select distinct c1, c2 from t);
?
Optimisation?None especially.
So what would be the advantage compared to subselect?
None, David just has WITH on the brain ;-)
LOL!
You're only saying that because it's true ;)
Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate
On Fri, 26 Dec 2008 19:13:48 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:
The subselect syntax certainly seems like the one most likely to
work across different SQL implementations. WITH is a pretty
subselects actually works on mysql too but on a 1M table with about
300K unique columns it performs more than 4 times slower than
select (distinct a,b) from table
18sec vs. 4sec
Times were similar for innodb and myisam.
Postgresql needs 17sec with subselect.
I didn't try to see how both db could perform with indexes.
mysql performance is impressive. I thought that most of the time
would be spent on "distinct" where postgresql shouldn't suffer from
its "count" implementation. But well still 300K rows to count on 1M
aren't few.
--
Ivan Sergio Borgonovo
http://www.webthatworks.it