idea - new aggregates median, listagg

Started by Pavel Stehuleabout 16 years ago8 messages
#1Pavel Stehule
pavel.stehule@gmail.com

Hello

I am looking on new feature - ORDER clause in aggregate, and I thing,
so we are able to effectively implement some non standard, but well
known aggregates.

a) function median - it is relative frequent request - with usually
slow implementation

b) function listagg (it is analogy of group_concat from MySQL) - it
should simplify report generating and some other

What is your opinion? Do you like to see these functions in core?

Regards
Pavel Stehule

#2David Fetter
david@fetter.org
In reply to: Pavel Stehule (#1)
Re: idea - new aggregates median, listagg

On Tue, Dec 15, 2009 at 10:28:49PM +0100, Pavel Stehule wrote:

Hello

I am looking on new feature - ORDER clause in aggregate, and I thing,
so we are able to effectively implement some non standard, but well
known aggregates.

a) function median - it is relative frequent request - with usually
slow implementation

Makes a lot of sense. I suspect we'll have to provide several
different medians, as there are several precise, useful, and
conflicting definitions. Some examples below:

http://www.simple-talk.com/sql/t-sql-programming/median-workbench/

b) function listagg (it is analogy of group_concat from MySQL) - it
should simplify report generating and some other

This is redundant, as it's equivalent to array_to_string(array_agg()).

If it's done as syntactic sugar over that, it's fine. If it's a
separate implementation, it's a bad idea. Probably best as an
optional module of some kind.

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter
Skype: davidfetter XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: David Fetter (#2)
Re: idea - new aggregates median, listagg

2009/12/15 David Fetter <david@fetter.org>:

On Tue, Dec 15, 2009 at 10:28:49PM +0100, Pavel Stehule wrote:

Hello

I am looking on new feature - ORDER clause in aggregate, and I thing,
so we are able to effectively implement some non standard, but well
known aggregates.

a) function median - it is relative frequent request - with usually
slow implementation

Makes a lot of sense.  I suspect we'll have to provide several
different medians, as there are several precise, useful, and
conflicting definitions.  Some examples below:

http://www.simple-talk.com/sql/t-sql-programming/median-workbench/

In this article the are two medians - statistical and financial. I am
for both. But only one can be named "median".

b) function listagg (it is analogy of group_concat from MySQL) - it
should simplify report generating and some other

This is redundant, as it's equivalent to array_to_string(array_agg()).

when I implement it in orafce - the almost all code was parameters
checking. Implementation is trivial, because important part of work
was done with array_agg support. It is wrapper over stringInfo.

If it's done as syntactic sugar over that, it's fine.  If it's a
separate implementation, it's a bad idea.  Probably best as an
optional module of some kind

I am not sure if implementation as syntactic sugar is best. It needs
parser changes. And final implementation will not be effective.
listagg should be faster - string operations are significantly
simpler than array op. What more - PostgreSQL cannot well wrap SRF and
aggregates. The implementation could be analogy with functions
generate_series and generate_subscripts. generate_subscripts is
redundant, but well implementation needs C coding. Implementation as
syntactic sugar is possible - it could be nice too - listagg is more
shorter than array_to_string(array_agg()) - but it has overhead with
array.

There could be some steps. First - implementation as wrapper over
arrays, second - if this functionality will be popular - native
implementation in C?

Regards
Pavel

Show quoted text

Cheers,
David.
--
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate

#4Robert Haas
robertmhaas@gmail.com
In reply to: Pavel Stehule (#3)
Re: idea - new aggregates median, listagg

On Wed, Dec 16, 2009 at 1:36 AM, Pavel Stehule <pavel.stehule@gmail.com> wrote:

b) function listagg (it is analogy of group_concat from MySQL) - it
should simplify report generating and some other

This is redundant, as it's equivalent to array_to_string(array_agg()).

when I implement it in orafce - the almost all code was parameters
checking. Implementation is trivial, because important part of work
was done with array_agg support. It is wrapper over stringInfo.

If it's done as syntactic sugar over that, it's fine.  If it's a
separate implementation, it's a bad idea.  Probably best as an
optional module of some kind

I am not sure if implementation as syntactic sugar is best. It needs
parser changes.

I don't think this is important enough to change the parser for it. I
don't see a concrete proposal for syntax here, but it seems like the
standard function call syntax should be adequate.

...Robert

#5Kevin Grittner
Kevin.Grittner@wicourts.gov
In reply to: Pavel Stehule (#3)
Re: idea - new aggregates median, listagg

Pavel Stehule <pavel.stehule@gmail.com> wrote:

2009/12/15 David Fetter <david@fetter.org>:

http://www.simple-talk.com/sql/t-sql-programming/median-workbench/

In this article the are two medians - statistical and financial. I
am for both. But only one can be named "median".

Well, since the statistical median requires that you specify whether
you want the left or right median (in case there are an even number
of values), you could either have two median functions, one of which
would take an extra parameter for this, or you could have median,
left_median, and right_median functions.

-Kevin

#6Thom Brown
thombrown@gmail.com
In reply to: Pavel Stehule (#1)
Re: idea - new aggregates median, listagg

2009/12/15 Pavel Stehule <pavel.stehule@gmail.com>

Hello

I am looking on new feature - ORDER clause in aggregate, and I thing,
so we are able to effectively implement some non standard, but well
known aggregates.

a) function median - it is relative frequent request - with usually
slow implementation

b) function listagg (it is analogy of group_concat from MySQL) - it
should simplify report generating and some other

What is your opinion? Do you like to see these functions in core?

I'm probably missing the point here, but when I originally saw MySQL's
group_concat function, I found it odd that it featured ordering
functionality. Shouldn't the order by determined by the query itself?
Otherwise it's almost as if its separating the relationship between the
result column and the resultset.

Thom

#7Pavel Stehule
pavel.stehule@gmail.com
In reply to: Thom Brown (#6)
Re: idea - new aggregates median, listagg

2009/12/16 Thom Brown <thombrown@gmail.com>:

2009/12/15 Pavel Stehule <pavel.stehule@gmail.com>

Hello

I am looking on new feature - ORDER clause in aggregate, and I thing,
so we are able to effectively implement some non standard, but well
known aggregates.

a) function median - it is relative frequent request - with usually
slow implementation

b) function listagg (it is analogy of group_concat from MySQL) - it
should simplify report generating and some other

What is your opinion? Do you like to see these functions in core?

I'm probably missing the point here, but when I originally saw MySQL's
group_concat function, I found it odd that it featured ordering
functionality.  Shouldn't the order by determined by the query itself?
Otherwise it's almost as if its separating the relationship between the
result column and the resultset.

Aggregates as group_concat or listagg are not typical SQL aggregates.
With these aggregates we are able to do some reports on SQL level
without stored procedures. What I know, order is determined only for
non hash aggregates - and you cannot specify method of aggregation, so
possibility to specify ORDER is important. But this feature isn't
related to this "proposal". It was commited yesterday - so you can
look on discussion about this feature.

Regards
Pavel Stehuke

Show quoted text

Thom

#8Scott Bailey
artacus@comcast.net
In reply to: Thom Brown (#6)
Re: idea - new aggregates median, listagg

Thom Brown wrote:

2009/12/15 Pavel Stehule <pavel.stehule@gmail.com
<mailto:pavel.stehule@gmail.com>>

Hello

I am looking on new feature - ORDER clause in aggregate, and I thing,
so we are able to effectively implement some non standard, but well
known aggregates.

a) function median - it is relative frequent request - with usually
slow implementation

b) function listagg (it is analogy of group_concat from MySQL) - it
should simplify report generating and some other

What is your opinion? Do you like to see these functions in core?

I'm probably missing the point here, but when I originally saw MySQL's
group_concat function, I found it odd that it featured ordering
functionality. Shouldn't the order by determined by the query itself?
Otherwise it's almost as if its separating the relationship between the
result column and the resultset.

For xmlagg in particular, it is quite useful to be able order the
results. And sorting the query doesn't work for ordering the agg unless
you do it in a subquery. Oracle has this functionality and it is quite
handy. It would be nice to see listagg with the option to order as well.

Scott