CREATE AGGREGATE array_cat

Started by Vlad Bokovabout 5 years ago9 messages
#1Vlad Bokov
vlad@razum2um.me

Hi, I wonder why there's no function to aggregate arrays by
concatenation out of the box?

There is a usual function `array_cat(anyarray, anyarray)`, but it
doesn't seamlessly work with grouping.

Wouldn't it be natural to have this:

CREATE AGGREGATE array_cat (anyarray)
(
    sfunc = array_cat,
    stype = anyarray,
    initcond = '{}'
);

Thanks,
Vlad

#2David G. Johnston
david.g.johnston@gmail.com
In reply to: Vlad Bokov (#1)
Re: CREATE AGGREGATE array_cat

On Wednesday, November 18, 2020, Vlad Bokov <vlad@razum2um.me> wrote:

Hi, I wonder why there's no function to aggregate arrays by
concatenation out of the box?

See array_agg(...)

David J.

#3Vik Fearing
vik@postgresfriends.org
In reply to: David G. Johnston (#2)
Re: CREATE AGGREGATE array_cat

On 11/18/20 11:19 PM, David G. Johnston wrote:

On Wednesday, November 18, 2020, Vlad Bokov <vlad@razum2um.me> wrote:

Hi, I wonder why there's no function to aggregate arrays by
concatenation out of the box?

See array_agg(...)

Why? That doesn't do what is wanted.

vik=# select array_agg(a) from (values (array[1]), (array[2])) as v(a);
array_agg
-----------
{{1},{2}}
(1 row)

vik=# CREATE AGGREGATE array_cat (anyarray)
vik-# (
vik(# sfunc = array_cat,
vik(# stype = anyarray,
vik(# initcond = '{}'
vik(# );
CREATE AGGREGATE

vik=# select array_cat(a) from (values (array[1]), (array[2])) as v(a);
array_cat
-----------
{1,2}
(1 row)

--
Vik Fearing

#4David G. Johnston
david.g.johnston@gmail.com
In reply to: Vik Fearing (#3)
Re: CREATE AGGREGATE array_cat

On Wed, Nov 18, 2020 at 5:37 PM Vik Fearing <vik@postgresfriends.org> wrote:

On 11/18/20 11:19 PM, David G. Johnston wrote:

On Wednesday, November 18, 2020, Vlad Bokov <vlad@razum2um.me> wrote:

Hi, I wonder why there's no function to aggregate arrays by
concatenation out of the box?

See array_agg(...)

Why? That doesn't do what is wanted.

Sorry, I did not read closely enough.

I doubt there is any substantial resistance to including such a function
but it would have to be written in C.

vik=# select array_agg(a) from (values (array[1]), (array[2])) as v(a);
array_agg
-----------
{{1},{2}}
(1 row)

And it's not too hard to work the system to get what you want even without
a custom aggregate.

select array_agg(b) from (values (array[1]), (array[2])) as v(a), unnest(a)
as w(b);

vik=# select array_cat(a) from (values (array[1]), (array[2])) as v(a);

array_cat
-----------
{1,2}
(1 row)

David J.

#5Chapman Flack
chap@anastigmatix.net
In reply to: David G. Johnston (#4)
Re: CREATE AGGREGATE array_cat

On 11/18/20 19:46, David G. Johnston wrote:

I doubt there is any substantial resistance to including such a function
but it would have to be written in C.

Would anything have to be written at all, save the CREATE AGGREGATE
suggested in the original message, using the existing array_cat as the
state transition function?

I suppose one might add an optimization to the existing array_cat to
detect the aggregate case, and realize it could clobber its left argument.
(But I'm not sure how much that would save, with arrays.)

Regards,
-Chap

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: Chapman Flack (#5)
Re: CREATE AGGREGATE array_cat

On Wed, Nov 18, 2020 at 5:54 PM Chapman Flack <chap@anastigmatix.net> wrote:

On 11/18/20 19:46, David G. Johnston wrote:

I doubt there is any substantial resistance to including such a function
but it would have to be written in C.

Would anything have to be written at all, save the CREATE AGGREGATE
suggested in the original message, using the existing array_cat as the
state transition function?

I suppose one might add an optimization to the existing array_cat to
detect the aggregate case, and realize it could clobber its left argument.
(But I'm not sure how much that would save, with arrays.)

Outside my particular area of involvement really; it may be sufficient.

David J.

#7Vik Fearing
vik@postgresfriends.org
In reply to: Chapman Flack (#5)
Re: CREATE AGGREGATE array_cat

On 11/19/20 1:54 AM, Chapman Flack wrote:

On 11/18/20 19:46, David G. Johnston wrote:

I doubt there is any substantial resistance to including such a function
but it would have to be written in C.

Would anything have to be written at all, save the CREATE AGGREGATE
suggested in the original message, using the existing array_cat as the
state transition function?

Nope. As my example showed.

One could imagine extending it with an inverse transition function for
use in windows (small w) but that's about it.
--
Vik Fearing

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Vik Fearing (#7)
Re: CREATE AGGREGATE array_cat

Vik Fearing <vik@postgresfriends.org> writes:

On 11/19/20 1:54 AM, Chapman Flack wrote:

Would anything have to be written at all, save the CREATE AGGREGATE
suggested in the original message, using the existing array_cat as the
state transition function?

Nope. As my example showed.

But by the same token, anybody who wants that can trivially make it.
I think if we're going to bother, we should strive for an implementation
of efficiency comparable to array_agg, and that will take some bespoke
code.

It might also be worth looking at 9a00f03e4, which addressed the fact
that anyone who had made a custom aggregate depending on array_append
was going to be hurting performance-wise. The same would be true of
custom aggregates depending on array_cat, and maybe we should try
to alleviate that even if we're providing a new built-in aggregate.

regards, tom lane

#9Andres Freund
andres@anarazel.de
In reply to: Chapman Flack (#5)
Re: CREATE AGGREGATE array_cat

Hi,

On 2020-11-18 19:54:52 -0500, Chapman Flack wrote:

On 11/18/20 19:46, David G. Johnston wrote:

I doubt there is any substantial resistance to including such a function
but it would have to be written in C.

Would anything have to be written at all, save the CREATE AGGREGATE
suggested in the original message, using the existing array_cat as the
state transition function?

Using array_cat() as the transition function essentially is O(N^2). And
I don't think there's a good way to solve that in array_cat() itself, at
least not compared to just using similar logic to array_agg.

I suppose one might add an optimization to the existing array_cat to
detect the aggregate case, and realize it could clobber its left argument.
(But I'm not sure how much that would save, with arrays.)

I don't immediately see how clobbering the left arg would work
reliably. That's easy enough for in-place modifications of types that
have a fixed width, but for an arbitrary width type that's much
harder. You could probably hack something together by inquiring about
the actual memory allocation size in aset.c etc, but that's pretty ugly.

Greetings,

Andres Freund