About aggregates...
Hello guys,
I would like to ask if there is any way to make an aggregate function to
take a set of tuples as an input variable. I know that an actual aggregate
function receives each tuple one at a time and process it on the fly.
However I want to store tuples in an incremental fashion so as to process
them in a batch approach in the finalaggr function. Think for example
implementing logistic regression (which is an OLAP query by its nature). I
want to support it with the current features that PostgreSQL provides from
which the closest feature is an aggregate. However an aggregate function
feeds me one a tuple for each call, but I would like to have access to a
batch of tuples per function call. Is there any possible way to perform
something like this?
Thank you very much for your time,
Michael
Hi,
On 30 November 2012 08:06, Michael Giannakopoulos <miccagiann@gmail.com> wrote:
However an aggregate function
feeds me one a tuple for each call, but I would like to have access to a
batch of tuples per function call. Is there any possible way to perform
something like this?
Yes, this might be good for you::
WINDOW
WINDOW indicates that the function is a window function rather than a
plain function. This is currently only useful for functions written in
C. The WINDOW attribute cannot be changed when replacing an existing
function definition.
http://www.postgresql.org/docs/9.1/static/sql-createfunction.html
Apart from C you can use this in Pl/R:
http://www.joeconway.com/plr/doc/plr-window-funcs.html
--
Ondrej
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Michael
Giannakopoulos
Sent: Thursday, November 29, 2012 4:07 PM
To: pgsql-general@postgresql.org
Subject: [GENERAL] About aggregates...
Hello guys,
I would like to ask if there is any way to make an aggregate function to
take a set of tuples as an input variable. I know that an actual aggregate
function receives each tuple one at a time and process it on the fly.
However I want to store tuples in an incremental fashion so as to process
them in a batch approach in the finalaggr function. Think for example
implementing logistic regression (which is an OLAP query by its nature). I
want to support it with the current features that PostgreSQL provides from
which the closest feature is an aggregate. However an aggregate function
feeds me one a tuple for each call, but I would like to have access to a
batch of tuples per function call. Is there any possible way to perform
something like this?
Thank you very much for your time,
Michael
=====================================
Not sure how the system would decide between (1-at-a-time) and
(everything-at-once).
The only approach I can think of would be to build out an array of "tuples"
and then have the aggregate process a single array value each time.
As Ondrej indicates in parallel you can try making use of Windows (probably
with a FRAME definition) as well.
Hopefully this helps but I am not familiar enough with the use-case to be
more specific.
David J.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Michael Giannakopoulos wrote:
I would like to ask if there is any way to make an aggregate function
to take a set of tuples as an
input variable. I know that an actual aggregate function receives each
tuple one at a time and process
it on the fly. However I want to store tuples in an incremental
fashion so as to process them in a
batch approach in the finalaggr function. Think for example
implementing logistic regression (which is
an OLAP query by its nature). I want to support it with the current
features that PostgreSQL provides
from which the closest feature is an aggregate. However an aggregate
function feeds me one a tuple for
each call, but I would like to have access to a batch of tuples per
function call. Is there any
possible way to perform something like this?
If you write in C, there is nothing that keeps you from
storing all the rows that come in in memory allocated in
a suitable MemoryContext and process them all at the end.
You might run out of memory though.
Yours,
Laurenz Albe
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general