question about window function in C
Hi !
I'm trying to write a window function in C .
In the function I'm using a tuplesort to sort the window data and then do
some processing.
Now when I'm running the function I get this: 'WARNING: temporary file
leak: File 43 still referenced'
The warning comes from my failure to call tuplesort_end at the appropriate
time.
From the beginning I thought that calling tuplesort_end when current_pos in
the window approched WinGetPartitionRowCount would do it.
That seemed to work at first but when I put a limit-clause in the query
that doesn't work.
So I think I need to use some callback mechanism to get that to work.
I've found some reference in the postgres source code to
RegisterExprContextCallback which seemed promising but I have no idea how
to get the right expression context to put in as the first argument to that
function.
So my question is, how do I shut down the tuplesort properly after the last
call to my window function ?
I'm running "PostgreSQL 9.3.5 on i686-pc-linux-gnu, compiled by
gcc-4.4.real (Debian 4.4.5-8) 4.4.5, 32-bit"
Best Regards
Dan S
Dan S <strd911@gmail.com> writes:
I'm trying to write a window function in C .
In the function I'm using a tuplesort to sort the window data and then do
some processing.
Hmm ... why do you feel you need to do that? The window function's input
should already be sorted according to the window specification.
Now when I'm running the function I get this: 'WARNING: temporary file
leak: File 43 still referenced'
The warning comes from my failure to call tuplesort_end at the appropriate
time.
Unsurprising.
So I think I need to use some callback mechanism to get that to work.
I've found some reference in the postgres source code to
RegisterExprContextCallback which seemed promising but I have no idea how
to get the right expression context to put in as the first argument to that
function.
I don't think there is one :-(. WindowAgg has a per-input-tuple econtext,
and a per-output-tuple econtext, but what you'd need for this is a
partition-lifespan econtext, which doesn't exist.
It's possible that we could promote the "partcontext" memory context into
a full econtext so as to support this sort of behavior. But I'd want to
see a reasonably convincing use-case for it.
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Well I'm trying to implement a window-function that works on range_types
and produces 'atomic ranges' for each input range.
Let's say I have a set of ranges some overlapping some not, and I want to
split each range at every boundary of every overlapping range and return
those.
So for each range r I want to return an array of ranges that consists of
range r split at every overlapping range boundary.
I need to consider both upper and lower boundaries and to be able to do
this in one pass over the data I need to sort both the upper and lower
boundaries and also sort the original ranges so I can loop over the ranges
and boundaries in lockstep to produce the arrays.
As a last step I sort back the arrays of atomic ranges in the original
order in the window so I can read out the tuplesort in that order and
return each array to its corresponding range r.
(The result can be used to answer questions like what are the maximum
number of simultaneously overlapping ranges and at which ranges the maximum
occurs)
Best Regards
Dan S
2014-12-26 18:57 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
Show quoted text
Dan S <strd911@gmail.com> writes:
I'm trying to write a window function in C .
In the function I'm using a tuplesort to sort the window data and then do
some processing.Hmm ... why do you feel you need to do that? The window function's input
should already be sorted according to the window specification.Now when I'm running the function I get this: 'WARNING: temporary file
leak: File 43 still referenced'
The warning comes from my failure to call tuplesort_end at theappropriate
time.
Unsurprising.
So I think I need to use some callback mechanism to get that to work.
I've found some reference in the postgres source code to
RegisterExprContextCallback which seemed promising but I have no idea how
to get the right expression context to put in as the first argument tothat
function.
I don't think there is one :-(. WindowAgg has a per-input-tuple econtext,
and a per-output-tuple econtext, but what you'd need for this is a
partition-lifespan econtext, which doesn't exist.It's possible that we could promote the "partcontext" memory context into
a full econtext so as to support this sort of behavior. But I'd want to
see a reasonably convincing use-case for it.regards, tom lane
On Fri, Dec 26, 2014 at 1:19 PM, Dan S <strd911@gmail.com> wrote:
Well I'm trying to implement a window-function that works on range_types and
produces 'atomic ranges' for each input range.
Let's say I have a set of ranges some overlapping some not, and I want to
split each range at every boundary of every overlapping range and return
those.
So for each range r I want to return an array of ranges that consists of
range r split at every overlapping range boundary.
I need to consider both upper and lower boundaries and to be able to do this
in one pass over the data I need to sort both the upper and lower boundaries
and also sort the original ranges so I can loop over the ranges and
boundaries in lockstep to produce the arrays.
As a last step I sort back the arrays of atomic ranges in the original order
in the window so I can read out the tuplesort in that order and return each
array to its corresponding range r.(The result can be used to answer questions like what are the maximum number
of simultaneously overlapping ranges and at which ranges the maximum occurs)Best Regards
Dan S2014-12-26 18:57 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
Dan S <strd911@gmail.com> writes:
I'm trying to write a window function in C .
In the function I'm using a tuplesort to sort the window data and then
do
some processing.Hmm ... why do you feel you need to do that? The window function's input
should already be sorted according to the window specification.Now when I'm running the function I get this: 'WARNING: temporary file
leak: File 43 still referenced'
The warning comes from my failure to call tuplesort_end at the
appropriate
time.Unsurprising.
So I think I need to use some callback mechanism to get that to work.
I've found some reference in the postgres source code to
RegisterExprContextCallback which seemed promising but I have no idea
how
to get the right expression context to put in as the first argument to
that
function.I don't think there is one :-(. WindowAgg has a per-input-tuple econtext,
and a per-output-tuple econtext, but what you'd need for this is a
partition-lifespan econtext, which doesn't exist.It's possible that we could promote the "partcontext" memory context into
a full econtext so as to support this sort of behavior. But I'd want to
see a reasonably convincing use-case for it.regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Fri, Dec 26, 2014 at 11:41 PM, Merlin Moncure <mmoncure@gmail.com> wrote:
On Fri, Dec 26, 2014 at 1:19 PM, Dan S <strd911@gmail.com> wrote:
Well I'm trying to implement a window-function that works on range_types and
produces 'atomic ranges' for each input range.
Let's say I have a set of ranges some overlapping some not, and I want to
split each range at every boundary of every overlapping range and return
those.
So for each range r I want to return an array of ranges that consists of
range r split at every overlapping range boundary.
I need to consider both upper and lower boundaries and to be able to do this
in one pass over the data I need to sort both the upper and lower boundaries
and also sort the original ranges so I can loop over the ranges and
boundaries in lockstep to produce the arrays.
As a last step I sort back the arrays of atomic ranges in the original order
in the window so I can read out the tuplesort in that order and return each
array to its corresponding range r.(The result can be used to answer questions like what are the maximum number
of simultaneously overlapping ranges and at which ranges the maximum occurs)
shoot -- sorry for empty mail -- misclick.
anyways, can you give some precise examples of what you want to occur?
for example, a set of sql taking actual inputs and the output you
would like to see...
merlin
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
2014-12-27 6:43 GMT+01:00 Merlin Moncure <mmoncure@gmail.com>:
On Fri, Dec 26, 2014 at 11:41 PM, Merlin Moncure <mmoncure@gmail.com>
wrote:On Fri, Dec 26, 2014 at 1:19 PM, Dan S <strd911@gmail.com> wrote:
Well I'm trying to implement a window-function that works on
range_types and
produces 'atomic ranges' for each input range.
Let's say I have a set of ranges some overlapping some not, and I wantto
split each range at every boundary of every overlapping range and return
those.
So for each range r I want to return an array of ranges that consists of
range r split at every overlapping range boundary.
I need to consider both upper and lower boundaries and to be able to dothis
in one pass over the data I need to sort both the upper and lower
boundaries
and also sort the original ranges so I can loop over the ranges and
boundaries in lockstep to produce the arrays.
As a last step I sort back the arrays of atomic ranges in the originalorder
in the window so I can read out the tuplesort in that order and return
each
array to its corresponding range r.
(The result can be used to answer questions like what are the maximum
number
of simultaneously overlapping ranges and at which ranges the maximum
occurs)
shoot -- sorry for empty mail -- misclick.
anyways, can you give some precise examples of what you want to occur?
for example, a set of sql taking actual inputs and the output you
would like to see...merlin
Say I have theses ranges (in a table or as output of a sql statement):
[2000-01-01,2000-05-01)
[2000-02-01,2003-07-01)
[2002-06-01,2003-03-01)
[2003-04-01,2004-08-01)
then for the first range [2000-01-01,2000-05-01) I want the array output :
{[2000-01-01,2000-02-01),[2000-02-01,2000-05-01)}
for the next range [2000-02-01,2003-07-01) I want the array output:
{[2000-02-01,2000-05-01),[2000-05-01,2002-06-01),[2002-06-01,2003-03-01),[2003-03-01,2003-04-01),[2003-04-01,2003-07-01)}
and for the third range [2002-06-01,2003-03-01) I want the output:
{[2002-06-01,2003-03-01)}
and for the last range [2003-04-01,2004-08-01) the output should be:
{[2003-04-01,2003-07-01),[2003-07-01,2004-08-01)}
As you can see each input range is split up at each boundary where it is
overlapped by the other ranges in the set.
I have written a window function which does this.
On the first call into the window function it calculates all the split up
ranges and stores the resulting arrays into a tuplesort and sort these back
into the original order in the window function.
On subsequent calls into the window function it pulls one array of ranges
at a time from the tuplesort and returns that as output.
Then on the last call into the window function I need to release the
tuplesort so to not leak resources.
The problem is that I can't determine which is the last call into the
function so I can't release the tuplesort at the appropriate time.
I need a way to determine when to release the tuplesort, for example a way
to register a callback at appropriate time.
Sql wise the table can look like this and say I have the 4 ranges above in
the table.
create table t1 (id int, dr daterange);
and the query with my window function: select id, dr,
window_range_split(dr) over () from t1
this query will exhaust the whole window and I can determine when to
release the tuplesort, I release it when I pull out the last tuple in the
window.
However this query won't work: select id, dr, window_range_split(dr) over
() from t1 limit 2
the number of tuples in the window is 4 and I have no idea that there are
only going to be 2 calls into the window function and so I fail release the
tuplesort on the second call.
I should also mention that it's not specifically dateranges I'm targeting.
the window function takes anyrange as argument type and returns anyarray.
Best Regards
Dan S