[PATCH] binary heap implementation
Hi.
There are two or three places in the Postgres source that implement heap
sort (e.g. tuplesort.c, nodeMergeAppend.c), and it's also needed by the
BDR code. It seemed reasonable to factor out the functionality.
I've attached a patch (binaryheap.diff) that contains a straightforward
implementation of a binary heap (originally written by Andres, with a
few bugfixes and cleanups by me).
There doesn't seem to be any place to put unit tests for code like this,
so at Alvaro's suggestion, I have attached a small standalone program I
used for testing (testbinheap.c) and a Makefile. If you copy them both
to src/backend/lib and run "make -f Makefile.binheap", it'll build the
test program. It's nothing exciting, just exercises the functions in
various ways.
I've also attached a patch (nodeMergeAppend.diff) that converts the code
in nodeMergeAppend.c to use binaryheap. It passes "make check", and also
the following test (which is planned as a merge append):
CREATE OR REPLACE VIEW gen AS SELECT * FROM
generate_series(1, 100000) g(i) ORDER BY g.i OFFSET 0;
SELECT * FROM (
SELECT * FROM gen UNION ALL SELECT * FROM gen UNION ALL
SELECT * FROM gen UNION ALL SELECT * FROM gen UNION ALL
SELECT * FROM gen UNION ALL SELECT * FROM gen) s
ORDER BY i OFFSET 1000000;
Initially there was a slight performance degradation with my patch, but
I speeded things up and now my code is at least at par with, and maybe
slightly faster than, the old code. On my laptop, both consistently
take ~2.4s on average to execute the above SELECT.
Comments? Suggestions?
-- Abhijit
On 2012-11-14 18:41:12 +0530, Abhijit Menon-Sen wrote:
There are two or three places in the Postgres source that implement heap
sort (e.g. tuplesort.c, nodeMergeAppend.c), and it's also needed by the
BDR code. It seemed reasonable to factor out the functionality.
This replaces the "simpleheap.[ch]" I had in the last submitted series.
I've attached a patch (binaryheap.diff) that contains a straightforward
implementation of a binary heap (originally written by Andres, with a
few bugfixes and cleanups by me).
I want to emphasise that the only good thing about my submitted version
was that it actually compiled. So quite a bit of the work here is from
Abhijit...
Greetings,
Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Wed, Nov 14, 2012 at 8:11 AM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote:
Comments? Suggestions?
It could use a run through pgindent. And the header comments are
separated by a blank line from the functions to which they are
attached, which is not project style.
+ if (heap->size + 1 == heap->space)
+ Assert("binary heap is full");
This doesn't really make any sense. Assert("binary heap is full")
will never fail, because that expression is a character string which
is, therefore, not NULL. I think you want Assert(heap->size <
heap->space). Or you could use (heap->size + 1 >= heap->space)
elog(LEVEL, message) if there's some reason this needs to be checked
in non-debug builds.
+ {
+ sift_down(heap, i);
+ }
Project style is to omit braces for a single-line body. This comes up
a few other places as well.
Other than the coding style issues, I think this looks fine. If you
can fix that up, I believe I could go ahead and commit this, unless
anyone else objects.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Robert Haas escribió:
On Wed, Nov 14, 2012 at 8:11 AM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote:
Comments? Suggestions?
It could use a run through pgindent. And the header comments are
separated by a blank line from the functions to which they are
attached, which is not project style.
Also there are some comments in the .h file which we typically don't
carry.
Other than the coding style issues, I think this looks fine. If you
can fix that up, I believe I could go ahead and commit this, unless
anyone else objects.
Does this include the changes in nodeMergeappend.c?
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Assert(!"description of error") is an idiom I've seen more than once,
although I'm not sure I understand why it's not written as Robert says with
the condition in the brackets (or as a print to STDERR followed by abort()
instead).
On 15 November 2012 15:11, Robert Haas <robertmhaas@gmail.com> wrote:
Show quoted text
On Wed, Nov 14, 2012 at 8:11 AM, Abhijit Menon-Sen <ams@2ndquadrant.com>
wrote:Comments? Suggestions?
It could use a run through pgindent. And the header comments are
separated by a blank line from the functions to which they are
attached, which is not project style.+ if (heap->size + 1 == heap->space) + Assert("binary heap is full");This doesn't really make any sense. Assert("binary heap is full")
will never fail, because that expression is a character string which
is, therefore, not NULL. I think you want Assert(heap->size <
heap->space). Or you could use (heap->size + 1 >= heap->space)
elog(LEVEL, message) if there's some reason this needs to be checked
in non-debug builds.+ { + sift_down(heap, i); + }Project style is to omit braces for a single-line body. This comes up
a few other places as well.Other than the coding style issues, I think this looks fine. If you
can fix that up, I believe I could go ahead and commit this, unless
anyone else objects.--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 11/15/2012 10:11 AM, Robert Haas wrote:
+ { + sift_down(heap, i); + }Project style is to omit braces for a single-line body. This comes up
a few other places as well.
I thought we modified that some years ago, although my memory of it is a
bit hazy. Personally I think it's a bad rule, at least as stated, in the
case where there's an else clause with a compound statement. I'd prefer
to see a rule that says that either both branches of an if/else should
be compound statements or neither should be. I think the cases where
only one branch is a compound statement are rather ugly.
cheers
andrew
On Thu, Nov 15, 2012 at 10:21 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
Other than the coding style issues, I think this looks fine. If you
can fix that up, I believe I could go ahead and commit this, unless
anyone else objects.Does this include the changes in nodeMergeappend.c?
I think so. I was going to double-check the performance before
committing, but it doesn't look like there should be a problem.
Coding style changes appear needed in that file as well, however.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2012-11-14 18:41:12 +0530, Abhijit Menon-Sen wrote:
There are two or three places in the Postgres source that implement heap
sort (e.g. tuplesort.c, nodeMergeAppend.c), and it's also needed by the
BDR code. It seemed reasonable to factor out the functionality.
pg_dump also contains a binary heap implementation if memory serves
right which makes me wonder whether we should try binaryheap.[ch]
backend clean... It currently uses palloc/pfree.
Too bad we don't have a memory allocation routine which easily works in
the backend and frontend... palloc references MemoryContextAlloc and
CurrentMemoryContext so thats not that easy to emulate.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Andrew Dunstan escribió:
On 11/15/2012 10:11 AM, Robert Haas wrote:
+ { + sift_down(heap, i); + }Project style is to omit braces for a single-line body. This comes up
a few other places as well.I thought we modified that some years ago, although my memory of it
is a bit hazy.
No, we only modified pg_indent to not take the braces off, because
of PG_TRY blocks. But we keep using single statements instead of
compound in many places; but there is no hard rule about it. To me,
using braces were they are not needed is pointless and ugly. We're very
careful about ensuring that our macro definitions work nicely with
single-statement if/else, for example.
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Nov 15, 2012 at 11:22 AM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2012-11-14 18:41:12 +0530, Abhijit Menon-Sen wrote:
There are two or three places in the Postgres source that implement heap
sort (e.g. tuplesort.c, nodeMergeAppend.c), and it's also needed by the
BDR code. It seemed reasonable to factor out the functionality.pg_dump also contains a binary heap implementation if memory serves
right which makes me wonder whether we should try binaryheap.[ch]
backend clean... It currently uses palloc/pfree.Too bad we don't have a memory allocation routine which easily works in
the backend and frontend... palloc references MemoryContextAlloc and
CurrentMemoryContext so thats not that easy to emulate.
I think there's a clear need for a library of code that can be shared
by frontend and backend code. I don't necessarily want to punt
everything into libpq, though. What I wonder if we might do is create
something like src/lib/pgguts that contains routines for memory
allocation, error handling, stringinfos (so that you don't have to use
PQexpBuffer in the frontend and StringInfo in the backend), etc. and
make that usable by both front-end and back-end code. I think that
would be a tremendously nice thing to have.
I don't think we should make it this patch's problem to do that, but
I'd be strongly in favor of such a thing if someone wants to put it
together. It's been a huge nuisance for me in the past and all
indicates are that the work you're doing on LR is just going to
exacerbate that problem.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2012-11-15 11:37:16 -0500, Robert Haas wrote:
On Thu, Nov 15, 2012 at 11:22 AM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2012-11-14 18:41:12 +0530, Abhijit Menon-Sen wrote:
There are two or three places in the Postgres source that implement heap
sort (e.g. tuplesort.c, nodeMergeAppend.c), and it's also needed by the
BDR code. It seemed reasonable to factor out the functionality.pg_dump also contains a binary heap implementation if memory serves
right which makes me wonder whether we should try binaryheap.[ch]
backend clean... It currently uses palloc/pfree.Too bad we don't have a memory allocation routine which easily works in
the backend and frontend... palloc references MemoryContextAlloc and
CurrentMemoryContext so thats not that easy to emulate.
I think there's a clear need for a library of code that can be shared
by frontend and backend code. I don't necessarily want to punt
everything into libpq, though.
Aggreed.
What I wonder if we might do is create
something like src/lib/pgguts that contains routines for memory
allocation, error handling, stringinfos (so that you don't have to use
PQexpBuffer in the frontend and StringInfo in the backend), etc. and
make that usable by both front-end and back-end code. I think that
would be a tremendously nice thing to have.
Yes. But it also sounds like quite a bit of work :(
I don't think we should make it this patch's problem to do that
Me neither. I was thinking about letting the "user" allocate enough
memory like:
binaryheap *heap = palloc(binaryheap_size(/*capacity*/ 10));
binaryheap_init(heap, 10, comparator);
But thats pretty ugly.
but
I'd be strongly in favor of such a thing if someone wants to put it
together. It's been a huge nuisance for me in the past and all
indicates are that the work you're doing on LR is just going to
exacerbate that problem.
I actually hope I won't have to fight with that all that much, but we
will see :/
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Thu, Nov 15, 2012 at 11:50 AM, Andres Freund <andres@2ndquadrant.com> wrote:
Me neither. I was thinking about letting the "user" allocate enough
memory like:binaryheap *heap = palloc(binaryheap_size(/*capacity*/ 10));
binaryheap_init(heap, 10, comparator);But thats pretty ugly.
Yeah. I would vote against doing that for now. We can always uglify
the code later if we decide it's absolutely necessary.
One trick that we could potentially use to make code run in the
frontend and backend is to put it in src/port. That code gets
compiled twice, once within -DFRONTEND and once without. So you can
use #ifdef to handle things that need to work different ways. The
only real problem with that approach is that you end up putting the
code in libpgport where it seems rather out of place. Still, maybe we
could have a src/framework directory that uses the same trick to
produce a libpgframework that frontend code can use, and a lib
pgframework_srv that can be linked into the backend. That's might not
actually be that much work; we'd just need to clone all the existing
src/port logic.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
At 2012-11-15 10:11:58 -0500, robertmhaas@gmail.com wrote:
It could use a run through pgindent.
Done.
I think you want Assert(heap->size < heap->space).
Of course. Thanks for catching that.
Project style is to omit braces for a single-line body.
I've removed most of them. In the few cases where one branch of an
if/else would have a one-line body and the other would not, I chose
to rewrite the code to avoid the problem (because, like Andrew, I
hate having to do that).
I wasn't sure how to present the following:
if (right_off < heap->size &&
heap->compare(&heap->nodes[node_off],
&heap->nodes[right_off]) < 0)
{
/* swap with the larger child */
if (!swap_off ||
heap->compare(&heap->nodes[left_off],
&heap->nodes[right_off]) < 0)
{
swap_off = right_off;
}
}
…so I left it alone. If you want me to remove the inner braces and
resubmit, despite the multi-line condition, let me know.
I didn't know which header comments Álvaro meant I should remove,
so I haven't done that either. I did remove the TESTBINHEAP stuff,
though.
I have attached the updated patch here. Thanks for your review.
-- Abhijit
Attachments:
binaryheap.difftext/x-diff; charset=us-asciiDownload+474-79
On Thu, 2012-11-15 at 17:50 +0100, Andres Freund wrote:
Me neither. I was thinking about letting the "user" allocate enough
memory like:binaryheap *heap = palloc(binaryheap_size(/*capacity*/ 10));
binaryheap_init(heap, 10, comparator);But thats pretty ugly.
Why not pass the allocator in as a function pointer? It could either be
an argument to the initialization, or we could make a new global
variable that's present inside the server and outside.
(Slight complication: palloc is a macro)
Regards,
Jeff Davis
On Thu, Nov 15, 2012 at 8:56 PM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote:
[ new patch ]
I went over this again today with a view to getting it committed, but
discovered some compiler warnings that look like this:
warning: cast to pointer from integer of different size
The problem seems to be that the binary heap implementation wants the
key and value to be a void *, but the way it's being used in
nodeMergeAppend.c the value is actually an int. I don't think we want
to commit a binary heap implementation which has an impedance mismatch
of this type with its only client. The obvious solution seems to be
to make the key and value each a Datum, and then clients can use
WhateverGetDatum and DatumGetWhatever to pass whatever built-in data
type they happen to have. I'm trying that approach now and will post
an updated patch based on that approach if it seems to work OK and
nobody suggests something better between now and then.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2012-11-20 14:03:12 -0500, Robert Haas wrote:
On Thu, Nov 15, 2012 at 8:56 PM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote:
[ new patch ]
I went over this again today with a view to getting it committed, but
discovered some compiler warnings that look like this:warning: cast to pointer from integer of different size
The problem seems to be that the binary heap implementation wants the
key and value to be a void *, but the way it's being used in
nodeMergeAppend.c the value is actually an int. I don't think we want
to commit a binary heap implementation which has an impedance mismatch
of this type with its only client. The obvious solution seems to be
to make the key and value each a Datum, and then clients can use
WhateverGetDatum and DatumGetWhatever to pass whatever built-in data
type they happen to have. I'm trying that approach now and will post
an updated patch based on that approach if it seems to work OK and
nobody suggests something better between now and then.
Sounds like a good plan, one other alternative would have been declaring
the parameters a size_t (and thus explicitly always big enough to store
a pointer, but also valid to store inline data) instead of using a
void*. But given there is DatumGetPointer/PointerGetDatum et al, using
the postgres-y datatype seems fine to me.
In the LCR case those really are pointers, so preserving that case is
important to me ;), but your proposal does that, so ...
Andres
On Tue, Nov 20, 2012 at 2:29 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2012-11-20 14:03:12 -0500, Robert Haas wrote:
On Thu, Nov 15, 2012 at 8:56 PM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote:
[ new patch ]
I went over this again today with a view to getting it committed, but
discovered some compiler warnings that look like this:warning: cast to pointer from integer of different size
The problem seems to be that the binary heap implementation wants the
key and value to be a void *, but the way it's being used in
nodeMergeAppend.c the value is actually an int. I don't think we want
to commit a binary heap implementation which has an impedance mismatch
of this type with its only client. The obvious solution seems to be
to make the key and value each a Datum, and then clients can use
WhateverGetDatum and DatumGetWhatever to pass whatever built-in data
type they happen to have. I'm trying that approach now and will post
an updated patch based on that approach if it seems to work OK and
nobody suggests something better between now and then.Sounds like a good plan, one other alternative would have been declaring
the parameters a size_t (and thus explicitly always big enough to store
a pointer, but also valid to store inline data) instead of using a
void*. But given there is DatumGetPointer/PointerGetDatum et al, using
the postgres-y datatype seems fine to me.In the LCR case those really are pointers, so preserving that case is
important to me ;), but your proposal does that, so ...
On further reflection, I'm wondering if it wouldn't be a smarter idea
to just get rid of the binaryheap_node concept altogether. Instead,
when you allocate a binary heap, you specify an "element size" as well
as a capacity. Then, the binary heap can truly treat the elements as
an opaque array of bytes rather than a struct of any particular type.
Of course, the disadvantage of this approach is that it's likely to be
slightly slower, because we'll need to do a multiplication by a
variable rather than simple bit shift. But the extra flexibility
might be worth it, because for example for merge append this is kind
of inefficient from a storage perspective. We only really need N+1
pointers, but we've got storage for 2N. With the above change plus a
user-specified third argument for the comparator function (passed as
yet another argument to binaryheap_allocate) we could get around that
while preserving the option for LCR or any other client code to do as
it sees fit.
Thoughts?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
On 2012-11-20 15:06:58 -0500, Robert Haas wrote:
On Tue, Nov 20, 2012 at 2:29 PM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2012-11-20 14:03:12 -0500, Robert Haas wrote:
On Thu, Nov 15, 2012 at 8:56 PM, Abhijit Menon-Sen <ams@2ndquadrant.com> wrote:
[ new patch ]
I went over this again today with a view to getting it committed, but
discovered some compiler warnings that look like this:warning: cast to pointer from integer of different size
The problem seems to be that the binary heap implementation wants the
key and value to be a void *, but the way it's being used in
nodeMergeAppend.c the value is actually an int. I don't think we want
to commit a binary heap implementation which has an impedance mismatch
of this type with its only client. The obvious solution seems to be
to make the key and value each a Datum, and then clients can use
WhateverGetDatum and DatumGetWhatever to pass whatever built-in data
type they happen to have. I'm trying that approach now and will post
an updated patch based on that approach if it seems to work OK and
nobody suggests something better between now and then.Sounds like a good plan, one other alternative would have been declaring
the parameters a size_t (and thus explicitly always big enough to store
a pointer, but also valid to store inline data) instead of using a
void*. But given there is DatumGetPointer/PointerGetDatum et al, using
the postgres-y datatype seems fine to me.In the LCR case those really are pointers, so preserving that case is
important to me ;), but your proposal does that, so ...
On further reflection, I'm wondering if it wouldn't be a smarter idea
to just get rid of the binaryheap_node concept altogether. Instead,
when you allocate a binary heap, you specify an "element size" as well
as a capacity. Then, the binary heap can truly treat the elements as
an opaque array of bytes rather than a struct of any particular type.
Of course, the disadvantage of this approach is that it's likely to be
slightly slower, because we'll need to do a multiplication by a
variable rather than simple bit shift. But the extra flexibility
might be worth it, because for example for merge append this is kind
of inefficient from a storage perspective. We only really need N+1
pointers, but we've got storage for 2N. With the above change plus a
user-specified third argument for the comparator function (passed as
yet another argument to binaryheap_allocate) we could get around that
while preserving the option for LCR or any other client code to do as
it sees fit.
Thoughts?
I don't have a fundamental problem with it, but I don't think it's worth
the cost. If you have variable sized (from the point of the compiler)
values you either have more complex offset computations to the
individual elements or one more indirection due to a pointer array.
Do you feel strongly about it? If so, go ahead, otherwise I would prefer
the current way (with parameters changed to be Datum's of course).
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Tue, Nov 20, 2012 at 3:19 PM, Andres Freund <andres@2ndquadrant.com> wrote:
I don't have a fundamental problem with it, but I don't think it's worth
the cost. If you have variable sized (from the point of the compiler)
values you either have more complex offset computations to the
individual elements or one more indirection due to a pointer array.Do you feel strongly about it? If so, go ahead, otherwise I would prefer
the current way (with parameters changed to be Datum's of course).
Here's a revised patch that takes the approach of just changing void *
to Datum; it includes a bunch of other cleanups that I felt were
worthwhile. I tested this using the following setup:
CREATE TABLE sample (a int, b numeric);
DO $$
BEGIN
FOR i IN 1 .. 1000 LOOP
EXECUTE 'CREATE TABLE sample' || i || ' (CHECK (b = ' || i
|| ')) INHERITS (sample)';
EXECUTE 'INSERT INTO sample' || i
|| ' SELECT g, ' || i || ' FROM
generate_series(1,100000) g;';
EXECUTE 'ALTER TABLE sample' || i
|| ' ADD PRIMARY KEY (a)';
END LOOP;
END
$$;
VACUUM ANALYZE;
And this test query:
select sum(1) from (select * from sample order by a limit 250) x;
On my system, on repeated execution, this takes about 113 ms with
unpatched master and about 114 ms with the patch. I'm inclined to
think that's not worth getting excited about, as it could be simply
code shifting around across cache line boundaries and not indicative
of an actual regression due to the difference in logic. Even if there
is a regression, it's pretty darn small: most people will not have
1000 partitions, and those that do will often find query execution
time dominated by other factors. Against that, there is positive
value in having this code somewhat more abstracted.
With respect to the question of the API, no, I don't feel
overwhelmingly strongly that we have to whack the API with a bat
that's larger than the one I've used here. On the flip side, it
wouldn't surprise me if, as the code acquires more users, we find that
we actually do want to make some changes, either the one I already
proposed or something else. Fortunately, it's not a lot of code, so
if we end up refactoring it some more later, it shouldn't be a big
deal. So I'm happy enough to commit this the way I have it and sort
out the rest as we go. If there are no objections I'll go ahead and
do that.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
binaryheap-rmh.patchapplication/octet-stream; name=binaryheap-rmh.patchDownload+411-84
On 2012-11-20 15:47:25 -0500, Robert Haas wrote:
On Tue, Nov 20, 2012 at 3:19 PM, Andres Freund <andres@2ndquadrant.com> wrote:
I don't have a fundamental problem with it, but I don't think it's worth
the cost. If you have variable sized (from the point of the compiler)
values you either have more complex offset computations to the
individual elements or one more indirection due to a pointer array.Do you feel strongly about it? If so, go ahead, otherwise I would prefer
the current way (with parameters changed to be Datum's of course).Here's a revised patch that takes the approach of just changing void *
to Datum; it includes a bunch of other cleanups that I felt were
worthwhile. I tested this using the following setup:CREATE TABLE sample (a int, b numeric);
DO $$
BEGIN
FOR i IN 1 .. 1000 LOOP
EXECUTE 'CREATE TABLE sample' || i || ' (CHECK (b = ' || i
|| ')) INHERITS (sample)';
EXECUTE 'INSERT INTO sample' || i
|| ' SELECT g, ' || i || ' FROM
generate_series(1,100000) g;';
EXECUTE 'ALTER TABLE sample' || i
|| ' ADD PRIMARY KEY (a)';
END LOOP;
END
$$;
VACUUM ANALYZE;And this test query:
select sum(1) from (select * from sample order by a limit 250) x;
On my system, on repeated execution, this takes about 113 ms with
unpatched master and about 114 ms with the patch. I'm inclined to
think that's not worth getting excited about, as it could be simply
code shifting around across cache line boundaries and not indicative
of an actual regression due to the difference in logic. Even if there
is a regression, it's pretty darn small: most people will not have
1000 partitions, and those that do will often find query execution
time dominated by other factors. Against that, there is positive
value in having this code somewhat more abstracted.
To make sure were not regressing I ran some more testcases that I could
think of/find and didn't find anything problematic. Either both are
equally fast or the new code is faster.
FWIW for me the new code is very slightly faster in your example, but
only if I apply it ontop of fklocks (where I applied it accidentally at
first), not if ontop of 273986bf0d39e5166eb15ba42ebff4803e23a688 where
its reversed, so your code-layout theory sounds likely.
With respect to the question of the API, no, I don't feel
overwhelmingly strongly that we have to whack the API with a bat
that's larger than the one I've used here. On the flip side, it
wouldn't surprise me if, as the code acquires more users, we find that
we actually do want to make some changes, either the one I already
proposed or something else. Fortunately, it's not a lot of code, so
if we end up refactoring it some more later, it shouldn't be a big
deal. So I'm happy enough to commit this the way I have it and sort
out the rest as we go. If there are no objections I'll go ahead and
do that.
Looks good to me. Adapting when we have the actual requirements sounds
fine & sensible as well...
Thanks,
Andres
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services