Allow substitute allocators for PGresult.
Hello. This message is a proposal of a pair of patches that
enables the memory allocator for PGresult in libpq to be
replaced.
The comment at the the begging of pqexpbuffer.c says that libpq
should not rely on palloc(). Besides, Tom Lane said that palloc
should not be visible outside the backend(*1) and I agree with
it.
*1: http://archives.postgresql.org/pgsql-hackers/1999-02/msg00364.php
On the other hand, in dblink, dblink-plus (our product!), and
maybe FDW's connect to other PostgreSQL servers are seem to copy
the result of the query contained in PGresult into tuple store. I
guess that this is in order to avoid memory leakage on
termination in halfway.
But it is rather expensive to copy whole PGresult, and the
significance grows as the data received gets larger. Furthermore,
it requires about twice as much memory as the net size of the
data. And it is fruitless to copy'n modify libpq or reinvent it
from scratch. So we shall be happy to be able to use palloc's in
libpq at least for PGresult for such case in spite of the policy.
For these reasons, I propose to make allocators for PGresult
replaceable.
The modifications are made up into two patches.
1. dupEvents() and pqAddTuple() get new memory block by malloc
currently, but the aquired memory block is linked into
PGresult finally. So I think it is preferable to use
pqResultAlloc() or its descendents in consistensy with the
nature of the place to link.
But there is not PQresultRealloc() and it will be costly, so
pqAddTuple() is not modified in this patch.
2. Define three function pointers
PQpgresult_(malloc|realloc|free) and replace the calls to
malloc/realloc/free in the four functions below with these
pointers.
PQmakeEmptyPGresult()
pqResultAlloc()
PQclear()
pqAddTuple()
This patches make the tools run in backend process and use libpq
possible to handle PGresult as it is with no copy, no more memory.
(Of cource, someone wants to use his/her custom allocator for
PGresult on standalone tools could do that using this feature.)
Three files are attached to this message.
First, the patch with respect to "1" above.
Second, the patch with respect to "2" above.
Third, a very simple sample program.
I have built and briefly tested on CentOS6, with the sample
program mentioned above and valgrind, but not on Windows.
How do you think about this?
Regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
On 11.11.2011 11:18, Kyotaro HORIGUCHI wrote:
The comment at the the begging of pqexpbuffer.c says that libpq
should not rely on palloc(). Besides, Tom Lane said that palloc
should not be visible outside the backend(*1) and I agree with
it.*1: http://archives.postgresql.org/pgsql-hackers/1999-02/msg00364.php
On the other hand, in dblink, dblink-plus (our product!), and
maybe FDW's connect to other PostgreSQL servers are seem to copy
the result of the query contained in PGresult into tuple store. I
guess that this is in order to avoid memory leakage on
termination in halfway.But it is rather expensive to copy whole PGresult, and the
significance grows as the data received gets larger. Furthermore,
it requires about twice as much memory as the net size of the
data. And it is fruitless to copy'n modify libpq or reinvent it
from scratch. So we shall be happy to be able to use palloc's in
libpq at least for PGresult for such case in spite of the policy.For these reasons, I propose to make allocators for PGresult
replaceable.
You could use the resource owner mechanism to track them. Register a
callback function with RegisterResourceReleaseCallback(). Whenever a
PGresult is returned from libpq, add it to e.g a linked list, kept in
TopMemoryContext, and also store a reference to CurrentResourceOwner in
the list element. In the callback function, scan through the list and
free all the PGresults associated with the resource owner that's being
released.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
On 11.11.2011 11:18, Kyotaro HORIGUCHI wrote:
The comment at the the begging of pqexpbuffer.c says that libpq
should not rely on palloc(). Besides, Tom Lane said that palloc
should not be visible outside the backend(*1) and I agree with
it.*1: http://archives.postgresql.org/pgsql-hackers/1999-02/msg00364.php
On the other hand, in dblink, dblink-plus (our product!), and
maybe FDW's connect to other PostgreSQL servers are seem to copy
the result of the query contained in PGresult into tuple store. I
guess that this is in order to avoid memory leakage on
termination in halfway.But it is rather expensive to copy whole PGresult, and the
significance grows as the data received gets larger. Furthermore,
it requires about twice as much memory as the net size of the
data. And it is fruitless to copy'n modify libpq or reinvent it
from scratch. So we shall be happy to be able to use palloc's in
libpq at least for PGresult for such case in spite of the policy.For these reasons, I propose to make allocators for PGresult
replaceable.
You could use the resource owner mechanism to track them.
Heikki's idea is probably superior so far as PG backend usage is
concerned in isolation, but I wonder if there are scenarios where a
client application would like to be able to manage libpq's allocations.
If so, Kyotaro-san's approach would solve more problems than just
dblink's.
However, the bigger picture here is that I think Kyotaro-san's desire to
not have dblink return a tuplestore may be misplaced. Tuplestores can
spill to disk, while PGresults don't; so the larger the result, the
more important it is to push it into a tuplestore and PQclear it as soon
as possible.
Despite that worry, it'd likely be a good idea to adopt one or the other
of these solutions anyway, because I think there are corner cases where
dblink.c can leak a PGresult --- for instance, what if dblink_res_error
fails due to out-of-memory before reaching PQclear? And we could get
rid of the awkward and none-too-cheap PG_TRY blocks that it uses to try
to defend against such leaks in other places.
So I'm in favor of making a change along that line, although I'd want
to see more evidence before considering changing dblink to not return
tuplestores.
regards, tom lane
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
Heikki's idea is probably superior so far as PG backend usage is
concerned in isolation, but I wonder if there are scenarios where a
client application would like to be able to manage libpq's allocations.
The answer to that is certainly 'yes'. It was one of the first things
that I complained about when moving from Oracle to PG. With OCI, you
can bulk load results directly into application-allocated memory areas.
Haven't been following the dblink discussion, so not going to comment
about that piece.
Thanks,
Stephen
Stephen Frost <sfrost@snowman.net> writes:
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
Heikki's idea is probably superior so far as PG backend usage is
concerned in isolation, but I wonder if there are scenarios where a
client application would like to be able to manage libpq's allocations.
The answer to that is certainly 'yes'. It was one of the first things
that I complained about when moving from Oracle to PG. With OCI, you
can bulk load results directly into application-allocated memory areas.
Well, loading data in a form whereby the application can access it
without going through the PGresult accessor functions would be an
entirely different (and vastly larger) project. I'm not sure I want
to open that can of worms --- it seems like you could write a huge
amount of code trying to provide every format someone might want,
and still find that there were impedance mismatches for many
applications.
AIUI Kyotaro-san is just suggesting that the app should be able to
provide a substitute malloc function for use in allocating PGresult
space (and not, I think, anything else that libpq allocates internally).
Basically this would allow PGresults to be cleaned up with methods other
than calling PQclear on each one. It wouldn't affect how you'd interact
with one while you had it. That seems like pretty much exactly what we
want for preventing memory leaks in the backend; but is it going to be
useful for other apps?
regards, tom lane
On Sat, Nov 12, 2011 at 12:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
AIUI Kyotaro-san is just suggesting that the app should be able to
provide a substitute malloc function for use in allocating PGresult
space (and not, I think, anything else that libpq allocates internally).
Basically this would allow PGresults to be cleaned up with methods other
than calling PQclear on each one. It wouldn't affect how you'd interact
with one while you had it. That seems like pretty much exactly what we
want for preventing memory leaks in the backend; but is it going to be
useful for other apps?
I think it will.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Kyotaro HORIGUCHI <horiguchi.kyotaro@oss.ntt.co.jp> writes:
Hello. This message is a proposal of a pair of patches that
enables the memory allocator for PGresult in libpq to be
replaced.
Since there seems to be rough consensus that something like this would
be a good idea, I looked more closely at the details of the patch.
I think the design could use some adjustment.
To start with, the patch proposes exposing some global variables that
affect the behavior of libpq process-wide. This seems like a pretty bad
design, because a single process could contain multiple usages of libpq
with different requirements. As an example, if dblink.c were to set
these variables inside a backend process, it would break usage of libpq
from PL/Perl via DBI. Global variables also tend to be a bad idea
whenever you think about multi-threaded applications --- they require
locking facilities, which are not in this patch.
I think it'd be better to consider the PGresult alloc/free functions to
be a property of a PGconn, which you'd set with a function call along the
lines of PQsetResultAllocator(conn, alloc_func, realloc_func, free_func)
after having successfully opened a connection. Then we just have some
more PGconn fields (and I guess PGresult will need a copy of the
free_func pointer) and no new global variables.
I am also feeling dubious about whether it's a good idea to expect the
functions to have exactly the signature of malloc/free. They are
essentially callbacks, and in most places where a library provides for
callbacks, it's customary to include a "void *" passthrough argument
in case the callback needs some context information. I am not sure that
dblink.c would need such a thing, but if we're trying to design a
general-purpose feature, then we probably should have it. The cost
would be having shim functions inside libpq for the default case, but
it doesn't seem likely that they'd cost enough to notice.
The patch lacks any user documentation, which it surely must have if
we are claiming this is a user-visible feature. And I think it could
use some attention to updating code comments, notably the large block
about PGresult space management near the top of fe-exec.c.
Usually, when writing a feature of this sort, it's a good idea to
implement a prototype use-case to make sure you've not overlooked
anything. So I'd feel happier about the patch if it came along with
a patch to make dblink.c use it to prevent memory leaks.
regards, tom lane
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
On 11.11.2011 11:18, Kyotaro HORIGUCHI wrote:
For these reasons, I propose to make allocators for PGresult
replaceable.
You could use the resource owner mechanism to track them.
BTW, I just thought of a potentially fatal objection to making PGresult
allocation depend on palloc: libpq is absolutely not prepared to handle
losing control on out-of-memory. While I'm not certain that its
behavior with malloc is entirely desirable either (it tends to loop in
hopes of getting the memory next time), we cannot just plop in palloc
in place of malloc and imagine that we're not breaking it.
This makes me think that Heikki's approach is by far the more tenable
one, so far as dblink is concerned. Perhaps the substitute-malloc idea
is still useful for some other application, but I'm inclined to put that
idea on the back burner until we have a concrete use case for it.
regards, tom lane
On 12/11/2011 07:36, Robert Haas wrote:
On Sat, Nov 12, 2011 at 12:48 AM, Tom Lane<tgl@sss.pgh.pa.us> wrote:
AIUI Kyotaro-san is just suggesting that the app should be able to
provide a substitute malloc function for use in allocating PGresult
space (and not, I think, anything else that libpq allocates internally).
Basically this would allow PGresults to be cleaned up with methods other
than calling PQclear on each one. It wouldn't affect how you'd interact
with one while you had it. That seems like pretty much exactly what we
want for preventing memory leaks in the backend; but is it going to be
useful for other apps?I think it will.
Maybe I've just talking nonsense, I just have little experience hacking
the pgsql and pdo-pgsql exstensions, but to me it would seem something
that could easily avoid an extra duplication of the data returned by
pqgetvalue. To me it seems a pretty nice win.
Cheers
--
Matteo Beccati
Development & Consulting - http://www.beccati.com/
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
Well, loading data in a form whereby the application can access it
without going through the PGresult accessor functions would be an
entirely different (and vastly larger) project.
Looking through the thread, I agree that it's a different thing than
what's being discussed here.
I'm not sure I want
to open that can of worms --- it seems like you could write a huge
amount of code trying to provide every format someone might want,
and still find that there were impedance mismatches for many
applications.
The OCI approach is actually very similar to how we handle our
catalogs internally.. Imagine you define a C struct which matched your
table structure, then you allocate 5000 (or however) of those, give the
base pointer to the 'getResult' call and a integer array of offsets into
that structure for each of the columns. There might have been a few
other minor things (like some notion of how to handle NULLs), but it was
pretty straight-forward from the C perspective, imv.
Trying to provide alternative formats (I'm guessing you were referring
to something like XML..? Or some complex structure?) would certainly be
a whole different ballgame.
Thanks,
Stephen
Show quoted text
AIUI Kyotaro-san is just suggesting that the app should be able to
provide a substitute malloc function for use in allocating PGresult
space (and not, I think, anything else that libpq allocates internally).
Basically this would allow PGresults to be cleaned up with methods other
than calling PQclear on each one. It wouldn't affect how you'd interact
with one while you had it. That seems like pretty much exactly what we
want for preventing memory leaks in the backend; but is it going to be
useful for other apps?regards, tom lane
Hello,
At Fri, 11 Nov 2011 11:29:30 +0200, Heikki Linnakangas wrote
You could use the resource owner mechanism to track
them. Register a callback function with
RegisterResourceReleaseCallback().
Thank you for letting me know about it. I have dug up a message
in pg-hackers refering to the mechanism on discussion about
postgresql-fdw. I'll put further thought into dblink-plus taking
it into account.
By the way, thinking about memory management for the result in
libpq is considerable as another issue.
At Sat, 12 Nov 2011 12:29:50 -0500, Tom Lane wrote
To start with, the patch proposes exposing some global
variables that affect the behavior of libpq process-wide. This
seems like a pretty bad design, because a single process could
contain multiple usages of libpq
You're right to say the design is bad. I've designed it to have
minimal impact on libpq by limiting usage and imposing any
reponsibility on the users, that is the developers of the modules
using it. If there are any other applications that want to use
their own allocators, there are some points to be considered.
I think it is preferable consiering multi-threading to make libpq
write PGresult into memory blocks passed from the application
like OCI does, instead of letting libpq itself make request for
them.
This approach hands the responsibility of memory management to
the user and gives them the capability to avoid memory exhaustion
by their own measures.
On the other hand, this way could produce the situation that
libpq cannot write all of the data to receive from the server
onto handed memory block. So, the API must be able to return the
partial data to the caller.
More advancing, if libpq could store the result directly into
user-allocated memory space using tuplestore-like interface, it
is better on performance if the final storage is a tuplestore
itself.
I will be happy with the part-by-part passing of result. So I
will think about this as the next issue.
So I'd feel happier about the patch if it came along with a
patch to make dblink.c use it to prevent memory leaks.
I take it is about my original patch.
Mmm, I heard that dblink copies received data in PGResult to
tuple store not only because of the memory leak, but less memory
usage (after the copy is finished). I think I could show you the
patch ignoring the latter, but it might take some time for me to
start from understand dblink and tuplestore closely...
If I find RegisterResourceReleaseCallback short for our
requirement, I will show it. If not, I withdraw this patch for
ongoing CF and propose another patch based on the discussion
above at another time.
Please let me have a little more time.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Import Notes
Reply to msg id not found: 4EBCEAFA.6010507@enterprisedb.com29692.1321118990@sss.pgh.pa.us
Hello,
me> I'll put further thought into dblink-plus taking it into
me> account.
..
me> Please let me have a little more time.
I've inquired the developer of dblink-plus about
RegisterResourceReleaseCallback(). He said that the function is
in bad compatibility with current implementation. In addition to
this, storing into tuplestore directly seems to me a good idea
than palloc'ed PGresult.
So I tried to make libpq/PGresult be able to handle alternative
tuple store by hinting to PGconn, and modify dblink to use the
mechanism as the first sample code.
I will show it as a series of patches in next message.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Hello, This is the next version of Allow substitute allocators
for PGresult.
Totally chaning the concept from the previous one, this patch
allows libpq to handle alternative tuple store for received
tuples.
Design guidelines are shown below.
- No need to modify existing client code of libpq.
- Existing libpq client runs with roughly same performance, and
dblink with modification runs faster to some extent and
requires less memory.
I have measured roughly of run time and memory requirement for
three configurations on CentOS6 on Vbox with 2GB mem 4 cores
running on Win7-Corei7, transferring (30 bytes * 2 cols) *
2000000 tuples (120MB net) within this virutal machine. The
results are below.
xfer time Peak RSS
Original : 6.02s 850MB
libpq patch + Original dblink : 6.11s 850MB
full patch : 4.44s 643MB
xfer time here is the mean of five 'real time's measured by
running sql script like this after warmup run.
=== test.sql
select dblink_connect('c', 'host=localhost port=5432 dbname=test');
select * from dblink('c', 'select a,c from foo limit 2000000') as (a text, b bytea) limit 1;
select dblink_disconnect('c');
===
$ for i in $(seq 1 10); do time psql test -f t.sql; done
===
Peak RSS is measured by picking up heap Rss in /proc/[pid]/smaps.
It seems somewhat slow using patched libpq and original dblink,
but it seems within error range too. If this amount of slowdown
is not permissible, it might be improved by restoring the static
call route before for extra redundancy of the code.
On the other hand, full patch version seems obviously fast and
requires less memory. Isn't it nice?
This patch consists of two sub patches.
The first is a patch for libpq to allow rewiring tuple storage
mechanism. But default behavior is not changed. Existing libpq
client should run with it.
The second is modify dblink to storing received tuples into
tuplestore directly using the mechanism above.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Ouch! I'm sorry for making a reverse patch for the first modification.
This is an amendment of the message below. The body text is
copied into this message.
http://archives.postgresql.org/message-id/20111201.192419.103527179.horiguchi.kyotaro@oss.ntt.co.jp
=======
Hello, This is the next version of Allow substitute allocators
for PGresult.
Totally chaning the concept from the previous one, this patch
allows libpq to handle alternative tuple store for received
tuples.
Design guidelines are shown below.
- No need to modify existing client code of libpq.
- Existing libpq client runs with roughly same performance, and
dblink with modification runs faster to some extent and
requires less memory.
I have measured roughly of run time and memory requirement for
three configurations on CentOS6 on Vbox with 2GB mem 4 cores
running on Win7-Corei7, transferring (30 bytes * 2 cols) *
2000000 tuples (120MB net) within this virutal machine. The
results are below.
xfer time Peak RSS
Original : 6.02s 850MB
libpq patch + Original dblink : 6.11s 850MB
full patch : 4.44s 643MB
xfer time here is the mean of five 'real time's measured by
running sql script like this after warmup run.
=== test.sql
select dblink_connect('c', 'host=localhost port=5432 dbname=test');
select * from dblink('c', 'select a,c from foo limit 2000000') as (a text, b bytea) limit 1;
select dblink_disconnect('c');
===
$ for i in $(seq 1 10); do time psql test -f t.sql; done
===
Peak RSS is measured by picking up heap Rss in /proc/[pid]/smaps.
It seems somewhat slow using patched libpq and original dblink,
but it seems within error range too. If this amount of slowdown
is not permissible, it might be improved by restoring the static
call route before for extra redundancy of the code.
On the other hand, full patch version seems obviously fast and
requires less memory. Isn't it nice?
This patch consists of two sub patches.
The first is a patch for libpq to allow rewiring tuple storage
mechanism. But default behavior is not changed. Existing libpq
client should run with it.
The second is modify dblink to storing received tuples into
tuplestore directly using the mechanism above.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Hello,
The documentation had slipped my mind.
This is the patch to add the documentation of PGresult custom
storage. It shows in section '31.19. Alternative result
storage'.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
libpq_subst_storage_doc_v1.patchtext/x-patch; charset=us-asciiDownload+319-0
On 12/01/2011 05:48 AM, Kyotaro HORIGUCHI wrote:
xfer time Peak RSS
Original : 6.02s 850MB
libpq patch + Original dblink : 6.11s 850MB
full patch : 4.44s 643MB
These look like interesting results. Currently Tom is listed as the
reviewer on this patch, based on comments made before the CF really
started. And the patch has been incorrectly been sitting in "Waiting
for author" for the last week; oops. I'm not sure what to do with this
one now except raise a general call to see if anyone wants to take a
look at it, now that it seems to be in good enough shape to deliver
measurable results.
--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.us
Greg Smith <greg@2ndQuadrant.com> writes:
On 12/01/2011 05:48 AM, Kyotaro HORIGUCHI wrote:
xfer time Peak RSS
Original : 6.02s 850MB
libpq patch + Original dblink : 6.11s 850MB
full patch : 4.44s 643MB
These look like interesting results. Currently Tom is listed as the
reviewer on this patch, based on comments made before the CF really
started. And the patch has been incorrectly been sitting in "Waiting
for author" for the last week; oops. I'm not sure what to do with this
one now except raise a general call to see if anyone wants to take a
look at it, now that it seems to be in good enough shape to deliver
measurable results.
I did list myself as reviewer some time ago, but if anyone else wants to
take it I won't be offended ;-)
regards, tom lane
On Thu, Dec 8, 2011 at 5:41 AM, Kyotaro HORIGUCHI
<horiguchi.kyotaro@oss.ntt.co.jp> wrote:
This is the patch to add the documentation of PGresult custom
storage. It shows in section '31.19. Alternative result
storage'.
It would be good to consolidate this into the main patch.
I find the names of the functions added here to be quite confusing and
would suggest renaming them. I expected PQgetAsCstring to do
something similar to PQgetvalue, but the code is completely different,
and even after reading the documentation I still don't understand what
that function is supposed to be used for. Why "as cstring"? What
would the other option be?
I also don't think the "add tuple" terminology is particularly good.
It's not obvious from the name that what you're doing is overriding
the way memory is allocated and results are stored.
Also, what about the problem Tom mentioned here?
http://archives.postgresql.org/message-id/1042.1321123761@sss.pgh.pa.us
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Hello, thank you for taking the time for comment.
At Wed, 21 Dec 2011 11:09:59 -0500, Robert Haas <robertmhaas@gmail.com> wrote
I find the names of the functions added here to be quite
confusing and would suggest renaming them. I expected
PQgetAsCstring to do something similar to PQgetvalue, but the
code is completely different,
To be honest, I've also felt that kind of perplexity. If the
problem is simply of the naming, I can propose the another name
"PQreadAttValue"... This is not so good too...
But...
and even after reading the documentation I still don't
understand what that function is supposed to be used for. Why
"as cstring"? What would the other option be?
Is it a problem of the poor description? Or about the raison
d'être of the function?
The immediate cause of the existence of the function is that
getAnotherTuple internally stores the field values of the tuples
sent from the server, in the form of PGresAttValue, and I have
found only one route to store a tuple into TupleStore is
BuildeTupleFromCStrings() to tupelstore_puttuple() which is
dblink does in materializeResult(), and of cource C-string is the
most natural format in C program, and I have hesitated to modify
execTuples.c, and I wanted to hide the details of PGresAttValue.
Assuming that the values are passed as PGresAttValue* is given
(for the reasons of performance and the extent of the
modification), the "adding tuple" functions should get the value
from the struct. This can be done in two ways from the view of
authority (`encapsulation', in other words) and convenience, one
is with the knowledge of the structure, and the other is without
it. PQgetAsCstring is the latter approach. (And it is
inconsistent with the fact that the definition of PGresAttValue
is moved into lipq-fe.h from libpq-int.h. The details of the
structure should be hidden like PGresult in this approach).
But it is not obvious that the choice is better than the another
one. If we consider that PGresAttValue is too simple and stable
to hide the details, PQgetAsCString will be taken off and the
problem will go out. PGresAttValue needs documentation in this
case.
I prefer to handle PGresAttValue directly if no problem.
I also don't think the "add tuple" terminology is particularly good.
It's not obvious from the name that what you're doing is overriding
the way memory is allocated and results are stored.
This phrase is taken from pqAddTuple() in fe-exec.c at first and
have not been changed after the function is integrated with other
functions.
I propose "tuple storage handler" for the alternative.
- typedef void *(*addTupleFunction)(...);
+ typedef void *(*tupleStorageHandler)(...);
- typedef enum { ADDTUP_*, } AddTupFunc;
+ typedef enum { TSHANDLER_*, } TSHandlerCommand;
- void *PQgetAddTupleParam(...);
+ void *PQgetTupleStrageHandlerContext(...);
- void PQregisterTupleAdder(...);
+ void PQregisterTupleStoreHandler(...);
- addTupleFunction PGresult.addTupleFunc;
+ tupleStorageHandler PGresult.tupleStorageHandlerFunc;
- void *PGresult.addTuleFuncParam;
+ void *PGresult.tupleStorageHandlerContext;
- char *PGresult.addTuleFuncErrMes;
+ void *PGresult.tupelStrageHandlerErrMes;
Also, what about the problem Tom mentioned here?
http://archives.postgresql.org/message-id/1042.1321123761@sss.pgh.pa.us
The plan that simply replace malloc's with something like
palloc's is abandoned for the narrow scope.
dblink-plus copies whole PGresult into TupleStore in order to
avoid making orphaned memory on SIGINT. The resource owner
mechanism is principally applicable to that but practically hard
for the reason that current implementation without radically
modification couldn't accept it.. In addition to that, dblink
also does same thing for maybe the same reason with dblink-plus
and another reason as far as I heard.
Whatever the reason is, both dblink and dblink-plus do the same
thing that could lower the performance than expected.
If TupleStore(TupleDesc) is preferable to PGresult for in-backend
use and oridinary(client-use) libpq users can handle only
PGresult, the mechanism like this patch would be reuired to
maintain the compatibility, I think. To the contrary, if there is
no significant reason to use TupleStore in backend use - it
implies that existing mechanisms like resource owner can save the
backend inexpensively from possible inconvenience caused by using
PGresult storage in backends - PGresult should be used as it is.
I think TupleStore prepared to be used in backend is preferable
for the usage and don't want to get data making detour via
PGresult.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
One patch that fell off the truck during a turn in the November
CommitFest was "Allow substitute allocators for PGresult". Re-reading
the whole thing again, it actually turned into a rather different
submission in the middle, and I know I didn't follow that shift
correctly then. I'm replying to its thread but have changed the subject
to reflect that change. From a procedural point of view, I don't feel
right kicking this back to its author on a Friday night when the
deadline for resubmitting it would be Sunday. Instead I've refreshed
the patch myself and am adding it to the January CommitFest. The new
patch is a single file; it's easy enough to split out the dblink changes
if someone wants to work with the pieces separately.
After my meta-review I think we should get another reviewer familiar
with using dblink to look at this next. This is fundamentally a
performance patch now. Some results and benchmarking code were
submitted along with it; the other issues are moot if those aren't
reproducible. The secondary goal for a new review here is to provide
another opinion on the naming issues and abstraction concerns raised so far.
To clear out the original line of thinking, this is not a replacement
low-level storage allocator anymore. The idea of using such a mechanism
to help catch memory leaks has also been dropped.
Instead this adds and documents a new path for libpq callers to more
directly receive tuples, for both improved speed and lower memory
usage. dblink has been modified to use this new mechanism.
Benchmarking by the author suggests no significant change in libpq speed
when only that change was made, while the modified dblink using the new
mechanism was significantly faster. It jumped from 332K tuples/sec to
450K, a 35% gain, and had a lower memory footprint too. Test
methodology and those results are at
http://archives.postgresql.org/pgsql-hackers/2011-12/msg00008.php
Robert Haas did a quick code review of this already, it along with
author response mixed in are at
http://archives.postgresql.org/pgsql-hackers/2011-12/msg01149.php I see
two areas of contention there:
-There are several naming bits no one is happy with yet. Robert didn't
like some of them, but neither did Kyotaro. I don't have an opinion
myself. Is it the case that some changes to the existing code's
terminology are what's actually needed to make this all better? Or is
this just fundamentally warty and there's nothing to be done about it.
Dunno.
-There is an abstraction wrapper vs. coding convenience trade-off
centering around PGresAttValue. It sounded to me like it raised always
fun questions like "where's the right place for the line between
lipq-fe.h and libpq-int.h to be?"
dblink is pretty popular, and this is a big performance win for it. If
naming and API boundary issues are the worst problems here, this sounds
like something well worth pursuing as part of 9.2's still advancing
performance theme.
--
Greg Smith 2ndQuadrant US greg@2ndQuadrant.com Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com