pg_background (and more parallelism infrastructure patches)
Attached is a contrib module that lets you launch arbitrary command in
a background worker, and supporting infrastructure patches for core.
You can launch queries and fetch the results back, much as you could
do with a dblink connection back to the local database but without the
hassles of dealing with authentication; and you can also run utility
commands, like VACUUM. For people who have always wanted to be able
to launch a vacuum (or an autonomous transaction, or a background
task) from a procedural language ... enjoy.
Here's an example of running vacuum and then fetching the results.
Notice that the notices from the original session are propagated to
our session; if an error had occurred, it would be re-thrown locally
when we try to read the results.
rhaas=# create table foo (a int);
CREATE TABLE
rhaas=# select pg_background_launch('vacuum verbose foo');
pg_background_launch
----------------------
51101
(1 row)
rhaas=# select * from pg_background_result(51101) as (x text);
INFO: vacuuming "public.foo"
INFO: "foo": found 0 removable, 0 nonremovable row versions in 0 out of 0 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
x
--------
VACUUM
(1 row)
Here's an overview of the attached patches:
Patches 1 and 2 add a few new interfaces to the shm_mq and dsm APIs
that happen to be convenient for later patches in the series. I'm
pretty sure I could make all this work without these, but it would
take more code and be less efficient, so here they are.
Patch 3 adds the ability for a backend to request that the protocol
messages it would normally send to the frontend get redirected to a
shm_mq. I did this by adding a couple of hook functions. The best
design is definitely arguable here, so if you'd like to bikeshed, this
is probably the patch to look at. This patch also adds a function to
help you parse an ErrorResponse or NoticeResponse and re-throw the
error or notice in the originating backend. Obviously, parallelism is
going to need this kind of functionality, but I suspect a variety of
other applications people may develop using background workers may
want it too; and it's certainly important for pg_background itself.
Patch 4 adds infrastructure that allows one session to save all of its
non-default GUC values and another session to reload those values.
This was written by Amit Khandekar and Noah Misch. It allows
pg_background to start up the background worker with the same GUC
settings that the launching process is using. I intend this as a
demonstration of how to synchronize any given piece of state between
cooperating backends. For real parallelism, we'll need to synchronize
snapshots, combo CIDs, transaction state, and so on, in addition to
GUCs. But GUCs are ONE of the things that we'll need to synchronize
in that context, and this patch shows the kind of API we're thinking
about for these sorts of problems.
Patch 5 is a trivial patch to add a function to get the authenticated
user ID. Noah pointed out to me that it's important for the
authenticated user ID, session user ID, and current user ID to all
match between the original session and the background worker.
Otherwise, pg_background could be used to circumvent restrictions that
we normally impose when those values differ from each other. The
session and current user IDs are restored by the GUC save-and-restore
machinery ("session_authorization" and "role") but the authenticated
user ID requires special treatment. To make that happen, it has to be
exposed somehow.
Patch 6 is pg_background itself. I'm quite pleased with how easily
this came together. The existing background worker, dsm, shm_toc, and
shm_mq infrastructure handles most of the heavily lifting here -
obviously with some exceptions addressed by the preceding patches.
Again, this is the kind of set-up that I'm expecting will happen in a
background worker used for actual parallelism - clearly, more state
will need to be restored there than here, but nonetheless the general
flow of the code here is about what I'm imagining, just with somewhat
more different kinds of state. Most of the work of writing this patch
was actually figuring out how to execute the query itself; what I
ended up with is mostly copied form exec_simple_query, but with some
difference here and there. I'm not sure if it would be
possible/advisable to try to refactor to reduce duplication.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Attachments:
0001-Extend-shm_mq-API-with-new-functions-shm_mq_sendv-an.patchtext/x-patch; charset=US-ASCII; name=0001-Extend-shm_mq-API-with-new-functions-shm_mq_sendv-an.patchDownload+124-17
0002-Extend-dsm-API-with-a-new-function-dsm_unkeep_mappin.patchtext/x-patch; charset=US-ASCII; name=0002-Extend-dsm-API-with-a-new-function-dsm_unkeep_mappin.patchDownload+19-1
0003-Support-frontend-backend-protocol-communication-usin.patchtext/x-patch; charset=US-ASCII; name=0003-Support-frontend-backend-protocol-communication-usin.patchDownload+295-4
0004-Add-infrastructure-to-save-and-restore-GUC-values.patchtext/x-patch; charset=US-ASCII; name=0004-Add-infrastructure-to-save-and-restore-GUC-values.patchDownload+402-1
0005-Add-a-function-to-get-the-authenticated-user-ID.patchtext/x-patch; charset=US-ASCII; name=0005-Add-a-function-to-get-the-authenticated-user-ID.patchDownload+11-1
0006-pg_background-Run-commands-in-a-background-worker-an.patchtext/x-patch; charset=US-ASCII; name=0006-pg_background-Run-commands-in-a-background-worker-an.patchDownload+959-1
On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote:
+ pq_mq_busy = true; + + iov[0].data = &msgtype; + iov[0].len = 1; + iov[1].data = s; + iov[1].len = len; + + Assert(pq_mq_handle != NULL); + result = shm_mq_sendv(pq_mq_handle, iov, 2, false); + + pq_mq_busy = false;
Don't you need a PG_TRY block here to reset pq_mq_busy?
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hi,
On 2014-07-25 14:11:32 -0400, Robert Haas wrote:
Attached is a contrib module that lets you launch arbitrary command in
a background worker, and supporting infrastructure patches for core.
Cool.
I assume this 'fell out' of the work towards parallelism? Do you think
all of the patches (except the contrib one) are required for that or is
some, e.g. 3), only required to demonstrate the others?
Patch 3 adds the ability for a backend to request that the protocol
messages it would normally send to the frontend get redirected to a
shm_mq. I did this by adding a couple of hook functions. The best
design is definitely arguable here, so if you'd like to bikeshed, this
is probably the patch to look at.
Uh. This doesn't sound particularly nice. Shouldn't this rather be
clearly layered by making reading/writing from the client a proper API
instead of adding hook functions here and there?
Also, you seem to have only touched receiving from the client, and not
sending back to the subprocess. Is that actually sufficient? I'd expect
that for this facility to be fully useful it'd have to be two way
communication. But perhaps I'm overestimating what it could be used for.
This patch also adds a function to
help you parse an ErrorResponse or NoticeResponse and re-throw the
error or notice in the originating backend. Obviously, parallelism is
going to need this kind of functionality, but I suspect a variety of
other applications people may develop using background workers may
want it too; and it's certainly important for pg_background itself.
I would have had use for it previously.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Fri, Jul 25, 2014 at 4:16 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote:
+ pq_mq_busy = true; + + iov[0].data = &msgtype; + iov[0].len = 1; + iov[1].data = s; + iov[1].len = len; + + Assert(pq_mq_handle != NULL); + result = shm_mq_sendv(pq_mq_handle, iov, 2, false); + + pq_mq_busy = false;Don't you need a PG_TRY block here to reset pq_mq_busy?
No. If shm_mq_sendv is interrupted, we can't use the shm_mq any more.
But since that should only happen if an interrupt arrives while the
queue is full, I think that's OK. (Think about the alternatives: if
the queue is full, we have no way of notifying the launching process
without waiting for it to retrieve the results, but it might not do
that right away, and if we've been killed we need to die *now* not
later.)
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Jul 26, 2014 at 4:37 AM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2014-07-25 14:11:32 -0400, Robert Haas wrote:
Attached is a contrib module that lets you launch arbitrary command in
a background worker, and supporting infrastructure patches for core.Cool.
I assume this 'fell out' of the work towards parallelism? Do you think
all of the patches (except the contrib one) are required for that or is
some, e.g. 3), only required to demonstrate the others?
I'm fairly sure that patches 3, 4, and 5 are all required in some form
as building blocks for parallelism. Patch 1 contains two functions,
one of which (shm_mq_set_handle) I think is generally useful for
people using background workers, but not absolutely required; and one
of which is infrastructure for patch 3 which might not be necessary
with different design choices (shm_mq_sendv). Patch 2 is only
included because pg_background can benefit from it; we could instead
use an eoxact callback, at the expense of doing cleanup at
end-of-transaction rather than end-of-query. But it's a mighty small
patch and seems like a reasonable extension to the API, so I lean
toward including it.
Patch 3 adds the ability for a backend to request that the protocol
messages it would normally send to the frontend get redirected to a
shm_mq. I did this by adding a couple of hook functions. The best
design is definitely arguable here, so if you'd like to bikeshed, this
is probably the patch to look at.Uh. This doesn't sound particularly nice. Shouldn't this rather be
clearly layered by making reading/writing from the client a proper API
instead of adding hook functions here and there?
I don't know exactly what you have in mind here. There is an API for
writing to the client that is used throughout the backend, but right
now "the client" always has to be a socket. Hooking a couple of parts
of that API lets us write someplace else instead. If you've got
another idea how to do this, suggest away...
Also, you seem to have only touched receiving from the client, and not
sending back to the subprocess. Is that actually sufficient? I'd expect
that for this facility to be fully useful it'd have to be two way
communication. But perhaps I'm overestimating what it could be used for.
Well, the basic shm_mq infrastructure can be used to send any kind of
messages you want between any pair of processes that care to establish
them. But in general I expect that data is going to flow mostly in
one direction - the user backend will launch workers and give them an
initial set of instructions, and then results will stream back from
the workers to the user backend. Other messaging topologies are
certainly possible, and probably useful for something, but I don't
really know exactly what those things will be yet, and I'm not sure
the FEBE protocol will be the right tool for the job anyway. But
error propagation, which is the main thrust of this, seems like a need
that will likely be pretty well ubiquitous.
This patch also adds a function to
help you parse an ErrorResponse or NoticeResponse and re-throw the
error or notice in the originating backend. Obviously, parallelism is
going to need this kind of functionality, but I suspect a variety of
other applications people may develop using background workers may
want it too; and it's certainly important for pg_background itself.I would have had use for it previously.
Cool. I know Petr was interested as well (possibly for the same project?).
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2014-07-26 12:20:34 -0400, Robert Haas wrote:
On Sat, Jul 26, 2014 at 4:37 AM, Andres Freund <andres@2ndquadrant.com> wrote:
On 2014-07-25 14:11:32 -0400, Robert Haas wrote:
Attached is a contrib module that lets you launch arbitrary command in
a background worker, and supporting infrastructure patches for core.Cool.
I assume this 'fell out' of the work towards parallelism? Do you think
all of the patches (except the contrib one) are required for that or is
some, e.g. 3), only required to demonstrate the others?I'm fairly sure that patches 3, 4, and 5 are all required in some form
as building blocks for parallelism. Patch 1 contains two functions,
one of which (shm_mq_set_handle) I think is generally useful for
people using background workers, but not absolutely required; and one
of which is infrastructure for patch 3 which might not be necessary
with different design choices (shm_mq_sendv). Patch 2 is only
included because pg_background can benefit from it; we could instead
use an eoxact callback, at the expense of doing cleanup at
end-of-transaction rather than end-of-query. But it's a mighty small
patch and seems like a reasonable extension to the API, so I lean
toward including it.
Don't get me wrong, I don't object to anything in here. It's just that
the bigger picture can help giving sensible feedback.
Patch 3 adds the ability for a backend to request that the protocol
messages it would normally send to the frontend get redirected to a
shm_mq. I did this by adding a couple of hook functions. The best
design is definitely arguable here, so if you'd like to bikeshed, this
is probably the patch to look at.Uh. This doesn't sound particularly nice. Shouldn't this rather be
clearly layered by making reading/writing from the client a proper API
instead of adding hook functions here and there?I don't know exactly what you have in mind here. There is an API for
writing to the client that is used throughout the backend, but right
now "the client" always has to be a socket. Hooking a couple of parts
of that API lets us write someplace else instead. If you've got
another idea how to do this, suggest away...
What I'm thinking of is providing an actual API for the writes instead
of hooking into the socket API in a couple places. I.e. have something
like
typedef struct DestIO DestIO;
struct DestIO
{
void (*flush)(struct DestIO *io);
int (*putbytes)(struct DestIO *io, const char *s, size_t len);
int (*getbytes)(struct DestIO *io, const char *s, size_t len);
...
}
and do everything through it. I haven't thought much about the specific
API we want, but abstracting the communication properly instead of
adding hooks here and there is imo much more likely to succeed in the
long term.
Also, you seem to have only touched receiving from the client, and not
sending back to the subprocess. Is that actually sufficient? I'd expect
that for this facility to be fully useful it'd have to be two way
communication. But perhaps I'm overestimating what it could be used for.Well, the basic shm_mq infrastructure can be used to send any kind of
messages you want between any pair of processes that care to establish
them. But in general I expect that data is going to flow mostly in
one direction - the user backend will launch workers and give them an
initial set of instructions, and then results will stream back from
the workers to the user backend. Other messaging topologies are
certainly possible, and probably useful for something, but I don't
really know exactly what those things will be yet, and I'm not sure
the FEBE protocol will be the right tool for the job anyway.
It's imo not particularly unreasonable to e.g. COPY to/from a bgworker. Which
would require the ability to both read/write from the other side.
But
error propagation, which is the main thrust of this, seems like a need
that will likely be pretty well ubiquitous.
Agreed.
This patch also adds a function to
help you parse an ErrorResponse or NoticeResponse and re-throw the
error or notice in the originating backend. Obviously, parallelism is
going to need this kind of functionality, but I suspect a variety of
other applications people may develop using background workers may
want it too; and it's certainly important for pg_background itself.I would have had use for it previously.
Cool. I know Petr was interested as well (possibly for the same project?).
Well, I was aware of Petr's project, but I also have my own pet project
I'd been playing with :). Nothing real.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Jul 28, 2014 at 1:50 PM, Andres Freund <andres@2ndquadrant.com> wrote:
Don't get me wrong, I don't object to anything in here. It's just that
the bigger picture can help giving sensible feedback.
Right. I did not get you wrong. :-)
The reason I'm making a point of it is that, if somebody wants to
object to the way those facilities are designed, it'd be good to get
that out of the way now rather than waiting until 2 or 3 patch sets
from now and then saying "Uh, could you guys go back and rework all
that stuff?". I'm not going to complain too loudly now if somebody
wants something in there done in a different way, but it's easier to
do that now while there's only pg_background sitting on top of it.
What I'm thinking of is providing an actual API for the writes instead
of hooking into the socket API in a couple places. I.e. have something
liketypedef struct DestIO DestIO;
struct DestIO
{
void (*flush)(struct DestIO *io);
int (*putbytes)(struct DestIO *io, const char *s, size_t len);
int (*getbytes)(struct DestIO *io, const char *s, size_t len);
...
}and do everything through it. I haven't thought much about the specific
API we want, but abstracting the communication properly instead of
adding hooks here and there is imo much more likely to succeed in the
long term.
This sounds suspiciously like the DestReceiver thing we've already
got, except that the DestReceiver only applies to tuple results, not
errors and notices and so on. I'm not totally unamenable to a bigger
refactoring here, but right now it looks to me like a solution in
search of a problem. The hooks are simple and seem to work fine; I
don't want to add notation for its own sake.
Also, you seem to have only touched receiving from the client, and not
sending back to the subprocess. Is that actually sufficient? I'd expect
that for this facility to be fully useful it'd have to be two way
communication. But perhaps I'm overestimating what it could be used for.Well, the basic shm_mq infrastructure can be used to send any kind of
messages you want between any pair of processes that care to establish
them. But in general I expect that data is going to flow mostly in
one direction - the user backend will launch workers and give them an
initial set of instructions, and then results will stream back from
the workers to the user backend. Other messaging topologies are
certainly possible, and probably useful for something, but I don't
really know exactly what those things will be yet, and I'm not sure
the FEBE protocol will be the right tool for the job anyway.It's imo not particularly unreasonable to e.g. COPY to/from a bgworker. Which
would require the ability to both read/write from the other side.
Well, that should work fine if the background worker and user backend
generate the CopyData messages via some bespoke code rather than
expecting to be able to jump into copy.c and have everything work. If
you want that to work, why? It doesn't make much sense for
pg_background, because I don't think it would be sensible for SELECT
pg_background_result(...) to return CopyInResponse or CopyOutResponse,
and even if it were sensible it doesn't seem useful. And I can't
think of any other application off-hand, either.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sat, Jul 26, 2014 at 9:32 PM, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Jul 25, 2014 at 4:16 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote:
+ pq_mq_busy = true; + + iov[0].data = &msgtype; + iov[0].len = 1; + iov[1].data = s; + iov[1].len = len; + + Assert(pq_mq_handle != NULL); + result = shm_mq_sendv(pq_mq_handle, iov, 2, false); + + pq_mq_busy = false;Don't you need a PG_TRY block here to reset pq_mq_busy?
No. If shm_mq_sendv is interrupted, we can't use the shm_mq any more.
But since that should only happen if an interrupt arrives while the
queue is full, I think that's OK.
I think here not only on interrupt, but any other error in this
function shm_mq_sendv() path (one example is WaitLatch())
could lead to same behaviour.
(Think about the alternatives: if
the queue is full, we have no way of notifying the launching process
without waiting for it to retrieve the results, but it might not do
that right away, and if we've been killed we need to die *now* not
later.)
So in such cases what is the advise to users, currently they will
see the below message:
postgres=# select * from pg_background_result(5124) as (x int);
ERROR: lost connection to worker process with PID 5124
One way is to ask them to check logs, but what about if they want
to handle error and take some action?
Another point about error handling is that to execute the sql in
function pg_background_worker_main(), it starts the transaction
which I think doesn't get aborted if error occurs and similarly handling
for timeout seems to be missing in error path.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Mon, Sep 8, 2014 at 10:39 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
On Sat, Jul 26, 2014 at 9:32 PM, Robert Haas <robertmhaas@gmail.com>
wrote:
On Fri, Jul 25, 2014 at 4:16 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:On Fri, Jul 25, 2014 at 02:11:32PM -0400, Robert Haas wrote:
+ pq_mq_busy = true; + + iov[0].data = &msgtype; + iov[0].len = 1; + iov[1].data = s; + iov[1].len = len; + + Assert(pq_mq_handle != NULL); + result = shm_mq_sendv(pq_mq_handle, iov, 2, false); + + pq_mq_busy = false;Don't you need a PG_TRY block here to reset pq_mq_busy?
No. If shm_mq_sendv is interrupted, we can't use the shm_mq any more.
But since that should only happen if an interrupt arrives while the
queue is full, I think that's OK.I think here not only on interrupt, but any other error in this
function shm_mq_sendv() path (one example is WaitLatch())
could lead to same behaviour.(Think about the alternatives: if
the queue is full, we have no way of notifying the launching process
without waiting for it to retrieve the results, but it might not do
that right away, and if we've been killed we need to die *now* not
later.)So in such cases what is the advise to users, currently they will
see the below message:
postgres=# select * from pg_background_result(5124) as (x int);
ERROR: lost connection to worker process with PID 5124One way is to ask them to check logs, but what about if they want
to handle error and take some action?Another point about error handling is that to execute the sql in
function pg_background_worker_main(), it starts the transaction
which I think doesn't get aborted if error occurs
For this I was just referring error handling code of
StartBackgroundWorker(), however during shutdown of process, it
will call AbortOutOfAnyTransaction() which will take care of aborting
transaction.
and similarly handling
for timeout seems to be missing in error path.
As we are anyway going to exit on error, so not sure, if this will be
required, however having it for clean exit seems to be better.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Mon, Sep 8, 2014 at 1:09 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
Don't you need a PG_TRY block here to reset pq_mq_busy?
No. If shm_mq_sendv is interrupted, we can't use the shm_mq any more.
But since that should only happen if an interrupt arrives while the
queue is full, I think that's OK.I think here not only on interrupt, but any other error in this
function shm_mq_sendv() path (one example is WaitLatch())
could lead to same behaviour.
Probably true. But I think we generally count on that to be no-fail,
or close to it, so I'm not really worried about it.
(Think about the alternatives: if
the queue is full, we have no way of notifying the launching process
without waiting for it to retrieve the results, but it might not do
that right away, and if we've been killed we need to die *now* not
later.)So in such cases what is the advise to users, currently they will
see the below message:
postgres=# select * from pg_background_result(5124) as (x int);
ERROR: lost connection to worker process with PID 5124One way is to ask them to check logs, but what about if they want
to handle error and take some action?
They have to check the logs. To reiterate what I said before, there
is no reasonable way to both have the background worker terminate
quickly and also guarantee that the full error message is received by
the process that started it. You have to pick one, and I stick by the
one I picked.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Sep 8, 2014 at 2:05 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:
Another point about error handling is that to execute the sql in
function pg_background_worker_main(), it starts the transaction
which I think doesn't get aborted if error occursFor this I was just referring error handling code of
StartBackgroundWorker(), however during shutdown of process, it
will call AbortOutOfAnyTransaction() which will take care of aborting
transaction.
Yeah.
and similarly handling
for timeout seems to be missing in error path.As we are anyway going to exit on error, so not sure, if this will be
required, however having it for clean exit seems to be better.
Can you be more specific?
I'm generally of the view that there's little point in spending time
cleaning things up that will go away anyway when the process exits.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 9, 2014 at 2:32 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Mon, Sep 8, 2014 at 2:05 AM, Amit Kapila <amit.kapila16@gmail.com>
wrote:
Another point about error handling is that to execute the sql in
function pg_background_worker_main(), it starts the transaction
which I think doesn't get aborted if error occursFor this I was just referring error handling code of
StartBackgroundWorker(), however during shutdown of process, it
will call AbortOutOfAnyTransaction() which will take care of aborting
transaction.Yeah.
and similarly handling
for timeout seems to be missing in error path.As we are anyway going to exit on error, so not sure, if this will be
required, however having it for clean exit seems to be better.Can you be more specific?
I was thinking to handle errors similar to what PostgreMain does,
however if it is going to exit then it might no be worth.
I'm generally of the view that there's little point in spending time
cleaning things up that will go away anyway when the process exits.
Yeah, in that case it might not make much sense. The argument here
could be why it has to exit, why can't it wait till the launcher asks it
to exit. You have mentioned in previous mail that you want to stick to
the approach taken by patch, however it is not clear why you think that
is better. If I try to think about how the worker backend should behave
incase the master backend assigns some task to worker, then I think it
would be sensible for worker to not exit after completing it's task (either
it has finished the execution of work assigned or otherwise) as master
backend can assign further tasks to the same worker. Another advantage
would be that setup and tear down cost of worker will be saved. Now If
we just think w.r.t pg_background it might not be so important to let
worker wait till launcher asks to quit, however as you have mentioned
in your mail upthread that you are expecting this kind of set-up for actual
parallelism, so it might not be completely insensible to expect worker
to wait till it has been asked to quit.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On 29/07/14 18:51, Robert Haas wrote:
On Mon, Jul 28, 2014 at 1:50 PM, Andres Freund <andres@2ndquadrant.com> wrote:
What I'm thinking of is providing an actual API for the writes instead
of hooking into the socket API in a couple places. I.e. have something
liketypedef struct DestIO DestIO;
struct DestIO
{
void (*flush)(struct DestIO *io);
int (*putbytes)(struct DestIO *io, const char *s, size_t len);
int (*getbytes)(struct DestIO *io, const char *s, size_t len);
...
}and do everything through it. I haven't thought much about the specific
API we want, but abstracting the communication properly instead of
adding hooks here and there is imo much more likely to succeed in the
long term.This sounds suspiciously like the DestReceiver thing we've already
got, except that the DestReceiver only applies to tuple results, not
errors and notices and so on. I'm not totally unamenable to a bigger
refactoring here, but right now it looks to me like a solution in
search of a problem. The hooks are simple and seem to work fine; I
don't want to add notation for its own sake.
I am not sure if what Andres proposed is the right solution, but I do
agree here that the hook definitely isn't.
I am also not sure that I like the pq_redirect_to_shm_mq being directly
exposed for use in bgworker, what I would like is to have elog interface
to which you tell that you want errors sent to shm_mq handle and that
one would then do the necessary calls (It's basically same principle but
for the sake of cleanliness/correctness I think it should be elog and
not pq from the point of bgworker).
So the interface needs work.
I do agree that we don't need two way communication over this channel, I
think the dsm_mq can be used directly quite well for generic usecase.
Otherwise I like the patch, and I am glad you also made the GUC
save/restore infrastructure.
Please don't forget to add this to next commitfest.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 9, 2014 at 12:33 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
I am not sure if what Andres proposed is the right solution, but I do agree
here that the hook definitely isn't.
Well, that doesn't get me very far. Andres never responded to my
previous comments about why I did it that way, and you're not
proposing a clear alternative that I can either implement or comment
on.
I am also not sure that I like the pq_redirect_to_shm_mq being directly
exposed for use in bgworker, what I would like is to have elog interface to
which you tell that you want errors sent to shm_mq handle and that one would
then do the necessary calls (It's basically same principle but for the sake
of cleanliness/correctness I think it should be elog and not pq from the
point of bgworker).
I think that's completely wrong. As the patch series demonstrates,
it's not limited to propagating ErrorResponse and NoticeResponse. It
can also propagate NotifyResponse and RowDescription and DataRow and
anything else that comes along. We are not just propagating errors;
we are propagating all protocol messages of whatever type. So tying
it to elog specifically is not right.
So the interface needs work.
I do agree that we don't need two way communication over this channel, I
think the dsm_mq can be used directly quite well for generic usecase.
I'm glad you agree, but that leaves me baffled as to what's wrong with
the hook approach. There's no crying need for extensibility in this
code; it's barely changed in a very long time, and I know of no other
need which might soon require changing it again. The hook is very
lightweight and shouldn't materially affect performance when it's not
used, as a more complex approach might.
Otherwise I like the patch, and I am glad you also made the GUC save/restore
infrastructure.
Cool.
Please don't forget to add this to next commitfest.
OK, done. But it would be awfully nice if we could actually make some
progress on hammering out a design everyone can live with here.
Letting it sit for another 5 weeks is not going to help anything, and
it will hold up getting any more stuff after this done in time for
9.5.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 09/09/14 18:49, Robert Haas wrote:
On Tue, Sep 9, 2014 at 12:33 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
I am also not sure that I like the pq_redirect_to_shm_mq being directly
exposed for use in bgworker, what I would like is to have elog interface to
which you tell that you want errors sent to shm_mq handle and that one would
then do the necessary calls (It's basically same principle but for the sake
of cleanliness/correctness I think it should be elog and not pq from the
point of bgworker).I think that's completely wrong. As the patch series demonstrates,
it's not limited to propagating ErrorResponse and NoticeResponse. It
can also propagate NotifyResponse and RowDescription and DataRow and
anything else that comes along. We are not just propagating errors;
we are propagating all protocol messages of whatever type. So tying
it to elog specifically is not right.
Oh in that case, I think what Andres proposed is actually quite good. I
know the hook works fine it just seems like using somewhat hackish
solution to save 20 lines of code.
The reason why I am not proposing anything better is that my solution
when I did prototyping in this area was to just check if my pq_dsm_mq
handle is set or not and call the appropriate function from the wrapper
based on that which does not seem like big improvement over the hook
approach... Anything better/more flexible that I could come up with is
basically same idea what Andres already wrote.
Please don't forget to add this to next commitfest.
OK, done. But it would be awfully nice if we could actually make some
progress on hammering out a design everyone can live with here.
Letting it sit for another 5 weeks is not going to help anything, and
it will hold up getting any more stuff after this done in time for
9.5.
Yeah I said that just as formality, I wrote the response now and not in
5 weeks exactly for this reason, I am all for discussing this now and I
think it's important to hammer this infrastructure out sooner rather
than later.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Sep 9, 2014 at 1:18 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
I think that's completely wrong. As the patch series demonstrates,
it's not limited to propagating ErrorResponse and NoticeResponse. It
can also propagate NotifyResponse and RowDescription and DataRow and
anything else that comes along. We are not just propagating errors;
we are propagating all protocol messages of whatever type. So tying
it to elog specifically is not right.Oh in that case, I think what Andres proposed is actually quite good. I know
the hook works fine it just seems like using somewhat hackish solution to
save 20 lines of code.
If it's 20 lines of code, I'm probably fine to go that way. Let's see
if we can figure out what those 20 lines look like.
libpq.h exports 29 functions that do a variety of different things.
Of those, 20 are in pqcomm.c and the others are in be-secure.c. I
presume that pluggability for the latter group, if needed at all, is a
separate project. The remaining ones break down like this:
- StreamServerPort, StreamConnection, StreamClose, and
TouchSocketFiles are intended to be called only from the postmaster,
to set up and tear down the listening socket and individual
connections. Presumably this is not what we care about here.
- pq_getbytes(), pq_getstring(), pq_getmessage(), pq_getbyte(),
pq_peekbyte(), and pq_getbyte_if_available() handle *reading* data
from the socket. Since you previously agreed that we didn't need to
build two-way communication on top of this, I would thank that would
mean that these don't need to be pluggable either. But maybe I'm
wrong.
- The remaining functions are pq_init(), pq_comm_reset(), pq_flush(),
pq_flush_if_writable(), pq_is_send_pending(), pq_putmessage(),
pq_putmessage_noblock(), pq_startcopyout(), and pq_endcopyout().
These are the ones that I think are potentially interesting.
I didn't choose to provide hooks for all of these in the submitted
patch because they're not all needed for I want to do here:
pq_startcopyout() and pq_endcopyout() are only needed for V2 protocol
support, which did not interest me (nor did COPY, really);
pq_putmessage_noblock(), pq_flush_if_writable(), and
pq_is_send_pending() are only used for the walsender protocol, which
doesn't seem useful to redirect to a non-socket; and I just didn't
happen to have any use for pq_init() or pq_comm_reset(). Hence what I
ended up with.
But, I could revisit that. Suppose I provide a structure with 10
function pointers for all ten of those functions, or maybe 9 since
pq_init() is called so early that it's not likely we'll have control
to put the hooks in place before that point, and anyway whatever code
installs the hooks can do its own initialization then. Then make a
global variable like pqSendMethods and #define pq_comm_reset() to be
pqSendMethods->comm_reset(), pflush() to be pqSendMethods->flush(),
and so on for all 9 or 10 methods. Then the pqmq code could just
change pqSendMethods to point to its own method structure instead of
the default one. Would that address the concern this concern? It's
more than 20 lines of code, but it's not TOO bad.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 09/09/14 22:48, Robert Haas wrote:
On Tue, Sep 9, 2014 at 1:18 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
I think that's completely wrong. As the patch series demonstrates,
it's not limited to propagating ErrorResponse and NoticeResponse. It
can also propagate NotifyResponse and RowDescription and DataRow and
anything else that comes along. We are not just propagating errors;
we are propagating all protocol messages of whatever type. So tying
it to elog specifically is not right.Oh in that case, I think what Andres proposed is actually quite good. I know
the hook works fine it just seems like using somewhat hackish solution to
save 20 lines of code.If it's 20 lines of code, I'm probably fine to go that way. Let's see
if we can figure out what those 20 lines look like.libpq.h exports 29 functions that do a variety of different things.
Of those, 20 are in pqcomm.c and the others are in be-secure.c. I
presume that pluggability for the latter group, if needed at all, is a
separate project. The remaining ones break down like this:
Ugh
- The remaining functions are pq_init(), pq_comm_reset(), pq_flush(),
pq_flush_if_writable(), pq_is_send_pending(), pq_putmessage(),
pq_putmessage_noblock(), pq_startcopyout(), and pq_endcopyout().
These are the ones that I think are potentially interesting.I didn't choose to provide hooks for all of these in the submitted
patch because they're not all needed for I want to do here:
pq_startcopyout() and pq_endcopyout() are only needed for V2 protocol
support, which did not interest me (nor did COPY, really);
pq_putmessage_noblock(), pq_flush_if_writable(), and
pq_is_send_pending() are only used for the walsender protocol, which
doesn't seem useful to redirect to a non-socket; and I just didn't
happen to have any use for pq_init() or pq_comm_reset(). Hence what I
ended up with.But, I could revisit that. Suppose I provide a structure with 10
function pointers for all ten of those functions, or maybe 9 since
pq_init() is called so early that it's not likely we'll have control
to put the hooks in place before that point, and anyway whatever code
installs the hooks can do its own initialization then. Then make a
global variable like pqSendMethods and #define pq_comm_reset() to be
pqSendMethods->comm_reset(), pflush() to be pqSendMethods->flush(),
and so on for all 9 or 10 methods. Then the pqmq code could just
change pqSendMethods to point to its own method structure instead of
the default one. Would that address the concern this concern? It's
more than 20 lines of code, but it's not TOO bad.
Yes, although my issue with the hooks was not that you only provided
them for 2 functions, but the fact that it had no structure and the
implementation was "if hook set do this, else do that" which I don't see
like a good way of doing it.
All I personally want is structure and extensibility so struct with just
the needed functions is good enough for me and I would be ok to leave
the other 8 functions just as a option for whomever needs to make them
overridable in the future.
--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Sep 10, 2014 at 2:18 AM, Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Sep 9, 2014 at 1:18 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
I think that's completely wrong. As the patch series demonstrates,
it's not limited to propagating ErrorResponse and NoticeResponse. It
can also propagate NotifyResponse and RowDescription and DataRow and
anything else that comes along. We are not just propagating errors;
we are propagating all protocol messages of whatever type. So tying
it to elog specifically is not right.Oh in that case, I think what Andres proposed is actually quite good. I
know
the hook works fine it just seems like using somewhat hackish solution
to
save 20 lines of code.
If it's 20 lines of code, I'm probably fine to go that way. Let's see
if we can figure out what those 20 lines look like.libpq.h exports 29 functions that do a variety of different things.
Of those, 20 are in pqcomm.c and the others are in be-secure.c. I
presume that pluggability for the latter group, if needed at all, is a
separate project. The remaining ones break down like this:- StreamServerPort, StreamConnection, StreamClose, and
TouchSocketFiles are intended to be called only from the postmaster,
to set up and tear down the listening socket and individual
connections. Presumably this is not what we care about here.
- pq_getbytes(), pq_getstring(), pq_getmessage(), pq_getbyte(),
pq_peekbyte(), and pq_getbyte_if_available() handle *reading* data
from the socket. Since you previously agreed that we didn't need to
build two-way communication on top of this, I would thank that would
mean that these don't need to be pluggable either. But maybe I'm
wrong.
- The remaining functions are pq_init(), pq_comm_reset(), pq_flush(),
pq_flush_if_writable(), pq_is_send_pending(), pq_putmessage(),
pq_putmessage_noblock(), pq_startcopyout(), and pq_endcopyout().
These are the ones that I think are potentially interesting.I didn't choose to provide hooks for all of these in the submitted
patch because they're not all needed for I want to do here:
pq_startcopyout() and pq_endcopyout() are only needed for V2 protocol
support, which did not interest me (nor did COPY, really);
pq_putmessage_noblock(), pq_flush_if_writable(), and
pq_is_send_pending() are only used for the walsender protocol, which
doesn't seem useful to redirect to a non-socket; and I just didn't
happen to have any use for pq_init() or pq_comm_reset(). Hence what I
ended up with.But, I could revisit that. Suppose I provide a structure with 10
function pointers for all ten of those functions, or maybe 9 since
pq_init() is called so early that it's not likely we'll have control
to put the hooks in place before that point, and anyway whatever code
installs the hooks can do its own initialization then.
Can we use pq_init() to install function pointers?
Let us say that it will take COMM_METHOD (tcp, shm_mq) as
input and then install function pointers based on communication
method. We can call this from main function of bgworker (in case
of patch from pg_background_worker_main()) with COMM_METHOD
as shm_mq and BackendInitialize() will pass it as tcp.
Then make a
global variable like pqSendMethods and #define pq_comm_reset() to be
pqSendMethods->comm_reset(), pflush() to be pqSendMethods->flush(),
and so on for all 9 or 10 methods. Then the pqmq code could just
change pqSendMethods to point to its own method structure instead of
the default one. Would that address the concern this concern? It's
more than 20 lines of code, but it's not TOO bad.
This idea seems to be better than directly using hooks, however I
don't see any harm in defining pqReceiveMethods for get API's as
well because it can make the whole layer extendable. Having said
that I think as currently there is no usage of it, so we can leave it
for another patch as well.
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com
On Tue, Sep 9, 2014 at 6:03 PM, Petr Jelinek <petr@2ndquadrant.com> wrote:
- The remaining functions are pq_init(), pq_comm_reset(), pq_flush(),
pq_flush_if_writable(), pq_is_send_pending(), pq_putmessage(),
pq_putmessage_noblock(), pq_startcopyout(), and pq_endcopyout().
These are the ones that I think are potentially interesting.I didn't choose to provide hooks for all of these in the submitted
patch because they're not all needed for I want to do here:
pq_startcopyout() and pq_endcopyout() are only needed for V2 protocol
support, which did not interest me (nor did COPY, really);
pq_putmessage_noblock(), pq_flush_if_writable(), and
pq_is_send_pending() are only used for the walsender protocol, which
doesn't seem useful to redirect to a non-socket; and I just didn't
happen to have any use for pq_init() or pq_comm_reset(). Hence what I
ended up with.But, I could revisit that. Suppose I provide a structure with 10
function pointers for all ten of those functions, or maybe 9 since
pq_init() is called so early that it's not likely we'll have control
to put the hooks in place before that point, and anyway whatever code
installs the hooks can do its own initialization then. Then make a
global variable like pqSendMethods and #define pq_comm_reset() to be
pqSendMethods->comm_reset(), pflush() to be pqSendMethods->flush(),
and so on for all 9 or 10 methods. Then the pqmq code could just
change pqSendMethods to point to its own method structure instead of
the default one. Would that address the concern this concern? It's
more than 20 lines of code, but it's not TOO bad.Yes, although my issue with the hooks was not that you only provided them
for 2 functions, but the fact that it had no structure and the
implementation was "if hook set do this, else do that" which I don't see
like a good way of doing it.
We've done it that way in a bunch of other places, like ExecutorRun().
An advantage of this approach (I think) is that jumping to a fixed
address is faster than jumping through a function pointer - so with
the approach I've taken here, the common case where we're talking to
the client incurs only the overhead of a null-test, and the larger
overhead of the function pointer jump is incurred only when the hook
is in use. Maybe that's not enough of a difference to matter to
anything, but I think the contention that I've invented some novel
kind of interface here doesn't hold up to scrutiny. We have lots of
hooks that work just like what I did here.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Sep 10, 2014 at 4:01 PM, Robert Haas <robertmhaas@gmail.com> wrote:
Yes, although my issue with the hooks was not that you only provided them
for 2 functions, but the fact that it had no structure and the
implementation was "if hook set do this, else do that" which I don't see
like a good way of doing it.We've done it that way in a bunch of other places, like ExecutorRun().
An advantage of this approach (I think) is that jumping to a fixed
address is faster than jumping through a function pointer - so with
the approach I've taken here, the common case where we're talking to
the client incurs only the overhead of a null-test, and the larger
overhead of the function pointer jump is incurred only when the hook
is in use. Maybe that's not enough of a difference to matter to
anything, but I think the contention that I've invented some novel
kind of interface here doesn't hold up to scrutiny. We have lots of
hooks that work just like what I did here.
Here's what the other approach looks like. I can't really see doing
this way and then only providing hooks for those two functions, so
this is with hooks for all the send-side stuff.
Original version: 9 files changed, 295 insertions(+), 3 deletions(-)
This version: 9 files changed, 428 insertions(+), 47 deletions(-)
There is admittedly a certain elegance to providing a complete set of
hooks, so maybe this is the way to go. The remaining patches in the
patch series work with either the old version or this one; the changes
here don't affect anything else.
Anyone else have an opinion on which way is better here?
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company