Re: pg_dump possible fix, need testers. (was: Re: pg_dump disaster)

Started by Bruce Momjianover 25 years ago5 messages

pgman@candle.pha.pa.us

over 25 years ago

Can someone remind me of where we left this?

Um, I didn't have any trouble at all reproducing Patrick's complaint.
pg_dump any moderately large table (I used tenk1 from the regress
database) and try to load the script with psql. Kaboom.

This is after or before my latest patch?

Before. I haven't updated since yesterday...

I can't seem to reproduce this problem,

Odd. Maybe there is something different about the kernel's timing of
message sending on your platform. I see it very easily on HPUX 10.20,
and Patrick sees it very easily on whatever he's using (netbsd I think).
You might try varying the situation a little, say
psql mydb <dumpfile
psql -f dumpfile mydb
psql mydb
\i dumpfile
and the same with -h localhost (to get a TCP/IP connection instead of
Unix domain). At the moment (pre-patch) I see failures with the
first two of these, but not with the \i method. -h doesn't seem to
matter for me, but it might for you.

Telling me something is wrong without giving suggestions on how
to fix it, nor direct pointers to where it fails doesn't help me
one bit. You're not offering constructive critism, you're not
even offering valid critism, you're just waving your finger at
"problems" that you say exist but don't pin down to anything specific.

I have been explaining it as clearly as I could. Let's try it
one more time.

I spent hours looking over what I did to pqFlush and pqPutnBytes
because of what you said earlier when all the bug seems to have
come down to is that I missed that the socket is set to non-blocking
in all cases now.

Letting the socket mode default to blocking will hide the problems from
existing clients that don't care about non-block mode. But people who
try to actually use the nonblock mode are going to see the same kinds of
problems that psql is exhibiting.

The old sequence of events that happened was as follows:

user sends data almost filling the output buffer...
user sends another line of text overflowing the buffer...
pqFlush is invoked blocking the user until the output pipe clears...
and repeat.

Right.

The nonblocking code allows sends to fail so the user can abort
sending stuff to the backend in order to process other work:

user sends data almost filling the output buffer...
user sends another line of text that may overflow the buffer...
pqFlush is invoked,
if the pipe can't be cleared an error is returned allowing the user to
retry the send later.
if the flush succeeds then more data is queued and success is returned

But you haven't thought through the mechanics of the "error is returned
allowing the user to retry" code path clearly enough. Let's take
pqPutBytes for an example. If it returns EOF, is that a hard error or
does it just mean that the application needs to wait a while? The
application *must* distinguish these cases, or it will do the wrong
thing: for example, if it mistakes a hard error for "wait a while",
then it will wait forever without making any progress or producing
an error report.

You need to provide a different return convention that indicates
what happened, say
EOF (-1) => hard error (same as old code)
0 => OK
1 => no data was queued due to risk of blocking
And you need to guarantee that the application knows what the state is
when the can't-do-it-yet return is made; note that I specified "no data
was queued" above. If pqPutBytes might queue some of the data before
returning 1, the application is in trouble again. While you apparently
foresaw that in recoding pqPutBytes, your code doesn't actually work.
There is the minor code bug that you fail to update "avail" after the
first pqFlush call, and the much more fundamental problem that you
cannot guarantee to have queued all or none of the data. Think about
what happens if the passed nbytes is larger than the output buffer size.
You may pass the first pqFlush successfully, then get into the loop and
get a won't-block return from pqFlush in the loop. What then?
You can't simply refuse to support the case nbytes > bufsize at all,
because that will cause application failures as well (too long query
sends it into an infinite loop trying to queue data, most likely).

A possible answer is to specify that a return of +N means "N bytes
remain unqueued due to risk of blocking" (after having queued as much
as you could). This would put the onus on the caller to update his
pointers/counts properly; propagating that into all the internal uses
of pqPutBytes would be no fun. (Of course, so far you haven't updated
*any* of the internal callers to behave reasonably in case of a
won't-block return; PQfn is just one example.)

Another possible answer is to preserve pqPutBytes' old API, "queue or
bust", by the expedient of enlarging the output buffer to hold whatever
we can't send immediately. This is probably more attractive, even
though a long query might suck up a lot of space that won't get
reclaimed as long as the connection lives. If you don't do this then
you are going to have to make a lot of ugly changes in the internal
callers to deal with won't-block returns. Actually, a bulk COPY IN
would probably be the worst case --- the app could easily load data into
the buffer far faster than it could be sent. It might be best to extend
PQputline to have a three-way return and add code there to limit the
growth of the output buffer, while allowing all internal callers to
assume that the buffer is expanded when they need it.

pqFlush has the same kind of interface design problem: the same EOF code
is returned for either a hard error or can't-flush-yet, but it would be
disastrous to treat those cases alike. You must provide a 3-way return
code.

Furthermore, the same sort of 3-way return code convention will have to
propagate out through anything that calls pqFlush (with corresponding
documentation updates). pqPutBytes can be made to hide a pqFlush won't-
block return by trying to enlarge the output buffer, but in most other
places you won't have a choice except to punt it back to the caller.

PQendcopy has the same interface design problem. It used to be that
(unless you passed a null pointer) PQendcopy would *guarantee* that
the connection was no longer in COPY state on return --- by resetting
it, if necessary. So the return code was mainly informative; the
application didn't have to do anything different if PQendcopy reported
failure. But now, a nonblocking application does need to pay attention
to whether PQendcopy completed or not --- and you haven't provided a way
for it to tell. If 1 is returned, the connection might still be in
COPY state, or it might not (PQendcopy might have reset it). If the
application doesn't distinguish these cases then it will fail.

I also think that you want to take a hard look at the automatic "reset"
behavior upon COPY failure, since a PQreset call will block the
application until it finishes. Really, what is needed to close down a
COPY safely in nonblock mode is a pair of entry points along the line of
"PQendcopyStart" and "PQendcopyPoll", with API conventions similar to
PQresetStart/PQresetPoll. This gives you the ability to do the reset
(if one is necessary) without blocking the application. PQendcopy
itself will only be useful to blocking applications.

I'm sorry if they don't work for some situations other than COPY IN,
but it's functionality that I needed and I expect to be expanded on
by myself and others that take interest in nonblocking operation.

I don't think that the nonblock code is anywhere near production quality
at this point. It may work for you, if you don't stress it too hard and
never have a communications failure; but I don't want to see us ship it
as part of Postgres unless these issues get addressed.

regards, tom lane

************

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Import Notes

Reply to msg id not found: 6208.948650136@sss.pgh.pa.us

Alfred Perlstein

bright@wintelcom.net

over 25 years ago

In reply to: Bruce Momjian (#1)

* Bruce Momjian <pgman@candle.pha.pa.us> [000929 19:30] wrote:

Can someone remind me of where we left this?

I really haven't figured a correct way to deal with the output buffer.
I'll try to consider ways to deal with this.

-Alfred

Alfred Perlstein

bright@wintelcom.net

over 25 years ago

In reply to: Bruce Momjian (#1)

* Bruce Momjian <pgman@candle.pha.pa.us> [000929 19:32] wrote:

Can someone remind me of where we left this?

[snip]

Sorry for the really long follow-up, but this was posted a long time
ago and I wanted to refresh everyone's memory as to the problem.

The problem is that when doing a COPY from stdin using libpq and
my non-blocking query settings there is a big problem, namely that
the size of the query is limited to the size of the output buffer.

The reason is that I couldn't figure any way to determine what the
application was doing, basically, if i'm sending a query I want to
expand the output buffer to accomidate the new query, however if
I'm doing a COPY out, I would like to be informed if the backend
is lagging behind.

The concept is that someone doing queries needs to be retrieving
results from the database backend between queries, however someone
doing a COPY doesn't until they issue the terminating \\. line.

I'm pretty sure I know what to do now, it's pretty simple actually,
I can examine the state of the connection, if it's in PGASYNC_COPY_IN
then I don't grow the buffer, I inform the application that the
data will block, if it's no PGASYNC_COPY_IN I allow the buffer to grow
protecting the application from blocking.

Tom, I'd like to know what do you think before running off to implement
this.

thanks,
-Alfred

Letting the socket mode default to blocking will hide the problems from
existing clients that don't care about non-block mode. But people who
try to actually use the nonblock mode are going to see the same kinds of
problems that psql is exhibiting.

The old sequence of events that happened was as follows:

user sends data almost filling the output buffer...
user sends another line of text overflowing the buffer...
pqFlush is invoked blocking the user until the output pipe clears...
and repeat.

Right.

The nonblocking code allows sends to fail so the user can abort
sending stuff to the backend in order to process other work:

user sends data almost filling the output buffer...
user sends another line of text that may overflow the buffer...
pqFlush is invoked,
if the pipe can't be cleared an error is returned allowing the user to
retry the send later.
if the flush succeeds then more data is queued and success is returned

But you haven't thought through the mechanics of the "error is returned
allowing the user to retry" code path clearly enough. Let's take
pqPutBytes for an example. If it returns EOF, is that a hard error or
does it just mean that the application needs to wait a while? The
application *must* distinguish these cases, or it will do the wrong
thing: for example, if it mistakes a hard error for "wait a while",
then it will wait forever without making any progress or producing
an error report.

You need to provide a different return convention that indicates
what happened, say
EOF (-1) => hard error (same as old code)
0 => OK
1 => no data was queued due to risk of blocking
And you need to guarantee that the application knows what the state is
when the can't-do-it-yet return is made; note that I specified "no data
was queued" above. If pqPutBytes might queue some of the data before
returning 1, the application is in trouble again. While you apparently
foresaw that in recoding pqPutBytes, your code doesn't actually work.
There is the minor code bug that you fail to update "avail" after the
first pqFlush call, and the much more fundamental problem that you
cannot guarantee to have queued all or none of the data. Think about
what happens if the passed nbytes is larger than the output buffer size.
You may pass the first pqFlush successfully, then get into the loop and
get a won't-block return from pqFlush in the loop. What then?
You can't simply refuse to support the case nbytes > bufsize at all,
because that will cause application failures as well (too long query
sends it into an infinite loop trying to queue data, most likely).

A possible answer is to specify that a return of +N means "N bytes
remain unqueued due to risk of blocking" (after having queued as much
as you could). This would put the onus on the caller to update his
pointers/counts properly; propagating that into all the internal uses
of pqPutBytes would be no fun. (Of course, so far you haven't updated
*any* of the internal callers to behave reasonably in case of a
won't-block return; PQfn is just one example.)

Another possible answer is to preserve pqPutBytes' old API, "queue or
bust", by the expedient of enlarging the output buffer to hold whatever
we can't send immediately. This is probably more attractive, even
though a long query might suck up a lot of space that won't get
reclaimed as long as the connection lives. If you don't do this then
you are going to have to make a lot of ugly changes in the internal
callers to deal with won't-block returns. Actually, a bulk COPY IN
would probably be the worst case --- the app could easily load data into
the buffer far faster than it could be sent. It might be best to extend
PQputline to have a three-way return and add code there to limit the
growth of the output buffer, while allowing all internal callers to
assume that the buffer is expanded when they need it.

pqFlush has the same kind of interface design problem: the same EOF code
is returned for either a hard error or can't-flush-yet, but it would be
disastrous to treat those cases alike. You must provide a 3-way return
code.

Furthermore, the same sort of 3-way return code convention will have to
propagate out through anything that calls pqFlush (with corresponding
documentation updates). pqPutBytes can be made to hide a pqFlush won't-
block return by trying to enlarge the output buffer, but in most other
places you won't have a choice except to punt it back to the caller.

PQendcopy has the same interface design problem. It used to be that
(unless you passed a null pointer) PQendcopy would *guarantee* that
the connection was no longer in COPY state on return --- by resetting
it, if necessary. So the return code was mainly informative; the
application didn't have to do anything different if PQendcopy reported
failure. But now, a nonblocking application does need to pay attention
to whether PQendcopy completed or not --- and you haven't provided a way
for it to tell. If 1 is returned, the connection might still be in
COPY state, or it might not (PQendcopy might have reset it). If the
application doesn't distinguish these cases then it will fail.

I also think that you want to take a hard look at the automatic "reset"
behavior upon COPY failure, since a PQreset call will block the
application until it finishes. Really, what is needed to close down a
COPY safely in nonblock mode is a pair of entry points along the line of
"PQendcopyStart" and "PQendcopyPoll", with API conventions similar to
PQresetStart/PQresetPoll. This gives you the ability to do the reset
(if one is necessary) without blocking the application. PQendcopy
itself will only be useful to blocking applications.

I'm sorry if they don't work for some situations other than COPY IN,
but it's functionality that I needed and I expect to be expanded on
by myself and others that take interest in nonblocking operation.

I don't think that the nonblock code is anywhere near production quality
at this point. It may work for you, if you don't stress it too hard and
never have a communications failure; but I don't want to see us ship it
as part of Postgres unless these issues get addressed.

regards, tom lane

************
-- 
Bruce Momjian                        |  http://candle.pha.pa.us
pgman@candle.pha.pa.us               |  (610) 853-3000
+  If your life is a hard drive,     |  830 Blythe Avenue
+  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."

Tom Lane

tgl@sss.pgh.pa.us

over 25 years ago

In reply to: Alfred Perlstein (#3)

Alfred Perlstein <bright@wintelcom.net> writes:

I'm pretty sure I know what to do now, it's pretty simple actually,
I can examine the state of the connection, if it's in PGASYNC_COPY_IN
then I don't grow the buffer, I inform the application that the
data will block, if it's no PGASYNC_COPY_IN I allow the buffer to grow
protecting the application from blocking.

From what I recall of the prior discussion, it seemed that a state-based
approach probably isn't the way to go. The real issue is how many
routines are you going to have to change to deal with a three-way return
convention; you want to minimize the number of places that have to cope
with that. IIRC the idea was to let pqPutBytes grow the buffer so that
its callers didn't need to worry about a "sorry, won't block" return
condition. If you feel that growing the buffer is inappropriate for a
specific caller, then probably the right answer is for that particular
caller to make an extra check to see if the buffer will overflow, and
refrain from calling pqPutBytes if it doesn't like what will happen.

If you make pqPutByte's behavior state-based, then callers that aren't
expecting a "won't block" return will fail (silently :-() in some states.
While you might be able to get away with that for PGASYNC_COPY_IN state
because not much of libpq is expected to be exercised in that state,
it strikes me as an awfully fragile coding convention. I think you will
regret that choice eventually, if you make it.

regards, tom lane

Alfred Perlstein

bright@wintelcom.net

over 25 years ago

In reply to: Tom Lane (#4)

* Tom Lane <tgl@sss.pgh.pa.us> [001012 12:14] wrote:

Alfred Perlstein <bright@wintelcom.net> writes:

I'm pretty sure I know what to do now, it's pretty simple actually,
I can examine the state of the connection, if it's in PGASYNC_COPY_IN
then I don't grow the buffer, I inform the application that the
data will block, if it's no PGASYNC_COPY_IN I allow the buffer to grow
protecting the application from blocking.

From what I recall of the prior discussion, it seemed that a state-based

approach probably isn't the way to go. The real issue is how many
routines are you going to have to change to deal with a three-way return
convention; you want to minimize the number of places that have to cope
with that. IIRC the idea was to let pqPutBytes grow the buffer so that
its callers didn't need to worry about a "sorry, won't block" return
condition. If you feel that growing the buffer is inappropriate for a
specific caller, then probably the right answer is for that particular
caller to make an extra check to see if the buffer will overflow, and
refrain from calling pqPutBytes if it doesn't like what will happen.

If you make pqPutByte's behavior state-based, then callers that aren't
expecting a "won't block" return will fail (silently :-() in some states.
While you might be able to get away with that for PGASYNC_COPY_IN state
because not much of libpq is expected to be exercised in that state,
it strikes me as an awfully fragile coding convention. I think you will
regret that choice eventually, if you make it.

It is a somewhat fragile change, but much less intrusive than anything
else I could think of. It removes the three-way return value from
the send code for everything except when in a COPY IN state where
it's the application's job to handle it. If there would be a
blocking condition we do as the non-blocking code handles it except
instead of blocking we buffer it in it's entirety.

My main question was wondering if there's any cases where we might
"go nuts" sending data to the backend with the exception of the
COPY code?

-- or --

I could make a function to check the buffer space and attempt to
flush the buffer (in a non-blocking manner) to be called from
PQputline and PQputnbytes if the connection is non-blocking.

However since those are external functions my question is if you
know of any other uses for those function besideds when COPY'ing
into the backend?

Since afaik I'm the only schmoe using my non-blocking stuff,
restricting the check for bufferspace to non-blocking connections
wouldn't break any APIs from my PoV.

How should I proceed?

--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."