libpq COPY handling

Started by Robert Haasover 12 years ago6 messages

robertmhaas@gmail.com

over 12 years ago

1 attachment(s)

Noah Misch pointed out something interesting to me:

/*
* PQputCopyEnd - send EOF indication to the backend during COPY IN
*
* After calling this, use PQgetResult() to check command completion status.
*
* Returns 1 if successful, 0 if data could not be sent (only possible
* in nonblock mode), or -1 if an error occurs.
*/

The comment alleges that 0 is a possible return value, but the only
return statements in the code for that function return literal values
of either 1 or -1. I'm not sure whether that's a bug in the code or
the documentation.

Also, I noticed that there are a few places in fe-protocol3.c that
seem not to know about COPY-BOTH mode. I'm not sure that any of these
are actually bugs right now given the current very limited use of
COPY-BOTH mode, but I'm wondering whether it wouldn't be better to
minimize the chance of future surprises. Patch attached.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachments:

libpq-copy-both.patchapplication/octet-stream; name=libpq-copy-both.patchDownload

diff --git a/src/interfaces/libpq/fe-protocol3.c b/src/interfaces/libpq/fe-protocol3.c
index 1e26e19..7fa090a 100644
--- a/src/interfaces/libpq/fe-protocol3.c
+++ b/src/interfaces/libpq/fe-protocol3.c
@@ -1566,7 +1566,8 @@ pqGetline3(PGconn *conn, char *s, int maxlen)
 	int			status;
 
 	if (conn->sock < 0 ||
-		conn->asyncStatus != PGASYNC_COPY_OUT ||
+		(conn->asyncStatus != PGASYNC_COPY_OUT &&
+		 conn->asyncStatus != PGASYNC_COPY_BOTH) ||
 		conn->copy_is_binary)
 	{
 		printfPQExpBuffer(&conn->errorMessage,
@@ -1617,7 +1618,8 @@ pqGetlineAsync3(PGconn *conn, char *buffer, int bufsize)
 	int			msgLength;
 	int			avail;
 
-	if (conn->asyncStatus != PGASYNC_COPY_OUT)
+	if (conn->asyncStatus != PGASYNC_COPY_OUT
+		&& conn->asyncStatus != PGASYNC_COPY_BOTH)
 		return -1;				/* we are not doing a copy... */
 
 	/*
@@ -1671,7 +1673,8 @@ pqEndcopy3(PGconn *conn)
 	PGresult   *result;
 
 	if (conn->asyncStatus != PGASYNC_COPY_IN &&
-		conn->asyncStatus != PGASYNC_COPY_OUT)
+		conn->asyncStatus != PGASYNC_COPY_OUT &&
+		conn->asyncStatus != PGASYNC_COPY_BOTH)
 	{
 		printfPQExpBuffer(&conn->errorMessage,
 						  libpq_gettext("no COPY in progress\n"));
@@ -1679,7 +1682,8 @@ pqEndcopy3(PGconn *conn)
 	}
 
 	/* Send the CopyDone message if needed */
-	if (conn->asyncStatus == PGASYNC_COPY_IN)
+	if (conn->asyncStatus == PGASYNC_COPY_IN ||
+		conn->asyncStatus == PGASYNC_COPY_BOTH)
 	{
 		if (pqPutMsgStart('c', false, conn) < 0 ||
 			pqPutMsgEnd(conn) < 0)

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Robert Haas (#1)

Re: libpq COPY handling

Robert Haas <robertmhaas@gmail.com> writes:

Noah Misch pointed out something interesting to me:
/*
* PQputCopyEnd - send EOF indication to the backend during COPY IN
*
* After calling this, use PQgetResult() to check command completion status.
*
* Returns 1 if successful, 0 if data could not be sent (only possible
* in nonblock mode), or -1 if an error occurs.
*/

The comment alleges that 0 is a possible return value, but the only
return statements in the code for that function return literal values
of either 1 or -1. I'm not sure whether that's a bug in the code or
the documentation.

Hm. pqFlush has a three-way return value which PQputCopyEnd is only
checking as a boolean. Probably the intent was to return "data not
sent" if pqFlush reports that.

However, the documentation in libpq.sgml is a bit bogus too, because it
counsels trying the PQputCopyEnd() call again, which will not work
(since we already changed the asyncStatus). We could make that say "a
zero result is informational, you might want to try PQflush() later".
The trouble with this, though, is that any existing callers that were
coded to the old spec would now be broken.

It might be better to consider that the code is correct and fix the
documentation. I notice that the other places in fe-exec.c that call
pqFlush() generally say "In nonblock mode, don't complain if we're unable
to send it all", which is pretty much the spirit of what this is doing
though it lacks that comment.

Also, I noticed that there are a few places in fe-protocol3.c that
seem not to know about COPY-BOTH mode. I'm not sure that any of these
are actually bugs right now given the current very limited use of
COPY-BOTH mode, but I'm wondering whether it wouldn't be better to
minimize the chance of future surprises. Patch attached.

+1 for these changes, anyway.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Tom Lane (#2)

Re: libpq COPY handling

On Thu, Apr 25, 2013 at 1:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, the documentation in libpq.sgml is a bit bogus too, because it
counsels trying the PQputCopyEnd() call again, which will not work
(since we already changed the asyncStatus). We could make that say "a
zero result is informational, you might want to try PQflush() later".
The trouble with this, though, is that any existing callers that were
coded to the old spec would now be broken.

It might be better to consider that the code is correct and fix the
documentation. I notice that the other places in fe-exec.c that call
pqFlush() generally say "In nonblock mode, don't complain if we're unable
to send it all", which is pretty much the spirit of what this is doing
though it lacks that comment.

Changing the meaning of a 0 return code seems like a bad idea.
However, not ever returning 0 isn't great either: someone could be
forgiven for writing code that calls PQputCopyData/End() until they
get a 0 result, then waits for the socket to become write-OK before
continuing. They might be surprised to find that this results in
libpq's internal buffer expanding until malloc fails.

I'm gathering that what I should probably do is something like this.
Send some data using PQputCopyData() or an end-of-copy using
PQputCopyEnd(). Then, call PQflush(). If the latter returns 1, then
wait for the socket to become write-OK and try again. Once it returns
0, send the next hunk of copy-data. This is a bit of an exasperating
interface because when the socket buffer fills up, PQputCopyData()
will do a write() but we won't know what happened; we'll have to call
PQflush() and try a second write of the same data without an
intervening wait to find out that the buffer is full. If this worked
as currently documented, that wouldn't be necessary. But it seems
tolerable.

Also, I noticed that there are a few places in fe-protocol3.c that
seem not to know about COPY-BOTH mode. I'm not sure that any of these
are actually bugs right now given the current very limited use of
COPY-BOTH mode, but I'm wondering whether it wouldn't be better to
minimize the chance of future surprises. Patch attached.

+1 for these changes, anyway.

OK, committed, thanks.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

over 12 years ago

In reply to: Robert Haas (#3)

Re: libpq COPY handling

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Apr 25, 2013 at 1:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, the documentation in libpq.sgml is a bit bogus too, because it
counsels trying the PQputCopyEnd() call again, which will not work
(since we already changed the asyncStatus). We could make that say "a
zero result is informational, you might want to try PQflush() later".
The trouble with this, though, is that any existing callers that were
coded to the old spec would now be broken.

Changing the meaning of a 0 return code seems like a bad idea.
However, not ever returning 0 isn't great either: someone could be
forgiven for writing code that calls PQputCopyData/End() until they
get a 0 result, then waits for the socket to become write-OK before
continuing.

Anybody who tried that would have already discovered that it doesn't
work, so that argument seems pretty hollow.

What I'm suggesting is that we fix the documentation to match what
the code actually does, ie 1 and -1 are the only return codes, but
in nonblock mode it may not actually have flushed all the data.
I do not see how that can break any code that works now.

An alternative possibility is to document the zero return case as
"can't happen now, but defined for possible future expansion", which
I rather suspect was the thought process last time this was looked at.
The trouble with that is that, if we ever did try to actually return
zero, we'd more than likely break some code that should have been
checking for the case and wasn't.

Anyway, in view of the lack of complaints from the field, I think
changing the behavior of this code would be much more likely to cause
problems than fix them.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

over 12 years ago

In reply to: Tom Lane (#4)

Re: libpq COPY handling

On Fri, Apr 26, 2013 at 10:48 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Apr 25, 2013 at 1:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, the documentation in libpq.sgml is a bit bogus too, because it
counsels trying the PQputCopyEnd() call again, which will not work
(since we already changed the asyncStatus). We could make that say "a
zero result is informational, you might want to try PQflush() later".
The trouble with this, though, is that any existing callers that were
coded to the old spec would now be broken.

Changing the meaning of a 0 return code seems like a bad idea.
However, not ever returning 0 isn't great either: someone could be
forgiven for writing code that calls PQputCopyData/End() until they
get a 0 result, then waits for the socket to become write-OK before
continuing.

Anybody who tried that would have already discovered that it doesn't
work, so that argument seems pretty hollow.

What I'm suggesting is that we fix the documentation to match what
the code actually does, ie 1 and -1 are the only return codes, but
in nonblock mode it may not actually have flushed all the data.
I do not see how that can break any code that works now.

An alternative possibility is to document the zero return case as
"can't happen now, but defined for possible future expansion", which
I rather suspect was the thought process last time this was looked at.
The trouble with that is that, if we ever did try to actually return
zero, we'd more than likely break some code that should have been
checking for the case and wasn't.

Anyway, in view of the lack of complaints from the field, I think
changing the behavior of this code would be much more likely to cause
problems than fix them.

Sure, I don't think we can change the behavior now; I just think it's
a shame that the documented behavior was never implemented. But at
this point we don't have much choice but to live with it.

By the by, might it be time to start thinking about deprecating
protocol version 2, at least for COPY? It seems that supporting it
requires nontrivial contortions of both the front-end and back-end
code, some of which seem possibly relevant to performance. Obviously
we're not going to do anything to 9.3 at this point, but maybe for the
next release...

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Gavin Flower

GavinFlower@archidevsys.co.nz

over 12 years ago

In reply to: Tom Lane (#4)

Re: libpq COPY handling

On 27/04/13 02:48, Tom Lane wrote:

Robert Haas <robertmhaas@gmail.com> writes:

On Thu, Apr 25, 2013 at 1:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

However, the documentation in libpq.sgml is a bit bogus too, because it
counsels trying the PQputCopyEnd() call again, which will not work
(since we already changed the asyncStatus). We could make that say "a
zero result is informational, you might want to try PQflush() later".
The trouble with this, though, is that any existing callers that were
coded to the old spec would now be broken.

Changing the meaning of a 0 return code seems like a bad idea.
However, not ever returning 0 isn't great either: someone could be
forgiven for writing code that calls PQputCopyData/End() until they
get a 0 result, then waits for the socket to become write-OK before
continuing.

Anybody who tried that would have already discovered that it doesn't
work, so that argument seems pretty hollow.

What I'm suggesting is that we fix the documentation to match what
the code actually does, ie 1 and -1 are the only return codes, but
in nonblock mode it may not actually have flushed all the data.
I do not see how that can break any code that works now.

An alternative possibility is to document the zero return case as
"can't happen now, but defined for possible future expansion", which
I rather suspect was the thought process last time this was looked at.
The trouble with that is that, if we ever did try to actually return
zero, we'd more than likely break some code that should have been
checking for the case and wasn't.

Anyway, in view of the lack of complaints from the field, I think
changing the behavior of this code would be much more likely to cause
problems than fix them.

regards, tom lane

The original usage of returns codes was as an offset for the Operating
System to jump to in a list of addresses to execute, after program
completion. Zero offset was to the first address for normal completion,
then 4 for the next address... Addresses were stored in 4 bytes. Hence
an historical tendency to define return codes in multiple of 4. This
dates back to the Mainframe days, pre UNIX, even!

Personally I would prefer zero to indicate an error, on the basis that
is the default value for memory, but historical usage makes that
impractical! So we are stuck with zero being interpreted as the return
code indicating 'Normal' completion - bash, and other Linux shells, are
designed on this assumption.

I first came across this return code convention when I was programming
Mainframes in the early 1970's.

Cheers,
Gavin