COPY IN/BOTH vs. extended query mode

Started by Robert Haasalmost 9 years ago7 messages
#1Robert Haas
robertmhaas@gmail.com

According to the documentation for COPY IN mode, "If the COPY command
was issued via an extended-query message, the backend will now discard
frontend messages until a Sync message is received, then it will issue
ReadyForQuery and return to normal processing." I added a similar
note to the documentation for COPY BOTH mode in
91fa8532f4053468acc08534a6aac516ccde47b7, and the documentation
accurately describes the behavior of the server. However, this seems
to make fully correct error handling for clients using libpq almost
impossible, because PQsendQueryGuts() sends
Parse-Bind-Describe-Execute-Sync in one shot without regard to whether
the command that was just sent invoked COPY mode (cf. the note in
CopyGetData about why we ignore Flush and Sync in that function).

So imagine that the client uses libpq to send (via the extended query
protocol) a COPY IN command (or some hypothetical command that starts
COPY BOTH mode to begin). If the server throws an error before the
Sync message is consumed, it will bounce back to PostgresMain which
will set doing_extended_query_message = true after which it will
consume messages, find the Sync, reset that flag, and send
ReadyForQuery. On the other hand, if the server enters CopyBoth mode,
consumes the Sync message in CopyGetData (or a similar function), and
*then* throws an ERROR, the server will wait for a second Sync message
from the client before issuing ReadyForQuery. There is no sensible
way of coping with this problem in libpq, because there is no way for
the client to know which part of the server code consumed the Sync
message that it already sent. In short, from the client's point of
view, if it enters COPY IN or COPY BOTH mode via the extend query
protocol, and an error occurs on the server, the server MAY OR MAY NOT
expect a further Sync message before issuing ReadyForQuery, and the
client has no way of knowing -- except maybe waiting for a while to
see what happens.

It does not appear to me that there is any good solution to this
problem. Fixing it on the server side would require a wire protocol
change - e.g. one kind of Sync message that is used in a
Parse-Bind-Describe-Execute-Sync sequence that only terminates
non-COPY commands and another kind that is used to signal the end even
of COPY. Fixing it on the client side would require all clients to
know prior to initiating an extended-query-protocol sequence whether
or not the command was going to initiate COPY, which is an awful API
even if didn't constitute an impossible-to-contemplate backward
compatibility break. Perhaps we will have to be content to document
the fact that this part of the protocol is depressingly broken...

...unless of course somebody can see something that I'm missing here
and the situation isn't as bad as it currently appears to me to be.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#2Robert Haas
robertmhaas@gmail.com
In reply to: Robert Haas (#1)
Re: COPY IN/BOTH vs. extended query mode

On Mon, Jan 23, 2017 at 9:12 PM, Robert Haas <robertmhaas@gmail.com> wrote:

According to the documentation for COPY IN mode, "If the COPY command
was issued via an extended-query message, the backend will now discard
frontend messages until a Sync message is received, then it will issue
ReadyForQuery and return to normal processing." I added a similar
note to the documentation for COPY BOTH mode in
91fa8532f4053468acc08534a6aac516ccde47b7, and the documentation
accurately describes the behavior of the server. However, this seems
to make fully correct error handling for clients using libpq almost
impossible, because PQsendQueryGuts() sends
Parse-Bind-Describe-Execute-Sync in one shot without regard to whether
the command that was just sent invoked COPY mode (cf. the note in
CopyGetData about why we ignore Flush and Sync in that function).

So imagine that the client uses libpq to send (via the extended query
protocol) a COPY IN command (or some hypothetical command that starts
COPY BOTH mode to begin). If the server throws an error before the
Sync message is consumed, it will bounce back to PostgresMain which
will set doing_extended_query_message = true after which it will
consume messages, find the Sync, reset that flag, and send
ReadyForQuery. On the other hand, if the server enters CopyBoth mode,
consumes the Sync message in CopyGetData (or a similar function), and
*then* throws an ERROR, the server will wait for a second Sync message
from the client before issuing ReadyForQuery. There is no sensible
way of coping with this problem in libpq, because there is no way for
the client to know which part of the server code consumed the Sync
message that it already sent. In short, from the client's point of
view, if it enters COPY IN or COPY BOTH mode via the extend query
protocol, and an error occurs on the server, the server MAY OR MAY NOT
expect a further Sync message before issuing ReadyForQuery, and the
client has no way of knowing -- except maybe waiting for a while to
see what happens.

It does not appear to me that there is any good solution to this
problem. Fixing it on the server side would require a wire protocol
change - e.g. one kind of Sync message that is used in a
Parse-Bind-Describe-Execute-Sync sequence that only terminates
non-COPY commands and another kind that is used to signal the end even
of COPY. Fixing it on the client side would require all clients to
know prior to initiating an extended-query-protocol sequence whether
or not the command was going to initiate COPY, which is an awful API
even if didn't constitute an impossible-to-contemplate backward
compatibility break. Perhaps we will have to be content to document
the fact that this part of the protocol is depressingly broken...

...unless of course somebody can see something that I'm missing here
and the situation isn't as bad as it currently appears to me to be.

Anybody have any thoughts on this?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Craig Ringer
craig.ringer@2ndquadrant.com
In reply to: Robert Haas (#2)
Re: COPY IN/BOTH vs. extended query mode

On 14 Feb. 2017 06:15, "Robert Haas" <robertmhaas@gmail.com> wrote:

On Mon, Jan 23, 2017 at 9:12 PM, Robert Haas <robertmhaas@gmail.com> wrote:

According to the documentation for COPY IN mode, "If the COPY command
was issued via an extended-query message, the backend will now discard
frontend messages until a Sync message is received, then it will issue
ReadyForQuery and return to normal processing." I added a similar
note to the documentation for COPY BOTH mode in
91fa8532f4053468acc08534a6aac516ccde47b7, and the documentation
accurately describes the behavior of the server. However, this seems
to make fully correct error handling for clients using libpq almost
impossible, because PQsendQueryGuts() sends
Parse-Bind-Describe-Execute-Sync in one shot without regard to whether
the command that was just sent invoked COPY mode (cf. the note in
CopyGetData about why we ignore Flush and Sync in that function).

So imagine that the client uses libpq to send (via the extended query
protocol) a COPY IN command (or some hypothetical command that starts
COPY BOTH mode to begin). If the server throws an error before the
Sync message is consumed, it will bounce back to PostgresMain which
will set doing_extended_query_message = true after which it will
consume messages, find the Sync, reset that flag, and send
ReadyForQuery. On the other hand, if the server enters CopyBoth mode,
consumes the Sync message in CopyGetData (or a similar function), and
*then* throws an ERROR, the server will wait for a second Sync message
from the client before issuing ReadyForQuery. There is no sensible
way of coping with this problem in libpq, because there is no way for
the client to know which part of the server code consumed the Sync
message that it already sent. In short, from the client's point of
view, if it enters COPY IN or COPY BOTH mode via the extend query
protocol, and an error occurs on the server, the server MAY OR MAY NOT
expect a further Sync message before issuing ReadyForQuery, and the
client has no way of knowing -- except maybe waiting for a while to
see what happens.

It does not appear to me that there is any good solution to this
problem. Fixing it on the server side would require a wire protocol
change - e.g. one kind of Sync message that is used in a
Parse-Bind-Describe-Execute-Sync sequence that only terminates
non-COPY commands and another kind that is used to signal the end even
of COPY. Fixing it on the client side would require all clients to
know prior to initiating an extended-query-protocol sequence whether
or not the command was going to initiate COPY, which is an awful API
even if didn't constitute an impossible-to-contemplate backward
compatibility break. Perhaps we will have to be content to document
the fact that this part of the protocol is depressingly broken...

...unless of course somebody can see something that I'm missing here
and the situation isn't as bad as it currently appears to me to be.

Anybody have any thoughts on this?

I've been thinking on it a bit, but don't really have anything that can be
done without a protocol version bump.

We can't really disallow extended query protocol COPY, too much is likely
to break. And we can't fix it without a protocol change.

A warning in the docs for COPY would be appropriate, noting that clients
should use the simple query protocol to issue COPY. It's kind of mixing
layers, since many users won't see the protocol level or have any idea if
their client driver uses ext or simple query, but we can at least advise
libpq users.

Also in the protocol docs, noting that clirnfa sending COPY should prefer
the simple query protocol due to error recovery issues with COPY and
extended query protocol.

#4Corey Huinker
corey.huinker@gmail.com
In reply to: Craig Ringer (#3)
Re: COPY IN/BOTH vs. extended query mode

On Mon, Feb 13, 2017 at 4:29 PM, Craig Ringer <craig.ringer@2ndquadrant.com>
wrote:

On 14 Feb. 2017 06:15, "Robert Haas" <robertmhaas@gmail.com> wrote:

On Mon, Jan 23, 2017 at 9:12 PM, Robert Haas <robertmhaas@gmail.com>
wrote:

According to the documentation for COPY IN mode, "If the COPY command
was issued via an extended-query message, the backend will now discard
frontend messages until a Sync message is received, then it will issue
ReadyForQuery and return to normal processing." I added a similar
note to the documentation for COPY BOTH mode in
91fa8532f4053468acc08534a6aac516ccde47b7, and the documentation
accurately describes the behavior of the server. However, this seems
to make fully correct error handling for clients using libpq almost
impossible, because PQsendQueryGuts() sends
Parse-Bind-Describe-Execute-Sync in one shot without regard to whether
the command that was just sent invoked COPY mode (cf. the note in
CopyGetData about why we ignore Flush and Sync in that function).

So imagine that the client uses libpq to send (via the extended query
protocol) a COPY IN command (or some hypothetical command that starts
COPY BOTH mode to begin). If the server throws an error before the
Sync message is consumed, it will bounce back to PostgresMain which
will set doing_extended_query_message = true after which it will
consume messages, find the Sync, reset that flag, and send
ReadyForQuery. On the other hand, if the server enters CopyBoth mode,
consumes the Sync message in CopyGetData (or a similar function), and
*then* throws an ERROR, the server will wait for a second Sync message
from the client before issuing ReadyForQuery. There is no sensible
way of coping with this problem in libpq, because there is no way for
the client to know which part of the server code consumed the Sync
message that it already sent. In short, from the client's point of
view, if it enters COPY IN or COPY BOTH mode via the extend query
protocol, and an error occurs on the server, the server MAY OR MAY NOT
expect a further Sync message before issuing ReadyForQuery, and the
client has no way of knowing -- except maybe waiting for a while to
see what happens.

It does not appear to me that there is any good solution to this
problem. Fixing it on the server side would require a wire protocol
change - e.g. one kind of Sync message that is used in a
Parse-Bind-Describe-Execute-Sync sequence that only terminates
non-COPY commands and another kind that is used to signal the end even
of COPY. Fixing it on the client side would require all clients to
know prior to initiating an extended-query-protocol sequence whether
or not the command was going to initiate COPY, which is an awful API
even if didn't constitute an impossible-to-contemplate backward
compatibility break. Perhaps we will have to be content to document
the fact that this part of the protocol is depressingly broken...

...unless of course somebody can see something that I'm missing here
and the situation isn't as bad as it currently appears to me to be.

Anybody have any thoughts on this?

I've been thinking on it a bit, but don't really have anything that can be
done without a protocol version bump.

We can't really disallow extended query protocol COPY, too much is likely
to break. And we can't fix it without a protocol change.

A warning in the docs for COPY would be appropriate, noting that clients
should use the simple query protocol to issue COPY. It's kind of mixing
layers, since many users won't see the protocol level or have any idea if
their client driver uses ext or simple query, but we can at least advise
libpq users.

Also in the protocol docs, noting that clirnfa sending COPY should prefer
the simple query protocol due to error recovery issues with COPY and
extended query protocol.

Forgive my ignorance, but is this issue related to the Catch-22 I had with
"COPY as a set returning function", wherein a function that invokes
BeginCopyFrom() basically starts a result set, but then ends it to do the
BeginCopyFrom() having NULL (meaning STDIN) as the file, so that when the
results from the copy come back the 'T' record that was going to preface
the 'D' records emitted by the function is now gone?

#5Robert Haas
robertmhaas@gmail.com
In reply to: Corey Huinker (#4)
Re: COPY IN/BOTH vs. extended query mode

On Thu, Feb 16, 2017 at 1:58 PM, Corey Huinker <corey.huinker@gmail.com> wrote:

Forgive my ignorance, but is this issue related to the Catch-22 I had with
"COPY as a set returning function", wherein a function that invokes
BeginCopyFrom() basically starts a result set, but then ends it to do the
BeginCopyFrom() having NULL (meaning STDIN) as the file, so that when the
results from the copy come back the 'T' record that was going to preface the
'D' records emitted by the function is now gone?

I can't quite understand what you've written here. I would think that
"COPY TO STDOUT", not "COPY FROM", would begin a result set.

If you were trying to write a SQL-callable function that would return
a result set by emitting protocol messages directly, I imagine that
will cause all kinds of problems, because you won't be able to keep
the result set the function produces by emitting protocol messages
cleanly separated from whatever the backend code that's calling that
function does to return whatever it views as the result of the
function call.

If you tried to write an SQL-callable function that internally started
and ended a copy from the client, then I think you would run into this
problem, and probably some others.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Corey Huinker
corey.huinker@gmail.com
In reply to: Robert Haas (#5)
Re: COPY IN/BOTH vs. extended query mode

On Sun, Feb 19, 2017 at 9:04 AM, Robert Haas <robertmhaas@gmail.com> wrote:

If you tried to write an SQL-callable function that internally started

and ended a copy from the client, then I think you would run into this
problem, and probably some others.

That's it. I had a PoC patch submitted that allowed someone to do this

insert into some_table(id, total_sum) select id, sum(a_numeric_metric) from
copy_srf('/a/file/name') group by 1

or

insert into some_table(id, total_sum) select id, sum(a_numeric_metric) from
copy_srf('/a/program/name arg1 arg2',true) group by 1

and those worked just fine, however, attempts to use the STDIN

insert into some_table(id, total_sum) select id, sum(a_numeric_metric) from
copy_srf(null) group by 1

failed, because as it was explained to me, the order of such events would
be:

1. start query
2. send result set format to client
3. start copy which implies that query result set is done
4. finish copy
5. emit query results to client, but the defining result format is gone,
thus error.

I'm just putting this here for future reference in case there is a protocol
change in the works.

#7Jeff Davis
pgsql@j-davis.com
In reply to: Robert Haas (#1)
Re: COPY IN/BOTH vs. extended query mode

On Mon, 2017-01-23 at 21:12 -0500, Robert Haas wrote:

According to the documentation for COPY IN mode, "If the COPY command
was issued via an extended-query message, the backend will now
discard
frontend messages until a Sync message is received, then it will
issue
ReadyForQuery and return to normal processing." I added a similar
note to the documentation for COPY BOTH mode in
91fa8532f4053468acc08534a6aac516ccde47b7, and the documentation
accurately describes the behavior of the server. However, this seems
to make fully correct error handling for clients using libpq almost
impossible, because PQsendQueryGuts() sends
Parse-Bind-Describe-Execute-Sync in one shot without regard to
whether
the command that was just sent invoked COPY mode (cf. the note in
CopyGetData about why we ignore Flush and Sync in that function).

So imagine that the client uses libpq to send (via the extended query
protocol) a COPY IN command (or some hypothetical command that starts
COPY BOTH mode to begin). If the server throws an error before the
Sync message is consumed, it will bounce back to PostgresMain which
will set doing_extended_query_message = true after which it will
consume messages, find the Sync, reset that flag, and send
ReadyForQuery. On the other hand, if the server enters CopyBoth
mode,
consumes the Sync message in CopyGetData (or a similar function), and
*then* throws an ERROR, the server will wait for a second Sync
message
from the client before issuing ReadyForQuery. There is no sensible
way of coping with this problem in libpq, because there is no way for
the client to know which part of the server code consumed the Sync
message that it already sent. In short, from the client's point of
view, if it enters COPY IN or COPY BOTH mode via the extend query
protocol, and an error occurs on the server, the server MAY OR MAY
NOT
expect a further Sync message before issuing ReadyForQuery, and the
client has no way of knowing -- except maybe waiting for a while to
see what happens.

I investigated a bit deeper here, and I'm not sure this is a real
problem (aside from ambiguity in the protocol docs).

If you send "COPY ... FROM STDIN" using the extended query protocol in
libpq, the non-error message flow is something like:

-> Parse + Bind + Describe + Execute + Sync
[ server processes Parse + Bind + Describe + Execute ]
[ server enters copy-in mode ]
<- CopyInResponse
[ server ignores Sync ]
-> CopyData
[ server processes CopyData ]
-> CopyDone
[ server processes CopyDone ]
[ server exits copy-in mode ]
-> Sync
[ server processes Sync ]
<- ReadyForQuery

If an error happens before the server enters copy-in mode (e.g. syntax
error), then you get something like:

-> Parse + Bind + Describe + Execute + Sync
[ server processes
Parse, encounters error ]
<- ErrorResponse
[ server ignores Bind +
Describe + Execute ]
[ server processes Sync ]
<- ReadyForQuery
[
client never got CopyInResponse, so never sent copy messages ]

If an error happens after the CopyInResponse is sent (e.g. malformed
data), you get something like:

-> Parse + Bind + Describe + Execute + Sync
[ server processes Bind + Describe + Execute ]
[ server enters copy-in mode ]
<- CopyInResponse
[ server ignores Sync ]
-> CopyData
[ server processes CopyData, encounters error ]
[ server exits copy-in mode ]
<- ErrorResponse
-> CopyDone
[ server ignores CopyDone ]
-> Sync
[ server processes Sync ]
<- ReadyForQuery

If the backend is canceled after the server sends CopyInResponse but
before it consumes (and ignores) the Sync, you get something like:

-> Parse + Bind + Describe + Execute + Sync
[ server processes Bind + Describe + Execute ]
[ server enters copy-in mode ]
<- CopyInResponse
[ SIGINT, server encounters error ]
<- ErrorResponse
[ server exits copy-in mode ]
[ server processes Sync ]
<- ReadyForQuery
-> CopyData
[ server ignores CopyData ]
-> CopyDone
[ server ignores CopyDone ]
-> Sync
[ server processes Sync ]
<- ReadyForQuery

The last case is not great, because I could imagine it confusing a
client, but I'm not sure about exactly how, and maybe it's something we
can document around?

Regards,
Jeff Davis