Re: [HACKERS] Continued problems with pgdump, Large Objects and crashing backends

Started by Tom Lanealmost 27 years ago4 messages
#1Tom Lane
tgl@sss.pgh.pa.us

Peter T Mount <peter@retep.org.uk> writes:

However, this fails when creating functions that have more than one sql
statement in them. He has some functions that insert into a table
depending on some arguments, then issue a select on the last arg which is
the functions result. However, pgdump doesn't end the select with a ; and
this causes the 6.5 backend to fail. Adding the ; fixes the problem.

What does 'fail' mean exactly? Crash, or just reject the query?
It sounds like there is a pg_dump bug here (omitting a required
semicolon) but I don't understand whether there's also a backend bug.

Running the backed with the -d2 flag, these expand to:

pq_recvbuf: recv() failed, errno 2
proc_exit(0) [#0]
shmem_exit(0) [#0]
exit(0)
/usr/local/pgsql/bin/postmaster: reaping dead processes...
/usr/local/pgsql/bin/postmaster: CleanupProc: pid 6731 exited with status 0
pq_recvbuf: recv() failed, errno 0
proc_exit(0) [#0]
shmem_exit(0) [#0]
exit(0)
/usr/local/pgsql/bin/postmaster: reaping dead processes...
/usr/local/pgsql/bin/postmaster: CleanupProc: pid 6730 exited with status 0

This doesn't look like a segv trace to me --- if the backend was
coredumping then the postmaster should see a nonzero exit status.

The recv() complaints probably indicate that the client application
disconnected ungracefully (ie, without sending the 'X' terminate
message). It's curious that they're not both alike.
That might be a red herring however --- right now pq_recvbuf doesn't
distinguish plain EOF from a true error, and if it's plain EOF then
whatever errno was last set to gets printed. Think I'll go fix that.

Barring more evidence, all I see here is client disconnect, not a
backend failure. What's your basis for claiming a segv crash?

Ok, now the problem. When he sets autocommit to false, the JDBC driver
sends BEGIN to the backend. Ok so far, however, something then fails
during the first large object's load, and causes everything else to fail.

That's not a bug, it's a feature ... allegedly, anyway. Any error
inside a transaction means the entire transaction is aborted. And
the backend will keep reminding you so until you cooperate by ending
the transaction. I don't like the behavior very much either, but
it's operating as designed.

regards, tom lane

#2Peter T Mount
peter@retep.org.uk
In reply to: Tom Lane (#1)
Re: [HACKERS] Continued problems with pgdump, Large Objects and crashing backends

On Wed, 17 Feb 1999, Tom Lane wrote:

Peter T Mount <peter@retep.org.uk> writes:

However, this fails when creating functions that have more than one sql
statement in them. He has some functions that insert into a table
depending on some arguments, then issue a select on the last arg which is
the functions result. However, pgdump doesn't end the select with a ; and
this causes the 6.5 backend to fail. Adding the ; fixes the problem.

What does 'fail' mean exactly? Crash, or just reject the query?
It sounds like there is a pg_dump bug here (omitting a required
semicolon) but I don't understand whether there's also a backend bug.

I didn't say this was a backend bug, but was one thing I came across while
looking at the following problem.

Running the backed with the -d2 flag, these expand to:

pq_recvbuf: recv() failed, errno 2
proc_exit(0) [#0]
shmem_exit(0) [#0]
exit(0)
/usr/local/pgsql/bin/postmaster: reaping dead processes...
/usr/local/pgsql/bin/postmaster: CleanupProc: pid 6731 exited with status 0
pq_recvbuf: recv() failed, errno 0
proc_exit(0) [#0]
shmem_exit(0) [#0]
exit(0)
/usr/local/pgsql/bin/postmaster: reaping dead processes...
/usr/local/pgsql/bin/postmaster: CleanupProc: pid 6730 exited with status 0

This doesn't look like a segv trace to me --- if the backend was
coredumping then the postmaster should see a nonzero exit status.

The recv() complaints probably indicate that the client application
disconnected ungracefully (ie, without sending the 'X' terminate
message). It's curious that they're not both alike.
That might be a red herring however --- right now pq_recvbuf doesn't
distinguish plain EOF from a true error, and if it's plain EOF then
whatever errno was last set to gets printed. Think I'll go fix that.

Barring more evidence, all I see here is client disconnect, not a
backend failure.

Hmmm, I've never seen the recv() problem before with any JDBC app, only
this one.

PS: Currently the JDBC driver is still using the 6.3.x protocol. When 6.4
came out I didn't implement the CANCEL stuff, as I was concentrating on
getting more of the innards implemented.

Anyhow, if the terminate message is a problem, I'll upgrade the protocol.

What's your basis for claiming a segv crash?

I think the segv came from Jason (who's run it against 6.3.x and 6.4.x).

Ok, now the problem. When he sets autocommit to false, the JDBC driver
sends BEGIN to the backend. Ok so far, however, something then fails
during the first large object's load, and causes everything else to fail.

That's not a bug, it's a feature ... allegedly, anyway. Any error
inside a transaction means the entire transaction is aborted. And
the backend will keep reminding you so until you cooperate by ending
the transaction. I don't like the behavior very much either, but
it's operating as designed.

I'm going to overhaul the autocommit(false) code. I suspect it's broken,
but I need to sit down and figure what is happening with this problem
first.

Peter

--
Peter T Mount peter@retep.org.uk
Main Homepage: http://www.retep.org.uk
PostgreSQL JDBC Faq: http://www.retep.org.uk/postgres
Java PDF Generator: http://www.retep.org.uk/pdf

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#1)

Peter T Mount <peter@retep.org.uk> writes:

The recv() complaints probably indicate that the client application
disconnected ungracefully (ie, without sending the 'X' terminate
message). It's curious that they're not both alike.

Hmmm, I've never seen the recv() problem before with any JDBC app, only
this one.

That particular message is new in the 6.5 code (BTW, as of this morning
it should say "pq_recvbuf: unexpected EOF on client connection").

I was about to say that prior versions would also complain about an
unexpected client disconnect, but actually it looks like 6.4.2 doesn't
--- at least not in this low-level code.  I'm not inclined to remove the
message however.  I think we want it there to help detect more serious
problems, like disconnect in the middle of a COPY operation.

PS: Currently the JDBC driver is still using the 6.3.x protocol. When 6.4
came out I didn't implement the CANCEL stuff, as I was concentrating on
getting more of the innards implemented.
Anyhow, if the terminate message is a problem, I'll upgrade the protocol.

The terminate message is defined in the old protocol too; it's not new
for 6.4. As for whether it's a "problem" not to send it, it's only
a problem if you don't like complaints in the postmaster log ;-).
The backend will close up shop just fine without it.

regards, tom lane

#4Peter T Mount
peter@retep.org.uk
In reply to: Tom Lane (#3)
Re: [HACKERS] Continued problems with pgdump, Large Objects and crashing backends

On Thu, 18 Feb 1999, Tom Lane wrote:

Peter T Mount <peter@retep.org.uk> writes:

The recv() complaints probably indicate that the client application
disconnected ungracefully (ie, without sending the 'X' terminate
message). It's curious that they're not both alike.

Hmmm, I've never seen the recv() problem before with any JDBC app, only
this one.

That particular message is new in the 6.5 code (BTW, as of this morning
it should say "pq_recvbuf: unexpected EOF on client connection").

I was about to say that prior versions would also complain about an
unexpected client disconnect, but actually it looks like 6.4.2 doesn't
--- at least not in this low-level code.  I'm not inclined to remove the
message however.  I think we want it there to help detect more serious
problems, like disconnect in the middle of a COPY operation.

PS: Currently the JDBC driver is still using the 6.3.x protocol. When 6.4
came out I didn't implement the CANCEL stuff, as I was concentrating on
getting more of the innards implemented.
Anyhow, if the terminate message is a problem, I'll upgrade the protocol.

The terminate message is defined in the old protocol too; it's not new
for 6.4. As for whether it's a "problem" not to send it, it's only
a problem if you don't like complaints in the postmaster log ;-).
The backend will close up shop just fine without it.

Looks like something that's been missing since the begining. Ok, I'll add
the message to it tomorrow, as I'm planning some cleanups this weekend.

Peter

--
Peter T Mount peter@retep.org.uk
Main Homepage: http://www.retep.org.uk
PostgreSQL JDBC Faq: http://www.retep.org.uk/postgres
Java PDF Generator: http://www.retep.org.uk/pdf