Wierd panic with 7.4.7

Started by Joshua D. Drakeover 20 years ago5 messages
#1Joshua D. Drake
jd@commandprompt.com

Hello,

Any thoughts on the below? Specifically the PANIC? A customer
was performing a full vacuum when it happen. This is running 7.4.7
on ES 3.0. We run daily vacuums and analyzes as well.

2005-06-27 16:35:02 LOG: recycled transaction log file "0000004D0000006F"
2005-06-27 16:35:02 LOG: recycled transaction log file "0000004D00000070"
2005-06-27 16:35:02 LOG: recycled transaction log file "0000004D00000071"
2005-06-27 16:35:02 LOG: recycled transaction log file "0000004D00000072"
2005-06-27 16:35:02 LOG: recycled transaction log file "0000004D00000073"
2005-06-27 16:35:02 LOG: recycled transaction log file "0000004D00000074"
2005-06-27 16:35:02 LOG: recycled transaction log file "0000004D00000075"
2005-06-27 16:35:02 LOG: recycled transaction log file "0000004D00000076"
2005-06-27 16:36:23 LOG: incomplete startup packet
2005-06-27 16:37:53 ERROR: could not send data to client: Broken pipe
2005-06-27 16:37:53 PANIC: cannot abort transaction 146017848, it was
already committed
2005-06-27 16:37:53 LOG: server process (PID 6931) was terminated by
signal 6
2005-06-27 16:37:53 LOG: terminating any other active server processes
2005-06-27 16:37:53 WARNING: terminating connection because of crash of
another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.
2005-06-27 16:37:53 WARNING: terminating connection because of crash of
another server process
DETAIL: The postmaster has commanded this server process to roll back
the current transaction and exit, because another server process exited
abnormally and possibly corrupted shared memory.
HINT: In a moment you should be able to reconnect to the database and
repeat your command.

Sincerely,

Joshua D. Drake

--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joshua D. Drake (#1)
Re: Wierd panic with 7.4.7

"Joshua D. Drake" <jd@commandprompt.com> writes:

2005-06-27 16:37:53 ERROR: could not send data to client: Broken pipe
2005-06-27 16:37:53 PANIC: cannot abort transaction 146017848, it was
already committed

A reasonable guess as to what happened there is:

1. Client process dies just as server is committing a transaction on its
behalf.

2. For some reason, server tries to send a message to client while it's
doing post-commit cleanup (before it gets to the point of resetting
its state to show that it's not in the transaction anymore).

3. Kernel rejects message, causing elog(ERROR), causing entry to
AbortTransaction, causing above panic.

There are a couple of big problems with this theory, though. In the
first place, there aren't any messages sent to the client during
post-commit; unless possibly it's an error message due to a failure
during post-commit, and that should have shown up in the server log.
In the second place, we don't treat communication failures as ERRORs,
so how did step 3 happen?

Do you know what the dead client was doing? Can you reproduce this?

regards, tom lane

#3Joshua D. Drake
jd@commandprompt.com
In reply to: Tom Lane (#2)
Re: Wierd panic with 7.4.7

There are a couple of big problems with this theory, though. In the
first place, there aren't any messages sent to the client during
post-commit; unless possibly it's an error message due to a failure
during post-commit, and that should have shown up in the server log.
In the second place, we don't treat communication failures as ERRORs,
so how did step 3 happen?

Do you know what the dead client was doing?

Unfortunately I don't. We didn't have PID logging turned on so I can't
tell which process it was. The only thing I was told was,

"I am running a Full Vacuum, CRAP the server just died ;)"

Can you reproduce this?

Let me see if the client has a dev database we can try to reproduce with.

Sincerely,

Joshua D. Drake

regards, tom lane

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

--
Your PostgreSQL solutions company - Command Prompt, Inc. 1.800.492.2240
PostgreSQL Replication, Consulting, Custom Programming, 24x7 support
Managed Services, Shared and Dedicated Hosting
Co-Authors: plPHP, plPerlNG - http://www.commandprompt.com/

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joshua D. Drake (#3)
Re: Wierd panic with 7.4.7

"Joshua D. Drake" <jd@commandprompt.com> writes:

Do you know what the dead client was doing?

Unfortunately I don't. We didn't have PID logging turned on so I can't
tell which process it was. The only thing I was told was,
"I am running a Full Vacuum, CRAP the server just died ;)"

Hmm ... VACUUM FULL is a tad weird because it marks itself committed
before it does the last stage of the operation (ie truncating the
relation). I suppose it is possible that an error during that last
stage would have this symptom. There's still the question of why
the postmaster log doesn't show the hypothetical initial error, though.

(Also, it's likely that there's some defense against this scenario in
VACUUM FULL, otherwise we'd hear this kind of report more often, I
should think. I don't have time to go looking right now though.)

regards, tom lane

#5Greg Stark
gsstark@mit.edu
In reply to: Tom Lane (#2)
Re: Wierd panic with 7.4.7

Tom Lane <tgl@sss.pgh.pa.us> writes:

In the second place, we don't treat communication failures as ERRORs,
so how did step 3 happen?

You probably realize this, but just in case: "Broken Pipe" probably means the
backend received SIGPIPE, not just that some file operation syscall returned
-1.

--
greg