help getting a backtrace from 9.2 on Ubuntu 13.04?

Started by Chris Curveyover 12 years ago12 messagesgeneral
Jump to latest
#1Chris Curvey
ccurvey@zuckergoldberg.com

I have a production server running PG 8.4 on RHEL6. I have a development server running PG 9.2 on Ubuntu 13.04. Periodically, I like to take a dump of the production server and load it on the development machine.

But I'm having troubles with the 9.2 server crashing when I'm restoring the dump. I'm using the 9.2 version of pg_dump. I've tried restoring a custom-format dump with pg_restore, and I've tried restoring a text-format dump with pqsl, and both of them are crashing on me.

The data is too sensitive for me to submit a database dump to the community, but I'd like to submit a stack trace, in the hopes that someone might be able to figure out what's going on. But I'm having some trouble getting this done.

I'm looking at the instructions at http://wiki.postgresql.org/wiki/Getting_a_stack_trace_of_a_running_PostgreSQL_backend_on_Linux/BSD

Which points me to https://wiki.edubuntu.org/DebuggingProgramCrash

Which tells me to install the “dbg” version of my package, so I have installed postgresql-9.2-dbg

The PG wiki page tells me to download and run list-dbgsym-packages-v2.sh, but the Edubuntu page says that that script has been supplanted by list-symbols-packages-v2.sh. Okay, so I run

chris@mu:~/Downloads$ sudo bash list-symbols-packages-v2.sh -p $(pidof -s postgres)

and get

warning: no loadable sections found in added symbol-file system-supplied DSO at 0x7fffcbae5000

any idea where I go from here?

Disclaimer

THIS IS A CONFIDENTIAL COMMUNICATION. The information contained in this e-mail is attorney-client privileged and confidential, intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are here by notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by reply e-mail or by telephone (908) 233-8500.Thank you..

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2John R Pierce
pierce@hogranch.com
In reply to: Chris Curvey (#1)
Re: help getting a backtrace from 9.2 on Ubuntu 13.04?

On 9/9/2013 7:00 AM, Chris Curvey wrote:

any idea where I go from here?

don't develop on a newer version of the database than you are deploying on.

*maybe* you can use the pgdump from 8.4 to connect to and dump the 9.2
database, but the 9.2 dump is NOT guaranteed to generate 8.4 compatible
SQL, in fact its highly likely it won't, as you can already attest to.

--
john r pierce 37N 122W
somewhere on the middle of the left coast

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3John R Pierce
pierce@hogranch.com
In reply to: John R Pierce (#2)
Re: help getting a backtrace from 9.2 on Ubuntu 13.04?

On 9/9/2013 4:53 PM, John R Pierce wrote:

On 9/9/2013 7:00 AM, Chris Curvey wrote:

any idea where I go from here?

don't develop on a newer version of the database than you are
deploying on.

*maybe* you can use the pgdump from 8.4 to connect to and dump the 9.2
database, but the 9.2 dump is NOT guaranteed to generate 8.4
compatible SQL, in fact its highly likely it won't, as you can already
attest to.

oops, I re-read yur request, you're going the other way.

use the pg_dump from 9.2 to dump the 8.4 database, via either a direct
network connection, or an ssh tunnel, or whatever works best. this is
guaranteed to work

--
john r pierce 37N 122W
somewhere on the middle of the left coast

#4marcin mank
marcin.mank@gmail.com
In reply to: Chris Curvey (#1)
Re: help getting a backtrace from 9.2 on Ubuntu 13.04?

On Mon, Sep 9, 2013 at 4:00 PM, Chris Curvey <ccurvey@zuckergoldberg.com>wrote:

But I'm having troubles with the 9.2 server crashing when I'm restoring
the dump. I'm using the 9.2 version of pg_dump. I've tried restoring a
custom-format dump with pg_restore, and I've tried restoring a text-format
dump with pqsl, and both of them are crashing on me.

The data is too sensitive for me to submit a database dump to the
community, but I'd like to submit a stack trace, in the hopes that someone
might be able to figure out what's going on. But I'm having some trouble
getting this done.

Is it crashing on a specific database object? pg_restore -v will tell you
how far it went. Then try to restore only that object. Is it perhaps
crashing on a specific row?

Try producing a self contained test case (like only the culprit table,
anonymized).

Regards
Marcin Mańk

#5Chris Curvey
ccurvey@zuckergoldberg.com
In reply to: marcin mank (#4)
Re: help getting a backtrace from 9.2 on Ubuntu 13.04?

From: Marcin Mańk [mailto:marcin.mank@gmail.com]
Sent: Monday, September 09, 2013 8:30 PM
To: Chris Curvey
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] help getting a backtrace from 9.2 on Ubuntu 13.04?

On Mon, Sep 9, 2013 at 4:00 PM, Chris Curvey <ccurvey@zuckergoldberg.com<mailto:ccurvey@zuckergoldberg.com>> wrote:
But I'm having troubles with the 9.2 server crashing when I'm restoring the dump. I'm using the 9.2 version of pg_dump. I've tried restoring a custom-format dump with pg_restore, and I've tried restoring a text-format dump with pqsl, and both of them are crashing on me.

The data is too sensitive for me to submit a database dump to the community, but I'd like to submit a stack trace, in the hopes that someone might be able to figure out what's going on. But I'm having some trouble getting this done.

Is it crashing on a specific database object? pg_restore -v will tell you how far it went. Then try to restore only that object. Is it perhaps crashing on a specific row?

Try producing a self contained test case (like only the culprit table, anonymized).

Regards
Marcin Mańk

Good advice. I turned on –verbose, and got a ton of output, ending with:

pg_restore: setting owner and privileges for FK CONSTRAINT user_id_refs_id_7ceef80f
pg_restore: setting owner and privileges for FK CONSTRAINT user_id_refs_id_dfbab7d
pg_restore: [archiver (db)] could not execute query: no connection to the server
Command was: -- Completed on 2013-09-09 11:35:16 EDT

pg_restore: [archiver (db)] could not execute query: no connection to the server
Command was: --
-- PostgreSQL database dump complete

Which I find really odd, because I specified –no-owner –no-privileges –no-tablespace
chris@mu:/sdb$ pg_restore --dbname=certified_mail_ccc2 --format=c --verbose --clean --no-owner --no-privileges --no-tablespaces -h mu -p 5434 cm_Mon.backup

So now I’m up to three questions. (Why the crash? How to get backtrace? Why are we applying permissions when I said not to?) I guess that’s the nature of the universe. Let me see if I can figure out which table that is and try to create a test case.

Disclaimer

THIS IS A CONFIDENTIAL COMMUNICATION. The information contained in this e-mail is attorney-client privileged and confidential, intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are here by notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by reply e-mail or by telephone (908) 233-8500.Thank you..

#6Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Chris Curvey (#5)
Re: help getting a backtrace from 9.2 on Ubuntu 13.04?

On 09/10/2013 06:57 AM, Chris Curvey wrote:

*From:*Marcin Mańk [mailto:marcin.mank@gmail.com]
*Sent:* Monday, September 09, 2013 8:30 PM
*To:* Chris Curvey
*Cc:* pgsql-general@postgresql.org
*Subject:* Re: [GENERAL] help getting a backtrace from 9.2 on Ubuntu 13.04?

On Mon, Sep 9, 2013 at 4:00 PM, Chris Curvey <ccurvey@zuckergoldberg.com
<mailto:ccurvey@zuckergoldberg.com>> wrote:

But I'm having troubles with the 9.2 server crashing when I'm
restoring the dump. I'm using the 9.2 version of pg_dump. I've
tried restoring a custom-format dump with pg_restore, and I've tried
restoring a text-format dump with pqsl, and both of them are
crashing on me.

The data is too sensitive for me to submit a database dump to the
community, but I'd like to submit a stack trace, in the hopes that
someone might be able to figure out what's going on. But I'm having
some trouble getting this done.

Is it crashing on a specific database object? pg_restore -v will tell
you how far it went. Then try to restore only that object. Is it perhaps
crashing on a specific row?

Try producing a self contained test case (like only the culprit table,
anonymized).

Regards

Marcin Mańk

Good advice. I turned on –verbose, and got a ton of output, ending with:

pg_restore: setting owner and privileges for FK CONSTRAINT
user_id_refs_id_7ceef80f

pg_restore: setting owner and privileges for FK CONSTRAINT
user_id_refs_id_dfbab7d

pg_restore: [archiver (db)] could not execute query: no connection to
the server

Command was: -- Completed on 2013-09-09 11:35:16 EDT

pg_restore: [archiver (db)] could not execute query: no connection to
the server

At this point I would be more worried about the above, 'no connection to
server'.

Command was: --

-- PostgreSQL database dump complete

Which I find really odd, because I specified –no-owner –no-privileges
–no-tablespace

--no-owner does not mean that ownership is not set, just that the
ownership from the source database is not carried over.

http://www.postgresql.org/docs/9.2/interactive/app-pgrestore.html

-O
--no-owner
Do not output commands to set ownership of objects to match the original
database. By default, pg_restore issues ALTER OWNER or SET SESSION
AUTHORIZATION statements to set ownership of created schema elements.
These statements will fail unless the initial connection to the database
is made by a superuser (or the same user that owns all of the objects in
the script). With -O, any user name can be used for the initial
connection, and this user will own all the created objects.

chris@mu:/sdb$ pg_restore --dbname=certified_mail_ccc2 --format=c
--verbose --clean --no-owner --no-privileges --no-tablespaces -h mu -p
5434 cm_Mon.backup

So now I’m up to three questions. (Why the crash? How to get
backtrace? Why are we applying permissions when I said not to?) I
guess that’s the nature of the universe. Let me see if I can figure out
which table that is and try to create a test case.

--
Adrian Klaver
adrian.klaver@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Chris Curvey
ccurvey@zuckergoldberg.com
In reply to: Adrian Klaver (#6)
Re: help getting a backtrace from 9.2 on Ubuntu 13.04?

Good advice. I turned on –verbose, and got a ton of output, ending with:

pg_restore: setting owner and privileges for FK CONSTRAINT
user_id_refs_id_7ceef80f

pg_restore: setting owner and privileges for FK CONSTRAINT
user_id_refs_id_dfbab7d

pg_restore: [archiver (db)] could not execute query: no connection to
the server

Command was: -- Completed on 2013-09-09 11:35:16 EDT

pg_restore: [archiver (db)] could not execute query: no connection to
the server

This was a red herring. That last constraint that was listed is the last thing in the dump/restore (at least according to --list). Perhaps at the end of pg_restore, it's trying to send the comment to the database ("completed on...") and that's confusing things? It does appear that the entire database has been restored, but the cluster restart is disconcerting.

At this point I would be more worried about the above, 'no connection to
server'.

Yep, that would be the point where the cluster is restarting.

--no-owner does not mean that ownership is not set, just that the ownership
from the source database is not carried over.

http://www.postgresql.org/docs/9.2/interactive/app-pgrestore.html

-O
--no-owner
Do not output commands to set ownership of objects to match the original
database. By default, pg_restore issues ALTER OWNER or SET SESSION
AUTHORIZATION statements to set ownership of created schema elements.
These statements will fail unless the initial connection to the database is
made by a superuser (or the same user that owns all of the objects in the
script). With -O, any user name can be used for the initial connection, and
this user will own all the created objects.

Does this mean that the --no-owner is a command flag for pg_dump, but is ignored by pg_restore? Let me run a test and see...

Disclaimer

THIS IS A CONFIDENTIAL COMMUNICATION. The information contained in this e-mail is attorney-client privileged and confidential, intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are here by notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by reply e-mail or by telephone (908) 233-8500.Thank you..

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Jeff Janes
jeff.janes@gmail.com
In reply to: Chris Curvey (#5)
Re: help getting a backtrace from 9.2 on Ubuntu 13.04?

On Tue, Sep 10, 2013 at 6:57 AM, Chris Curvey <ccurvey@zuckergoldberg.com>
wrote:

From: Marcin Mańk [mailto:marcin.mank@gmail.com]

Is it crashing on a specific database object? pg_restore -v will tell

you

how far it went. Then try to restore only that object. Is it perhaps
crashing on a specific row?

Good advice. I turned on –verbose, and got a ton of output, ending with:

pg_restore: setting owner and privileges for FK CONSTRAINT
user_id_refs_id_7ceef80f

pg_restore: setting owner and privileges for FK CONSTRAINT
user_id_refs_id_dfbab7d

pg_restore: [archiver (db)] could not execute query: no connection to the
server

Command was: -- Completed on 2013-09-09 11:35:16 EDT

What does the server log say? It should tell you why the server is
restarting.

Cheers,

Jeff

#9Chris Curvey
ccurvey@zuckergoldberg.com
In reply to: Jeff Janes (#8)
Re: help getting a backtrace from 9.2 on Ubuntu 13.04?

From: Jeff Janes [mailto:jeff.janes@gmail.com]
Sent: Tuesday, September 10, 2013 1:26 PM
To: Chris Curvey
Cc: Marcin Mańk; pgsql-general@postgresql.org
Subject: Re: [GENERAL] help getting a backtrace from 9.2 on Ubuntu 13.04?

On Tue, Sep 10, 2013 at 6:57 AM, Chris Curvey <ccurvey@zuckergoldberg.com<mailto:ccurvey@zuckergoldberg.com>> wrote:

From: Marcin Mańk [mailto:marcin.mank@gmail.com<mailto:marcin.mank@gmail.com>]

Is it crashing on a specific database object? pg_restore -v will tell you
how far it went. Then try to restore only that object. Is it perhaps
crashing on a specific row?

Good advice. I turned on –verbose, and got a ton of output, ending with:

pg_restore: setting owner and privileges for FK CONSTRAINT
user_id_refs_id_7ceef80f

pg_restore: setting owner and privileges for FK CONSTRAINT
user_id_refs_id_dfbab7d

pg_restore: [archiver (db)] could not execute query: no connection to the
server

Command was: -- Completed on 2013-09-09 11:35:16 EDT

What does the server log say? It should tell you why the server is restarting.

Cheers,

Jeff

Great thought. Looking through the logs, it appears that all my failures are on a CREATE INDEX. Usually on my biggest table, but often on another table.

2013-09-10 10:09:46 EDT ERROR: canceling autovacuum task
2013-09-10 10:09:46 EDT CONTEXT: automatic analyze of table "certified_mail_ccc2.public.cm_status_history"
2013-09-10 10:15:13 EDT LOG: server process (PID 14386) was terminated by signal 11: Segmentation fault
2013-09-10 10:15:13 EDT DETAIL: Failed process was running: CREATE INDEX cm_envelope_tracking_number ON cm_envelope USING btree (tracking_number);

2013-09-10 10:15:13 EDT LOG: terminating any other active server processes
2013-09-10 10:15:13 EDT WARNING: terminating connection because of crash of another server process
2013-09-10 10:15:13 EDT DETAIL: The postmaster has commanded this server process to roll back the current transaction and exit, because another server process exited abnormally and possibly corrupted shared memory.

I cannot square this with the fact that when I echo the commands, the last echoed command is about setting privileges.

Another development (possibly unrelated): I tried *dumping* with –no-privileges –no-tablespace –no-owner, and the restore went fine.

Disclaimer

THIS IS A CONFIDENTIAL COMMUNICATION. The information contained in this e-mail is attorney-client privileged and confidential, intended only for the use of the individual or entity named above. If the reader of this message is not the intended recipient, or the employee or agent responsible to deliver it to the intended recipient, you are here by notified that any dissemination, distribution or copying of this communication is strictly prohibited. If you have received this communication in error, please notify us immediately by reply e-mail or by telephone (908) 233-8500.Thank you..

#10Adrian Klaver
adrian.klaver@aklaver.com
In reply to: Chris Curvey (#9)
Re: help getting a backtrace from 9.2 on Ubuntu 13.04?

On 09/10/2013 10:37 AM, Chris Curvey wrote:

Another development (possibly unrelated): I tried **dumping** with
–no-privileges –no-tablespace –no-owner, and the restore went fine.

Probably has to do with whether you are dumping plain text or custom format:

http://www.postgresql.org/docs/9.2/interactive/app-pgdump.html

-O
--no-owner
Do not output commands to set ownership of objects to match the original
database. By default, pg_dump issues ALTER OWNER or SET SESSION
AUTHORIZATION statements to set ownership of created database objects.
These statements will fail when the script is run unless it is started
by a superuser (or the same user that owns all of the objects in the
script). To make a script that can be restored by any user, but will
give that user ownership of all the objects, specify -O.

This option is only meaningful for the plain-text format. For the
archive formats, you can specify the option when you call pg_restore.

--
Adrian Klaver
adrian.klaver@gmail.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11Jeff Janes
jeff.janes@gmail.com
In reply to: Chris Curvey (#9)
Re: help getting a backtrace from 9.2 on Ubuntu 13.04?

On Tue, Sep 10, 2013 at 10:37 AM, Chris Curvey
<ccurvey@zuckergoldberg.com>wrote:

Great thought. Looking through the logs, it appears that all my failures
are on a CREATE INDEX. Usually on my biggest table, but often on another
table.

2013-09-10 10:09:46 EDT ERROR: canceling autovacuum task

2013-09-10 10:09:46 EDT CONTEXT: automatic analyze of table
"certified_mail_ccc2.public.cm_status_history"

2013-09-10 10:15:13 EDT LOG: server process (PID 14386) was terminated by
signal 11: Segmentation fault

2013-09-10 10:15:13 EDT DETAIL: Failed process was running: CREATE INDEX
cm_envelope_tracking_number ON cm_envelope USING btree (tracking_number);

2013-09-10 10:15:13 EDT LOG: terminating any other active server processes

2013-09-10 10:15:13 EDT WARNING: terminating connection because of crash
of another server process

2013-09-10 10:15:13 EDT DETAIL: The postmaster has commanded this server
process to roll back the current transaction and exit, because another
server process exited abnormally and possibly corrupted shared memory.

I cannot square this with the fact that when I echo the commands, the last
echoed command is about setting privileges.

A backend is crashing, and taking down the entire PostgreSQL system. The
commands you see being echoed are from a different process from the one
that triggered the crash, so it is just an innocent bystander which has no
useful information. Are you using parallel restore? (If not, why is there
someone indexing your biggest table during the restore?)

You will want to get the backtrace of the coredump generated by the crashed
backend, not of the running process. Have you tried taking a bt with gdb?
You said you couldn't find the symbols, but have you tried it anyway? On
CentOS and openSuse I often get warnings about some symbols not being
found, but all the symbols I actually need to interpret the backtrace end
up being there.

Cheers,

Jeff

#12Chris Curvey
chris@chriscurvey.com
In reply to: Jeff Janes (#11)
Re: help getting a backtrace from 9.2 on Ubuntu 13.04?

On Sun, Sep 15, 2013 at 7:49 PM, Jeff Janes <jeff.janes@gmail.com> wrote:

On Tue, Sep 10, 2013 at 10:37 AM, Chris Curvey <ccurvey@zuckergoldberg.com

wrote:

Great thought. Looking through the logs, it appears that all my failures
are on a CREATE INDEX. Usually on my biggest table, but often on another
table.

2013-09-10 10:09:46 EDT ERROR: canceling autovacuum task

2013-09-10 10:09:46 EDT CONTEXT: automatic analyze of table
"certified_mail_ccc2.public.cm_status_history"

2013-09-10 10:15:13 EDT LOG: server process (PID 14386) was terminated
by signal 11: Segmentation fault

2013-09-10 10:15:13 EDT DETAIL: Failed process was running: CREATE INDEX
cm_envelope_tracking_number ON cm_envelope USING btree (tracking_number);

2013-09-10 10:15:13 EDT LOG: terminating any other active server
processes

2013-09-10 10:15:13 EDT WARNING: terminating connection because of crash
of another server process

2013-09-10 10:15:13 EDT DETAIL: The postmaster has commanded this server
process to roll back the current transaction and exit, because another
server process exited abnormally and possibly corrupted shared memory.

I cannot square this with the fact that when I echo the commands, the
last echoed command is about setting privileges.

A backend is crashing, and taking down the entire PostgreSQL system. The
commands you see being echoed are from a different process from the one
that triggered the crash, so it is just an innocent bystander which has no
useful information. Are you using parallel restore? (If not, why is there
someone indexing your biggest table during the restore?)

I'm the only person doing anything, and the only thing going on is the
restore. And I'm not using parallel restore (I thought that might be part
of the issue.)

You will want to get the backtrace of the coredump generated by the
crashed backend, not of the running process. Have you tried taking a bt
with gdb? You said you couldn't find the symbols, but have you tried it
anyway? On CentOS and openSuse I often get warnings about some symbols not
being found, but all the symbols I actually need to interpret the backtrace
end up being there.

I can try, but the instructions said that installing the -dev package by
itself was not sufficient, so I stopped at that point. But as they say in
the lottery ads, "Hey, you never know!"

Cheers,

Jeff

Many thanks!