BUG #6167: pg_dump fails on table lock

Started by Jesper Engmanover 14 years ago4 messagesbugs
Jump to latest
#1Jesper Engman
jesper@engman.net

The following bug has been logged online:

Bug reference: 6167
Logged by: Jesper Engman
Email address: jesper@engman.net
PostgreSQL version: 8.3.10
Operating system: Linux
Description: pg_dump fails on table lock
Details:

I have tables that exists for short time periods, sometimes for as short as
5 min. pg_dump is starting to fail due to a problem to lock these tables:

pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: relation
"vehicle_change_partitions.vehicle_change_export_p4368494" does not exist
pg_dump: The command was: LOCK TABLE
vehicle_change_partitions.vehicle_change_export_p4368494 IN ACCESS SHARE
MODE
Backup failed: PGPASSWORD=xxxxx && export PGPASSWORD && export PGOPTIONS="-c
statement_timeout=0 -c maintenance_work_mem=2147483647" && /usr/bin/pg_dump
-h xxx.xxx.xxx.xxx -U postgres --ignore-version -Fc -Z 6 xxxxxxxx >
/vol/nfs_backup/postgres_dumps/2011_07_13/xxxxxxxx_2011_07_13
Account: xxxxxxxx Backup failed

How is this possible - pg_dump is a serializable transaction? It doesn't
seem to be tripped up by some other backend function since this actually
fails on the lock.

#2Simon Riggs
simon@2ndQuadrant.com
In reply to: Jesper Engman (#1)
Re: BUG #6167: pg_dump fails on table lock

On Thu, Aug 18, 2011 at 2:05 AM, Jesper Engman <jesper@engman.net> wrote:

The following bug has been logged online:

Bug reference:      6167
Logged by:          Jesper Engman
Email address:      jesper@engman.net
PostgreSQL version: 8.3.10
Operating system:   Linux
Description:        pg_dump fails on table lock
Details:

I have tables that exists for short time periods, sometimes for as short as
5 min. pg_dump is starting to fail due to a problem to lock these tables:

pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: relation
"vehicle_change_partitions.vehicle_change_export_p4368494" does not exist
pg_dump: The command was: LOCK TABLE
vehicle_change_partitions.vehicle_change_export_p4368494 IN ACCESS SHARE
MODE
Backup failed: PGPASSWORD=xxxxx && export PGPASSWORD && export PGOPTIONS="-c
statement_timeout=0 -c maintenance_work_mem=2147483647" && /usr/bin/pg_dump
-h xxx.xxx.xxx.xxx -U postgres --ignore-version -Fc -Z 6 xxxxxxxx >
/vol/nfs_backup/postgres_dumps/2011_07_13/xxxxxxxx_2011_07_13
Account: xxxxxxxx Backup failed

How is this possible - pg_dump is a serializable transaction? It doesn't
seem to be tripped up by some other backend function since this actually
fails on the lock.

Well, its not a bug.

You've asked to dump a table and then dropped the table concurrently
with the attempt to dump the table.

Exclude the tables you don't wish to have dumped using command line options.

I don't think we will put in an option to silently exclude missing
tables, not least because it would be technically difficult.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#3Jesper Engman
jesper@engman.net
In reply to: Simon Riggs (#2)
Re: BUG #6167: pg_dump fails on table lock

DDL is not excluded from MVCC, right? This kind of concurrency should
be handled and it's generally managed just fine. I just did a lab test
attempting to drop a table while pg_dump is running on the same db and
it simply waits to drop the table until pg_dump is done. That is the
expected behavior.

There is some discussion about concurrency and DDL changes in:

http://archives.postgresql.org/pgsql-bugs/2010-02/msg00187.php

But in that case, some specialized backend functions like
pg_get_indexdef is using committed state and that doesn't seem to
occur in this case.

I wonder if there is a small time span between when pg_dump starts and
when all locks have been acquired that may be the problem (if a table
is dropped during that time span). Is there such a small time of
vulnerability?

The database in question does not have a ton of tables like this -
about 10 tables and the tables exists for about 5 min. But this runs
on many databases (more than a thousand). So, if there is a window of
vulnerability (if only small) - chances are we're hitting it.

Excluding tables from the dump is not an option - that will be an
incomplete backup.

Show quoted text

On Thu, Aug 18, 2011 at 4:03 AM, Simon Riggs <simon@2ndquadrant.com> wrote:

On Thu, Aug 18, 2011 at 2:05 AM, Jesper Engman <jesper@engman.net> wrote:

The following bug has been logged online:

Bug reference:      6167
Logged by:          Jesper Engman
Email address:      jesper@engman.net
PostgreSQL version: 8.3.10
Operating system:   Linux
Description:        pg_dump fails on table lock
Details:

I have tables that exists for short time periods, sometimes for as short as
5 min. pg_dump is starting to fail due to a problem to lock these tables:

pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: relation
"vehicle_change_partitions.vehicle_change_export_p4368494" does not exist
pg_dump: The command was: LOCK TABLE
vehicle_change_partitions.vehicle_change_export_p4368494 IN ACCESS SHARE
MODE
Backup failed: PGPASSWORD=xxxxx && export PGPASSWORD && export PGOPTIONS="-c
statement_timeout=0 -c maintenance_work_mem=2147483647" && /usr/bin/pg_dump
-h xxx.xxx.xxx.xxx -U postgres --ignore-version -Fc -Z 6 xxxxxxxx >
/vol/nfs_backup/postgres_dumps/2011_07_13/xxxxxxxx_2011_07_13
Account: xxxxxxxx Backup failed

How is this possible - pg_dump is a serializable transaction? It doesn't
seem to be tripped up by some other backend function since this actually
fails on the lock.

Well, its not a bug.

You've asked to dump a table and then dropped the table concurrently
with the attempt to dump the table.

Exclude the tables you don't wish to have dumped using command line options.

I don't think we will put in an option to silently exclude missing
tables, not least because it would be technically difficult.

--
 Simon Riggs                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jesper Engman (#3)
Re: BUG #6167: pg_dump fails on table lock

Jesper Engman <jesper@engman.net> writes:

I wonder if there is a small time span between when pg_dump starts and
when all locks have been acquired that may be the problem (if a table
is dropped during that time span). Is there such a small time of
vulnerability?

Certainly. pg_dump has to read pg_class to get the names of the tables,
and then try to lock each one. If you drop a table during that window,
the lock command will fail.

The window is actually a bit longer than necessary in existing releases,
because pg_dump was doing some other stuff before it got around to
acquiring the locks. I fixed that recently
http://git.postgresql.org/gitweb/?p=postgresql.git&amp;a=commitdiff&amp;h=68d977a73
but that patch hasn't made it to any released versions yet. In any case
there's still a nonzero window.

Excluding tables from the dump is not an option - that will be an
incomplete backup.

Um ... if you know it's a transient table, why do you care about backing
it up?

regards, tom lane