BUG #6167: pg_dump fails on table lock
The following bug has been logged online:
Bug reference: 6167
Logged by: Jesper Engman
Email address: jesper@engman.net
PostgreSQL version: 8.3.10
Operating system: Linux
Description: pg_dump fails on table lock
Details:
I have tables that exists for short time periods, sometimes for as short as
5 min. pg_dump is starting to fail due to a problem to lock these tables:
pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: relation
"vehicle_change_partitions.vehicle_change_export_p4368494" does not exist
pg_dump: The command was: LOCK TABLE
vehicle_change_partitions.vehicle_change_export_p4368494 IN ACCESS SHARE
MODE
Backup failed: PGPASSWORD=xxxxx && export PGPASSWORD && export PGOPTIONS="-c
statement_timeout=0 -c maintenance_work_mem=2147483647" && /usr/bin/pg_dump
-h xxx.xxx.xxx.xxx -U postgres --ignore-version -Fc -Z 6 xxxxxxxx >
/vol/nfs_backup/postgres_dumps/2011_07_13/xxxxxxxx_2011_07_13
Account: xxxxxxxx Backup failed
How is this possible - pg_dump is a serializable transaction? It doesn't
seem to be tripped up by some other backend function since this actually
fails on the lock.
On Thu, Aug 18, 2011 at 2:05 AM, Jesper Engman <jesper@engman.net> wrote:
The following bug has been logged online:
Bug reference: 6167
Logged by: Jesper Engman
Email address: jesper@engman.net
PostgreSQL version: 8.3.10
Operating system: Linux
Description: pg_dump fails on table lock
Details:I have tables that exists for short time periods, sometimes for as short as
5 min. pg_dump is starting to fail due to a problem to lock these tables:pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: relation
"vehicle_change_partitions.vehicle_change_export_p4368494" does not exist
pg_dump: The command was: LOCK TABLE
vehicle_change_partitions.vehicle_change_export_p4368494 IN ACCESS SHARE
MODE
Backup failed: PGPASSWORD=xxxxx && export PGPASSWORD && export PGOPTIONS="-c
statement_timeout=0 -c maintenance_work_mem=2147483647" && /usr/bin/pg_dump
-h xxx.xxx.xxx.xxx -U postgres --ignore-version -Fc -Z 6 xxxxxxxx >
/vol/nfs_backup/postgres_dumps/2011_07_13/xxxxxxxx_2011_07_13
Account: xxxxxxxx Backup failedHow is this possible - pg_dump is a serializable transaction? It doesn't
seem to be tripped up by some other backend function since this actually
fails on the lock.
Well, its not a bug.
You've asked to dump a table and then dropped the table concurrently
with the attempt to dump the table.
Exclude the tables you don't wish to have dumped using command line options.
I don't think we will put in an option to silently exclude missing
tables, not least because it would be technically difficult.
--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
DDL is not excluded from MVCC, right? This kind of concurrency should
be handled and it's generally managed just fine. I just did a lab test
attempting to drop a table while pg_dump is running on the same db and
it simply waits to drop the table until pg_dump is done. That is the
expected behavior.
There is some discussion about concurrency and DDL changes in:
http://archives.postgresql.org/pgsql-bugs/2010-02/msg00187.php
But in that case, some specialized backend functions like
pg_get_indexdef is using committed state and that doesn't seem to
occur in this case.
I wonder if there is a small time span between when pg_dump starts and
when all locks have been acquired that may be the problem (if a table
is dropped during that time span). Is there such a small time of
vulnerability?
The database in question does not have a ton of tables like this -
about 10 tables and the tables exists for about 5 min. But this runs
on many databases (more than a thousand). So, if there is a window of
vulnerability (if only small) - chances are we're hitting it.
Excluding tables from the dump is not an option - that will be an
incomplete backup.
Show quoted text
On Thu, Aug 18, 2011 at 4:03 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Thu, Aug 18, 2011 at 2:05 AM, Jesper Engman <jesper@engman.net> wrote:
The following bug has been logged online:
Bug reference: 6167
Logged by: Jesper Engman
Email address: jesper@engman.net
PostgreSQL version: 8.3.10
Operating system: Linux
Description: pg_dump fails on table lock
Details:I have tables that exists for short time periods, sometimes for as short as
5 min. pg_dump is starting to fail due to a problem to lock these tables:pg_dump: SQL command failed
pg_dump: Error message from server: ERROR: relation
"vehicle_change_partitions.vehicle_change_export_p4368494" does not exist
pg_dump: The command was: LOCK TABLE
vehicle_change_partitions.vehicle_change_export_p4368494 IN ACCESS SHARE
MODE
Backup failed: PGPASSWORD=xxxxx && export PGPASSWORD && export PGOPTIONS="-c
statement_timeout=0 -c maintenance_work_mem=2147483647" && /usr/bin/pg_dump
-h xxx.xxx.xxx.xxx -U postgres --ignore-version -Fc -Z 6 xxxxxxxx >
/vol/nfs_backup/postgres_dumps/2011_07_13/xxxxxxxx_2011_07_13
Account: xxxxxxxx Backup failedHow is this possible - pg_dump is a serializable transaction? It doesn't
seem to be tripped up by some other backend function since this actually
fails on the lock.Well, its not a bug.
You've asked to dump a table and then dropped the table concurrently
with the attempt to dump the table.Exclude the tables you don't wish to have dumped using command line options.
I don't think we will put in an option to silently exclude missing
tables, not least because it would be technically difficult.--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
Jesper Engman <jesper@engman.net> writes:
I wonder if there is a small time span between when pg_dump starts and
when all locks have been acquired that may be the problem (if a table
is dropped during that time span). Is there such a small time of
vulnerability?
Certainly. pg_dump has to read pg_class to get the names of the tables,
and then try to lock each one. If you drop a table during that window,
the lock command will fail.
The window is actually a bit longer than necessary in existing releases,
because pg_dump was doing some other stuff before it got around to
acquiring the locks. I fixed that recently
http://git.postgresql.org/gitweb/?p=postgresql.git&a=commitdiff&h=68d977a73
but that patch hasn't made it to any released versions yet. In any case
there's still a nonzero window.
Excluding tables from the dump is not an option - that will be an
incomplete backup.
Um ... if you know it's a transient table, why do you care about backing
it up?
regards, tom lane