pg_dump without explicit table locking

Started by Jürgen Strobelalmost 12 years ago12 messages
#1Jürgen Strobel
juergen+pg@strobel.info
1 attachment(s)

Hi,

at work at my company I inherited responsibility for a large PG 8.1 DB,
with a an extreme number of tables (~300000). Surprisingly this is
working quite well, except for maintenance and backup. I am tasked with
finding a way to do dump & restore to 9.3 with as little downtime as
possible.

Using 9.3's pg_dump with -j12 I found out that pg_dump takes 6 hours to
lock tables using a single thread, then does the data dump in 1 more
hour using 12 workers. However if I patch out the explicit LOCK TABLE
statements this only takes 1 hour total. Of course no one else is using
the DB at this time. In a pathological test case scenario in a staging
environment the dump time decreased from 5 hours to 5 minutes.

I've googled the problem and there seem to be more people with similar
problems, so I made this a command line option --no-table-locks and
wrapped it up in as nice a patch against github/master as I can manage
(and I didn't use C for a long time). I hope you find it useful.

regards,
J�rgen Strobel

Attachments:

pg_dump_nolock.difftext/plain; charset=UTF-8; name=pg_dump_nolock.diff; x-mac-creator=0; x-mac-type=0Download
commit 393d47cf6b5ce77297e52e7fc390aa862fd2c2fd
Author: Jürgen Strobel <juergen+github@strobel.info>
Date:   Fri Mar 7 16:54:22 2014 +0100

    Add option --no-table-locks to pg_dump.

diff --git a/doc/src/sgml/ref/pg_dump.sgml b/doc/src/sgml/ref/pg_dump.sgml
index 1f0d4de..5725c46 100644
--- a/doc/src/sgml/ref/pg_dump.sgml
+++ b/doc/src/sgml/ref/pg_dump.sgml
@@ -772,6 +772,25 @@ PostgreSQL documentation
      </varlistentry>
 
      <varlistentry>
+      <term><option>--no-table-locks</></term>
+      <listitem>
+       <para>
+        If this option is specified, tables to be dumped will not be locked explicitly
+        at the start of pg_dump. It implies <option>--no-synchronized-snapshots</>.
+        This is potentially dangerous and should only be used by experts,
+        and only while there is no other activity in the database.
+       </para>
+       <para>
+        In the presense of an extreme number of tables pg_dump can exhibit
+        bad startup performance because it needs to issue LOCK TABLE statements for all
+        tables. This is intended as a last resort option for dump and restore version
+        upgrades, in order to make downtime manageable. In an especially bad case
+        with ~300000 tables this reduced dump time from 6 to 1 hours.
+       </para>
+      </listitem>
+     </varlistentry>
+
+     <varlistentry>
       <term><option>--no-tablespaces</option></term>
       <listitem>
        <para>
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index 6f2634b..ac05356 100644
--- a/src/bin/pg_dump/parallel.c
+++ b/src/bin/pg_dump/parallel.c
@@ -32,6 +32,9 @@
 #define PIPE_READ							0
 #define PIPE_WRITE							1
 
+/* flag for the no-table-locks command-line long option */
+int	no_table_locks = 0;
+
 /* file-scope variables */
 #ifdef WIN32
 static unsigned int tMasterThreadId = 0;
@@ -886,7 +889,7 @@ WaitForCommands(ArchiveHandle *AH, int pipefd[2])
 			 * meantime.  lockTableNoWait dies in this case to prevent a
 			 * deadlock.
 			 */
-			if (strcmp(te->desc, "BLOBS") != 0)
+			if (!no_table_locks && strcmp(te->desc, "BLOBS") != 0)
 				lockTableNoWait(AH, te);
 
 			/*
diff --git a/src/bin/pg_dump/parallel.h b/src/bin/pg_dump/parallel.h
index 7a32a9b..f95a6bb 100644
--- a/src/bin/pg_dump/parallel.h
+++ b/src/bin/pg_dump/parallel.h
@@ -70,6 +70,9 @@ extern bool parallel_init_done;
 extern DWORD mainThreadId;
 #endif
 
+/* flag for the no-table-locks command-line long option */
+extern int	no_table_locks;
+
 extern void init_parallel_dump_utils(void);
 
 extern int	GetIdleWorker(ParallelState *pstate);
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 17bb846..e1fe047 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -357,6 +357,7 @@ main(int argc, char **argv)
 		{"use-set-session-authorization", no_argument, &use_setsessauth, 1},
 		{"no-security-labels", no_argument, &no_security_labels, 1},
 		{"no-synchronized-snapshots", no_argument, &no_synchronized_snapshots, 1},
+		{"no-table-locks", no_argument, &no_table_locks, 1},
 		{"no-unlogged-table-data", no_argument, &no_unlogged_table_data, 1},
 
 		{NULL, 0, NULL, 0}
@@ -562,6 +563,10 @@ main(int argc, char **argv)
 	if (column_inserts)
 		dump_inserts = 1;
 
+	/* --no-table-locks implies --no_synchronized_snapshots */
+	if (no_table_locks)
+		no_synchronized_snapshots = 1;
+
 	if (dataOnly && schemaOnly)
 	{
 		write_msg(NULL, "options -s/--schema-only and -a/--data-only cannot be used together\n");
@@ -902,6 +907,7 @@ help(const char *progname)
 	printf(_("  --inserts                    dump data as INSERT commands, rather than COPY\n"));
 	printf(_("  --no-security-labels         do not dump security label assignments\n"));
 	printf(_("  --no-synchronized-snapshots  do not use synchronized snapshots in parallel jobs\n"));
+	printf(_("  --no-table-locks             do not lock tables before starting to dump\n"));
 	printf(_("  --no-tablespaces             do not dump tablespace assignments\n"));
 	printf(_("  --no-unlogged-table-data     do not dump unlogged table data\n"));
 	printf(_("  --quote-all-identifiers      quote all identifiers, even if not key words\n"));
@@ -4807,7 +4813,7 @@ getTables(Archive *fout, int *numTables)
 		 * NOTE: it'd be kinda nice to lock other relations too, not only
 		 * plain tables, but the backend doesn't presently allow that.
 		 */
-		if (tblinfo[i].dobj.dump && tblinfo[i].relkind == RELKIND_RELATION)
+		if (!no_table_locks && tblinfo[i].dobj.dump && tblinfo[i].relkind == RELKIND_RELATION)
 		{
 			resetPQExpBuffer(query);
 			appendPQExpBuffer(query,
#2Robert Haas
robertmhaas@gmail.com
In reply to: Jürgen Strobel (#1)
Re: pg_dump without explicit table locking

On Mon, Mar 17, 2014 at 7:52 AM, Jürgen Strobel <juergen+pg@strobel.info> wrote:

at work at my company I inherited responsibility for a large PG 8.1 DB,
with a an extreme number of tables (~300000). Surprisingly this is
working quite well, except for maintenance and backup. I am tasked with
finding a way to do dump & restore to 9.3 with as little downtime as
possible.

Using 9.3's pg_dump with -j12 I found out that pg_dump takes 6 hours to
lock tables using a single thread, then does the data dump in 1 more
hour using 12 workers. However if I patch out the explicit LOCK TABLE
statements this only takes 1 hour total. Of course no one else is using
the DB at this time. In a pathological test case scenario in a staging
environment the dump time decreased from 5 hours to 5 minutes.

I've googled the problem and there seem to be more people with similar
problems, so I made this a command line option --no-table-locks and
wrapped it up in as nice a patch against github/master as I can manage
(and I didn't use C for a long time). I hope you find it useful.

Fascinating report. Whether we use your patch or not, that's
interesting to know about. Please add your patch here so we don't
forget about it:

https://commitfest.postgresql.org/action/commitfest_view/open

See also https://wiki.postgresql.org/wiki/Submitting_a_Patch

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Pavel Stehule
pavel.stehule@gmail.com
In reply to: Jürgen Strobel (#1)
Re: pg_dump without explicit table locking

2014-03-17 12:52 GMT+01:00 Jürgen Strobel <juergen+pg@strobel.info>:

Hi,

at work at my company I inherited responsibility for a large PG 8.1 DB,
with a an extreme number of tables (~300000). Surprisingly this is
working quite well, except for maintenance and backup. I am tasked with
finding a way to do dump & restore to 9.3 with as little downtime as
possible.

Using 9.3's pg_dump with -j12 I found out that pg_dump takes 6 hours to
lock tables using a single thread, then does the data dump in 1 more
hour using 12 workers. However if I patch out the explicit LOCK TABLE
statements this only takes 1 hour total. Of course no one else is using
the DB at this time. In a pathological test case scenario in a staging
environment the dump time decreased from 5 hours to 5 minutes.

I've googled the problem and there seem to be more people with similar
problems, so I made this a command line option --no-table-locks and
wrapped it up in as nice a patch against github/master as I can manage
(and I didn't use C for a long time). I hope you find it useful.

Joe Conway sent me a tip so commit eeb6f37d89fc60c6449ca12ef9e914
91069369cb significantly decrease a time necessary for locking. So it can
help to.

I am not sure, if missing lock is fully correct. In same situation I though
about some form of database level lock. So you can get a protected access
by one statement.

Regards

Pavel Stehule

Show quoted text

regards,
Jürgen Strobel

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Pavel Stehule (#3)
Re: pg_dump without explicit table locking

Pavel Stehule <pavel.stehule@gmail.com> writes:

2014-03-17 12:52 GMT+01:00 Jürgen Strobel <juergen+pg@strobel.info>:

I've googled the problem and there seem to be more people with similar
problems, so I made this a command line option --no-table-locks and
wrapped it up in as nice a patch against github/master as I can manage
(and I didn't use C for a long time). I hope you find it useful.

Joe Conway sent me a tip so commit eeb6f37d89fc60c6449ca12ef9e914
91069369cb significantly decrease a time necessary for locking. So it can
help to.

Indeed. I think there's zero chance that we'd accept the patch as
proposed. If there's still a performance problem in HEAD, we'd look
for some other way to improve matters more.

(Note that this is only one of assorted O(N^2) behaviors in older versions
of pg_dump; we've gradually stamped them out over time.)

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Jim Nasby
jim@nasby.net
In reply to: Tom Lane (#4)
Re: pg_dump without explicit table locking

On 3/17/14, 8:47 AM, Tom Lane wrote:

Pavel Stehule <pavel.stehule@gmail.com> writes:

2014-03-17 12:52 GMT+01:00 Jürgen Strobel <juergen+pg@strobel.info>:

I've googled the problem and there seem to be more people with similar
problems, so I made this a command line option --no-table-locks and
wrapped it up in as nice a patch against github/master as I can manage
(and I didn't use C for a long time). I hope you find it useful.

Joe Conway sent me a tip so commit eeb6f37d89fc60c6449ca12ef9e914
91069369cb significantly decrease a time necessary for locking. So it can
help to.

Indeed. I think there's zero chance that we'd accept the patch as
proposed. If there's still a performance problem in HEAD, we'd look
for some other way to improve matters more.

(Note that this is only one of assorted O(N^2) behaviors in older versions
of pg_dump; we've gradually stamped them out over time.)

On that note, it's recommended that when you are taking a backup to restore into a newer version of Postgres you create the dump using the NEWER version of pg_dump, not the old one.
--
Jim C. Nasby, Data Architect jim@nasby.net
512.569.9461 (cell) http://jim.nasby.net

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jim Nasby (#5)
Re: pg_dump without explicit table locking

Jim Nasby <jim@nasby.net> writes:

On 3/17/14, 8:47 AM, Tom Lane wrote:

(Note that this is only one of assorted O(N^2) behaviors in older versions
of pg_dump; we've gradually stamped them out over time.)

On that note, it's recommended that when you are taking a backup to restore into a newer version of Postgres you create the dump using the NEWER version of pg_dump, not the old one.

Right. IIRC, the OP said he *did* use a recent pg_dump ... but this
particular issue got fixed server-side, so the new pg_dump didn't help
against an 8.1 server :-(

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Joe Conway
mail@joeconway.com
In reply to: Tom Lane (#6)
Re: pg_dump without explicit table locking

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/17/2014 04:15 PM, Tom Lane wrote:

Jim Nasby <jim@nasby.net> writes:

On 3/17/14, 8:47 AM, Tom Lane wrote:

(Note that this is only one of assorted O(N^2) behaviors in
older versions of pg_dump; we've gradually stamped them out
over time.)

On that note, it's recommended that when you are taking a backup
to restore into a newer version of Postgres you create the dump
using the NEWER version of pg_dump, not the old one.

Right. IIRC, the OP said he *did* use a recent pg_dump ... but
this particular issue got fixed server-side, so the new pg_dump
didn't help against an 8.1 server :-(

Exactly. I backported the patch from 9.3 to 8.4 and saw a schema-only
dump time go from <give-up-and-kill-it-after-5-days> to 1 hour. This
was for a database with about 500k tables.

Joe

- --
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTJ4M7AAoJEDfy90M199hlLzUQAIpprlGDB2lAwgaMD9CYiko4
xJKCWPE9FH3zd5D4hoWfzI3rTudP+wr5VHGncJHKFcarVC3RFwlDcwYFZVFbzVfr
IcKxQjiwYJfi30/sjClnXeaqyQCeurolKwkLXiSLbmVvi7edtlYPN4WEBrEojKCS
AUQtVqbtlSWhCMmTWnYAn2uejuXK5wZnOvjQZzhTpTuYsMZ1uKhuAb/ir3/PPiVA
qGax6QLKfJ1SLOlMtLRWTo8jYH82s9QNRNv2kvaiQILXgkmTQ3Z79aYrq5BrCeVL
dn5vHwkGXyYfZ3XuuaomspB9+kUleNsvWWWfBWKJVw129WFbwYqqFF6GX0q1vLzb
+Le2tMmvIUxFAaJle9usS9REa6lyJr5RiLPq+q0DA/oTx/na7O9DdJzfjIzhsKTK
ghRyhUyIafPDPgAMSahh1vWlHdOE9wy+8GYcRKkfLiZKFO24DVknHcDWCFa22uBH
b5BZCV0AgJwWTFcBjpQ/Nz8+0ykGx69B9RswUXtqXoRpak9kvvCbtqCyN15MRUSG
t5Bfr5RPGmL8TFZWKsgGeg8cWRFra3P4WktDB9d1tq8qfcKDPENlEkUEC/bc5D0M
OFoaNoSVCiK2uoFInxtGujFuQxlxtAgbN8PkkCrqHwkU/tmWSzQljaaHx3AKY8iM
Bgr493yOFPDVcESQ1pSa
=PV6q
-----END PGP SIGNATURE-----

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Craig Ringer
craig@2ndquadrant.com
In reply to: Joe Conway (#7)
Re: pg_dump without explicit table locking

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/18/2014 07:20 AM, Joe Conway wrote:

On 03/17/2014 04:15 PM, Tom Lane wrote:

Jim Nasby <jim@nasby.net> writes:

On 3/17/14, 8:47 AM, Tom Lane wrote:

(Note that this is only one of assorted O(N^2) behaviors in
older versions of pg_dump; we've gradually stamped them out
over time.)

On that note, it's recommended that when you are taking a
backup to restore into a newer version of Postgres you create
the dump using the NEWER version of pg_dump, not the old one.

Right. IIRC, the OP said he *did* use a recent pg_dump ... but
this particular issue got fixed server-side, so the new pg_dump
didn't help against an 8.1 server :-(

Exactly. I backported the patch from 9.3 to 8.4 and saw a
schema-only dump time go from <give-up-and-kill-it-after-5-days> to
1 hour. This was for a database with about 500k tables.

I wonder if doing large batches of

LOCK TABLE table1, table2, table3, ...

would help, instead of doing individual statements?

- --
Craig Ringer http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.15 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQEcBAEBAgAGBQJTJ5fpAAoJELBXNkqjr+S2XMkH/jhhyET40uuc+zkwpJzW6j7x
o6x2VsaME0PpFUZE5iRL0FDEbHovwznZsQFOPFaCOqZ9vCL1P0JWl6Wf1JNZPJSr
NkwT08PsiOFmtXvnGh5109z3Of6ADGyEKCWHp7msxokzj1whJqzcIbyjAajUq21K
wgnYksH6TUVuutDg6r6PPaNTbmIvcxHwRs58hYi8JUIXdzIyYSlLdPezAo0VjwGw
8GHUQYy4r/hkSDWA4ViZ6ZjDIIgzT46nrTdxkGNPZn4Oz8k6nASnFFZhLfanytQI
YQhqlmIa/jievlZX5Q38snIPE+KYEq3buYK/OmbKuDCfHotC9nNNDCHdop7Qs4c=
=zQoC
-----END PGP SIGNATURE-----

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Jeff Janes
jeff.janes@gmail.com
In reply to: Craig Ringer (#8)
Re: pg_dump without explicit table locking

On Mon, Mar 17, 2014 at 5:48 PM, Craig Ringer <craig@2ndquadrant.com> wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/18/2014 07:20 AM, Joe Conway wrote:

On 03/17/2014 04:15 PM, Tom Lane wrote:

Jim Nasby <jim@nasby.net> writes:

On 3/17/14, 8:47 AM, Tom Lane wrote:

(Note that this is only one of assorted O(N^2) behaviors in
older versions of pg_dump; we've gradually stamped them out
over time.)

On that note, it's recommended that when you are taking a
backup to restore into a newer version of Postgres you create
the dump using the NEWER version of pg_dump, not the old one.

Right. IIRC, the OP said he *did* use a recent pg_dump ... but
this particular issue got fixed server-side, so the new pg_dump
didn't help against an 8.1 server :-(

Exactly. I backported the patch from 9.3 to 8.4 and saw a
schema-only dump time go from <give-up-and-kill-it-after-5-days> to
1 hour. This was for a database with about 500k tables.

I wonder if doing large batches of

LOCK TABLE table1, table2, table3, ...

would help, instead of doing individual statements?

If I recall correctly, someone did submit a patch to do that. It helped
when dumping schema only, but not much when dumping data.

Cheers,

Jeff

#10Joe Conway
mail@joeconway.com
In reply to: Jeff Janes (#9)
Re: pg_dump without explicit table locking

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 03/17/2014 05:55 PM, Jeff Janes wrote:

On Mon, Mar 17, 2014 at 5:48 PM, Craig Ringer
<craig@2ndquadrant.com I wonder if doing large batches of

LOCK TABLE table1, table2, table3, ...

would help, instead of doing individual statements?

If I recall correctly, someone did submit a patch to do that. It
helped when dumping schema only, but not much when dumping data.

Not surprising at all. The huge time is incurred in taking the locks,
but if you are trying to use pg_upgrade in link mode to speed your
upgrade, you are totally hosed by the time it takes to grab those locks.

This patch applied to 9.3 substantially fixes the issue:
8<-----------------------
commit eeb6f37d89fc60c6449ca12ef9e91491069369cb
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Thu Jun 21 15:01:17 2012 +0300

Add a small cache of locks owned by a resource owner in ResourceOwner.
8<-----------------------

On my 8.4 database, with 500,000 tables there were about 2.5 million
locks taken including toast tables and indexes during the schema dump.
Without the patch grabbing locks took many, many days with that many
objects to lock. With a backported version of the patch, one hour.

So if you have a problem due to many tables on an older than 9.3
version of Postgres, this is the direction to head (a custom patch
applied to your old version just long enough to get successfully
upgraded).

Joe

- --
Joe Conway
credativ LLC: http://www.credativ.us
Linux, PostgreSQL, and general Open Source
Training, Service, Consulting, & 24x7 Support
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.14 (GNU/Linux)
Comment: Using GnuPG with Thunderbird - http://www.enigmail.net/

iQIcBAEBAgAGBQJTJ6I1AAoJEDfy90M199hlDsAP/320vLhjKRjaxulBmcESmuze
LUoIOHLS2ZacybTjcMlZSoTBSzf6iXzc3A84ROhQ0c7ASYzWtF/YFkd039FzHz+e
TxtuodZd+CO18f1ZYBR5S7AhXhsA7oviXXdgPhqwb14mIqGAlvblTp9cvMODK+OS
O96NSLe2qe1AvmxwwthcKzhlXBChzoRvT8jXeS5A/G+VfM7UV1HApGmklJE0oe9+
ZaXhxQWGecKqZgkPwfZzIzOz9qQITDb3woi7GxbiXLv8Ds1lgAxPRz26qJB/mKBC
NqxQHViyty79TA8EFV8DrE0g++CUz33rSs1suY5Z1yzsQ7iEFBP1U52BJE5ZdB0J
8Zpz1eLT15fEIuV+64MeXN47U2refJdEjw9Ozx788MgDOu43k9m4+VHjzcH1AO2l
qfp0eqxpIjDpqH4Lu/0DAzl86yEW76tJX+pdieICGOHLdruLS/984gZGtDpjclNE
l/FaliLQQ4Bvqg8tMmmq/dyTxBG+BRmfCBbaBRdtQA762P9Lh7QsL/mGHukwRNVb
M5Ve7i/1HT7ZrazEnMkAotnYMrH5QTy1qTVfjiR0gjXzccdMXSOT8NN/yiWwOq6d
ZRuBvr8Ws+xCDDWwABj8Oh2mKpupy04/87EaTy2+sh6yJaIZPPV+n4ftjF4NF/gP
zBhXxMpFalKDqevkp52Z
=/Nl7
-----END PGP SIGNATURE-----

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Jürgen Strobel
juergen+pg@strobel.info
In reply to: Tom Lane (#6)
Re: pg_dump without explicit table locking

On 18.03.14 00:15, Tom Lane wrote:

Jim Nasby <jim@nasby.net> writes:

On 3/17/14, 8:47 AM, Tom Lane wrote:

(Note that this is only one of assorted O(N^2) behaviors in older versions
of pg_dump; we've gradually stamped them out over time.)

On that note, it's recommended that when you are taking a backup to restore into a newer version of Postgres you create the dump using the NEWER version of pg_dump, not the old one.

Right. IIRC, the OP said he *did* use a recent pg_dump ... but this
particular issue got fixed server-side, so the new pg_dump didn't help
against an 8.1 server :-(

Yes, I did use 9.3's pg_dump against my 8.1 DB initially.

The patch was created against github's master.

-J�rgen

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Jürgen Strobel
juergen+pg@strobel.info
In reply to: Joe Conway (#10)
Re: pg_dump without explicit table locking

On 18.03.14 02:32, Joe Conway wrote:

On 03/17/2014 05:55 PM, Jeff Janes wrote:

On Mon, Mar 17, 2014 at 5:48 PM, Craig Ringer
<craig@2ndquadrant.com I wonder if doing large batches of

LOCK TABLE table1, table2, table3, ...

would help, instead of doing individual statements?

If I recall correctly, someone did submit a patch to do that. It
helped when dumping schema only, but not much when dumping data.

Not surprising at all. The huge time is incurred in taking the locks,
but if you are trying to use pg_upgrade in link mode to speed your
upgrade, you are totally hosed by the time it takes to grab those locks.

This patch applied to 9.3 substantially fixes the issue:
8<-----------------------
commit eeb6f37d89fc60c6449ca12ef9e91491069369cb
Author: Heikki Linnakangas <heikki.linnakangas@iki.fi>
Date: Thu Jun 21 15:01:17 2012 +0300

Add a small cache of locks owned by a resource owner in ResourceOwner.
8<-----------------------

On my 8.4 database, with 500,000 tables there were about 2.5 million
locks taken including toast tables and indexes during the schema dump.
Without the patch grabbing locks took many, many days with that many
objects to lock. With a backported version of the patch, one hour.

So if you have a problem due to many tables on an older than 9.3
version of Postgres, this is the direction to head (a custom patch
applied to your old version just long enough to get successfully
upgraded).

In a testing environment I restored my 8.1 DB with 300,000 tables to a
9.3 server (using my patched pg_dump).

Then I ran the original 9.3 pg_dump against the 9.3 DB again, and it
works reasonably well. So I can confirm the server side improvements in
9.3 do to work for my test case.

Still when I finally get around to do this on production I plan to use
my patched pg_dump rather than backporting the server fix to 8.1, as I'd
rather not touch our already-patched-for-something-else 8.1 server.

I can't wait to get my hand on 9.x replication features and other stuff :-)

-J�rgen

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers