Early locking option to parallel backup
Hello people,
I am sending a patch to improve parallel backup on larger databases.
THE PROBLEM
pg_dump was taking more than 24 hours to complete in one of my databases. I
begin to research alternatives. Parallel backup reduced the backup time to
little less than a hour, but it failed almost every time because of
concurrent queries that generated exclusive locks. It is difficult to
guarantee that my applications will not issue queries such as drop table,
alter table, truncate table, create index or drop index for a hour. And I
prefer not to create controls mechanisms to that end if I can work around
it.
THE SOLUTION
The patch creates a "--lock-early" option which will make pg_dump to issue
shared locks on all tables on the backup TOC on each parallel worker start.
That way, the backup has a very small chance of failing. When it does,
happen in the first few seconds of the backup job. My backup scripts (not
included here) are aware of that and retries the backup in case of failure.
TESTS
I am using this technique in production over a year now and it is working
well for me.
Lucas
Attachments:
pg_dump-lock-early.patchtext/x-patch; charset=US-ASCII; name=pg_dump-lock-early.patchDownload
diff --git a/src/bin/pg_dump/parallel.c b/src/bin/pg_dump/parallel.c
index 8b996f4..32de351 100644
--- a/src/bin/pg_dump/parallel.c
+++ b/src/bin/pg_dump/parallel.c
@@ -834,6 +834,48 @@
}
+
+/**
+ * This function issue shared locks for all tables to be dumped
+ */
+static void
+LockEarly(ArchiveHandle *AH)
+{
+ TocEntry *tocEntry;
+ char* tableNamespace;
+ char* tableName;
+ PQExpBuffer query;
+ PGresult *res;
+
+ query = createPQExpBuffer();
+ resetPQExpBuffer(query);
+
+ for( tocEntry = AH->toc->next; tocEntry != AH->toc; tocEntry = tocEntry->next )
+ {
+
+ if( tocEntry->desc && strcmp( tocEntry->desc, "TABLE DATA") == 0 )
+ {
+ tableNamespace = tocEntry->namespace;
+ tableName = tocEntry->tag;
+
+ appendPQExpBuffer(query,
+ "lock table %s.%s in access share mode nowait;\n",
+ PQescapeIdentifier(AH->connection,tableNamespace,strlen(tableNamespace)),
+ PQescapeIdentifier(AH->connection,tableName,strlen(tableName))
+ );
+ }
+ }
+
+ res = PQexec( AH->connection, query->data );
+ if( !res || PQresultStatus(res) != PGRES_COMMAND_OK )
+ {
+ exit_horribly(modulename,"Could not lock the tables to begin the backup job\n\n");
+ }
+
+ PQclear( res );
+ destroyPQExpBuffer(query);
+}
+
/*
* This function is called by both Unix and Windows variants to set up
* and run a worker process. Caller should exit the process (or thread)
@@ -867,6 +909,14 @@
*/
(AH->SetupWorkerPtr) ((Archive *) AH);
+ /*
+ * Issue shared locks if --lock-early option is issued at command line
+ */
+ if( ((Archive*)AH)->dopt->lock_early )
+ {
+ LockEarly(AH);
+ }
+
/*
* Execute commands until done.
*/
diff --git a/src/bin/pg_dump/pg_backup.h b/src/bin/pg_dump/pg_backup.h
index ce3100f..26e2580 100644
--- a/src/bin/pg_dump/pg_backup.h
+++ b/src/bin/pg_dump/pg_backup.h
@@ -158,6 +158,7 @@
int use_setsessauth;
int enable_row_security;
int load_via_partition_root;
+ int lock_early;
/* default, if no "inclusion" switches appear, is to dump everything */
bool include_everything;
diff --git a/src/bin/pg_dump/pg_dump.c b/src/bin/pg_dump/pg_dump.c
index 8733426..a9ce96f 100644
--- a/src/bin/pg_dump/pg_dump.c
+++ b/src/bin/pg_dump/pg_dump.c
@@ -362,6 +362,7 @@
{"no-unlogged-table-data", no_argument, &dopt.no_unlogged_table_data, 1},
{"no-subscriptions", no_argument, &dopt.no_subscriptions, 1},
{"no-sync", no_argument, NULL, 7},
+ {"lock-early", no_argument, &dopt.lock_early, 1},
{NULL, 0, NULL, 0}
};
@@ -603,6 +604,15 @@
if (dopt.if_exists && !dopt.outputClean)
exit_horribly(NULL, "option --if-exists requires option -c/--clean\n");
+ /* Basic checks for lock-early logic */
+ if( numWorkers <= 1 )
+ {
+ if( dopt.lock_early )
+ {
+ exit_horribly(NULL, "lock early only works for parallel backup\n" );
+ }
+ }
+
/* Identify archive format to emit */
archiveFormat = parseArchiveFormat(format, &archiveMode);
@@ -924,6 +934,8 @@
printf(_(" -F, --format=c|d|t|p output file format (custom, directory, tar,\n"
" plain text (default))\n"));
printf(_(" -j, --jobs=NUM use this many parallel jobs to dump\n"));
+ printf(_(" --lock-early issue shared locks as early as possible for all\n"
+ " tables on all parallel backup workers\n" ));
printf(_(" -v, --verbose verbose mode\n"));
printf(_(" -V, --version output version information, then exit\n"));
printf(_(" -Z, --compress=0-9 compression level for compressed formats\n"));
On Sun, Nov 5, 2017 at 7:17 PM, Lucas <lucas75@gmail.com> wrote:
The patch creates a "--lock-early" option which will make pg_dump to issue
shared locks on all tables on the backup TOC on each parallel worker start.
That way, the backup has a very small chance of failing. When it does,
happen in the first few seconds of the backup job. My backup scripts (not
included here) are aware of that and retries the backup in case of failure.
You should register your patch to the next opened commit fest, which
will begin in January, if you are looking for feedback and review:
https://commitfest.postgresql.org/16/
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Sun, Nov 5, 2017 at 5:17 AM, Lucas <lucas75@gmail.com> wrote:
The patch creates a "--lock-early" option which will make pg_dump to issue
shared locks on all tables on the backup TOC on each parallel worker start.
That way, the backup has a very small chance of failing. When it does,
happen in the first few seconds of the backup job. My backup scripts (not
included here) are aware of that and retries the backup in case of failure.
I wonder why we don't do this already ... and by default.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-11-05 17:38:39 -0500, Robert Haas wrote:
On Sun, Nov 5, 2017 at 5:17 AM, Lucas <lucas75@gmail.com> wrote:
The patch creates a "--lock-early" option which will make pg_dump to issue
shared locks on all tables on the backup TOC on each parallel worker start.
That way, the backup has a very small chance of failing. When it does,
happen in the first few seconds of the backup job. My backup scripts (not
included here) are aware of that and retries the backup in case of failure.I wonder why we don't do this already ... and by default.
Well, the current approach afaics requires #relations * 2 locks, whereas
acquiring them in every worker would scale that with the number of
workers. IIUC the problem here is that even though a lock is already
held by the main backend an independent locker's request will prevent
the on-demand lock by the dump worker from being granted. It seems to
me the correct fix here would be to somehow avoid the fairness logic in
the parallel dump case - although I don't quite know how to best do so.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Andres Freund <andres@anarazel.de> writes:
Well, the current approach afaics requires #relations * 2 locks, whereas
acquiring them in every worker would scale that with the number of
workers.
Yeah, that's gonna be a problem with this proposal.
IIUC the problem here is that even though a lock is already
held by the main backend an independent locker's request will prevent
the on-demand lock by the dump worker from being granted. It seems to
me the correct fix here would be to somehow avoid the fairness logic in
the parallel dump case - although I don't quite know how to best do so.
I wonder if we couldn't somehow repurpose the work that was done for
parallel workers' locks. Lots of security-type issues to be handled
if we're to open that up to clients, but maybe it's solvable. For
instance, maybe only allowing it to clients sharing the same snapshot
would help.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Nov 6, 2017 at 4:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
I wonder if we couldn't somehow repurpose the work that was done for
parallel workers' locks. Lots of security-type issues to be handled
if we're to open that up to clients, but maybe it's solvable. For
instance, maybe only allowing it to clients sharing the same snapshot
would help.
Interesting idea. There's a bunch of holes that would need to be
patched there; for instance, you can't have one session running DDL
while somebody else has AccessShareLock. Parallel query relies on the
parallel-mode restrictions to prevent that kind of thing from
happening, but it would be strange (and likely somewhat broken) to try
to enforce those here. It would be strange and probably bad if LOCK
TABLE a; LOCK TABLE b in one session and LOCK TABLE b; LOCK TABLE a in
another session failed to deadlock. In short, there's a big
difference between a single session using multiple processes and
multiple closely coordinated sessions.
Also, even if you did it, you still need a lot of PROCLOCKs. Workers
don't need to take all locks up front because they can be assured of
getting them later, but they've still got to lock the objects they
actually want to access. Group locking aims to prevent deadlocks
between cooperating processes; it is not a license to skip locking
altogether.
None of which is to say that the problems don't feel related somehow.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2017-11-05 22:43:34 -0500, Tom Lane wrote:
IIUC the problem here is that even though a lock is already
held by the main backend an independent locker's request will prevent
the on-demand lock by the dump worker from being granted. It seems to
me the correct fix here would be to somehow avoid the fairness logic in
the parallel dump case - although I don't quite know how to best do so.I wonder if we couldn't somehow repurpose the work that was done for
parallel workers' locks. Lots of security-type issues to be handled
if we're to open that up to clients, but maybe it's solvable. For
instance, maybe only allowing it to clients sharing the same snapshot
would help.
Yea, I'd been wondering the same.
I'm slightly worried that somehow tying multiple clients into parallel
mode would cause a bunch of problems - that's not really the purpose of
the code and a number of its assumptions aren't quite right for that.
I'm not sure it really buys us much in contrast to just allowing a
locker to specify that it's allowed to jump the lock queue for an ASL if
it has 'backup' rights or such. Or actually, just allow it as a general
option to LOCK, there's plenty other operational cases where the current
"fair" behaviour is really annoying, e.g. when executing operational
DDL/DML and such.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Lucas,
* Lucas (lucas75@gmail.com) wrote:
pg_dump was taking more than 24 hours to complete in one of my databases. I
begin to research alternatives. Parallel backup reduced the backup time to
little less than a hour, but it failed almost every time because of
concurrent queries that generated exclusive locks. It is difficult to
guarantee that my applications will not issue queries such as drop table,
alter table, truncate table, create index or drop index for a hour. And I
prefer not to create controls mechanisms to that end if I can work around
it.
I certainly understand the value of pg_dump-based backups, but have you
considered doing file-based backups? That would avoid the need to do
any in-database locking at all, and would give you the ability to do
PITR too. Further, you could actually restore that backup to another
system and then do a pg_dump there to get a logical representation (and
this would test your physical database backup/restore process too...).
Thanks!
Stephen
Em 05/11/2017 21:09, Andres Freund escreveu:
On 2017-11-05 17:38:39 -0500, Robert Haas wrote:
On Sun, Nov 5, 2017 at 5:17 AM, Lucas <lucas75@gmail.com> wrote:
The patch creates a "--lock-early" option which will make pg_dump to issue
shared locks on all tables on the backup TOC on each parallel worker start.
That way, the backup has a very small chance of failing. When it does,
happen in the first few seconds of the backup job. My backup scripts (not
included here) are aware of that and retries the backup in case of failure.I wonder why we don't do this already ... and by default.
Well, the current approach afaics requires #relations * 2 locks, whereas
acquiring them in every worker would scale that with the number of
workers.
Yes, that is why I proposed as an option. As an option will not affect
anyone that does not want to use it.
IIUC the problem here is that even though a lock is already
held by the main backend an independent locker's request will prevent
the on-demand lock by the dump worker from being granted. It seems to
me the correct fix here would be to somehow avoid the fairness logic in
the parallel dump case - although I don't quite know how to best do so.
It seems natural to think several connections in a synchronized snapshot
as the same connection. Then it may be reasonable to grant a shared lock
out of turn if any connection of the same shared snapshot already have a
granted lock for the same relation. Last year Tom mentioned that there
is already queue-jumping logic of that sort in the lock manager for
other purposes. Although seems conceptually simple, I suspect the
implementation is not.
On the other hand, the lock-early option is very simple and has no
impact on anyone that does not want to use it.
---
Lucas
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 11/06/2017 12:30 PM, Stephen Frost wrote:
* Lucas (lucas75@gmail.com) wrote:
pg_dump was taking more than 24 hours to complete in one of my databases. I
begin to research alternatives. Parallel backup reduced the backup time to
little less than a hour, but it failed almost every time because of
concurrent queries that generated exclusive locks. It is difficult to
guarantee that my applications will not issue queries such as drop table,
alter table, truncate table, create index or drop index for a hour. And I
prefer not to create controls mechanisms to that end if I can work around
it.I certainly understand the value of pg_dump-based backups, but have you
considered doing file-based backups? That would avoid the need to do
any in-database locking at all, and would give you the ability to do
PITR too. Further, you could actually restore that backup to another
system and then do a pg_dump there to get a logical representation (and
this would test your physical database backup/restore process too...).
Yes, a point in time recovery has the advantage of keeping the backup
more up-to-date, but has the disadvantage of being more expensive and
complex. In my case, point in time recovery would require an upgrade of
10 TB of storage space and my stakeholders did not approved this
investment yet.
I suspect that there is lots of users that uses pg_dump as primary
backup tool and that they would benefit of a more reliable parallel backup.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers