[PoC PATCH] Parallel dump to /dev/null
Hi,
dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
way to check for corruption. However, dumping to /dev/null is currently
not supported in directory mode which makes it not possible to dump to
/dev/null in parallel.
I had a look at this, and it appears it would suffice to just override
the few spots in pg_backup_directory.c which assemble filenames in the
target directory to revert to '/dev/null' (or 'NUL' on Windows).
The attached proof-of-concept patch does that, and it seems to work; I'm
getting some nice speedups on a 170 GB test database:
$ time pg_dump -Fc -Z0 -f /dev/null TESTDB
real 32m45.120s
[...]
$ time pg_dump -Fd -j8 -Z0 -f /dev/null TESTDB
real 9m28.357s
Thoughts?
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de
credativ GmbH, HRB M�nchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer
Attachments:
parallel-pgdump-dev-null.patchtext/x-diff; charset=us-asciiDownload+38-9
On Thu, Feb 01, 2018 at 02:24:46PM +0100, Michael Banck wrote:
dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
way to check for corruption. However, dumping to /dev/null is currently
not supported in directory mode which makes it not possible to dump to
/dev/null in parallel.
FWIW, I use this trick as well in some test deployments. Now, those
deployments happen in places where I want the checks to warn me, so the
time it takes is not really an issue generally. Do folks here think
that speedups would be worth it? Say to make the operation shorter on
production systems for example. A lengthy pg_dump bloats data for a
longer time, so I can buy that shorter times may help, though I think
that hearing from other folks is necessary as well.
--
Michael
Hi,
Am Freitag, den 02.02.2018, 13:22 +0900 schrieb Michael Paquier:
On Thu, Feb 01, 2018 at 02:24:46PM +0100, Michael Banck wrote:
dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
way to check for corruption. However, dumping to /dev/null is currently
not supported in directory mode which makes it not possible to dump to
/dev/null in parallel.FWIW, I use this trick as well in some test deployments. Now, those
deployments happen in places where I want the checks to warn me, so the
time it takes is not really an issue generally. Do folks here think
that speedups would be worth it? Say to make the operation shorter on
production systems for example. A lengthy pg_dump bloats data for a
longer time, so I can buy that shorter times may help, though I think
that hearing from other folks is necessary as well.
Another use-case would be automatic restores of basebackups, where you
might want to dump to /dev/null afterwards to check for issues, as just
starting the basebackup wouldn't tell you. If you have lots of instances
and lots of basebackups to check, doing that in parallel might be
helpful.
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de
credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
Hi!
Michael, thanks for the patch!
2 февр. 2018 г., в 9:22, Michael Paquier <michael.paquier@gmail.com> написал(а):
On Thu, Feb 01, 2018 at 02:24:46PM +0100, Michael Banck wrote:
dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
way to check for corruption. However, dumping to /dev/null is currently
not supported in directory mode which makes it not possible to dump to
/dev/null in parallel.Do folks here think that speedups would be worth it?
I've seen backup verification via COPY TO to /dev/null in production. These copies were parallelized on the side of verification script.
Not sure this counts for feature or against it.
Best regards, Andrey Borodin.
Hi.
2 февр. 2018 г., в 13:25, Andrey Borodin <x4mmm@yandex-team.ru> написал(а):
Do folks here think that speedups would be worth it?
I've seen backup verification via COPY TO to /dev/null in production. These copies were parallelized on the side of verification script.
Not sure this counts for feature or against it.
This sounds for feature because usage of COPY is involuntary. Having this in pg_dump would make our scripts for checking backups simpler.
--
May the force be with you…
https://simply.name
Hi,
On Thu, Feb 01, 2018 at 02:24:46PM +0100, Michael Banck wrote:
dumping a database to /dev/null via pg_dump is (AFAIK) one recommended
way to check for corruption. However, dumping to /dev/null is currently
not supported in directory mode which makes it not possible to dump to
/dev/null in parallel.I had a look at this, and it appears it would suffice to just override
the few spots in pg_backup_directory.c which assemble filenames in the
target directory to revert to '/dev/null' (or 'NUL' on Windows).The attached proof-of-concept patch does that, and it seems to work; I'm
getting some nice speedups on a 170 GB test database:$ time pg_dump -Fc -Z0 -f /dev/null TESTDB
real 32m45.120s
[...]
$ time pg_dump -Fd -j8 -Z0 -f /dev/null TESTDB
real 9m28.357s
I have added this patch to the next commitfest:
https://commitfest.postgresql.org/17/1576/
I also tried to add a TAP test, but as this patch produces no dump
output, I had to exclude it from all restores tests and then got an
off-by-one between the number of tests that were expected vs. those that
were run. I've attached it if somebody wants to take a look, but I hope
with Stephen's refactoring of the pg_dump TAP tests, this might be
easier.
Michael
Attachments:
0001-WIP-pg_dump-TAP-testcase.patchtext/x-diff; charset=us-asciiDownload+84-6
Hi,
On 2018-02-28 14:28:41 +0100, Michael Banck wrote:
I have added this patch to the next commitfest:
Given this is a new patch, submitted for the last commitfest, and not
completely trivial, I'd argue this is too late for v11.
Does anybody disagree?
- Andres
Hi,
Am Donnerstag, den 01.03.2018, 01:28 -0800 schrieb Andres Freund:
Hi,
On 2018-02-28 14:28:41 +0100, Michael Banck wrote:
I have added this patch to the next commitfest:
Given this is a new patch, submitted for the last commitfest, and not
completely trivial, I'd argue this is too late for v11.
I was under the impression that rule was rather about complicated and
invasive patches, not just any patch which isn't completely trivial.
I agree that the patch is not completely trivial, but is rather small
(it's diffstat is "3 files changed, 38 insertions(+), 9 deletions(-)"),
and it certainly is not highly invasive, or touching code all over the
place, but really only in a few places in pg_backup_directory.c.
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de
credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
On 2018-03-01 14:21:24 +0100, Michael Banck wrote:
Hi,
Am Donnerstag, den 01.03.2018, 01:28 -0800 schrieb Andres Freund:
Hi,
On 2018-02-28 14:28:41 +0100, Michael Banck wrote:
I have added this patch to the next commitfest:
Given this is a new patch, submitted for the last commitfest, and not
completely trivial, I'd argue this is too late for v11.I was under the impression that rule was rather about complicated and
invasive patches, not just any patch which isn't completely trivial.
I think you're right this patch is a bit boderline for that. But we have
~220 pending CF entries, and fairly limited review and commit
capacity. Something's gotta give. A lot of those are patches that have
been waiting to be committed for a while. So things added first to the
last CF the day before it starts are prime candidates.
You probably can increase your chances by herding a few patches towards
being committable.
Greetings,
Andres Freund
strdup -> pg_strdup()
I wonder if, instead of doing strcmp() all over the place, we should
give this behavior a separate boolean flag in lclContext.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi Alvaro,
On Thu, Mar 01, 2018 at 04:00:52PM -0300, Alvaro Herrera wrote:
strdup -> pg_strdup()
Fixed.
I wonder if, instead of doing strcmp() all over the place, we should
give this behavior a separate boolean flag in lclContext.
I mused a bit about what to name that flag and came up with `discard',
but maybe somebody has a better idea?
In any case, new patch attached which does this.
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de
credativ GmbH, HRB M�nchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer
Attachments:
parallel-pgdump-dev-null-V2.patchtext/x-diff; charset=us-asciiDownload+49-9
On Mon, Mar 5, 2018 at 1:49 PM, Michael Banck <michael.banck@credativ.de>
wrote:
In any case, new patch attached which does this.
The null device is already defined in port.h, as DEVNULL. No need to
redefine it as NULL_DEVICE.
Alternatively paths.h defines _PATH_DEVNULL.
Regards,
--
Andreas 'ads' Scherbaum
German PostgreSQL User Group
European PostgreSQL User Group - Board of Directors
Volunteer Regional Contact, Germany - PostgreSQL Project
Hi Andreas,
On Mon, Mar 05, 2018 at 03:15:19PM +0100, Andreas 'ads' Scherbaum wrote:
The null device is already defined in port.h, as DEVNULL. No need to
redefine it as NULL_DEVICE.
Thanks for pointing that out, a new patch removing NULL_DEVICE is
attached.
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de
credativ GmbH, HRB M�nchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer
Attachments:
parallel-pgdump-dev-null-V3.patchtext/x-diff; charset=us-asciiDownload+42-9
Please make ctx->discard a bool, not an int.
You can initialize it like this:
ctx->discard = strcmp(ctx->directory, DEVNULL) == 0;
Why do you change the mode to create directory to 0711 from 0700?
You have a cuddled "else" in the last hunk. Please uncuddle.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Hi Alvaro,
Am Montag, den 05.03.2018, 12:48 -0300 schrieb Alvaro Herrera:
Please make ctx->discard a bool, not an int.
Ok, done so now. I forgot to mention that in the earlier resubmission,
but I had a look at the other pg_backup_*.c files and isSpecialScript
and hasSeek were ints as well, so I opted for that.
You can initialize it like this:
ctx->discard = strcmp(ctx->directory, DEVNULL) == 0;
Thanks, changed it thusly.
Why do you change the mode to create directory to 0711 from 0700?
Oh wow, that was either a mistype in vim or a very old test hack, thanks
for spotting it.
You have a cuddled "else" in the last hunk. Please uncuddle.
Done so now, hopefully.
Thanks again for the review, a new version is attached.
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de
credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
Attachments:
parallel-pgdump-dev-null-V4.patchtext/x-patch; charset=UTF-8; name=parallel-pgdump-dev-null-V4.patchDownload+39-9
I made a few amendments (here's v5) and was ready to push, when I
noticed that _StartBlobs() does not seem to be doing the right thing.
Did you test this with a few large objects?
The reason I noticed is I wondered about the amount of open() calls
(plus zlib function calls) we would save by keeping an FD open to
/dev/null, rather than opening it over and over for each object -- ie.,
maybe not touch setFilePath() at all, if possible. That looks perhaps
too invasive, so maybe not. But do audit other callsites that may open
files.
--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachments:
v5-0001-Allow-parallel-dump-to-go-to-dev-null.patchtext/plain; charset=us-asciiDownload+41-11
Hi Alvaro,
Am Montag, den 05.03.2018, 13:56 -0300 schrieb Alvaro Herrera:
I made a few amendments (here's v5) and was ready to push, when I
noticed that _StartBlobs() does not seem to be doing the right thing.
Did you test this with a few large objects?
I did test it on the regression database, which leaves one or two large
objects behind:
mba@fock:~$ psql -h /tmp -p 65432 -d regression -c '\dl'
Large objects
ID | Owner | Description
-------+-------+------------------------------
3001 | mba | testing comments
36867 | mba | I Wandered Lonely as a Cloud
47229 | mba |
(3 rows)
What exactly is wrong with _StartBlobs()? It calls setFilePath() which
makes sure /dev/null is opened and not something else.
The reason I noticed is I wondered about the amount of open() calls
(plus zlib function calls) we would save by keeping an FD open to
/dev/null, rather than opening it over and over for each object -- ie.,
maybe not touch setFilePath() at all, if possible. That looks perhaps
too invasive, so maybe not. But do audit other callsites that may open
files.
I see your point; I've hacked together the attached additional PoC
patch, which keeps the dataFH open. The parallel case was tricky, I had
to add an additional flag to lclContext so that DEVNULL is only opened
once for data files cause I could not find a place where to set it for
parallel workers and it crashed otherwise.
This cuts down the number of /dev/null opens from 352 to 6 for a -j 2
dump of the regression database to /dev/null.
In my opinion, I think this is too evolved and possibly error-prone for
being just an optimization, so I'd drop that for v11 for now and maybe
revisit it later.
Michael
--
Michael Banck
Projektleiter / Senior Berater
Tel.: +49 2166 9901-171
Fax: +49 2166 9901-100
Email: michael.banck@credativ.de
credativ GmbH, HRB Mönchengladbach 12080
USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 Mönchengladbach
Geschäftsführung: Dr. Michael Meskes, Jörg Folz, Sascha Heuer
Attachments:
parallel-pgdump-keep-dev-null-open.patchtext/x-patch; charset=UTF-8; name=parallel-pgdump-keep-dev-null-open.patchDownload+64-21
Re: Alvaro Herrera 2018-03-05 <20180305165609.kq5y7uzy64o45vsg@alvherre.pgsql>
The reason I noticed is I wondered about the amount of open() calls
(plus zlib function calls) we would save by keeping an FD open to
/dev/null, rather than opening it over and over for each object -- ie.,
maybe not touch setFilePath() at all, if possible. That looks perhaps
too invasive, so maybe not. But do audit other callsites that may open
files.
In normal operation, the files are also opened individually, so
special-casing /dev/null seems overly specific to me (and I'd be very
surprised if it made any difference.)
For the feature to be useful in practise, it needs documentation.
People won't expect -Fd -f /dev/null to work because /dev/null is not
a directory.
Otherwise, +1 from me.
Christoph
On Tue, Mar 20, 2018 at 09:54:07AM +0100, Christoph Berg wrote:
Otherwise, +1 from me.
I have been thinking about this patch lately, and on the contrary I am
voting -1 for the concepts behind this patch. pg_dump is by nature a
tool aimed at fetching data from the database, at putting it in a shape
wanted by the user, and optionally at saving the data and at making it
durable (since recently for the last part).
It is not a corruption detection tool. Even if it was a tool to detect
corruption, it is doing it wrong in two ways:
1) It scans tables using only sequential scans, so it basically never
checks any other AMs than heap.
2) It detects only one failure at a time and stops. Hence in order to
detect all corruptions, one need to run pg_dump, repair or zero the
pages and then rince and repeat until a successful run is achieved.
This is a costly process particularly on large relations, where a run of
pg_dump can take minutes, and the more the pages, the more time it takes
to do the whole cleanup before being able to save as much data as
possible.
Now, why are people using pg_dump > /dev/null? Mainly the lack of
better tools, which would be actually able to detect pages in corrupted
pages in one run, and not only heap pages. I honestly think that
amcheck is something that we sould more focus on and has more potential
on the matter, and that we are just complicating pg_dump to do something
it is not designed for, and would do it badly anyway.
--
Michael
Re: Michael Paquier 2018-03-20 <20180320135720.GE13368@paquier.xyz>
Now, why are people using pg_dump > /dev/null? Mainly the lack of
better tools, which would be actually able to detect pages in corrupted
pages in one run, and not only heap pages. I honestly think that
amcheck is something that we sould more focus on and has more potential
on the matter, and that we are just complicating pg_dump to do something
it is not designed for, and would do it badly anyway.
Still, people are using it for that purpose now, and speeding it up
would just be a non-intrusive patch.
Also, if pg_dump is a backup tool, why does that mean it must not be
used for anything else?
Mit freundlichen Gr��en,
Christoph Berg
--
Senior Berater, Tel.: +49 2166 9901 187
credativ GmbH, HRB M�nchengladbach 12080, USt-ID-Nummer: DE204566209
Trompeterallee 108, 41189 M�nchengladbach
Gesch�ftsf�hrung: Dr. Michael Meskes, J�rg Folz, Sascha Heuer
pgp fingerprint: 5C48 FE61 57F4 9179 5970 87C6 4C5A 6BAB 12D2 A7AE