backup tools ought to ensure created backups are durable
Hi,
As pointed out in
/messages/by-id/20160327232509.v5wgac5vskusedin@awork2.anarazel.de
our backup tools (i.e. pg_basebackup, pg_dump[all]), currently don't
make any efforts to ensure their output is durable.
I think for backup tools of possibly critical data, that's pretty much
unaceptable.
There's cases where we can't ensure durability (i.e. pg_dump | gzip >
file), but it's out of our hands in that case.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Mar 28, 2016 at 8:30 AM, Andres Freund <andres@anarazel.de> wrote:
As pointed out in
/messages/by-id/20160327232509.v5wgac5vskusedin@awork2.anarazel.de
our backup tools (i.e. pg_basebackup, pg_dump[all]), currently don't
make any efforts to ensure their output is durable.I think for backup tools of possibly critical data, that's pretty much
unaceptable.
Definitely agreed, once a backup/dump has been taken and those
utilities exit, we had better ensure that they are durably on disk.
For pg_basebackup and pg_dump, as everything except pg_dump/plain
require a target directory for the location of the output result, we
really can make things better.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Mar 28, 2016 at 3:11 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Mon, Mar 28, 2016 at 8:30 AM, Andres Freund <andres@anarazel.de> wrote:
As pointed out in
/messages/by-id/20160327232509.v5wgac5vskusedin@awork2.anarazel.de
our backup tools (i.e. pg_basebackup, pg_dump[all]), currently don't
make any efforts to ensure their output is durable.I think for backup tools of possibly critical data, that's pretty much
unaceptable.Definitely agreed, once a backup/dump has been taken and those
utilities exit, we had better ensure that they are durably on disk.
For pg_basebackup and pg_dump, as everything except pg_dump/plain
require a target directory for the location of the output result, we
really can make things better.
Definitely agreed on fixing it. But I don't think your summary is right.
pg_basebackup in tar mode can be sent to stdout, does not require a
directory. And the same for pg_dump in any mode except for directory. So we
can't just drive it off the mode, some more detailed checks are required.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 2016-03-28 11:35:57 +0200, Magnus Hagander wrote:
On Mon, Mar 28, 2016 at 3:11 AM, Michael Paquier <michael.paquier@gmail.com>
wrote:On Mon, Mar 28, 2016 at 8:30 AM, Andres Freund <andres@anarazel.de> wrote:
As pointed out in
/messages/by-id/20160327232509.v5wgac5vskusedin@awork2.anarazel.de
our backup tools (i.e. pg_basebackup, pg_dump[all]), currently don't
make any efforts to ensure their output is durable.I think for backup tools of possibly critical data, that's pretty much
unaceptable.Definitely agreed, once a backup/dump has been taken and those
utilities exit, we had better ensure that they are durably on disk.
For pg_basebackup and pg_dump, as everything except pg_dump/plain
require a target directory for the location of the output result, we
really can make things better.Definitely agreed on fixing it. But I don't think your summary is right.
pg_basebackup in tar mode can be sent to stdout, does not require a
directory. And the same for pg_dump in any mode except for directory. So we
can't just drive it off the mode, some more detailed checks are required.
if (!isastty(stdout)) ought to do the trick, afaics? And maybe add a
warning somewhere in the docs about the tools not fsyncing if you pipe
their output data somewhere?
Andres
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Mon, Mar 28, 2016 at 3:12 PM, Andres Freund <andres@anarazel.de> wrote:
On 2016-03-28 11:35:57 +0200, Magnus Hagander wrote:
On Mon, Mar 28, 2016 at 3:11 AM, Michael Paquier <
michael.paquier@gmail.com>
wrote:
On Mon, Mar 28, 2016 at 8:30 AM, Andres Freund <andres@anarazel.de>
wrote:
As pointed out in
/messages/by-id/20160327232509.v5wgac5vskusedin@awork2.anarazel.de
our backup tools (i.e. pg_basebackup, pg_dump[all]), currently don't
make any efforts to ensure their output is durable.I think for backup tools of possibly critical data, that's pretty
much
unaceptable.
Definitely agreed, once a backup/dump has been taken and those
utilities exit, we had better ensure that they are durably on disk.
For pg_basebackup and pg_dump, as everything except pg_dump/plain
require a target directory for the location of the output result, we
really can make things better.Definitely agreed on fixing it. But I don't think your summary is right.
pg_basebackup in tar mode can be sent to stdout, does not require a
directory. And the same for pg_dump in any mode except for directory. Sowe
can't just drive it off the mode, some more detailed checks are required.
if (!isastty(stdout)) ought to do the trick, afaics? And maybe add a
warning somewhere in the docs about the tools not fsyncing if you pipe
their output data somewhere?
That should work yeah. And given that we already use that check in other
places, it seems it should be perfectly safe. And as long as we only do a
WARNING and not abort if the fsync fails, we should be OK if people
intentionally store their backups on an fs that doesn't speak fsync (if
that exists), in which case I don't really think we even need a switch to
turn it off.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 3/28/16 11:03 AM, Magnus Hagander wrote:
That should work yeah. And given that we already use that check in other
places, it seems it should be perfectly safe. And as long as we only do
a WARNING and not abort if the fsync fails, we should be OK if people
intentionally store their backups on an fs that doesn't speak fsync (if
that exists), in which case I don't really think we even need a switch
to turn it off.
I'd even go so far as spitting out a warning any time we can't fsync
(maybe that's what you're suggesting?)
--
Jim Nasby, Data Architect, Blue Treble Consulting, Austin TX
Experts in Analytics, Data Architecture and PostgreSQL
Data in Trouble? Get it in Treble! http://BlueTreble.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 29, 2016 at 8:46 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 3/28/16 11:03 AM, Magnus Hagander wrote:
That should work yeah. And given that we already use that check in other
places, it seems it should be perfectly safe. And as long as we only do
a WARNING and not abort if the fsync fails, we should be OK if people
intentionally store their backups on an fs that doesn't speak fsync (if
that exists), in which case I don't really think we even need a switch
to turn it off.I'd even go so far as spitting out a warning any time we can't fsync
(maybe that's what you're suggesting?)
That is pretty much what I was suggesting, yes.
Though we might want to consolidate them in for example pg_basebackup -Fp
and pg_dump -Fd into something like "failed to fsync <n> files".
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On 2016-03-29 10:06:20 +0200, Magnus Hagander wrote:
On Tue, Mar 29, 2016 at 8:46 AM, Jim Nasby <Jim.Nasby@bluetreble.com> wrote:
On 3/28/16 11:03 AM, Magnus Hagander wrote:
That should work yeah. And given that we already use that check in other
places, it seems it should be perfectly safe. And as long as we only do
a WARNING and not abort if the fsync fails, we should be OK if people
intentionally store their backups on an fs that doesn't speak fsync (if
that exists), in which case I don't really think we even need a switch
to turn it off.I'd even go so far as spitting out a warning any time we can't fsync
(maybe that's what you're suggesting?)That is pretty much what I was suggesting, yes.
Though we might want to consolidate them in for example pg_basebackup -Fp
and pg_dump -Fd into something like "failed to fsync <n> files".
I'd just not output anything if ENOTSUPP or similar is returned, and not
bother with something as complex as collecting errors.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 29, 2016 at 10:12 AM, Andres Freund <andres@anarazel.de> wrote:
On 2016-03-29 10:06:20 +0200, Magnus Hagander wrote:
On Tue, Mar 29, 2016 at 8:46 AM, Jim Nasby <Jim.Nasby@bluetreble.com>
wrote:
On 3/28/16 11:03 AM, Magnus Hagander wrote:
That should work yeah. And given that we already use that check in
other
places, it seems it should be perfectly safe. And as long as we only
do
a WARNING and not abort if the fsync fails, we should be OK if people
intentionally store their backups on an fs that doesn't speak fsync(if
that exists), in which case I don't really think we even need a switch
to turn it off.I'd even go so far as spitting out a warning any time we can't fsync
(maybe that's what you're suggesting?)That is pretty much what I was suggesting, yes.
Though we might want to consolidate them in for example pg_basebackup -Fp
and pg_dump -Fd into something like "failed to fsync <n> files".I'd just not output anything if ENOTSUPP or similar is returned, and not
bother with something as complex as collecting errors.
That'll work too, I guess. Won't necessarily make people aware of the
problem, but in the unlikely event they use a fs like that they should be
aware of it already.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/