pg_start_backup and pg_stop_backup Re: Re: [COMMITTERS] pgsql: Make CheckRequiredParameterValues() depend upon correct
On Wed, Apr 28, 2010 at 4:43 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
This doesn't contain any changes to pg_start_backup() yet, that's a
separate issue and still under discussion.
I'm thinking of changing pg_start_backup and pg_stop_backup so that
they just check that wal_level >= 'archive', and changing pg_stop_backup
so that it doesn't wait for archiving when archive_mode is OFF.
This change is very simple and enables us to take a base backup for SR
even if archive_mode is OFF. Thought?
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Wed, 2010-04-28 at 19:40 +0900, Fujii Masao wrote:
On Wed, Apr 28, 2010 at 4:43 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:This doesn't contain any changes to pg_start_backup() yet, that's a
separate issue and still under discussion.I'm thinking of changing pg_start_backup and pg_stop_backup so that
they just check that wal_level >= 'archive', and changing pg_stop_backup
so that it doesn't wait for archiving when archive_mode is OFF.This change is very simple and enables us to take a base backup for SR
even if archive_mode is OFF. Thought?
Makes sense.
I'm wondering whether this could cause problems with people taking hot
backups that aren't aimed at SR. Perhaps we could have 2 new functions
whose names are more closely linked to the exact purpose:
pg_start_replication_copy() etc..
which then act exactly as you suggest.
--
Simon Riggs www.2ndQuadrant.com
On Wed, Apr 28, 2010 at 6:52 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Wed, 2010-04-28 at 19:40 +0900, Fujii Masao wrote:
On Wed, Apr 28, 2010 at 4:43 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:This doesn't contain any changes to pg_start_backup() yet, that's a
separate issue and still under discussion.I'm thinking of changing pg_start_backup and pg_stop_backup so that
they just check that wal_level >= 'archive', and changing pg_stop_backup
so that it doesn't wait for archiving when archive_mode is OFF.This change is very simple and enables us to take a base backup for SR
even if archive_mode is OFF. Thought?Makes sense.
I'm wondering whether this could cause problems with people taking hot
backups that aren't aimed at SR. Perhaps we could have 2 new functions
whose names are more closely linked to the exact purpose:
pg_start_replication_copy() etc..
which then act exactly as you suggest.
Hmm. That seems a bit complicated. Why can't we just let people use
the existing functions the way they always have?
...Robert
On Wed, 2010-04-28 at 06:56 -0400, Robert Haas wrote:
On Wed, Apr 28, 2010 at 6:52 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Wed, 2010-04-28 at 19:40 +0900, Fujii Masao wrote:
On Wed, Apr 28, 2010 at 4:43 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:This doesn't contain any changes to pg_start_backup() yet, that's a
separate issue and still under discussion.I'm thinking of changing pg_start_backup and pg_stop_backup so that
they just check that wal_level >= 'archive', and changing pg_stop_backup
so that it doesn't wait for archiving when archive_mode is OFF.This change is very simple and enables us to take a base backup for SR
even if archive_mode is OFF. Thought?Makes sense.
I'm wondering whether this could cause problems with people taking hot
backups that aren't aimed at SR. Perhaps we could have 2 new functions
whose names are more closely linked to the exact purpose:
pg_start_replication_copy() etc..
which then act exactly as you suggest.Hmm. That seems a bit complicated. Why can't we just let people use
the existing functions the way they always have?
We can, but I already gave a reason why we should not.
IIRC it was you that suggested changing the names of things if the
behaviour changes.
--
Simon Riggs www.2ndQuadrant.com
Robert Haas wrote:
On Wed, Apr 28, 2010 at 6:52 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Wed, 2010-04-28 at 19:40 +0900, Fujii Masao wrote:
On Wed, Apr 28, 2010 at 4:43 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:This doesn't contain any changes to pg_start_backup() yet, that's a
separate issue and still under discussion.I'm thinking of changing pg_start_backup and pg_stop_backup so that
they just check that wal_level >= 'archive', and changing pg_stop_backup
so that it doesn't wait for archiving when archive_mode is OFF.This change is very simple and enables us to take a base backup for SR
even if archive_mode is OFF. Thought?Makes sense.
I'm wondering whether this could cause problems with people taking hot
backups that aren't aimed at SR. Perhaps we could have 2 new functions
whose names are more closely linked to the exact purpose:
pg_start_replication_copy() etc..
which then act exactly as you suggest.Hmm. That seems a bit complicated. Why can't we just let people use
the existing functions the way they always have?
Well, it would be nice to allow using pg_start_backup() on the primary
when streaming replication is enabled, even if archiving isn't.
Otherwise the only way to get the base backup for the standby is to shut
down primary first, or use filesystem snapshot etc.
The straightforward way to enable that would be to allow
pg_start_backup() when wal_level >= 'archive', regardless of
archive_mode. However, I'm worried that someone might take an online
backup without archiving (and replication), not realizing that it's not
safe.
That risk is there already, though, if you restore from an online backup
and forget to create recovery.conf. It will start up in inconsistent
state. The proposed change would make it easier to make that mistake.
I'm not sure what to do about it, maybe throw a warning if you start up
a database and there's a backup_label file in the data directory.
Something like:
WARNING: database system was interrupted while backup was in progress
HINT: If you are restoring from an online backup, you must use a WAL
archive for the restore, or the database can be in inconsistent state
That would also occur if the primary database crashes while a backup is
being taken, in which case the warning can be ignored.
Or maybe we should check in pg_start_backup() that either archive_mode
or streaming replication (max_wal_senders > 0) is enabled.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Wed, Apr 28, 2010 at 8:28 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Or maybe we should check in pg_start_backup() that either archive_mode
or streaming replication (max_wal_senders > 0) is enabled.
I agree that pg_start_backup checks not only wal_level but also that.
Regards,
--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center
On Wed, Apr 28, 2010 at 7:22 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Wed, 2010-04-28 at 06:56 -0400, Robert Haas wrote:
On Wed, Apr 28, 2010 at 6:52 AM, Simon Riggs <simon@2ndquadrant.com> wrote:
On Wed, 2010-04-28 at 19:40 +0900, Fujii Masao wrote:
On Wed, Apr 28, 2010 at 4:43 PM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:This doesn't contain any changes to pg_start_backup() yet, that's a
separate issue and still under discussion.I'm thinking of changing pg_start_backup and pg_stop_backup so that
they just check that wal_level >= 'archive', and changing pg_stop_backup
so that it doesn't wait for archiving when archive_mode is OFF.This change is very simple and enables us to take a base backup for SR
even if archive_mode is OFF. Thought?Makes sense.
I'm wondering whether this could cause problems with people taking hot
backups that aren't aimed at SR. Perhaps we could have 2 new functions
whose names are more closely linked to the exact purpose:
pg_start_replication_copy() etc..
which then act exactly as you suggest.Hmm. That seems a bit complicated. Why can't we just let people use
the existing functions the way they always have?We can, but I already gave a reason why we should not.
IIRC it was you that suggested changing the names of things if the
behaviour changes.
Absolutely, but I'm arguing that we shouldn't change the behavior in
the first place. At least as I understand it, even when not using
archive_mode, streaming replication, or hot standby, it's still
perfectly legal to use pg_start_backup() to take a hot backup. I
don't see why we would either (a) break that use case or (b) create
another function that does the same thing but with one extra error
check.
...Robert
Robert Haas wrote:
At least as I understand it, even when not using
archive_mode, streaming replication, or hot standby, it's still
perfectly legal to use pg_start_backup() to take a hot backup.
Nope. The correct procedure to take a hot backup is described in
http://www.postgresql.org/docs/8.4/interactive/continuous-archiving.html#BACKUP-TIPS.
It involves setting archive_mode=on, and archive_command to a shell
command that normally just returns true, except when backup is in
progress. You can't take a hot backup without archiving (or streaming)
at least temporarily. (except with filesystem-level snapshot capabilities).
Which is unfortunate, really. I wish we had a mode where the server
simply refrained from removing/recycling WAL segments while the backup
is running. You could then just:
1. pg_start_backup()
2. tar the data directory, except for pg_xlog
3. tar pg_xlog
4. pg_stop_backup().
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Which is unfortunate, really. I wish we had a mode where the server
simply refrained from removing/recycling WAL segments while the backup
is running. You could then just:
1. pg_start_backup()
2. tar the data directory, except for pg_xlog
3. tar pg_xlog
4. pg_stop_backup().
I think there's a termination issue there --- the safe stop point
would (appear to be) past whatever WAL you'd copied during step 3.
Still, the possibility of adding modes such as this seems to me to be a
good argument for not inventing a new version of pg_start_backup/
pg_stop_backup every time.
regards, tom lane
On Wed, 2010-04-28 at 11:10 -0400, Robert Haas wrote:
IIRC it was you that suggested changing the names of things if the
behaviour changes.Absolutely, but I'm arguing that we shouldn't change the behavior in
the first place. At least as I understand it...
I feel like you're just arguing against whatever I say - your reasoning
makes no sense. Masao would not have proposed it as a change if it
already worked like that, would he? Just reading the thread would tell
you that much. Plus, you clearly don't know how it works now, so not
sure why you're commenting at all, its just minor stuff and a few ideas.
--
Simon Riggs www.2ndQuadrant.com
On Wed, Apr 28, 2010 at 11:25 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
Robert Haas wrote:
At least as I understand it, even when not using
archive_mode, streaming replication, or hot standby, it's still
perfectly legal to use pg_start_backup() to take a hot backup.Nope. The correct procedure to take a hot backup is described in
http://www.postgresql.org/docs/8.4/interactive/continuous-archiving.html#BACKUP-TIPS.
It involves setting archive_mode=on, and archive_command to a shell
command that normally just returns true, except when backup is in
progress. You can't take a hot backup without archiving (or streaming)
at least temporarily. (except with filesystem-level snapshot capabilities).
Oh. Well, in that case the proposed change seems reasonable... but
what do you mean by "except with filesystem-level snapshot
capabilities"?
...Robert
On Wed, 2010-04-28 at 12:44 -0400, Robert Haas wrote:
On Wed, Apr 28, 2010 at 11:25 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:Robert Haas wrote:
At least as I understand it, even when not using
archive_mode, streaming replication, or hot standby, it's still
perfectly legal to use pg_start_backup() to take a hot backup.Nope. The correct procedure to take a hot backup is described in
http://www.postgresql.org/docs/8.4/interactive/continuous-archiving.html#BACKUP-TIPS.
It involves setting archive_mode=on, and archive_command to a shell
command that normally just returns true, except when backup is in
progress. You can't take a hot backup without archiving (or streaming)
at least temporarily. (except with filesystem-level snapshot capabilities).Oh. Well, in that case the proposed change seems reasonable... but
what do you mean by "except with filesystem-level snapshot
capabilities"?
Like LVM, SANS or ZFS.
Joshua D. Drake
...Robert
--
PostgreSQL.org Major Contributor
Command Prompt, Inc: http://www.commandprompt.com/ - 503.667.4564
Consulting, Training, Support, Custom Development, Engineering
Robert Haas wrote:
but
what do you mean by "except with filesystem-level snapshot
capabilities"?
If you have a filesystem that supports atomic snapshots, you can take a
snapshot of the filesystem the data directory resides on, and then copy
the data directory from the snapshot at your leisure, without
pg_start/stop_backup(). It is entirely invisible to PostgreSQL and works
just like copying the data directory after an immediate shutdown. The
server will perform crash recovery after restore.
Virtualization software, logical volume managers and SANs tend to have
such features, in addition to filesystems.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Well, it would be nice to allow using pg_start_backup() on the primary
when streaming replication is enabled, even if archiving isn't.
Otherwise the only way to get the base backup for the standby is to shut
down primary first, or use filesystem snapshot etc.
I think I must be missing something: exactly how would you fire up a new
standby from such a base backup, if you weren't running archiving?
If you aren't archiving then there's no guarantee that you'll still have
a continuous WAL series starting from the start of the backup.
IOW I think that the requirement in pg_start_backup shouldn't be relaxed
without some more thought/work.
regards, tom lane
IOW I think that the requirement in pg_start_backup shouldn't be relaxed
without some more thought/work.
Yeah, I was talking to Bruce about that this AM, and it seems like a
feature we *need* to have ... for 9.1.
I'm sufficiently concerned about the amount of flux HS/SR is in right
now that I'd like to declare it "good enough" and move towards release.
Otherwise we'll tinker with it forever and there will be no 9.0.
"Release early, release often" *is* the OSS mantra, after all. The
question now isn't "Is binary replication perfect" but "is it *good
enough* for some substantial portion of our users". And I think the
answer to the latter question is, at this point, yes.
--
-- Josh Berkus
PostgreSQL Experts Inc.
http://www.pgexperts.com
Tom Lane wrote:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Well, it would be nice to allow using pg_start_backup() on the primary
when streaming replication is enabled, even if archiving isn't.
Otherwise the only way to get the base backup for the standby is to shut
down primary first, or use filesystem snapshot etc.I think I must be missing something: exactly how would you fire up a new
standby from such a base backup, if you weren't running archiving?
I was replying to Robert's thought on using pg_start/stop_backup() for
taking a hot backup. Not for bootstrapping a standby.
If you aren't archiving then there's no guarantee that you'll still have
a continuous WAL series starting from the start of the backup.
I wasn't really thinking of this use case, but you could set
wal_keep_segments "high enough". Not a configuration I would recommend
for high availability, but should be fine for setting up a streaming
replication standby for testing etc. If we don't allow
pg_start/stop_backup() with archive_mode=off and max_wal_senders>0,
there's no way to bootstrap a streaming replication standby without
archiving.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
On Wed, 2010-04-28 at 11:11 -0700, Josh Berkus wrote:
IOW I think that the requirement in pg_start_backup shouldn't be relaxed
without some more thought/work.Yeah, I was talking to Bruce about that this AM, and it seems like a
feature we *need* to have ... for 9.1.I'm sufficiently concerned about the amount of flux HS/SR is in right
now that I'd like to declare it "good enough" and move towards release.
Otherwise we'll tinker with it forever and there will be no 9.0."Release early, release often" *is* the OSS mantra, after all. The
question now isn't "Is binary replication perfect" but "is it *good
enough* for some substantial portion of our users". And I think the
answer to the latter question is, at this point, yes.
As of exactly today, my answer, for my piece of this is also "yes".
I'm not convinced that the same is true across the board. Some important
changes have happened in last few days and I see more coming.
--
Simon Riggs www.2ndQuadrant.com
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Tom Lane wrote:
If you aren't archiving then there's no guarantee that you'll still have
a continuous WAL series starting from the start of the backup.
I wasn't really thinking of this use case, but you could set
wal_keep_segments "high enough".
Ah. Okay, that seems like a workable approach, at least for people with
reasonably predictable WAL loads. We could certainly improve on it
later to make it more bulletproof, but it's usable now --- if we relax
the error checks.
(wal_keep_segments can be changed without restarting, right?)
Not a configuration I would recommend
for high availability, but should be fine for setting up a streaming
replication standby for testing etc. If we don't allow
pg_start/stop_backup() with archive_mode=off and max_wal_senders>0,
there's no way to bootstrap a streaming replication standby without
archiving.
Right. +1 for weakening the tests, then. Is there any use in looking
at wal_keep_segments as part of this test?
regards, tom lane
Heikki Linnakangas wrote:
Tom Lane wrote:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Well, it would be nice to allow using pg_start_backup() on the primary
when streaming replication is enabled, even if archiving isn't.
Otherwise the only way to get the base backup for the standby is to shut
down primary first, or use filesystem snapshot etc.I think I must be missing something: exactly how would you fire up a new
standby from such a base backup, if you weren't running archiving?I was replying to Robert's thought on using pg_start/stop_backup() for
taking a hot backup. Not for bootstrapping a standby.
Scratch that, I just reread what I wrote, and starting a streaming
replication standby from such a backup was exactly what I was describing..
If you aren't archiving then there's no guarantee that you'll still have
a continuous WAL series starting from the start of the backup.I wasn't really thinking of this use case, but you could set
wal_keep_segments "high enough". Not a configuration I would recommend
for high availability, but should be fine for setting up a streaming
replication standby for testing etc. If we don't allow
pg_start/stop_backup() with archive_mode=off and max_wal_senders>0,
there's no way to bootstrap a streaming replication standby without
archiving.
This still makes sense.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com
Tom Lane wrote:
Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:
Tom Lane wrote:
If you aren't archiving then there's no guarantee that you'll still have
a continuous WAL series starting from the start of the backup.I wasn't really thinking of this use case, but you could set
wal_keep_segments "high enough".Ah. Okay, that seems like a workable approach, at least for people with
reasonably predictable WAL loads. We could certainly improve on it
later to make it more bulletproof, but it's usable now --- if we relax
the error checks.
Yeah, wal_keep_segments is wishy-woshy in general, not only with backups.
(wal_keep_segments can be changed without restarting, right?)
It's PG_SIGHUP.
Not a configuration I would recommend
for high availability, but should be fine for setting up a streaming
replication standby for testing etc. If we don't allow
pg_start/stop_backup() with archive_mode=off and max_wal_senders>0,
there's no way to bootstrap a streaming replication standby without
archiving.Right. +1 for weakening the tests, then. Is there any use in looking
at wal_keep_segments as part of this test?
I don't think so. There's no safe setting that would guarantee anything.
We could check for wal_keep_segments>0, but any small number is the same
practice. We don't insist on wal_keep_segments>0 to allow WAL streaming
without archival in general, let's not treat taking the base backup
differently.
--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com