Netapp SnapCenter

Started by Paul Försteralmost 6 years ago37 messagesgeneral
Jump to latest
#1Paul Förster
paul.foerster@gmail.com

Hi,

first of all, it wasn't my glorious idea and I know that volume snapshots are no backups. :-)

Leaving that aside: Our company has decided to move away from Netapp SnapCreator to its SnapCenter. I was told there is some community plugin for PostgreSQL to use with SnapCenter but can't find any information that makes sense.

The only thing I found is:

https://automationstore.netapp.com/snappack-listNC.shtml

But it's from Oct. 3rd 2018, for PostgreSQL 9.x and 10.x and also requires JDK 1.8 or later.

If this works at all, I consider these version limits (i.e. development status) as ugly, because more or less in limbo.

Does anyone know of a solution for SnapCenter that works, is actively supported and can be used with current versions of PostgreSQL (i.e. 11.x, 12.x and soon to come 13.x)?

Any idea would be greatly appreciated. Thanks in advance.

Cheers,
Paul

#2Wolff, Ken L
ken.l.wolff@lmco.com
In reply to: Paul Förster (#1)
RE: Netapp SnapCenter

We got the following from one of our NetApp contacts regarding a similar question. I'd be interested to hear if anyone feels the steps mentioned are not accurate:

There is presently no NetApp supported plugin for PostgreSQL for SnapCenter. There is a community support plugin for PostgreSQL, which I have no experience with, and it hasn't been updated since 2018, so not sure if it's worth pursuing if it's not actively being maintained.

Documentation for backup and recovery of PostgreSQL is generally missing. It's not a SnapCenter issue, it's PostreSQL itself. It also doesn't need a lot of documentation because backups are easy. PostgreSQL doesn't need to be in a special mode for backups to work.

1. Place data files on volume 1
2. Place Logs on volume 2
3. Schedule ONTAP snapshots on volume 1 at 00:05
4. Schedule ONTAP snapshots on volume 2 at 00:10

There you go. All you really need is a data file snapshot before the log snapshot. During recovery, PostgreSQL will figure out what logs are required to make the database consistent.

Ken

-----Original Message-----
From: Paul Förster <paul.foerster@gmail.com>
Sent: Thursday, June 18, 2020 4:56 AM
To: pgsql-general <pgsql-general@postgresql.org>
Subject: EXTERNAL: Netapp SnapCenter

Hi,

first of all, it wasn't my glorious idea and I know that volume snapshots are no backups. :-)

Leaving that aside: Our company has decided to move away from Netapp SnapCreator to its SnapCenter. I was told there is some community plugin for PostgreSQL to use with SnapCenter but can't find any information that makes sense.

The only thing I found is:

https://automationstore.netapp.com/snappack-listNC.shtml

But it's from Oct. 3rd 2018, for PostgreSQL 9.x and 10.x and also requires JDK 1.8 or later.

If this works at all, I consider these version limits (i.e. development status) as ugly, because more or less in limbo.

Does anyone know of a solution for SnapCenter that works, is actively supported and can be used with current versions of PostgreSQL (i.e. 11.x, 12.x and soon to come 13.x)?

Any idea would be greatly appreciated. Thanks in advance.

Cheers,

Paul

#3Paul Förster
paul.foerster@gmail.com
In reply to: Wolff, Ken L (#2)
Re: Netapp SnapCenter

Hi Ken,

On 18. Jun, 2020, at 15:56, Wolff, Ken L <ken.l.wolff@lmco.com> wrote:
PostgreSQL doesn’t need to be in a special mode for backups to work.

this is curious. Doesn't the PostgreSQL cluster need to be set to backup mode to use SnapCenter?

The problem is, one can't test that and get a reliable answer, because you do 100 backups (snaps) for testing and restore those 100 backups, and all may be ok.

But what about the 1000th snap? Just in that millisecond of the snap, some important information is written to the volume on which the PostgreSQL cluster resides and for some reason, it needs to be restored later and this information is lost or may lead to corruption. This is why backup mode exists, after all. Really, no backup mode by a pre/post script when snapping?

This is very surprising to me.

Cheers,
Paul

#4Magnus Hagander
magnus@hagander.net
In reply to: Paul Förster (#3)
Re: Netapp SnapCenter

On Thu, Jun 18, 2020 at 4:07 PM Paul Förster <paul.foerster@gmail.com>
wrote:

Hi Ken,

On 18. Jun, 2020, at 15:56, Wolff, Ken L <ken.l.wolff@lmco.com> wrote:
PostgreSQL doesn’t need to be in a special mode for backups to work.

this is curious. Doesn't the PostgreSQL cluster need to be set to backup
mode to use SnapCenter?

I don't know specifically about SnapCenter, but for snapshots in general,
it does require backup mode *unless* all your data is on the same disk and
you have an atomic snapshot across that disk (in theory it can be on
different disk as well, as long as the snapshots in that case are atomic
across *all* those disks, not just individually, but that is unusual).

So the upthread suggestion of putting data and wal on different disk and
snapshoting them at different times is *NOT* safe. Unless the reference to
the directory for the logs means a directory where log files are copied out
with archive_command, and it's actually the log archive (in which case it
will work, but the recommendation is that the log archive should not be on
the same machine).

The problem is, one can't test that and get a reliable answer, because you

do 100 backups (snaps) for testing and restore those 100 backups, and all
may be ok.

But what about the 1000th snap? Just in that millisecond of the snap, some
important information is written to the volume on which the PostgreSQL
cluster resides and for some reason, it needs to be restored later and this
information is lost or may lead to corruption. This is why backup mode
exists, after all. Really, no backup mode by a pre/post script when
snapping?

The normal case is that snapshots are guaranteed to be atomic, and thus
this millisecond window cannot appear. But that usually only applies across
individual volumes. It's also worth noticing that taking the backups
without using the backup mode and archive_command means you cannot use them
for PITR, only for restore onto that specific snapshot.

While our documentation on backups in general definitely needs improvement,
this particular requirement is documented at
https://www.postgresql.org/docs/current/backup-file.html.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#5Paul Förster
paul.foerster@gmail.com
In reply to: Magnus Hagander (#4)
Re: Netapp SnapCenter

Hi Magnus,

On 18. Jun, 2020, at 16:19, Magnus Hagander <magnus@hagander.net> wrote:
I don't know specifically about SnapCenter, but for snapshots in general, it does require backup mode *unless* all your data is on the same disk and you have an atomic snapshot across that disk (in theory it can be on different disk as well, as long as the snapshots in that case are atomic across *all* those disks, not just individually, but that is unusual).

according to what I know from our storage guys, Netapp does atomic snapshots for each volume. We have the database and its corresponding WAL files (pg_wal directory) on the same volume and the archived WALs (archive_command) on another. And the snapshots on those two volumes are not taken at the same time. Currently, the database is set to backup mode (using cron) and the storage guys have a window during which they can take the snapshots.

So the upthread suggestion of putting data and wal on different disk and snapshoting them at different times is *NOT* safe. Unless the reference to the directory for the logs means a directory where log files are copied out with archive_command, and it's actually the log archive (in which case it will work, but the recommendation is that the log archive should not be on the same machine).

as I said above, pg_wal is a directory in PGDATA at the default location and WALs are archived using the archive_command to a different volume. So I guess, we should be safe then.

The normal case is that snapshots are guaranteed to be atomic, and thus this millisecond window cannot appear. But that usually only applies across individual volumes. It's also worth noticing that taking the backups without using the backup mode and archive_command means you cannot use them for PITR, only for restore onto that specific snapshot.

While our documentation on backups in general definitely needs improvement, this particular requirement is documented at https://www.postgresql.org/docs/current/backup-file.html.

since we use backup mode, we should be good with PITR too. I didn't test that myself, but a workmate did and he said it worked nicely.

So bottom line, SnapCenter means for PostgreSQL just a plain volume snapshot like we currently do with SnapCreator, using backup mode scripts and "archive_command"-ing WALs to a diffent volume, i.e. no change in strategy. It's just that SnapCenter can't control backup mode as it can with Oracle.

Thanks for the invaluable input. Also, thanks Ken.

Cheers,
Paul

#6Stephen Frost
sfrost@snowman.net
In reply to: Paul Förster (#5)
Re: Netapp SnapCenter

Greetings,

* Paul Förster (paul.foerster@gmail.com) wrote:

On 18. Jun, 2020, at 16:19, Magnus Hagander <magnus@hagander.net> wrote:
I don't know specifically about SnapCenter, but for snapshots in general, it does require backup mode *unless* all your data is on the same disk and you have an atomic snapshot across that disk (in theory it can be on different disk as well, as long as the snapshots in that case are atomic across *all* those disks, not just individually, but that is unusual).

according to what I know from our storage guys, Netapp does atomic snapshots for each volume. We have the database and its corresponding WAL files (pg_wal directory) on the same volume and the archived WALs (archive_command) on another. And the snapshots on those two volumes are not taken at the same time. Currently, the database is set to backup mode (using cron) and the storage guys have a window during which they can take the snapshots.

If the entire database, all tablespaces, and pg_wal, are on the same
volume and the snapshot of the volume is atomic, then you don't actually
need to go through the start/stop backup- a snapshot being restored will
look just like a system crash and PG will just go back to the last
checkpoint and replay the WAL that's in pg_wal and it should reach
consistency and come up.

So the upthread suggestion of putting data and wal on different disk and snapshoting them at different times is *NOT* safe. Unless the reference to the directory for the logs means a directory where log files are copied out with archive_command, and it's actually the log archive (in which case it will work, but the recommendation is that the log archive should not be on the same machine).

as I said above, pg_wal is a directory in PGDATA at the default location and WALs are archived using the archive_command to a different volume. So I guess, we should be safe then.

Yes, that should be alright.

The normal case is that snapshots are guaranteed to be atomic, and thus this millisecond window cannot appear. But that usually only applies across individual volumes. It's also worth noticing that taking the backups without using the backup mode and archive_command means you cannot use them for PITR, only for restore onto that specific snapshot.

While our documentation on backups in general definitely needs improvement, this particular requirement is documented at https://www.postgresql.org/docs/current/backup-file.html.

since we use backup mode, we should be good with PITR too. I didn't test that myself, but a workmate did and he said it worked nicely.

So bottom line, SnapCenter means for PostgreSQL just a plain volume snapshot like we currently do with SnapCreator, using backup mode scripts and "archive_command"-ing WALs to a diffent volume, i.e. no change in strategy. It's just that SnapCenter can't control backup mode as it can with Oracle.

The one issue here is that if you're using the deprecated exxclusive
backup API, then PG will create a backup_label file in the data
directory. If the system reboots while that file exists, there's a good
chance that PG won't start up cleanly since, due to the file existing,
it thinks that it's restoring from a backup when it isn't.

Of course, if you're actually restoring from a backup, then that file is
absolutely critical to have in place, otherwise PG won't realize it's
being restored from a backup and you'll end up with a corrupted database
on restore.

Better is to use the newer non-exclusive API and arrange to collect the
necessary contents of the backup_label file from PG and store that with
the snapshot that you've taken.

Thanks,

Stephen

#7Ron
ronljohnsonjr@gmail.com
In reply to: Paul Förster (#5)
Re: Netapp SnapCenter

On 6/18/20 11:49 AM, Paul Förster wrote:

Hi Magnus,

On 18. Jun, 2020, at 16:19, Magnus Hagander <magnus@hagander.net> wrote:
I don't know specifically about SnapCenter, but for snapshots in general, it does require backup mode *unless* all your data is on the same disk and you have an atomic snapshot across that disk (in theory it can be on different disk as well, as long as the snapshots in that case are atomic across *all* those disks, not just individually, but that is unusual).

according to what I know from our storage guys, Netapp does atomic snapshots for each volume. We have the database and its corresponding WAL files (pg_wal directory) on the same volume and the archived WALs (archive_command) on another. And the snapshots on those two volumes are not taken at the same time. Currently, the database is set to backup mode (using cron) and the storage guys have a window during which they can take the snapshots.

So the upthread suggestion of putting data and wal on different disk and snapshoting them at different times is *NOT* safe. Unless the reference to the directory for the logs means a directory where log files are copied out with archive_command, and it's actually the log archive (in which case it will work, but the recommendation is that the log archive should not be on the same machine).

as I said above, pg_wal is a directory in PGDATA at the default location and WALs are archived using the archive_command to a different volume. So I guess, we should be safe then.

But it's trivial to make pg_xlog a separate mount point.

--
Angular momentum makes the world go 'round.

#8Paul Förster
paul.foerster@gmail.com
In reply to: Stephen Frost (#6)
Re: Netapp SnapCenter

Hi Stephen,

On 18. Jun, 2020, at 21:26, Stephen Frost <sfrost@snowman.net> wrote:
If the entire database, all tablespaces, and pg_wal, are on the same
volume and the snapshot of the volume is atomic, then you don't actually
need to go through the start/stop backup- a snapshot being restored will
look just like a system crash and PG will just go back to the last
checkpoint and replay the WAL that's in pg_wal and it should reach
consistency and come up.

we DID use tablespaces as an old habit (coming from Oracle :-( when everything PostgreSQL was new). When we realized what a bad idea that was, I reorganized them all away. So we don't use tablespaces anymore. Still, they used to be on the same volume, just a different directory. So even that would have been no problem in this particular case. :-)

Only Flyway doesn't like a DBA to reorg a tablespace if developers add a tablespace clause to their create table/index statements. But we made the developers remove those clauses.

The one issue here is that if you're using the deprecated exxclusive
backup API, then PG will create a backup_label file in the data
directory. If the system reboots while that file exists, there's a good
chance that PG won't start up cleanly since, due to the file existing,
it thinks that it's restoring from a backup when it isn't.

we don't. We use the non-exclusive backup mode according to:

25.3.3.1. Making A Non-Exclusive Low-Level Backup
https://www.postgresql.org/docs/current/continuous-archiving.html

But I must say that I'm really not a friend of being forced to keep the session open until the pg_stop_backup() call occurs. This really gave me a huge headache when writing the backup script. It's solved now but I consider this very ugly.

When I wrote our backup mode script I read the deprecation note about the exclusive mode backup. This is why I decided to go with non-exclusive to be ready when exclusive backup mode is finally removed some day. Yet, I don't see the reason. Everything has to be consistent. So a non-exclusive backup mode makes absolutely no sense to me. Either the whole database cluster is in backup mode or it is not. There's nothing in between.

This is unlike Oracle where you can set single tablespaces to backup mode and then just backup them separately. Still, I never saw a use for this with Oracle too.

Cheers,
Paul

#9Paul Förster
paul.foerster@gmail.com
In reply to: Ron (#7)
Re: Netapp SnapCenter

Hi Ron,

On 18. Jun, 2020, at 21:30, Ron <ronljohnsonjr@gmail.com> wrote:
But it's trivial to make pg_xlog a separate mount point.

technically yes, but that would mean it would go to a different volume and hence destroy atomicity of the Netapp snapshots.

Also, you'd have a huge administrative barrier against such things in our company.

Cheers,
Paul

#10Stephen Frost
sfrost@snowman.net
In reply to: Paul Förster (#8)
Re: Netapp SnapCenter

Greetings,

* Paul Förster (paul.foerster@gmail.com) wrote:

When I wrote our backup mode script I read the deprecation note about the exclusive mode backup. This is why I decided to go with non-exclusive to be ready when exclusive backup mode is finally removed some day. Yet, I don't see the reason. Everything has to be consistent. So a non-exclusive backup mode makes absolutely no sense to me. Either the whole database cluster is in backup mode or it is not. There's nothing in between.

Glad that you went with the non-exclusive method.

When it comes to 'backup mode', it's actually the case that there can be
multiple backups running concurrently because there isn't actually a
single 'cluster wide backup mode', really. Regarding the deprecated
methodology, there just isn't a way for the database on-disk image to
look exactly like a backup while also being able to survive a
crash/restart with only the WAL that's in the pg_wal directory. Perhaps
there's other things we could do but at some point it has to be accepted
that there's gotta be something changed in the data directory to
indicate that it's a backup and not just a crash, and that change needs
to happen *after* the backup/snapshot/whatever has been taken, otherwise
there's a window of risk where a crash/restart would fail.

Thanks,

Stephen

#11Paul Förster
paul.foerster@gmail.com
In reply to: Stephen Frost (#10)
Re: Netapp SnapCenter

Hi Stephen,

On 19. Jun, 2020, at 18:02, Stephen Frost <sfrost@snowman.net> wrote:
When it comes to 'backup mode', it's actually the case that there can be
multiple backups running concurrently because there isn't actually a
single 'cluster wide backup mode', really.

this is what I don't understand. Why would there be a reason to run multiple backups concurrently? I mean, using pg_start_backup() means I then have to backup (using whatever method) of the while PGDATA and not just some random file ${PGDATA}/base/13297/2685. And since I have to backup the whole of PGDATA anyway, why would I want to do that multiple times, even more so, concurrently?

I read the backup doc but I just can't grasp the idea of why multiple concurrent backup capabilities should yield any benefit at all.

Cheers,
Paul

#12Magnus Hagander
magnus@hagander.net
In reply to: Paul Förster (#11)
Re: Netapp SnapCenter

On Sat, Jun 20, 2020 at 9:04 AM Paul Förster <paul.foerster@gmail.com>
wrote:

Hi Stephen,

On 19. Jun, 2020, at 18:02, Stephen Frost <sfrost@snowman.net> wrote:
When it comes to 'backup mode', it's actually the case that there can be
multiple backups running concurrently because there isn't actually a
single 'cluster wide backup mode', really.

this is what I don't understand. Why would there be a reason to run
multiple backups concurrently? I mean, using pg_start_backup() means I then
have to backup (using whatever method) of the while PGDATA and not just
some random file ${PGDATA}/base/13297/2685. And since I have to backup the
whole of PGDATA anyway, why would I want to do that multiple times, even
more so, concurrently?

I read the backup doc but I just can't grasp the idea of why multiple
concurrent backup capabilities should yield any benefit at all.

One not uncommon case is for example being able to provision a new replica
while a backup is running. Since replicas are provisioned starting off a
base backup, being able to run that concurrently is very useful. Especially
if base backups take a long time to run.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#13Paul Förster
paul.foerster@gmail.com
In reply to: Magnus Hagander (#12)
Re: Netapp SnapCenter

Hi Magnus,

On 21. Jun, 2020, at 21:35, Magnus Hagander <magnus@hagander.net> wrote:
One not uncommon case is for example being able to provision a new replica while a backup is running. Since replicas are provisioned starting off a base backup, being able to run that concurrently is very useful. Especially if base backups take a long time to run.

hmm, a backup shouldn't take so long as to delay a base backup significantly. But that's just my opinion. I may be a little snapshot biased, though, where it only takes a few seconds. :-)

Cheers,
Paul

#14Stephen Frost
sfrost@snowman.net
In reply to: Paul Förster (#13)
Re: Netapp SnapCenter

Greetings,

* Paul Förster (paul.foerster@gmail.com) wrote:

On 21. Jun, 2020, at 21:35, Magnus Hagander <magnus@hagander.net> wrote:
One not uncommon case is for example being able to provision a new replica while a backup is running. Since replicas are provisioned starting off a base backup, being able to run that concurrently is very useful. Especially if base backups take a long time to run.

hmm, a backup shouldn't take so long as to delay a base backup significantly. But that's just my opinion. I may be a little snapshot biased, though, where it only takes a few seconds. :-)

That's not the only case that I, at least, have heard of- folks aren't
really very happy with their backups fail when they could have just as
well completed, even if they're overlapping. Sure, it's better if
backups are scheduled such that they don't overlap, but that can be hard
to guarantee.

The thing about this is though that the new API avoids *other* issues,
like what happens if the system crashes during a backup (which is an
entirely common thing that happens, considering how long many backups
take...) and it does so in a relatively reasonable way while also
allowing concurrent backups, which is perhaps a relatively modest
benefit but isn't the main point of the different API.

Thanks,

Stephen

#15Paul Förster
paul.foerster@gmail.com
In reply to: Stephen Frost (#14)
Re: Netapp SnapCenter

Hi Stephen,

On 22. Jun, 2020, at 07:36, Stephen Frost <sfrost@snowman.net> wrote:
That's not the only case that I, at least, have heard of- folks aren't
really very happy with their backups fail when they could have just as
well completed, even if they're overlapping. Sure, it's better if
backups are scheduled such that they don't overlap, but that can be hard
to guarantee.

I see.

The thing about this is though that the new API avoids *other* issues,
like what happens if the system crashes during a backup (which is an
entirely common thing that happens, considering how long many backups
take...) and it does so in a relatively reasonable way while also
allowing concurrent backups, which is perhaps a relatively modest
benefit but isn't the main point of the different API.

that makes me curious about another thing. The output of pg_stop_backup() is to be stored. Otherwise the backup is useless. So far, so good. But what if the server crashes in the middle of the backup and pg_stop_back() hence is never reached? In this case, it obviously does not create any output.

Ok, you usually start the server, the database does a crash recovery and opens. Then, some time later, you do the usual backup and all is well. This is like 99.999% of all cases.

But what if you need to restore to the latest transaction while the database was running in backup mode during which the crash occurred. How does that work if no pg_stop_backup() output exists? Did I miss something here?

Cheers,
Paul

#16Peter J. Holzer
hjp-pgsql@hjp.at
In reply to: Paul Förster (#15)
Re: Netapp SnapCenter

On 2020-06-22 08:02:06 +0200, Paul Förster wrote:

On 22. Jun, 2020, at 07:36, Stephen Frost <sfrost@snowman.net> wrote:
The thing about this is though that the new API avoids *other* issues,
like what happens if the system crashes during a backup (which is an
entirely common thing that happens, considering how long many backups
take...) and it does so in a relatively reasonable way while also
allowing concurrent backups, which is perhaps a relatively modest
benefit but isn't the main point of the different API.

that makes me curious about another thing. The output of
pg_stop_backup() is to be stored. Otherwise the backup is useless. So
far, so good. But what if the server crashes in the middle of the
backup and pg_stop_back() hence is never reached? In this case, it
obviously does not create any output.

Correct. Your backup will also almost certainly be missing some files,
so it won't be usable anyway.

Ok, you usually start the server, the database does a crash recovery
and opens. Then, some time later, you do the usual backup and all is
well. This is like 99.999% of all cases.

But what if you need to restore to the latest transaction while the
database was running in backup mode during which the crash occurred.
How does that work if no pg_stop_backup() output exists? Did I miss
something here?

Restore the previous backup and replay WALs from there (that assumes of
course that you are archiving WALs continuously, but if you don't, you
can't do PITR in general, so if you have that requirement you are doing
it).

hp

--
_ | Peter J. Holzer | Story must make more sense than reality.
|_|_) | |
| | | hjp@hjp.at | -- Charles Stross, "Creative writing
__/ | http://www.hjp.at/ | challenge!"

#17Magnus Hagander
magnus@hagander.net
In reply to: Paul Förster (#15)
Re: Netapp SnapCenter

On Mon, Jun 22, 2020 at 8:02 AM Paul Förster <paul.foerster@gmail.com>
wrote:

Hi Stephen,

On 22. Jun, 2020, at 07:36, Stephen Frost <sfrost@snowman.net> wrote:
That's not the only case that I, at least, have heard of- folks aren't
really very happy with their backups fail when they could have just as
well completed, even if they're overlapping. Sure, it's better if
backups are scheduled such that they don't overlap, but that can be hard
to guarantee.

I see.

Yeah, especially when your backups are a number of TB which makes them take
Some Time (TM) to complete...

The thing about this is though that the new API avoids *other* issues,

like what happens if the system crashes during a backup (which is an
entirely common thing that happens, considering how long many backups
take...) and it does so in a relatively reasonable way while also
allowing concurrent backups, which is perhaps a relatively modest
benefit but isn't the main point of the different API.

that makes me curious about another thing. The output of pg_stop_backup()
is to be stored. Otherwise the backup is useless. So far, so good. But what
if the server crashes in the middle of the backup and pg_stop_back() hence
is never reached? In this case, it obviously does not create any output.

Whenever the connection that ran pg_start_backup() disconnects without
calling pg_stop_backup(), the "state" of being "in backup mode" is "rolled
back" in the database. So similar to how a transaction you started with
BEGIN gets rolled back if you just disconnect without issuing COMMIT.

Your backup will of course be invalid in this case, but the database itself
will be fine. (And the inability to ensure this is exactly why the old
"exclusive mode" for backups is deprecated -- but the non-exclusive mode is
safe with this) So it is of course very important to check that the
pg_stop_backup() step completed successfully, and fail the entire backup if
it did not.

Ok, you usually start the server, the database does a crash recovery and

opens. Then, some time later, you do the usual backup and all is well. This
is like 99.999% of all cases.

But what if you need to restore to the latest transaction while the
database was running in backup mode during which the crash occurred. How
does that work if no pg_stop_backup() output exists? Did I miss something
here?

It does not work off *that* base backup. But if you start from the *prior*
be backup (one that did complete with a successful pg_stop_backup) then you
can still use the archived wal to recover to any point in time.

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#18Paul Förster
paul.foerster@gmail.com
In reply to: Peter J. Holzer (#16)
Re: Netapp SnapCenter

Hi Peter,

On 22. Jun, 2020, at 13:01, Peter J. Holzer <hjp-pgsql@hjp.at> wrote:
Restore the previous backup and replay WALs from there (that assumes of
course that you are archiving WALs continuously, but if you don't, you
can't do PITR in general, so if you have that requirement you are doing
it).

ok, that makes it a lot clearer to me. Thank you very much.

Cheers,
Paul

#19Paul Förster
paul.foerster@gmail.com
In reply to: Magnus Hagander (#17)
Re: Netapp SnapCenter

Hi Magnus,

On 22. Jun, 2020, at 13:08, Magnus Hagander <magnus@hagander.net> wrote:

It does not work off *that* base backup. But if you start from the *prior* be backup (one that did complete with a successful pg_stop_backup) then you can still use the archived wal to recover to any point in time.

ok, that's clear now. Thank you very much.

Cheers,
Paul

#20Stephen Frost
sfrost@snowman.net
In reply to: Paul Förster (#19)
Re: Netapp SnapCenter

Greetings,

* Paul Förster (paul.foerster@gmail.com) wrote:

On 22. Jun, 2020, at 13:08, Magnus Hagander <magnus@hagander.net> wrote:
It does not work off *that* base backup. But if you start from the *prior* be backup (one that did complete with a successful pg_stop_backup) then you can still use the archived wal to recover to any point in time.

ok, that's clear now. Thank you very much.

Right, and tools like pgbackrest will figure this kind of thing out for
you too- just give it the time you want to restore to and it'll figure
out the right backup to use.

Thanks,

Stephen

#21Wolff, Ken L
ken.l.wolff@lmco.com
In reply to: Stephen Frost (#20)
#22Paul Förster
paul.foerster@gmail.com
In reply to: Wolff, Ken L (#21)
#23Stephen Frost
sfrost@snowman.net
In reply to: Paul Förster (#22)
#24Wolff, Ken L
ken.l.wolff@lmco.com
In reply to: Stephen Frost (#23)
#25Paul Förster
paul.foerster@gmail.com
In reply to: Wolff, Ken L (#24)
#26Wolff, Ken L
ken.l.wolff@lmco.com
In reply to: Paul Förster (#25)
#27Paul Förster
paul.foerster@gmail.com
In reply to: Wolff, Ken L (#26)
#28David G. Johnston
david.g.johnston@gmail.com
In reply to: Paul Förster (#27)
#29Magnus Hagander
magnus@hagander.net
In reply to: Paul Förster (#27)
#30Paul Förster
paul.foerster@gmail.com
In reply to: Magnus Hagander (#29)
#31Paul Förster
paul.foerster@gmail.com
In reply to: Stephen Frost (#23)
#32Stephen Frost
sfrost@snowman.net
In reply to: Paul Förster (#30)
#33Paul Förster
paul.foerster@gmail.com
In reply to: Stephen Frost (#32)
#34Stephen Frost
sfrost@snowman.net
In reply to: Paul Förster (#33)
#35Paul Förster
paul.foerster@gmail.com
In reply to: Stephen Frost (#34)
#36Stephen Frost
sfrost@snowman.net
In reply to: Paul Förster (#35)
#37Paul Förster
paul.foerster@gmail.com
In reply to: Stephen Frost (#36)