Google Cloud Platform, snapshots and WAL

Started by Moreno Andreoabout 9 years ago3 messagesgeneral

moreno.andreo@evolu-s.it

about 9 years ago

Hi everyone,

I have my PostgreSQL 9.5 server running on a VM instance on Google
Compute Engine (Google Cloud Platform) on Debian Jessie (8.3), and I
have another dedicated VM instance that, every night at 3.00, takes a
snapshot of the whole disk, without stopping the PG instance itself.
Snapshots are stored and kept by Google in an incremental way, and we
keep the last 2 weeks of history.
The question is: Keeping all two weeks worth of pg_xlog files, I don't
think I still need a periodic pg_basebackup to perform PITR, do I?

Thanks in advance,
Moreno.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Ben

bench@silentmedia.com

about 9 years ago

In reply to: Moreno Andreo (#1)

Re: Google Cloud Platform, snapshots and WAL

On Mar 20, 2017, at 6:31 AM, Moreno Andreo <moreno.andreo@evolu-s.it> wrote:

Hi everyone,

I have my PostgreSQL 9.5 server running on a VM instance on Google Compute Engine (Google Cloud Platform) on Debian Jessie (8.3), and I have another dedicated VM instance that, every night at 3.00, takes a snapshot of the whole disk, without stopping the PG instance itself.
Snapshots are stored and kept by Google in an incremental way, and we keep the last 2 weeks of history.
The question is: Keeping all two weeks worth of pg_xlog files, I don't think I still need a periodic pg_basebackup to perform PITR, do I?

You need a base backup to apply your wals to. So long as you have one from after the start of your wal stream, you should be good for PITR. That said, replaying 2 weeks of wal files can take a long time. For that reason alone, it might well make sense to have more than a single basebackup snapshot.

Also, I cannot stress enough how important it is to actually test your recovery strategy. Few things are worse than assuming you can recover only to find out when you need to that you cannot. Do not let https://about.gitlab.com/2017/02/10/postmortem-of-database-outage-of-january-31/ <https://about.gitlab.com/2017/02/10/postmortem-of-database-outage-of-january-31/> happen to you.

Moreno Andreo

moreno.andreo@evolu-s.it

about 9 years ago

In reply to: Ben (#2)

Re: Google Cloud Platform, snapshots and WAL

<html>
<head>
<meta content="text/html; charset=windows-1252"
http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Il 20/03/2017 17:45, Ben Chobot ha
scritto:<br>
</div>
<blockquote
cite="mid:F78A1CFE-BCE4-4BBB-9B44-9D491010B280@silentmedia.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html;
charset=windows-1252">
<br class="">
<div>
<blockquote type="cite" class="">
<div class="">On Mar 20, 2017, at 6:31 AM, Moreno Andreo <<a
moz-do-not-send="true"
href="mailto:moreno.andreo@evolu-s.it" class="">moreno.andreo@evolu-s.it</a>>
wrote:</div>
<br class="Apple-interchange-newline">
<div class="">
<div class="">Hi everyone,<br class="">
<br class="">
ï¿½ï¿½ï¿½I have my PostgreSQL 9.5 server running on a VM
instance on Google Compute Engine (Google Cloud Platform)
on Debian Jessie (8.3), and I have another dedicated VM
instance that, every night at 3.00, takes a snapshot of
the whole disk, without stopping the PG instance itself.<br
class="">
Snapshots are stored and kept by Google in an incremental
way, and we keep the last 2 weeks of history.<br class="">
The question is: Keeping all two weeks worth of pg_xlog
files, I don't think I still need a periodic pg_basebackup
to perform PITR, do I?<br class="">
</div>
</div>
</blockquote>
</div>
<br class="">
<div class="">You need a base backup to apply your wals to. So
long as you have one from after the start of your wal stream,
you should be good for PITR.</div>
</blockquote>
Hmmm... I went back in the docs and noticed I missed something. To
achieve PITR, the cluster needs to checkpoint, and this can be
obtained with the pg_start_backup() function... so if I try to get a
snapshot and start a recover (creating recovery.conf etc.) it will
not even start recovery, right?<br>
<br>
Now I'm gonna try 2 approaches:<br>
1. (straightforward) barman with basebackup and WAL archiving<br>
2. (GoogleCloud-oriented) disk snapshot between pg_start_backup and
pg_stop_backup (so the snapshot is taken just after the checkpoint),
WAL archiving<br>
<br>
I will report the results.<br>
<br>
<blockquote
cite="mid:F78A1CFE-BCE4-4BBB-9B44-9D491010B280@silentmedia.com"
type="cite">
<div class=""> That said, replaying 2 weeks of wal files can take
a long time. For that reason alone, it might well make sense to
have more than a single basebackup snapshot.</div>
</blockquote>
That's right, my (wrong) thought was to have a snapshot per day and
all 14 days worth of WALs, but after the meeting with the Google
Specialist, I'm oriented to make a base backup per day and then
store it on Nearline, making it expire after 14 days. Same for WAL
files.<br>
<br>
<blockquote
cite="mid:F78A1CFE-BCE4-4BBB-9B44-9D491010B280@silentmedia.com"
type="cite">
<div class="">Also, I cannot stress enough how important it is to
actually test your recovery strategy. </div>
</blockquote>
I totally agree... that's why I'm here. I don't want to prepare a
backup strategy when I already need to recover....<br>
<br>
Thanks<br>
Moreno.<br>
<br>
</body>
</html>