streamlined standby procedure

Started by Csaba Nagyalmost 20 years ago19 messages
#1Csaba Nagy
nagy@ecircle-ag.com

Hi all,

I decided to start implementing a streamlined WAL shipping based standby
building procedure. My aim is fairly simple: to be able to build a
standby as automated as possible.

The ultimate simplicity would be for me:
- install postgres on the standby machine;
- create a directory for the data base files, containing
postgresql.conf and pg_hba.conf, and a standby.conf file;
- start up the postmaster with a "--build-standby" option;

All the rest should be done automatically by postgres.

The procedure should be something similar to the one available today if
you do it manually. The main difference would be that the standby
postmaster should connect to the primary server, and get all table data
and WAL record stream through normal data base connections...

To facilitate this process, I thought about why not expose the WAL files
through a system view ? Something along the lines of:

pg_wal (
name text,
walrecords blob,
iscurrent boolean
)

Then anybody interested in the WAL record stream could easily find out
which is the current WAL record, and get any of the existing WAL records
by streaming the blob. Closed WAL files would be streamed completely,
and the current WAL file could be streamed in realtime as it is
created... this would facilitate an always as up to date as possible
standby, as it could get the WAL records in real time.

To make it possible to reliably get closed WAL records, a WAL
subscription system could be created, where a subscriber (the standby)
could signal which is the oldest WAL file it did not get yet. The
primary machine would keep all the WAL files extending back to the
oldest subscribed one. Then each time the subscriber finishes processing
a WAL file, it can signal it's interest in the next one. This could be
implemented by a table like:

pg_wal_subscription (
subscriber text,
name text
)

The subscribers would insert a record in this table, and update it to
the next WAL file after they processed one. The subscriber names should
be unique across subscribers, this should be managed by the admin who
sets up the subscribers. When the subscriber is not interested anymore,
it can delete it's subscription record. That could be done by the DBA
too if things go haywire...

To build a stand by based on log shipping it is necessary to get over
all the data base files too. That could be also done by exposing them
through some view, which in turn might take advantage of knowledge of
the table structure to compress the data to be transferred. The main
idea is to do all transfers through normal DB connections, so the only
configuration to be done is to point the standby to the master
machine...

So, all this said, I'm not too familiar with either C programming or the
postgres sources, but I'm willing to learn. And the project as a whole
seems a bit too much to do it in one piece, so my first aim is to expose
the WAL records in a system view.

I would really appreciate any comments you have...

Thanks,
Csaba.

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Csaba Nagy (#1)
Re: streamlined standby procedure

Csaba Nagy <nagy@ecircle-ag.com> writes:

The procedure should be something similar to the one available today if
you do it manually. The main difference would be that the standby
postmaster should connect to the primary server, and get all table data
and WAL record stream through normal data base connections...

This is pie-in-the-sky really. A design like that would mean that the
master could *never* recycle WAL files, because it could never know when
some slave would pop up demanding a copy of ancient history.

regards, tom lane

#3Csaba Nagy
nagy@ecircle-ag.com
In reply to: Tom Lane (#2)
Re: streamlined standby procedure

You obviously did not read further down :-)

I was proposing a subscription system, where the slave can specify the
oldest WAL file it is interested in, and keep that up to date as it
processes them.

That could cause of course trouble if a slave dies and it won't update
the subscription, but that's not any different than the current setup if
the archive_command starts failing constantly because the archive site
is down. In either case human intervention is needed. The DB connection
based approach has the advantage that you don't have to restart the
server if your slave location changes, and you can have multiple slaves
at the same time if you like, e.g. if you want to smoothly move over the
slave to another machine.

Cheers,
Csaba.

Show quoted text

On Tue, 2006-02-07 at 16:18, Tom Lane wrote:

Csaba Nagy <nagy@ecircle-ag.com> writes:

The procedure should be something similar to the one available today if
you do it manually. The main difference would be that the standby
postmaster should connect to the primary server, and get all table data
and WAL record stream through normal data base connections...

This is pie-in-the-sky really. A design like that would mean that the
master could *never* recycle WAL files, because it could never know when
some slave would pop up demanding a copy of ancient history.

regards, tom lane

#4Tom Lane
tgl@sss.pgh.pa.us
In reply to: Csaba Nagy (#3)
Re: streamlined standby procedure

Csaba Nagy <nagy@ecircle-ag.com> writes:

You obviously did not read further down :-)
I was proposing a subscription system, where the slave can specify the
oldest WAL file it is interested in, and keep that up to date as it
processes them.

And how is that "system view" going to handle subscriptions?

regards, tom lane

#5Andrew Rawnsley
ronz@investoranalytics.com
In reply to: Csaba Nagy (#1)
Re: streamlined standby procedure

IMHO the #1 priority in the current PITR/WAL shipping system is to make the
standby able to tolerate being shut down and restarted, i.e. actually having
a true standby mode and not the current method of doing it only on startup.

While it is a trivial thing to fool postgres into staying in startup/restore
mode with a restore_command that blocks until more files are available, if
the machine needs to be shut down for whatever reason you have to go back to
the last image and replay to the present, which isn't always convenient. Nor
are you able to shut down the standby, copy it to a second instance to use
for testing/development/whatever, and restart the standby.

(Just to be clear - I _really_ like the flexibility of the customizable
archive and restore structure with PITR in PG, but the lack of a standby
mode always reminds me of whacking my forehead at 3am on the too-low doorway
into my son's bedroom...)

On 2/7/06 10:11 AM, "Csaba Nagy" <nagy@ecircle-ag.com> wrote:

Show quoted text

Hi all,

I decided to start implementing a streamlined WAL shipping based standby
building procedure. My aim is fairly simple: to be able to build a
standby as automated as possible.

The ultimate simplicity would be for me:
- install postgres on the standby machine;
- create a directory for the data base files, containing
postgresql.conf and pg_hba.conf, and a standby.conf file;
- start up the postmaster with a "--build-standby" option;

All the rest should be done automatically by postgres.

The procedure should be something similar to the one available today if
you do it manually. The main difference would be that the standby
postmaster should connect to the primary server, and get all table data
and WAL record stream through normal data base connections...

To facilitate this process, I thought about why not expose the WAL files
through a system view ? Something along the lines of:

pg_wal (
name text,
walrecords blob,
iscurrent boolean
)

Then anybody interested in the WAL record stream could easily find out
which is the current WAL record, and get any of the existing WAL records
by streaming the blob. Closed WAL files would be streamed completely,
and the current WAL file could be streamed in realtime as it is
created... this would facilitate an always as up to date as possible
standby, as it could get the WAL records in real time.

To make it possible to reliably get closed WAL records, a WAL
subscription system could be created, where a subscriber (the standby)
could signal which is the oldest WAL file it did not get yet. The
primary machine would keep all the WAL files extending back to the
oldest subscribed one. Then each time the subscriber finishes processing
a WAL file, it can signal it's interest in the next one. This could be
implemented by a table like:

pg_wal_subscription (
subscriber text,
name text
)

The subscribers would insert a record in this table, and update it to
the next WAL file after they processed one. The subscriber names should
be unique across subscribers, this should be managed by the admin who
sets up the subscribers. When the subscriber is not interested anymore,
it can delete it's subscription record. That could be done by the DBA
too if things go haywire...

To build a stand by based on log shipping it is necessary to get over
all the data base files too. That could be also done by exposing them
through some view, which in turn might take advantage of knowledge of
the table structure to compress the data to be transferred. The main
idea is to do all transfers through normal DB connections, so the only
configuration to be done is to point the standby to the master
machine...

So, all this said, I'm not too familiar with either C programming or the
postgres sources, but I'm willing to learn. And the project as a whole
seems a bit too much to do it in one piece, so my first aim is to expose
the WAL records in a system view.

I would really appreciate any comments you have...

Thanks,
Csaba.

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

#6Csaba Nagy
nagy@ecircle-ag.com
In reply to: Tom Lane (#4)
Re: streamlined standby procedure

On Tue, 2006-02-07 at 16:45, Tom Lane wrote:

Csaba Nagy <nagy@ecircle-ag.com> writes:

You obviously did not read further down :-)
I was proposing a subscription system, where the slave can specify the
oldest WAL file it is interested in, and keep that up to date as it
processes them.

And how is that "system view" going to handle subscriptions?

Not THAT "system view" will handle the subscription... there would be
this view which exposes the WAL files, which would do exactly that,
expose the existing WAL files, and only those which exist. Of course it
must place some kind of lock on the WAL file it currently streams so it
is not recycled, but other than that this view should not be concerned
with subscription issues.

The subscription system would be a table in which you can insert
(subscriber_id, oldest_WAL_file_name_i'm_interested_in) tuples. When
recycling WAL files, this table will be consulted and only WAL files
older than the oldest entry in the subscription table are allowed to be
recycled.

Slaves will update their subscription line after processing each WAL
file, setting it to the next WAL file name they need. So the oldest WAL
to be kept will actually be in sync with what the slaves really need.

OK, now I start to see what you mean, i.e. if there's no subscription
then all WAL files are immediately recycled, and the view can only show
one entry, the current WAL. But actually that's OK, you still can see
what's the current WAL file, and can subscribe starting with it.

Cheers,
Csaba.

#7Csaba Nagy
nagy@ecircle-ag.com
In reply to: Andrew Rawnsley (#5)
Re: streamlined standby procedure

On Tue, 2006-02-07 at 16:58, Andrew Rawnsley wrote:

IMHO the #1 priority in the current PITR/WAL shipping system is to make the
standby able to tolerate being shut down and restarted, i.e. actually having
a true standby mode and not the current method of doing it only on startup.

This fits nicely in what I would like to achieve, and it might be
actually a better start.

While it is a trivial thing to fool postgres into staying in startup/restore
mode with a restore_command that blocks until more files are available, if
the machine needs to be shut down for whatever reason you have to go back to
the last image and replay to the present, which isn't always convenient. Nor
are you able to shut down the standby, copy it to a second instance to use
for testing/development/whatever, and restart the standby.

Why would you shut down the standby to copy it ? It would by nicer to be
able build a "standby of the standby" ;-)
Even nicer would be to have a stand-by which allows read only access,
but I guess that's a tough call.

Cheers,
Csaba.

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew Rawnsley (#5)
Re: streamlined standby procedure

Andrew Rawnsley <ronz@investoranalytics.com> writes:

IMHO the #1 priority in the current PITR/WAL shipping system is to make the
standby able to tolerate being shut down and restarted, i.e. actually having
a true standby mode and not the current method of doing it only on startup.

How is shutting down the standby a good idea? Seems like that will
block the master too --- or at least result in WAL log files piling up
rapidly. If the standby goes off-line, abandoning it and starting from
a fresh base backup when you are ready to restart it seems like the most
likely recovery path. For sure I don't see this as the "#1 priority".

regards, tom lane

#9Andrew Rawnsley
andrew.rawnsley@investoranalytics.com
In reply to: Tom Lane (#8)
Re: streamlined standby procedure

On 2/7/06 1:19 PM, "Tom Lane" <tgl@sss.pgh.pa.us> wrote:

Andrew Rawnsley <ronz@investoranalytics.com> writes:

IMHO the #1 priority in the current PITR/WAL shipping system is to make the
standby able to tolerate being shut down and restarted, i.e. actually having
a true standby mode and not the current method of doing it only on startup.

How is shutting down the standby a good idea? Seems like that will
block the master too --- or at least result in WAL log files piling up
rapidly. If the standby goes off-line, abandoning it and starting from
a fresh base backup when you are ready to restart it seems like the most
likely recovery path. For sure I don't see this as the "#1 priority".

regards, tom lane

I wasn't suggesting this in the context of Csaba's auto-ship plan (and, to
be clear, not #1 in the context of the entire database development. Just
PITR).

For one, sometimes you have no choice about the standby being shut down, but
most of the time you can plan for that. As for Csaba's question of why I
would want to create a copy of a standby, its the easiest way to create
development and testing snapshots at standby locations, and for making
paranoid operations people confident that your standby procedures are
working. I do it with my Oracle (pardon the 'O' word) installations all the
time, and I despise being able to do something with Oracle that I can't with
PG.

I ship WAL logs around in batches independent of the archive command to
several locations. Either I :

A) let the logs 'pile up' on the standby (crap has to pile up somewhere),
and apply them should the standby be needed (could take some time should the
'pile' be large). The only way here to keep the recover time short is to
re-image the database frequently and ship it around. Not nice with big
databases.

B) Do the blocking recover command to continually apply the logs as they get
moved around. While this can generate good clever points, its a rig.
Fragile.

To me the question isn't 'How is shutting down the standby a good idea?',
its 'How is shutting down the standby not a bad idea?'. Different points of
view, I suppose - In my situation the standby going offline is not a
catastrophic event like the primary would be; its even a useful thing. If
there was some rman-style thing like people have suggested to auto-ship logs
around, then yeah, dealing with an offline standby could be a tricky thing
(but would need some solution anyway). But hell, Slony and Mammoth can
tolerate it, I just would like log shipping to handle it also.

Maybe it isn't #1 priority, but its something I view as a limitation, and
not just lacking a feature. Its something I can't control. As I originally
mentioned, the customizable archive/restore feature is great, superior to
dealing with it in Oracle. But the standby mode makes the Oracle setup more
bulletproof.

--

Andrew Rawnsley
Chief Technology Officer
Investor Analytics, LLC
(740) 587-0114
http://www.investoranalytics.com

#10Simon Riggs
simon@2ndquadrant.com
In reply to: Csaba Nagy (#1)
Re: streamlined standby procedure

On Tue, 2006-02-07 at 16:11 +0100, Csaba Nagy wrote:

I decided to start implementing a streamlined WAL shipping based standby
building procedure. My aim is fairly simple: to be able to build a
standby as automated as possible.

What do you find difficult about the current method? That's got to be
the first discussion point.

Best Regards, Simon Riggs

#11Csaba Nagy
nagy@ecircle-ag.com
In reply to: Simon Riggs (#10)
Re: streamlined standby procedure

What do you find difficult about the current method? That's got to be
the first discussion point.

The main problem I have is the complexity of setup.

It involves a lot of additional scripting which you have to get it right
to be actually reliable. The documentation is giving a rough idea on how
to do it, but it is quite some work to make it work, and you can't
really tell that is reliable...

Another issue is that unless you got the archive_command right in the
master server from the beginning, you will have to restart the server
once you decide to build your standby... the archive_command is a
start-up time parameter. This could be of course alleviated by always
using a stub script as archive command, and let it do nothing if you
don't have a standby, and then modify it to start archiving to the right
place once you start building one, or if you want to move it to another
machine. But this is also not documented, and you have to figure it out
for yourself.

And responding to Tom's other post regarding a real standby mode, where
you could stop the standby and then later resume it still in standby
mode: I would actually have a good use for it tonight :-)
We will migrate our application to a new version, which involves some
changes in the data base. Now it would be nice to stop the standby
BEFORE doing these changes, and if the migration fails for some reason,
start up the standby and use it with our old application version. But if
the migration succeeds, I want to start up the standby still as standby,
and make it resume standby operation... rebuilding it will take half day
at least.
So a standby which can be isolated for a while would actually be useful.
OK, now that I'm thinking a bit more about this, I could achieve this by
fiddling with the restore_command so it stops delivering the logs for a
while. But again it is not straightforward.

The whole point of this is that starting up a standby should be as
simple as pointing the standby machine to the primary server, without
shell scripting gimmicks (which are OS specific and therefore hard to
document in a generic way), without the need of fiddling with the
primary's configuration (see archive command), without the need to
restart the primary if the archive command was not right in the first
place. And to make it easy to start up one more standby if needed, or
isolate it for a while when doing some risky work on the primary.

It's about user friendliness and flexibility. It's not that it can't do
the work right now, but it's really hard to do it...

Cheers,
Csaba.

#12Simon Riggs
simon@2ndquadrant.com
In reply to: Csaba Nagy (#11)
Re: streamlined standby procedure

On Wed, 2006-02-08 at 11:10 +0100, Csaba Nagy wrote:

Another issue is that unless you got the archive_command right in the
master server from the beginning, you will have to restart the server
once you decide to build your standby... the archive_command is a
start-up time parameter

Much of your difficulty seems to come from your thinking that this
parameter requires a restart. It doesn't - check it out.

The script need not be complex, you only need to put a wait loop in the
restore script so that it waits for the next log file.

Best Regards, Simon Riggs

#13Csaba Nagy
nagy@ecircle-ag.com
In reply to: Simon Riggs (#12)
Re: streamlined standby procedure

OK, this is news to me, I recall that last looking at the configuration
docs it was start-up time, but I might be wrong.

[looking up the docs]

OK, citing the 8.1 online docs:

17.5.3. Archiving

archive_command (string)

The shell command to execute to archive a completed segment of
the WAL file series. If this is an empty string (the default),
WAL archiving is disabled. Any %p in the string is replaced by
the absolute path of the file to archive, and any %f is replaced
by the file name only. Use %% to embed an actual % character in
the command. For more information see Section 23.3.1. This
option can only be set at server start or in the postgresql.conf
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
file.

It is important for the command to return a zero exit status if
and only if it succeeds. Examples:

archive_command = 'cp "%p" /mnt/server/archivedir/"%f"'
archive_command = 'copy "%p" /mnt/server/archivedir/"%f"' # Windows

It's at least confusing... it does say "or in the postgresql.conf file" too, but I must have overlooked that... and the "only" word is really confusing there.

[looking at: http://www.powerpostgresql.com/Downloads/annotated_conf_80.html]

OK, this is what confused me. The annotated conf file states it's a startup time parameter.

Well, good to know it's not...

Actually, my needs of PITR/standby building are mostly solved by now, but it's sure not an easy ride, and I really wonder if there is any readily available script bundle to do it for a windows server...

Maybe a standby-building-tutorial is all what is needed...

Cheers,
Csaba.

Show quoted text

Much of your difficulty seems to come from your thinking that this
parameter requires a restart. It doesn't - check it out.

The script need not be complex, you only need to put a wait loop in the
restore script so that it waits for the next log file.

Best Regards, Simon Riggs

#14Marko Kreen
markokr@gmail.com
In reply to: Tom Lane (#8)
Re: streamlined standby procedure

On 2/7/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Rawnsley <ronz@investoranalytics.com> writes:

IMHO the #1 priority in the current PITR/WAL shipping system is to make the
standby able to tolerate being shut down and restarted, i.e. actually having
a true standby mode and not the current method of doing it only on startup.

How is shutting down the standby a good idea? Seems like that will
block the master too --- or at least result in WAL log files piling up
rapidly. If the standby goes off-line, abandoning it and starting from
a fresh base backup when you are ready to restart it seems like the most
likely recovery path. For sure I don't see this as the "#1 priority".

For regular recovery it is indeed unnecessary. But I would also
put this as #1 TODO for long-running hot-standby case. The requirement
to start all over makes current setup rather cumbersome.

And #2 would be running read-only queries while in recovery :)

--
marko

#15Jim C. Nasby
jnasby@pervasive.com
In reply to: Csaba Nagy (#13)
Re: streamlined standby procedure

On Thu, Feb 09, 2006 at 10:37:34AM +0100, Csaba Nagy wrote:

option can only be set at server start or in the postgresql.conf

Yeah, this is something that was actually discussed on -docs recently. I
believe -HEAD was changed so that every parameter that used to have that
text now says: option can be set in postgresql.conf or on the server
command line.

Maybe a standby-building-tutorial is all what is needed...

Having that would be very handy indeed. In fact, if you want to get
rough notes put together I'd be happy to edit it into a finished
product, though I'm not sure of the best place to put it. I could
certainly post it somewhere on pervasive-postgres.com if nothing else...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#16Jim C. Nasby
jnasby@pervasive.com
In reply to: Marko Kreen (#14)
Re: streamlined standby procedure

On Thu, Feb 09, 2006 at 04:44:20PM +0200, Marko Kreen wrote:

On 2/7/06, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Andrew Rawnsley <ronz@investoranalytics.com> writes:

IMHO the #1 priority in the current PITR/WAL shipping system is to make the
standby able to tolerate being shut down and restarted, i.e. actually having
a true standby mode and not the current method of doing it only on startup.

How is shutting down the standby a good idea? Seems like that will
block the master too --- or at least result in WAL log files piling up
rapidly. If the standby goes off-line, abandoning it and starting from
a fresh base backup when you are ready to restart it seems like the most
likely recovery path. For sure I don't see this as the "#1 priority".

For regular recovery it is indeed unnecessary. But I would also
put this as #1 TODO for long-running hot-standby case. The requirement
to start all over makes current setup rather cumbersome.

What happens right now when you want to bring the standby up? Do you
have to kill it out of recovery mode and re-start, forcing it to replay
WAL again anyway?

And #2 would be running read-only queries while in recovery :)

That would be damn handy :)
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

#17Marko Kreen
markokr@gmail.com
In reply to: Jim C. Nasby (#16)
Re: streamlined standby procedure

On 2/9/06, Jim C. Nasby <jnasby@pervasive.com> wrote:

What happens right now when you want to bring the standby up? Do you
have to kill it out of recovery mode and re-start, forcing it to replay
WAL again anyway?

touch $LOGDIR/STOP ...

--
marko

#18Bruce Momjian
pgman@candle.pha.pa.us
In reply to: Jim C. Nasby (#15)
Re: streamlined standby procedure

I have updated the 8.1.X documentation to remove the word "only", which
is confusing.

---------------------------------------------------------------------------

Jim C. Nasby wrote:

On Thu, Feb 09, 2006 at 10:37:34AM +0100, Csaba Nagy wrote:

option can only be set at server start or in the postgresql.conf

Yeah, this is something that was actually discussed on -docs recently. I
believe -HEAD was changed so that every parameter that used to have that
text now says: option can be set in postgresql.conf or on the server
command line.

Maybe a standby-building-tutorial is all what is needed...

Having that would be very handy indeed. In fact, if you want to get
rough notes put together I'd be happy to edit it into a finished
product, though I'm not sure of the best place to put it. I could
certainly post it somewhere on pervasive-postgres.com if nothing else...
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461

---------------------------(end of broadcast)---------------------------
TIP 1: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

-- 
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073
#19Jim C. Nasby
jnasby@pervasive.com
In reply to: Csaba Nagy (#1)
Re: streamlined standby procedure

On Tue, Feb 07, 2006 at 04:11:07PM +0100, Csaba Nagy wrote:

pg_wal (
name text,
walrecords blob,
iscurrent boolean
)

ISTM that a current_wal_file() function would be better than a boolean
on the view...

So, all this said, I'm not too familiar with either C programming or the
postgres sources, but I'm willing to learn. And the project as a whole
seems a bit too much to do it in one piece, so my first aim is to expose
the WAL records in a system view.

Seems a reasonable place to start. IMO there's a lot more info that
could be exposed through system views than is currently being done.
--
Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com
Pervasive Software http://pervasive.com work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461