Include WAL in base backup

Started by Magnus Haganderabout 15 years ago29 messageshackers
Jump to latest
#1Magnus Hagander
magnus@hagander.net

Here's a cutdown version of the idea about including WAL in the base
backup. What I initially wanted was to introduce a way to guarantee
that the required WAL (with some sort of cutoff of course) would be
available for the backup, but I ran out of time for that. We can
always add that later.

For now, you need to set wal_keep_segments to make it work properly,
but if you do the idea is that the tar file/stream generated in the
base backup will include all the required WAL files. That means that
you can start a postmaster directly against the directory extracted
from the tarball, and you no longer need to set up archive logging to
get a backup.

I've got some refactoring I want to do around the
SendBackupDirectory() function after this, but a review of the
functionality first would be good. And obviously, documentation is
still necessary.

The patch to pg_basebackup applies on top of the previously posted version.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachments:

basebackup_wal.patchtext/x-patch; charset=US-ASCII; name=basebackup_wal.patchDownload+91-23
#2Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Magnus Hagander (#1)
Re: Include WAL in base backup

Magnus Hagander <magnus@hagander.net> writes:

Here's a cutdown version of the idea about including WAL in the base
backup. What I initially wanted was to introduce a way to guarantee
that the required WAL (with some sort of cutoff of course) would be
available for the backup, but I ran out of time for that. We can
always add that later.

What if you start a concurrent process that begins streaming the WAL
segments just before you start the backup, and you stop it after having
stopped the backup. I would think that then, the local received files
would be complete. We would only need a program able to stream the WAL
segments and build WAL files from that… do you know about one? :)

For now, you need to set wal_keep_segments to make it work properly,

That's quite a big foot gun, isn't it? You would have to at least offer
an option to check for your backup or to call it broken when you miss
some WAL files on the server.

The only other safe option I know about that's not a pg_streamrecv
subprocess would be to require archiving for the duration of the base
backup, but I think we agreed that it would be nice being able to bypass
that.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#3Magnus Hagander
magnus@hagander.net
In reply to: Dimitri Fontaine (#2)
Re: Include WAL in base backup

On Sat, Jan 15, 2011 at 23:32, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:

Magnus Hagander <magnus@hagander.net> writes:

Here's a cutdown version of the idea about including WAL in the base
backup. What I initially wanted was to introduce a way to guarantee
that the required WAL (with some sort of cutoff of course) would be
available for the backup, but I ran out of time for that. We can
always add that later.

What if you start a concurrent process that begins streaming the WAL
segments just before you start the backup, and you stop it after having
stopped the backup.  I would think that then, the local received files
would be complete.  We would only need a program able to stream the WAL
segments and build WAL files from that… do you know about one? :)

Sure, if you stream the backups "on the side", then you don't need
this feature. This is for "very simple filesystem backups", without
the need to set up streaming of archiving.

If you're streaming the WAL separately, you'd use pg_basebackup in
non-wal mode. That's why it's optional.

For now, you need to set wal_keep_segments to make it work properly,

That's quite a big foot gun, isn't it?  You would have to at least offer
an option to check for your backup or to call it broken when you miss
some WAL files on the server.

Oh, it will give you an ERROR if any of the required WAL files aren't
around anymore. So you will know that your backup is incomplete.

The only other safe option I know about that's not a pg_streamrecv
subprocess would be to require archiving for the duration of the base
backup, but I think we agreed that it would be nice being able to bypass
that.

No, the safe option I'm envisioning for this one is to just prevent
the server from recycling the logfiles until they have been sent off.
Witha cap so we don't prevent recycling forever if the client hangs.
This is the part I started working on, but didn't hav etime to finish
for the CF. I've not given up on it, it's just waiting for later.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#4Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Magnus Hagander (#3)
Re: Include WAL in base backup

Magnus Hagander <magnus@hagander.net> writes:

What if you start a concurrent process that begins streaming the WAL
segments just before you start the backup, and you stop it after having
stopped the backup.  I would think that then, the local received files
would be complete.  We would only need a program able to stream the WAL
segments and build WAL files from that… do you know about one? :)

Sure, if you stream the backups "on the side", then you don't need
this feature. This is for "very simple filesystem backups", without
the need to set up streaming of archiving.

What I meant is: why don't we provide an option to automate just that
behavior in pg_basebackup? It looks like a fork() then calling code you
already wrote.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#5Magnus Hagander
magnus@hagander.net
In reply to: Dimitri Fontaine (#4)
Re: Include WAL in base backup

On Sun, Jan 16, 2011 at 18:45, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:

Magnus Hagander <magnus@hagander.net> writes:

What if you start a concurrent process that begins streaming the WAL
segments just before you start the backup, and you stop it after having
stopped the backup.  I would think that then, the local received files
would be complete.  We would only need a program able to stream the WAL
segments and build WAL files from that… do you know about one? :)

Sure, if you stream the backups "on the side", then you don't need
this feature. This is for "very simple filesystem backups", without
the need to set up streaming of archiving.

What I meant is: why don't we provide an option to automate just that
behavior in pg_basebackup?  It looks like a fork() then calling code you
already wrote.

Ah, I see. That's a good idea.

However, it's not quite that simple. "just adding a fork()" doesn't
work on all our platforms, and you need to deal with feedback and such
between them as well.

I think it's definitely something worth doing in the long run, but I
think we should start with the simpler way.

Oh, and this might be the use-case for integrating the streaming log
code as well :-) But if we plan to do that, perhaps we should pick a
different name for the binary? Or maybe just share code with another
one later..

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#6Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Magnus Hagander (#5)
Re: Include WAL in base backup

Magnus Hagander <magnus@hagander.net> writes:

However, it's not quite that simple. "just adding a fork()" doesn't
work on all our platforms, and you need to deal with feedback and such
between them as well.

I'd think client-side, we have an existing implementation with the
parallel pg_restore option. Don't know (yet) how easy it is to reuse
that code…

Oh, and this might be the use-case for integrating the streaming log
code as well :-) But if we plan to do that, perhaps we should pick a
different name for the binary? Or maybe just share code with another
one later..

You're talking about the pg_streamrecv binary? Then yes, my idea about
it is that it should become the default archive client we ship with
PostgreSQL. And grow into offering a way to be the default restore
command too. I'd see the current way that the standby can switch
between restoring from archive and from a live stream as a first step
into that direction.

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#7Magnus Hagander
magnus@hagander.net
In reply to: Dimitri Fontaine (#6)
Re: Include WAL in base backup

On Sun, Jan 16, 2011 at 20:13, Dimitri Fontaine <dimitri@2ndquadrant.fr> wrote:

Magnus Hagander <magnus@hagander.net> writes:

However, it's not quite that simple. "just adding a fork()" doesn't
work on all our platforms, and you need to deal with feedback and such
between them as well.

I'd think client-side, we have an existing implementation with the
parallel pg_restore option.  Don't know (yet) how easy it is to reuse
that code…

True.

But however we do it, it will be significantly more complex than just
including the WAL. And I want to make sure we get *something* done in
time for 9.1, and then improve upon it. If we can get the improvement
into 9.1 that's great, but if not it will have to wait until 9.2 - and
I don't want to leave us without anything for 9.1.

Oh, and this might be the use-case for integrating the streaming log
code as well :-) But if we plan to do that, perhaps we should pick a
different name for the binary? Or maybe just share code with another
one later..

You're talking about the pg_streamrecv binary?  Then yes, my idea about
it is that it should become the default archive client we ship with
PostgreSQL.  And grow into offering a way to be the default restore
command too.  I'd see the current way that the standby can switch
between restoring from archive and from a live stream as a first step
into that direction.

Right. I did put both the base backup and the wal streaming in the
same binary earlier, and the only shared code was the one to connect
to the db. So I split them apart again. This is the reason to put them
back together perhaps - or just have the ability for pg_basebackup to
fork()+exec() the wal streamer.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#8Dimitri Fontaine
dimitri@2ndQuadrant.fr
In reply to: Magnus Hagander (#7)
Re: Include WAL in base backup

Magnus Hagander <magnus@hagander.net> writes:

But however we do it, it will be significantly more complex than just
including the WAL. And I want to make sure we get *something* done in
time for 9.1, and then improve upon it. If we can get the improvement
into 9.1 that's great, but if not it will have to wait until 9.2 - and
I don't want to leave us without anything for 9.1.

+1. The only point I'm not clear on is the complexity, and I trust you
to cut off at the right point here… meanwhile, I'm still asking for this
little more until you say your plate's full :)

Right. I did put both the base backup and the wal streaming in the
same binary earlier, and the only shared code was the one to connect
to the db. So I split them apart again. This is the reason to put them
back together perhaps - or just have the ability for pg_basebackup to
fork()+exec() the wal streamer.

That would be awesome.

Then pg_streamrecv could somehow accept options that make it suitable
for use as an archive command, connecting to your (still?) third-party
daemon? At this point it'd be pg_walsender :)

Regards,
--
Dimitri Fontaine
http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support

#9Stephen Frost
sfrost@snowman.net
In reply to: Magnus Hagander (#1)
Re: Include WAL in base backup

Greetings,

* Magnus Hagander (magnus@hagander.net) wrote:

For now, you need to set wal_keep_segments to make it work properly,
but if you do the idea is that the tar file/stream generated in the
base backup will include all the required WAL files.

Is there some reason to not ERROR outright if we're asked to provide WAL
and wal_keep_segments isn't set..? I'd rather do that than only ERROR
when a particular WAL is missing.. That could lead to transient backup
errors that an inexperienced sysadmin or DBA might miss or ignore.
They'll notice if it doesn't work the first time they try it and spits
out a hint about wal_keep_segments.

That means that
you can start a postmaster directly against the directory extracted
from the tarball, and you no longer need to set up archive logging to
get a backup.

Like that part.

I've got some refactoring I want to do around the
SendBackupDirectory() function after this, but a review of the
functionality first would be good. And obviously, documentation is
still necessary.

mkay, I'm not going to try to make this ready for committer, but will
provide my comments on it overall. Bit difficult to review when someone
else is reviewing the base patch too. :/

Here goes:

- I'm not a huge fan of the whole 'closetar' option, that feels really
rather wrong to me. Why not just open it and close it in
perform_base_backup(), unconditionally?

- I wonder if you're not getting to a level where you shold be using a
struct to pass the relevant information to perform_base_backup()
instead of adding more arguments on.. That's going to get unwieldy at
some point.

- Why not initialize logid and logseg like so?:

int logid = startptr.xlogid;
int logseg = startptr.xrecoff / XLogSegSize;

Then use those in your elog? Seems cleaner to me.

- A #define right in the middle of a while loop...? Really?

- The grammar changes strike me as.. odd. Typically, you would have an
'option' production that you can then have a list of and then let each
option be whatever the OR'd set of options is. Wouldn't the current
grammar require that you put the options in a specific order? That'd
blow.

@@ -687,7 +690,7 @@ BaseBackup()
 		 * once since it can be relocated, and it will be checked before we do
 		 * anything anyway.
 		 */
-		if (basedir != NULL && i > 0)
+		if (basedir != NULL && !PQgetisnull(res, i, 1))
 			verify_dir_is_empty_or_create(PQgetvalue(res, i, 1));
 	}

- Should the 'i > 0' conditional above still be there..?

So, that's my review from just reading the source code and the thread..
Hope it's useful, sorry it's not more. :/

Thanks,

Stephen

#10Stephen Frost
sfrost@snowman.net
In reply to: Stephen Frost (#9)
Re: Include WAL in base backup

Magnus,

* Stephen Frost (sfrost@snowman.net) wrote:

mkay, I'm not going to try to make this ready for committer, but will
provide my comments on it overall. Bit difficult to review when someone
else is reviewing the base patch too. :/

I went ahead and marked it as 'waiting on author', since I'd like
feedback on my comments, but I'll try to make time in the next few days
to apply the two patches and test it out, unless I hear otherwise.

Thanks,

Stephen

#11Magnus Hagander
magnus@hagander.net
In reply to: Stephen Frost (#9)
Re: Include WAL in base backup

On Thu, Jan 20, 2011 at 05:03, Stephen Frost <sfrost@snowman.net> wrote:

Greetings,

* Magnus Hagander (magnus@hagander.net) wrote:

For now, you need to set wal_keep_segments to make it work properly,
but if you do the idea is that the tar file/stream generated in the
base backup will include all the required WAL files.

Is there some reason to not ERROR outright if we're asked to provide WAL
and wal_keep_segments isn't set..?  I'd rather do that than only ERROR
when a particular WAL is missing..  That could lead to transient backup
errors that an inexperienced sysadmin or DBA might miss or ignore.
They'll notice if it doesn't work the first time they try it and spits
out a hint about wal_keep_segments.

Well, in a "smaller:ish" database you can easily do the full backup
before you run out of segments in the data directory even when you
haven't set wal_keep_segments. If we error out, we force extra work on
the user in the trivial case.

I'd rather not change that, but instead (as Fujii-san has also
mentioned is needed anyway) put some more effort into documenting in
which cases you need to set it.

I've got some refactoring I want to do around the
SendBackupDirectory() function after this, but a review of the
functionality first would be good. And obviously, documentation is
still necessary.

mkay, I'm not going to try to make this ready for committer, but will
provide my comments on it overall.  Bit difficult to review when someone
else is reviewing the base patch too. :/

Heh, yeah.

Here goes:

- I'm not a huge fan of the whole 'closetar' option, that feels really
 rather wrong to me.  Why not just open it and close it in
 perform_base_backup(), unconditionally?

Yeah, we could move the whole thing up there. Or, as I mentioned in an
IM conversation with Heikki, just get rid of SendBackupDirectory()
completely and inline it inside the loop in perform_base_backup().
Given that it's basically just 5 lines + a call to sendDir()..

- I wonder if you're not getting to a level where you shold be using a
 struct to pass the relevant information to perform_base_backup()
 instead of adding more arguments on..  That's going to get unwieldy at
 some point.

Yeah, probably.

We *could* pass the BaseBackupCmd struct from the parser all the way
in - or is that cheating too much on abstractions? It seems if we
don't, we're just going to hav ea copy of that struct without the
NodeTag member..

- Why not initialize logid and logseg like so?:

       int logid = startptr.xlogid;
       int logseg = startptr.xrecoff / XLogSegSize;

 Then use those in your elog?  Seems cleaner to me.

Hmm. Yes. Agreed.

- A #define right in the middle of a while loop...?  Really?

Haha, yeah, that was a typo. I didn't remember the name of the
variable so I stuck it there for testing and forgot it. It should be
ThisTimeLineID, and no #define at all.

- The grammar changes strike me as..  odd.  Typically, you would have an
 'option' production that you can then have a list of and then let each
 option be whatever the OR'd set of options is.  Wouldn't the current
 grammar require that you put the options in a specific order?  That'd
 blow.

It does require them in a specific order. The advantage is that it
makes the code easier. and it's not like end users are expected to run
them anyway...

Now, I'm no bison export, so it might be an easy fix. But the way I
could figure, to make them order indepdent I have to basically collect
them up together as a List instead of just as a struct, and then loop
through that list to build a struct later.

If someone who knows Bison better can tell me a neater way to do that,
I'll be happy to change :-)

@@ -687,7 +690,7 @@ BaseBackup()
                * once since it can be relocated, and it will be checked before we do
                * anything anyway.
                */
-               if (basedir != NULL && i > 0)
+               if (basedir != NULL && !PQgetisnull(res, i, 1))
                       verify_dir_is_empty_or_create(PQgetvalue(res, i, 1));
       }

- Should the 'i > 0' conditional above still be there..?

No. That's a cheat-check that assumes the base directory is always
sent first. Which is not true anymore - with this patch we always send
it *last* so we can include the WAL in it.

So, that's my review from just reading the source code and the thread..
Hope it's useful, sorry it's not more. :/

Thanks - it certainly is!

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#11)
Re: Include WAL in base backup

Magnus Hagander <magnus@hagander.net> writes:

- Why not initialize logid and logseg like so?:

� � � �int logid = startptr.xlogid;
� � � �int logseg = startptr.xrecoff / XLogSegSize;

�Then use those in your elog? �Seems cleaner to me.

Hmm. Yes. Agreed.

Marginal complaint here: int is the wrong type, I'm pretty sure.

regards, tom lane

#13Fujii Masao
masao.fujii@gmail.com
In reply to: Tom Lane (#12)
Re: Include WAL in base backup

On Fri, Jan 21, 2011 at 12:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

- Why not initialize logid and logseg like so?:

       int logid = startptr.xlogid;
       int logseg = startptr.xrecoff / XLogSegSize;

 Then use those in your elog?  Seems cleaner to me.

Hmm. Yes. Agreed.

Marginal complaint here: int is the wrong type, I'm pretty sure.

And, we should use XLByteToPrevSeg here instead of just =, I think.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#14Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#13)
Re: Include WAL in base backup

On Mon, Jan 24, 2011 at 08:45, Fujii Masao <masao.fujii@gmail.com> wrote:

On Fri, Jan 21, 2011 at 12:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

- Why not initialize logid and logseg like so?:

       int logid = startptr.xlogid;
       int logseg = startptr.xrecoff / XLogSegSize;

 Then use those in your elog?  Seems cleaner to me.

Hmm. Yes. Agreed.

Marginal complaint here: int is the wrong type, I'm pretty sure.

And, we should use XLByteToPrevSeg here instead of just =, I think.

Not XLByteToSeg?

(I admit I missed the existance of both those macros, though)

I plan to post a rebased version of this patch today or tomorrow, btw,
that should work against all the changes that have been applied to
git.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#15Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#14)
Re: Include WAL in base backup

On Mon, Jan 24, 2011 at 4:47 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Mon, Jan 24, 2011 at 08:45, Fujii Masao <masao.fujii@gmail.com> wrote:

On Fri, Jan 21, 2011 at 12:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

- Why not initialize logid and logseg like so?:

       int logid = startptr.xlogid;
       int logseg = startptr.xrecoff / XLogSegSize;

 Then use those in your elog?  Seems cleaner to me.

Hmm. Yes. Agreed.

Marginal complaint here: int is the wrong type, I'm pretty sure.

And, we should use XLByteToPrevSeg here instead of just =, I think.

Not XLByteToSeg?

Checking... yeah, you are right. We should use XLByteToSeg since
the REDO starting WAL record doesn't exist in the previous WAL
segment when the REDO starting location is a boundary byte.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#16Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#15)
Re: Include WAL in base backup

On Mon, Jan 24, 2011 at 09:03, Fujii Masao <masao.fujii@gmail.com> wrote:

On Mon, Jan 24, 2011 at 4:47 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Mon, Jan 24, 2011 at 08:45, Fujii Masao <masao.fujii@gmail.com> wrote:

On Fri, Jan 21, 2011 at 12:28 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

- Why not initialize logid and logseg like so?:

       int logid = startptr.xlogid;
       int logseg = startptr.xrecoff / XLogSegSize;

 Then use those in your elog?  Seems cleaner to me.

Hmm. Yes. Agreed.

Marginal complaint here: int is the wrong type, I'm pretty sure.

And, we should use XLByteToPrevSeg here instead of just =, I think.

Not XLByteToSeg?

Checking... yeah, you are right. We should use XLByteToSeg since
the REDO starting WAL record doesn't exist in the previous WAL
segment when the REDO starting location is a boundary byte.

Here's an updated version of the patch:

* rebased on the current git master (including changing the switch
from -w to -x since -w was used now)
* includes some documentation
* use XLogByteToSeg and uint32 as mentioned here
* refactored to remove SendBackupDirectory(), removing the ugly closetar boolean

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachments:

basebackup_wal.patchtext/x-patch; charset=US-ASCII; name=basebackup_wal.patchDownload+148-35
#17Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#16)
Re: Include WAL in base backup

On Tue, Jan 25, 2011 at 12:33 AM, Magnus Hagander <magnus@hagander.net> wrote:

Here's an updated version of the patch:

* rebased on the current git master (including changing the switch
from -w to -x since -w was used now)
* includes some documentation
* use XLogByteToSeg and uint32 as mentioned here
* refactored to remove SendBackupDirectory(), removing the ugly closetar boolean

I reviewed the patch. Here are comments.

+ {"xlog", no_argument, NULL, 'w'},

Typo: s/w/x

/*
* BASE_BACKUP [LABEL <label>] [PROGRESS] [FAST]
*/

In repl_gram.y, the above comment needs to be updated.

Both this patch and the multiple-backup one removes SendBackupDirectory
in the almost same way. So, how about committing that common part first?

+ XLByteToSeg(endptr, endlogid, endlogseg);

XLByteToPrevSeg should be used for the backup end location. Because
when it's a boundary byte, all the required WAL records exist in the
previous WAL segment. This is why pg_stop_backup uses XLByteToPrevSeg
for the backup end location.

+		elog(DEBUG1, "Going to send wal from %i.%i to %i.%i",
+			 logid, logseg,
+			 endlogid, endlogseg);

%u should be used instead of %i. Or how about logging the filename?

+ char xlogname[64];

How about using MAXFNAMELEN instead of 64?

+			XLogFileName(xlogname, ThisTimeLineID, logid, logseg);
+			sprintf(fn, "./pg_xlog/%s", xlogname);

How about using XLogFilePath? instead?

+			if (logid > endptr.xlogid ||
+				(logid == endptr.xlogid && logseg >= endptr.xrecoff / XLogSegSize))

Why don't you use endlogseg?

The estimated size of $PGDATA doesn't include WAL files, but the
actual one does.
This inconsistency messes up the progress report as follows:

33708/17323 kB (194%) 1/1 tablespaces

So, what about sparating the progress report into two parts: one for $PGDATA and
tablespaces, another for WAL files?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#18Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#17)
Re: Include WAL in base backup

On Tue, Jan 25, 2011 at 10:56, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Jan 25, 2011 at 12:33 AM, Magnus Hagander <magnus@hagander.net> wrote:

Here's an updated version of the patch:

* rebased on the current git master (including changing the switch
from -w to -x since -w was used now)
* includes some documentation
* use XLogByteToSeg and uint32 as mentioned here
* refactored to remove SendBackupDirectory(), removing the ugly closetar boolean

I reviewed the patch. Here are comments.

+               {"xlog", no_argument, NULL, 'w'},

Typo: s/w/x

Ah, oops. thanks.

/*
 * BASE_BACKUP [LABEL <label>] [PROGRESS] [FAST]
 */

In repl_gram.y, the above comment needs to be updated.

Same here - oops, thanks. It was also missing the quotes around <label>.

Both this patch and the multiple-backup one removes SendBackupDirectory
in the almost same way. So, how about committing that common part first?

I think they're small enough that we'll just commit it along with
whichever comes first and then have the other one merge with it.

+               XLByteToSeg(endptr, endlogid, endlogseg);

XLByteToPrevSeg should be used for the backup end location. Because
when it's a boundary byte, all the required WAL records exist in the
previous WAL segment. This is why pg_stop_backup uses XLByteToPrevSeg
for the backup end location.

Well, it's just for debugging output, really, but see below.

+               elog(DEBUG1, "Going to send wal from %i.%i to %i.%i",
+                        logid, logseg,
+                        endlogid, endlogseg);

%u should be used instead of %i. Or how about logging the filename?

I actually plan to remove that DEBUG output completely (sorry, forgot
to mention that), I'm not sure it's useful to have it around once we
apply. Or do you think it would be useful to keep?

+                       char    xlogname[64];

How about using MAXFNAMELEN instead of 64?

Agreed.

+                       XLogFileName(xlogname, ThisTimeLineID, logid, logseg);
+                       sprintf(fn, "./pg_xlog/%s", xlogname);

How about using XLogFilePath? instead?

Can't do that, because we need the "./" part when we call sendFile() -
it's structured around having that one, since we get it from the file
name looping.

+                       if (logid > endptr.xlogid ||
+                               (logid == endptr.xlogid && logseg >= endptr.xrecoff / XLogSegSize))

Why don't you use endlogseg?

Honestly? Because I thought I added endlogseg just for the debugging
output, didn't realize I had it down here.

But if I do, then I should not use the XLByteToPrevSeg() per your
suggestion above, right? Keep it the way it is.

The estimated size of $PGDATA doesn't include WAL files, but the
actual one does.
This inconsistency messes up the progress report as follows:

   33708/17323 kB (194%) 1/1 tablespaces

Yeah, that is definitely a potential problem. I think we're just going
to have to document it - and in a big database, it's not going to be
quite as bad...

So, what about sparating the progress report into two parts: one for $PGDATA and
tablespaces, another for WAL files?

We can't really do that. We need to send the progress report before we
begin the tar file, and we don't know how many xlog segments we will
have at that time. And we don't want to send pg_xlog as a separate tar
stream, because if we do we won't be able to get the single-tar-output
in the simple case - thus no piping etc. I actually had the xlog data
as it's own tar stream in the beginning, but Heikki convinced me of
the advantage of keeping it in the main one...

What we could do, I guess, is to code pg_basebackup to detect when it
starts receiving xlog files and then stop increasing the percentage at
that point, instead just doing a counter?

(the discussed changse above have been applied and pushed to github)

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#19Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#18)
Re: Include WAL in base backup

On Tue, Jan 25, 2011 at 8:21 PM, Magnus Hagander <magnus@hagander.net> wrote:

+               elog(DEBUG1, "Going to send wal from %i.%i to %i.%i",
+                        logid, logseg,
+                        endlogid, endlogseg);

%u should be used instead of %i. Or how about logging the filename?

I actually plan to remove that DEBUG output completely (sorry, forgot
to mention that), I'm not sure it's useful to have it around once we
apply. Or do you think it would be useful to keep?

Nope. I'm OK to remove that.

+                       XLogFileName(xlogname, ThisTimeLineID, logid, logseg);
+                       sprintf(fn, "./pg_xlog/%s", xlogname);

How about using XLogFilePath? instead?

Can't do that, because we need the "./" part when we call sendFile() -
it's structured around having that one, since we get it from the file
name looping.

Understood.

+                       if (logid > endptr.xlogid ||
+                               (logid == endptr.xlogid && logseg >= endptr.xrecoff / XLogSegSize))

Why don't you use endlogseg?

Honestly? Because I thought I added endlogseg just for the debugging
output, didn't realize I had it down here.

But if I do, then I should not use the XLByteToPrevSeg() per your
suggestion above, right? Keep it the way it is.

Yes. If we check "logseg >= endlogseg", we should use XLByteToSeg.

So, what about sparating the progress report into two parts: one for $PGDATA and
tablespaces, another for WAL files?

We can't really do that. We need to send the progress report before we
begin the tar file, and we don't know how many xlog segments we will
have at that time. And we don't want to send pg_xlog as a separate tar
stream, because if we do we won't be able to get the single-tar-output
in the simple case - thus no piping etc. I actually had the xlog data
as it's own tar stream in the beginning, but Heikki convinced me of
the advantage of keeping it in the main one...

What we could do, I guess, is to code pg_basebackup to detect when it
starts receiving xlog files and then stop increasing the percentage at
that point, instead just doing a counter?

Umm.. not progressing the report during receiving WAL files seems
also odd. But I don't have another good idea... For now, I'm OK if the
behavior of the progress report is well documented.

(the discussed changse above have been applied and pushed to github)

Thanks! I'll test and review that.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#20Fujii Masao
masao.fujii@gmail.com
In reply to: Fujii Masao (#19)
Re: Include WAL in base backup

On Tue, Jan 25, 2011 at 10:28 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

(the discussed changse above have been applied and pushed to github)

Thanks! I'll test and review that.

WAL file might get recycled or removed while walsender is reading it.
So the WAL file which pg_basebackup seemingly received successfully
might be actually invalid. Shouldn't we need to check that what we read
is valid as XLogRead does?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#21Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#20)
#22Magnus Hagander
magnus@hagander.net
In reply to: Magnus Hagander (#21)
#23Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#22)
#24Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#23)
#25Fujii Masao
masao.fujii@gmail.com
In reply to: Heikki Linnakangas (#24)
#26Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#25)
#27Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#23)
#28Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#27)
#29Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#28)