pg_receivexlog and feedback message

Started by Magnus Haganderover 13 years ago15 messages
#1Magnus Hagander
magnus@hagander.net

Right now, pg_receivexlog sets:
replymsg->write = InvalidXLogRecPtr;
replymsg->flush = InvalidXLogRecPtr;
replymsg->apply = InvalidXLogRecPtr;

when it sends it's status updates.

I'm thinking it sohuld set replymsg->write = blockpos instad.

Why? That way you can see in pg_stat_replication what has actually
been received by pg_receivexlog - not just what we last sent. This can
be useful in combination with an archive_command that can block WAL
recycling until it has been saved to the standby. And it would be
useful as a general monitoring thing as well.

I think the original reason was that it shouldn't interefer with
synchronous replication - but it does take away a fairly useful
usecase...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#2Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#1)
Re: pg_receivexlog and feedback message

On Tue, Jun 5, 2012 at 9:53 PM, Magnus Hagander <magnus@hagander.net> wrote:

Right now, pg_receivexlog sets:
                       replymsg->write = InvalidXLogRecPtr;
                       replymsg->flush = InvalidXLogRecPtr;
                       replymsg->apply = InvalidXLogRecPtr;

when it sends it's status updates.

I'm thinking it sohuld set replymsg->write = blockpos instad.

Why? That way you can see in pg_stat_replication what has actually
been received by pg_receivexlog - not just what we last sent. This can
be useful in combination with an archive_command that can block WAL
recycling until it has been saved to the standby. And it would be
useful as a general monitoring thing as well.

I think the original reason was that it shouldn't interefer with
synchronous replication - but it does take away a fairly useful
usecase...

I think that not only replaymsg->write but also ->flush should be set to
blockpos in pg_receivexlog. Which allows pg_receivexlog to behave
as synchronous standby, so we can write WAL to both local and remote
synchronously. I believe there are some use cases for synchronous
pg_receivexlog.

OTOH, neither replaymsg->write nor ->flush should be set to
InvalidXLogRecPtr, to prevent pg_basebackup from behaving as
synchronous standby.

Regards,

--
Fujii Masao

#3Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#2)
Re: pg_receivexlog and feedback message

On Tue, Jun 5, 2012 at 4:42 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Jun 5, 2012 at 9:53 PM, Magnus Hagander <magnus@hagander.net> wrote:

Right now, pg_receivexlog sets:
                       replymsg->write = InvalidXLogRecPtr;
                       replymsg->flush = InvalidXLogRecPtr;
                       replymsg->apply = InvalidXLogRecPtr;

when it sends it's status updates.

I'm thinking it sohuld set replymsg->write = blockpos instad.

Why? That way you can see in pg_stat_replication what has actually
been received by pg_receivexlog - not just what we last sent. This can
be useful in combination with an archive_command that can block WAL
recycling until it has been saved to the standby. And it would be
useful as a general monitoring thing as well.

I think the original reason was that it shouldn't interefer with
synchronous replication - but it does take away a fairly useful
usecase...

I think that not only replaymsg->write but also ->flush should be set to
blockpos in pg_receivexlog. Which allows pg_receivexlog to behave
as synchronous standby, so we can write WAL to both local and remote
synchronously. I believe there are some use cases for synchronous
pg_receivexlog.

pg_receivexlog doesn't currently fsync() after every write. It only
fsync():s complete files. So we'd need to set ->flush only at the end
of a segment, right?

OTOH, neither replaymsg->write nor ->flush should be set to
InvalidXLogRecPtr, to prevent pg_basebackup from behaving as
synchronous standby.

Oh, good point. So yeah, we'd need to make it a parameter to the function.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#4Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#3)
Re: pg_receivexlog and feedback message

On Tue, Jun 5, 2012 at 11:44 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Tue, Jun 5, 2012 at 4:42 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Jun 5, 2012 at 9:53 PM, Magnus Hagander <magnus@hagander.net> wrote:

Right now, pg_receivexlog sets:
                       replymsg->write = InvalidXLogRecPtr;
                       replymsg->flush = InvalidXLogRecPtr;
                       replymsg->apply = InvalidXLogRecPtr;

when it sends it's status updates.

I'm thinking it sohuld set replymsg->write = blockpos instad.

Why? That way you can see in pg_stat_replication what has actually
been received by pg_receivexlog - not just what we last sent. This can
be useful in combination with an archive_command that can block WAL
recycling until it has been saved to the standby. And it would be
useful as a general monitoring thing as well.

I think the original reason was that it shouldn't interefer with
synchronous replication - but it does take away a fairly useful
usecase...

I think that not only replaymsg->write but also ->flush should be set to
blockpos in pg_receivexlog. Which allows pg_receivexlog to behave
as synchronous standby, so we can write WAL to both local and remote
synchronously. I believe there are some use cases for synchronous
pg_receivexlog.

pg_receivexlog doesn't currently fsync() after every write. It only
fsync():s complete files. So we'd need to set ->flush only at the end
of a segment, right?

Yes.

Currently the status update is sent for each status interval. In sync
replication, transaction has to wait for a while even after pg_receivexlog
has written or flushed the WAL data.

So we should add new option which specifies whether pg_receivexlog
sends the status packet back as soon as it writes or flushes the WAL
data, like the walreceiver does?

Regards,

--
Fujii Masao

#5Robert Haas
robertmhaas@gmail.com
In reply to: Magnus Hagander (#3)
Re: pg_receivexlog and feedback message

On Tue, Jun 5, 2012 at 10:44 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Tue, Jun 5, 2012 at 4:42 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Jun 5, 2012 at 9:53 PM, Magnus Hagander <magnus@hagander.net> wrote:

Right now, pg_receivexlog sets:
                       replymsg->write = InvalidXLogRecPtr;
                       replymsg->flush = InvalidXLogRecPtr;
                       replymsg->apply = InvalidXLogRecPtr;

when it sends it's status updates.

I'm thinking it sohuld set replymsg->write = blockpos instad.

Why? That way you can see in pg_stat_replication what has actually
been received by pg_receivexlog - not just what we last sent. This can
be useful in combination with an archive_command that can block WAL
recycling until it has been saved to the standby. And it would be
useful as a general monitoring thing as well.

I think the original reason was that it shouldn't interefer with
synchronous replication - but it does take away a fairly useful
usecase...

I think that not only replaymsg->write but also ->flush should be set to
blockpos in pg_receivexlog. Which allows pg_receivexlog to behave
as synchronous standby, so we can write WAL to both local and remote
synchronously. I believe there are some use cases for synchronous
pg_receivexlog.

pg_receivexlog doesn't currently fsync() after every write. It only
fsync():s complete files. So we'd need to set ->flush only at the end
of a segment, right?

If you want to be able to use it as a synchronous standby, that's not
going to work very well. You could end up with pg_receivexlog waiting
for the end of the segment before it flushes; meanwhile, all the
clients are sitting there waiting for the flush to happen before they
do anything that could generate more WAL to fill the segment.

Unless you have a solution to that problem, I'd recommend setting
write (which should work with the new remote_write mode for sync rep)
but not setting flush.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#4)
Re: pg_receivexlog and feedback message

On Wed, Jun 6, 2012 at 8:26 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Jun 5, 2012 at 11:44 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Tue, Jun 5, 2012 at 4:42 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Jun 5, 2012 at 9:53 PM, Magnus Hagander <magnus@hagander.net> wrote:

Right now, pg_receivexlog sets:
                       replymsg->write = InvalidXLogRecPtr;
                       replymsg->flush = InvalidXLogRecPtr;
                       replymsg->apply = InvalidXLogRecPtr;

when it sends it's status updates.

I'm thinking it sohuld set replymsg->write = blockpos instad.

Why? That way you can see in pg_stat_replication what has actually
been received by pg_receivexlog - not just what we last sent. This can
be useful in combination with an archive_command that can block WAL
recycling until it has been saved to the standby. And it would be
useful as a general monitoring thing as well.

I think the original reason was that it shouldn't interefer with
synchronous replication - but it does take away a fairly useful
usecase...

I think that not only replaymsg->write but also ->flush should be set to
blockpos in pg_receivexlog. Which allows pg_receivexlog to behave
as synchronous standby, so we can write WAL to both local and remote
synchronously. I believe there are some use cases for synchronous
pg_receivexlog.

pg_receivexlog doesn't currently fsync() after every write. It only
fsync():s complete files. So we'd need to set ->flush only at the end
of a segment, right?

Yes.

Currently the status update is sent for each status interval. In sync
replication, transaction has to wait for a while even after pg_receivexlog
has written or flushed the WAL data.

So we should add new option which specifies whether pg_receivexlog
sends the status packet back as soon as it writes or flushes the WAL
data, like the walreceiver does?

That might be useful, but I think that's 9.3 material at this point.

But I think we can get the "set the write location" in as a bugfix.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#7Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#6)
Re: pg_receivexlog and feedback message

On Thu, Jun 7, 2012 at 5:05 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Wed, Jun 6, 2012 at 8:26 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Jun 5, 2012 at 11:44 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Tue, Jun 5, 2012 at 4:42 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Jun 5, 2012 at 9:53 PM, Magnus Hagander <magnus@hagander.net> wrote:

Right now, pg_receivexlog sets:
                       replymsg->write = InvalidXLogRecPtr;
                       replymsg->flush = InvalidXLogRecPtr;
                       replymsg->apply = InvalidXLogRecPtr;

when it sends it's status updates.

I'm thinking it sohuld set replymsg->write = blockpos instad.

Why? That way you can see in pg_stat_replication what has actually
been received by pg_receivexlog - not just what we last sent. This can
be useful in combination with an archive_command that can block WAL
recycling until it has been saved to the standby. And it would be
useful as a general monitoring thing as well.

I think the original reason was that it shouldn't interefer with
synchronous replication - but it does take away a fairly useful
usecase...

I think that not only replaymsg->write but also ->flush should be set to
blockpos in pg_receivexlog. Which allows pg_receivexlog to behave
as synchronous standby, so we can write WAL to both local and remote
synchronously. I believe there are some use cases for synchronous
pg_receivexlog.

pg_receivexlog doesn't currently fsync() after every write. It only
fsync():s complete files. So we'd need to set ->flush only at the end
of a segment, right?

Yes.

Currently the status update is sent for each status interval. In sync
replication, transaction has to wait for a while even after pg_receivexlog
has written or flushed the WAL data.

So we should add new option which specifies whether pg_receivexlog
sends the status packet back as soon as it writes or flushes the WAL
data, like the walreceiver does?

That might be useful, but I think that's 9.3 material at this point.

Fair enough. That's new feature rather than a bugfix.

But I think we can get the "set the write location" in as a bugfix.

Also "set the flush location"? Sending the flush location back seems
helpful when using pg_receivexlog for WAL archiving purpose. By
seeing the flush location we can ensure that WAL file has been archived
durably (IOW, WAL file has been flushed in remote archive area).

Regards,

--
Fujii Masao

#8Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#7)
Re: pg_receivexlog and feedback message

On Thursday, June 7, 2012, Fujii Masao wrote:

On Thu, Jun 7, 2012 at 5:05 AM, Magnus Hagander <magnus@hagander.net<javascript:;>>
wrote:

On Wed, Jun 6, 2012 at 8:26 PM, Fujii Masao <masao.fujii@gmail.com<javascript:;>>

wrote:

On Tue, Jun 5, 2012 at 11:44 PM, Magnus Hagander <magnus@hagander.net<javascript:;>>

wrote:

On Tue, Jun 5, 2012 at 4:42 PM, Fujii Masao <masao.fujii@gmail.com<javascript:;>>

wrote:

On Tue, Jun 5, 2012 at 9:53 PM, Magnus Hagander <magnus@hagander.net<javascript:;>>

wrote:

Right now, pg_receivexlog sets:
replymsg->write = InvalidXLogRecPtr;
replymsg->flush = InvalidXLogRecPtr;
replymsg->apply = InvalidXLogRecPtr;

when it sends it's status updates.

I'm thinking it sohuld set replymsg->write = blockpos instad.

Why? That way you can see in pg_stat_replication what has actually
been received by pg_receivexlog - not just what we last sent. This

can

be useful in combination with an archive_command that can block WAL
recycling until it has been saved to the standby. And it would be
useful as a general monitoring thing as well.

I think the original reason was that it shouldn't interefer with
synchronous replication - but it does take away a fairly useful
usecase...

I think that not only replaymsg->write but also ->flush should be set

to

blockpos in pg_receivexlog. Which allows pg_receivexlog to behave
as synchronous standby, so we can write WAL to both local and remote
synchronously. I believe there are some use cases for synchronous
pg_receivexlog.

pg_receivexlog doesn't currently fsync() after every write. It only
fsync():s complete files. So we'd need to set ->flush only at the end
of a segment, right?

Yes.

Currently the status update is sent for each status interval. In sync
replication, transaction has to wait for a while even after

pg_receivexlog

has written or flushed the WAL data.

So we should add new option which specifies whether pg_receivexlog
sends the status packet back as soon as it writes or flushes the WAL
data, like the walreceiver does?

That might be useful, but I think that's 9.3 material at this point.

Fair enough. That's new feature rather than a bugfix.

But I think we can get the "set the write location" in as a bugfix.

Also "set the flush location"? Sending the flush location back seems
helpful when using pg_receivexlog for WAL archiving purpose. By
seeing the flush location we can ensure that WAL file has been archived
durably (IOW, WAL file has been flushed in remote archive area).

You can do that with the write location as well, as long as you round it
off to complete segments, can't you?

In fact that's exactly the usecase that got me to realize we were missing
this :-)

//Magnus

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#9Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#8)
Re: pg_receivexlog and feedback message

On Thu, Jun 7, 2012 at 6:25 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Thursday, June 7, 2012, Fujii Masao wrote:

On Thu, Jun 7, 2012 at 5:05 AM, Magnus Hagander <magnus@hagander.net>
wrote:

On Wed, Jun 6, 2012 at 8:26 PM, Fujii Masao <masao.fujii@gmail.com>
wrote:

On Tue, Jun 5, 2012 at 11:44 PM, Magnus Hagander <magnus@hagander.net>
wrote:

On Tue, Jun 5, 2012 at 4:42 PM, Fujii Masao <masao.fujii@gmail.com>
wrote:

On Tue, Jun 5, 2012 at 9:53 PM, Magnus Hagander <magnus@hagander.net>
wrote:

Right now, pg_receivexlog sets:
                       replymsg->write = InvalidXLogRecPtr;
                       replymsg->flush = InvalidXLogRecPtr;
                       replymsg->apply = InvalidXLogRecPtr;

when it sends it's status updates.

I'm thinking it sohuld set replymsg->write = blockpos instad.

Why? That way you can see in pg_stat_replication what has actually
been received by pg_receivexlog - not just what we last sent. This
can
be useful in combination with an archive_command that can block WAL
recycling until it has been saved to the standby. And it would be
useful as a general monitoring thing as well.

I think the original reason was that it shouldn't interefer with
synchronous replication - but it does take away a fairly useful
usecase...

I think that not only replaymsg->write but also ->flush should be set
to
blockpos in pg_receivexlog. Which allows pg_receivexlog to behave
as synchronous standby, so we can write WAL to both local and remote
synchronously. I believe there are some use cases for synchronous
pg_receivexlog.

pg_receivexlog doesn't currently fsync() after every write. It only
fsync():s complete files. So we'd need to set ->flush only at the end
of a segment, right?

Yes.

Currently the status update is sent for each status interval. In sync
replication, transaction has to wait for a while even after
pg_receivexlog
has written or flushed the WAL data.

So we should add new option which specifies whether pg_receivexlog
sends the status packet back as soon as it writes or flushes the WAL
data, like the walreceiver does?

That might be useful, but I think that's 9.3 material at this point.

Fair enough. That's new feature rather than a bugfix.

But I think we can get the "set the write location" in as a bugfix.

Also "set the flush location"? Sending the flush location back seems
helpful when using pg_receivexlog for WAL archiving purpose. By
seeing the flush location we can ensure that WAL file has been archived
durably (IOW, WAL file has been flushed in remote archive area).

You  can do that with the write location as well, as long as you round it
off to complete segments, can't you?

You mean to prevent pg_receivexlog from sending back the end of WAL file
as the write location *before* it completes the WAL file? If so, yes. But
why do you want to keep the flush location invalid?

Regards,

--
Fujii Masao

#10Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#9)
Re: pg_receivexlog and feedback message

On Thursday, June 7, 2012, Fujii Masao wrote:

On Thu, Jun 7, 2012 at 6:25 PM, Magnus Hagander <magnus@hagander.net>
wrote:

On Thursday, June 7, 2012, Fujii Masao wrote:

On Thu, Jun 7, 2012 at 5:05 AM, Magnus Hagander <magnus@hagander.net>
wrote:

On Wed, Jun 6, 2012 at 8:26 PM, Fujii Masao <masao.fujii@gmail.com>
wrote:

On Tue, Jun 5, 2012 at 11:44 PM, Magnus Hagander <

magnus@hagander.net>

wrote:

On Tue, Jun 5, 2012 at 4:42 PM, Fujii Masao <masao.fujii@gmail.com>
wrote:

On Tue, Jun 5, 2012 at 9:53 PM, Magnus Hagander <

magnus@hagander.net>

wrote:

Right now, pg_receivexlog sets:
replymsg->write = InvalidXLogRecPtr;
replymsg->flush = InvalidXLogRecPtr;
replymsg->apply = InvalidXLogRecPtr;

when it sends it's status updates.

I'm thinking it sohuld set replymsg->write = blockpos instad.

Why? That way you can see in pg_stat_replication what has actually
been received by pg_receivexlog - not just what we last sent. This
can
be useful in combination with an archive_command that can block

WAL

recycling until it has been saved to the standby. And it would be
useful as a general monitoring thing as well.

I think the original reason was that it shouldn't interefer with
synchronous replication - but it does take away a fairly useful
usecase...

I think that not only replaymsg->write but also ->flush should be

set

to
blockpos in pg_receivexlog. Which allows pg_receivexlog to behave
as synchronous standby, so we can write WAL to both local and

remote

synchronously. I believe there are some use cases for synchronous
pg_receivexlog.

pg_receivexlog doesn't currently fsync() after every write. It only
fsync():s complete files. So we'd need to set ->flush only at the

end

of a segment, right?

Yes.

Currently the status update is sent for each status interval. In sync
replication, transaction has to wait for a while even after
pg_receivexlog
has written or flushed the WAL data.

So we should add new option which specifies whether pg_receivexlog
sends the status packet back as soon as it writes or flushes the WAL
data, like the walreceiver does?

That might be useful, but I think that's 9.3 material at this point.

Fair enough. That's new feature rather than a bugfix.

But I think we can get the "set the write location" in as a bugfix.

Also "set the flush location"? Sending the flush location back seems
helpful when using pg_receivexlog for WAL archiving purpose. By
seeing the flush location we can ensure that WAL file has been archived
durably (IOW, WAL file has been flushed in remote archive area).

You can do that with the write location as well, as long as you round it

You mean to prevent pg_receivexlog from sending back the end of WAL file
as the write location *before* it completes the WAL file? If so, yes. But
why do you want to keep the flush location invalid?

No. pg_receivexlog sends back the correct write location. Whoever does the
check (through pg_stat_replication) rounds down, so it only counts it once
pg_receivexlog has acknowledged receiving the whole mail.

I'm not against doing the flush location as well, I'm just worried about
feature-creep :-) But let's see how big a change that would turn out to
be...

//Magnus

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

#11Magnus Hagander
magnus@hagander.net
In reply to: Magnus Hagander (#10)
1 attachment(s)
Re: pg_receivexlog and feedback message

On Thu, Jun 7, 2012 at 12:40 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Thursday, June 7, 2012, Fujii Masao wrote:

On Thu, Jun 7, 2012 at 6:25 PM, Magnus Hagander <magnus@hagander.net>
wrote:

On Thursday, June 7, 2012, Fujii Masao wrote:

On Thu, Jun 7, 2012 at 5:05 AM, Magnus Hagander <magnus@hagander.net>
wrote:

On Wed, Jun 6, 2012 at 8:26 PM, Fujii Masao <masao.fujii@gmail.com>
wrote:

On Tue, Jun 5, 2012 at 11:44 PM, Magnus Hagander
<magnus@hagander.net>
wrote:

On Tue, Jun 5, 2012 at 4:42 PM, Fujii Masao <masao.fujii@gmail.com>
wrote:

On Tue, Jun 5, 2012 at 9:53 PM, Magnus Hagander
<magnus@hagander.net>
wrote:

Right now, pg_receivexlog sets:
                       replymsg->write = InvalidXLogRecPtr;
                       replymsg->flush = InvalidXLogRecPtr;
                       replymsg->apply = InvalidXLogRecPtr;

when it sends it's status updates.

I'm thinking it sohuld set replymsg->write = blockpos instad.

Why? That way you can see in pg_stat_replication what has
actually
been received by pg_receivexlog - not just what we last sent.
This
can
be useful in combination with an archive_command that can block
WAL
recycling until it has been saved to the standby. And it would be
useful as a general monitoring thing as well.

I think the original reason was that it shouldn't interefer with
synchronous replication - but it does take away a fairly useful
usecase...

I think that not only replaymsg->write but also ->flush should be
set
to
blockpos in pg_receivexlog. Which allows pg_receivexlog to behave
as synchronous standby, so we can write WAL to both local and
remote
synchronously. I believe there are some use cases for synchronous
pg_receivexlog.

pg_receivexlog doesn't currently fsync() after every write. It only
fsync():s complete files. So we'd need to set ->flush only at the
end
of a segment, right?

Yes.

Currently the status update is sent for each status interval. In
sync
replication, transaction has to wait for a while even after
pg_receivexlog
has written or flushed the WAL data.

So we should add new option which specifies whether pg_receivexlog
sends the status packet back as soon as it writes or flushes the WAL
data, like the walreceiver does?

That might be useful, but I think that's 9.3 material at this point.

Fair enough. That's new feature rather than a bugfix.

But I think we can get the "set the write location" in as a bugfix.

Also "set the flush location"? Sending the flush location back seems
helpful when using pg_receivexlog for WAL archiving purpose. By
seeing the flush location we can ensure that WAL file has been archived
durably (IOW, WAL file has been flushed in remote archive area).

You  can do that with the write location as well, as long as you round
it

You mean to prevent pg_receivexlog from sending back the end of WAL file
as the write location *before* it completes the WAL file? If so, yes. But
why do you want to keep the flush location invalid?

No. pg_receivexlog sends back the correct write location. Whoever does the
check (through pg_stat_replication) rounds down, so it only counts it once
pg_receivexlog has acknowledged receiving the whole mail.

I'm not against doing the flush location as well, I'm just worried about
feature-creep :-) But let's see how big a change that would turn out to
be...

How about this?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

Attachments:

receivelog_flushpos.patchapplication/octet-stream; name=receivelog_flushpos.patchDownload
diff --git a/src/bin/pg_basebackup/receivelog.c b/src/bin/pg_basebackup/receivelog.c
index a51a40e..cd71721 100644
--- a/src/bin/pg_basebackup/receivelog.c
+++ b/src/bin/pg_basebackup/receivelog.c
@@ -274,6 +274,7 @@ ReceiveXlogStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline, char *sysi
 	int			walfile = -1;
 	int64		last_status = -1;
 	XLogRecPtr	blockpos = InvalidXLogRecPtr;
+	XLogRecPtr	flushedpos = InvalidXLogRecPtr;
 
 	if (sysidentifier != NULL)
 	{
@@ -359,8 +360,8 @@ ReceiveXlogStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline, char *sysi
 			char		replybuf[sizeof(StandbyReplyMessage) + 1];
 			StandbyReplyMessage *replymsg = (StandbyReplyMessage *) (replybuf + 1);
 
-			replymsg->write = InvalidXLogRecPtr;
-			replymsg->flush = InvalidXLogRecPtr;
+			replymsg->write = blockpos;
+			replymsg->flush = flushedpos;
 			replymsg->apply = InvalidXLogRecPtr;
 			replymsg->sendTime = now;
 			replybuf[0] = 'r';
@@ -552,6 +553,19 @@ ReceiveXlogStream(PGconn *conn, XLogRecPtr startpos, uint32 timeline, char *sysi
 				walfile = -1;
 				xlogoff = 0;
 
+				/*
+				 * Set flushed position to the last byte in the previous
+				 * file. Per above we know that xrecoff%XLOG_SEG_SIZE=0
+				 */
+				flushedpos = blockpos;
+				if (flushedpos.xrecoff == 0)
+				{
+					flushedpos.xlogid--;
+					flushedpos.xrecoff = XLogFileSize-1;
+				}
+				else
+					flushedpos.xrecoff--;
+
 				if (stream_stop != NULL)
 				{
 					/*
#12Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#11)
Re: pg_receivexlog and feedback message

On Sun, Jun 10, 2012 at 7:55 PM, Magnus Hagander <magnus@hagander.net> wrote:

How about this?

+				/*
+				 * Set flushed position to the last byte in the previous
+				 * file. Per above we know that xrecoff%XLOG_SEG_SIZE=0
+				 */
+				flushedpos = blockpos;
+				if (flushedpos.xrecoff == 0)
+				{
+					flushedpos.xlogid--;
+					flushedpos.xrecoff = XLogFileSize-1;
+				}
+				else
+					flushedpos.xrecoff--;

flushedpos.xrecoff doesn't need to be decremented by one.
If xrecoff % XLOG_SEG_SIZE = 0, the position should be the last
byte of previous (i.e., flushed) WAL file.

Regards,

--
Fujii Masao

#13Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#12)
Re: pg_receivexlog and feedback message

On Sun, Jun 10, 2012 at 4:02 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sun, Jun 10, 2012 at 7:55 PM, Magnus Hagander <magnus@hagander.net> wrote:

How about this?

+                               /*
+                                * Set flushed position to the last byte in the previous
+                                * file. Per above we know that xrecoff%XLOG_SEG_SIZE=0
+                                */
+                               flushedpos = blockpos;
+                               if (flushedpos.xrecoff == 0)
+                               {
+                                       flushedpos.xlogid--;
+                                       flushedpos.xrecoff = XLogFileSize-1;
+                               }
+                               else
+                                       flushedpos.xrecoff--;

flushedpos.xrecoff doesn't need to be decremented by one.
If xrecoff % XLOG_SEG_SIZE = 0, the position should be the last
byte of previous (i.e., flushed) WAL file.

Hmm. I thikn I confused myself with "last byte written" vs "current
position". And we're dealing with current position here...

So it should just be flushedpos = blockpos and be done with it, right?

Though before I commit anything with this, we need to decide what to
wrt syncrep on that, per the other thread.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#14Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#13)
Re: pg_receivexlog and feedback message

On Mon, Jun 11, 2012 at 10:04 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Sun, Jun 10, 2012 at 4:02 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sun, Jun 10, 2012 at 7:55 PM, Magnus Hagander <magnus@hagander.net> wrote:

How about this?

+                               /*
+                                * Set flushed position to the last byte in the previous
+                                * file. Per above we know that xrecoff%XLOG_SEG_SIZE=0
+                                */
+                               flushedpos = blockpos;
+                               if (flushedpos.xrecoff == 0)
+                               {
+                                       flushedpos.xlogid--;
+                                       flushedpos.xrecoff = XLogFileSize-1;
+                               }
+                               else
+                                       flushedpos.xrecoff--;

flushedpos.xrecoff doesn't need to be decremented by one.
If xrecoff % XLOG_SEG_SIZE = 0, the position should be the last
byte of previous (i.e., flushed) WAL file.

Hmm. I thikn I confused myself with "last byte written" vs "current
position". And we're dealing with current position here...

So it should just be flushedpos = blockpos and be done with it, right?

Yep.

Though before I commit anything with this, we need to decide what to
wrt syncrep on that, per the other thread.

Yep.

Regards,

--
Fujii Masao

#15Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#14)
Re: pg_receivexlog and feedback message

On Mon, Jun 11, 2012 at 5:24 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Mon, Jun 11, 2012 at 10:04 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Sun, Jun 10, 2012 at 4:02 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Sun, Jun 10, 2012 at 7:55 PM, Magnus Hagander <magnus@hagander.net> wrote:

How about this?

+                               /*
+                                * Set flushed position to the last byte in the previous
+                                * file. Per above we know that xrecoff%XLOG_SEG_SIZE=0
+                                */
+                               flushedpos = blockpos;
+                               if (flushedpos.xrecoff == 0)
+                               {
+                                       flushedpos.xlogid--;
+                                       flushedpos.xrecoff = XLogFileSize-1;
+                               }
+                               else
+                                       flushedpos.xrecoff--;

flushedpos.xrecoff doesn't need to be decremented by one.
If xrecoff % XLOG_SEG_SIZE = 0, the position should be the last
byte of previous (i.e., flushed) WAL file.

Hmm. I thikn I confused myself with "last byte written" vs "current
position". And we're dealing with current position here...

So it should just be flushedpos = blockpos and be done with it, right?

Yep.

Though before I commit anything with this, we need to decide what to
wrt syncrep on that, per the other thread.

Yep.

Per the other thread, we decided to postpone this until 9.3. And also
figure out a better set of switches for pg_receivexlog to control it
with.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/