incorrect handling of the timeout in pg_receivexlog

Started by Fujii Masaoabout 14 years ago32 messageshackers
Jump to latest
#1Fujii Masao
masao.fujii@gmail.com

Hi,

When I compiled HEAD with --disable-integer-datetimes and tested
pg_receivexlog, I encountered unexpected replication timeout. As
far as I read the pg_receivexlog code, the cause of this problem is
that pg_receivexlog handles the standby message timeout incorrectly
in --disable-integer-datetimes. The attached patch fixes this problem.
Comments?

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

timeout_handling_v1.patchtext/x-diff; charset=US-ASCII; name=timeout_handling_v1.patchDownload+21-2
#2Fujii Masao
masao.fujii@gmail.com
In reply to: Fujii Masao (#1)
Re: incorrect handling of the timeout in pg_receivexlog

On Tue, Feb 7, 2012 at 2:58 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

Hi,

When I compiled HEAD with --disable-integer-datetimes and tested
pg_receivexlog, I encountered unexpected replication timeout. As
far as I read the pg_receivexlog code, the cause of this problem is
that pg_receivexlog handles the standby message timeout incorrectly
in --disable-integer-datetimes. The attached patch fixes this problem.
Comments?

receivelog.c
-------
timeout.tv_sec = last_status + standby_message_timeout - now - 1;
if (timeout.tv_sec <= 0)
-------

Umm.. the above code also handles the timestamp incorrectly. ISTM that the
root cause of these problems is that receivelog.c uses TimestampTz. What
about changing receivelog.c so that it uses time_t instead of TimestampTz?
Which would make the code simpler, I think.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#3Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Fujii Masao (#2)
Re: incorrect handling of the timeout in pg_receivexlog

On 07.02.2012 09:03, Fujii Masao wrote:

On Tue, Feb 7, 2012 at 2:58 PM, Fujii Masao<masao.fujii@gmail.com> wrote:

When I compiled HEAD with --disable-integer-datetimes and tested
pg_receivexlog, I encountered unexpected replication timeout. As
far as I read the pg_receivexlog code, the cause of this problem is
that pg_receivexlog handles the standby message timeout incorrectly
in --disable-integer-datetimes. The attached patch fixes this problem.
Comments?

receivelog.c
-------
timeout.tv_sec = last_status + standby_message_timeout - now - 1;
if (timeout.tv_sec<= 0)
-------

Umm.. the above code also handles the timestamp incorrectly. ISTM that the
root cause of these problems is that receivelog.c uses TimestampTz.

Yep. While localGetCurrentTimestamp() returns a TimestampTz and handles
float timestamps correctly, the caller just assigns the result to a
int64 variable, assuming --enable-integer-datetimes.

What about changing receivelog.c so that it uses time_t instead of
TimestampTz? Which would make the code simpler, I think.

Hmm, that would reduce granularity to seconds. The --statusint option is
given in seconds, but it would be good to have more precision in the
calculations to avoid rounding errors.

But actually, if the purpose of the --statusint option is to avoid
disconnection because of exceeding the server's replication_timeout, one
second granularity just isn't enough to be begin with.
replication_timeout is given in milliseconds, so if you set
replication_timeout=900ms in the server, there is no way to make
pg_basebackup/pg_receivexlog to reply in time.

So, --statusint needs to be in milliseconds. And while we're at it, how
difficult would be to ask the server for the current value of
replication_timeout, and set --statusint automatically based on that? Or
perhaps mark replication_timeout as GUC_REPORT. It is rather fiddly that
depending on a server setting, you need to pass an option in the client
or it will just silently fail with no indication of what the problem is.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#4Magnus Hagander
magnus@hagander.net
In reply to: Heikki Linnakangas (#3)
Re: incorrect handling of the timeout in pg_receivexlog

On Tue, Feb 7, 2012 at 10:31, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 07.02.2012 09:03, Fujii Masao wrote:

On Tue, Feb 7, 2012 at 2:58 PM, Fujii Masao<masao.fujii@gmail.com>  wrote:

When I compiled HEAD with --disable-integer-datetimes and tested

pg_receivexlog, I encountered unexpected replication timeout. As
far as I read the pg_receivexlog code, the cause of this problem is
that pg_receivexlog handles the standby message timeout incorrectly
in --disable-integer-datetimes. The attached patch fixes this problem.
Comments?

receivelog.c
-------
       timeout.tv_sec = last_status + standby_message_timeout - now - 1;
       if (timeout.tv_sec<= 0)
-------

Umm.. the above code also handles the timestamp incorrectly. ISTM that the
root cause of these problems is that receivelog.c uses TimestampTz.

Yep. While localGetCurrentTimestamp() returns a TimestampTz and handles
float timestamps correctly, the caller just assigns the result to a int64
variable, assuming --enable-integer-datetimes.

Ugh. Indeed.

What about changing receivelog.c so that it uses time_t instead of
TimestampTz? Which would make the code simpler, I think.

Hmm, that would reduce granularity to seconds. The --statusint option is
given in seconds, but it would be good to have more precision in the
calculations to avoid rounding errors.

But actually, if the purpose of the --statusint option is to avoid
disconnection because of exceeding the server's replication_timeout, one
second granularity just isn't enough to be begin with. replication_timeout
is given in milliseconds, so if you set replication_timeout=900ms in the
server, there is no way to make pg_basebackup/pg_receivexlog to reply in
time.

So, --statusint needs to be in milliseconds. And while we're at it, how
difficult would be to ask the server for the current value of
replication_timeout, and set --statusint automatically based on that? Or
perhaps mark replication_timeout as GUC_REPORT. It is rather fiddly that
depending on a server setting, you need to pass an option in the client or
it will just silently fail with no indication of what the problem is.

We can't really ask for it easily, since we're on a replication
connection. Unless we add that to the walsender grammar, but that
would make it impossible to use the receivexlog stuff against a 9.1
server (which I think still works, though I haven't tested it in a
while).

Do we have a facility to make it a GUC_REPORT but only for walsender
connections? It seems like a very unnecessary thing to have it sent
out over every single connection, since it would only be useful in a
very small subset of them.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#5Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Magnus Hagander (#4)
Re: incorrect handling of the timeout in pg_receivexlog

On 07.02.2012 11:35, Magnus Hagander wrote:

On Tue, Feb 7, 2012 at 10:31, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

So, --statusint needs to be in milliseconds. And while we're at it, how
difficult would be to ask the server for the current value of
replication_timeout, and set --statusint automatically based on that? Or
perhaps mark replication_timeout as GUC_REPORT. It is rather fiddly that
depending on a server setting, you need to pass an option in the client or
it will just silently fail with no indication of what the problem is.

We can't really ask for it easily, since we're on a replication
connection. Unless we add that to the walsender grammar, but that
would make it impossible to use the receivexlog stuff against a 9.1
server (which I think still works, though I haven't tested it in a
while).

You could put a version-check there, and only send the command if
connected to a 9.2 server.

Do we have a facility to make it a GUC_REPORT but only for walsender
connections?

There's no such facility at the moment.

It seems like a very unnecessary thing to have it sent out over every
single connection, since it would only be useful in a very small
subset of them.

True, and conversely, many of the current GUC_REPORT settings don't
apply to replication clients at all. Like standard_conforming_strings
and DateStyle.

I think we need another flag for settings that should be sent to
replication clients. GUC_REPORT_REPLICATION? Some settings would have
both GUC_REPORT and GUC_REPORT_REPLICATION, while others would have only
one of them.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#6Magnus Hagander
magnus@hagander.net
In reply to: Heikki Linnakangas (#5)
Re: incorrect handling of the timeout in pg_receivexlog

On Tue, Feb 7, 2012 at 10:55, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 07.02.2012 11:35, Magnus Hagander wrote:

On Tue, Feb 7, 2012 at 10:31, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com>  wrote:

So, --statusint needs to be in milliseconds. And while we're at it, how

difficult would be to ask the server for the current value of
replication_timeout, and set --statusint automatically based on that? Or
perhaps mark replication_timeout as GUC_REPORT. It is rather fiddly that
depending on a server setting, you need to pass an option in the client
or
it will just silently fail with no indication of what the problem is.

We can't really ask for it easily, since we're on a replication
connection. Unless we add that to the walsender grammar, but that
would make it impossible to use the receivexlog stuff against a 9.1
server (which I think still works, though I haven't tested it in a
while).

You could put a version-check there, and only send the command if connected
to a 9.2 server.

True..

Do we have a facility to make it a GUC_REPORT but only for walsender
connections?

There's no such facility at the moment.

It seems like a very unnecessary thing to have it sent out over every
single connection, since it would only be useful in a very small
subset of them.

True, and conversely, many of the current GUC_REPORT settings don't apply to
replication clients at all. Like standard_conforming_strings and DateStyle.

Right. But it's less important there, since the replication
connections will in any reasonable configuration be only a tiny part
of the connections.

I think we need another flag for settings that should be sent to replication
clients. GUC_REPORT_REPLICATION? Some settings would have both GUC_REPORT
and GUC_REPORT_REPLICATION, while others would have only one of them.

Yup, seems like the cleanest solution.

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#7Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#6)
Re: incorrect handling of the timeout in pg_receivexlog

On Tue, Feb 7, 2012 at 7:06 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Tue, Feb 7, 2012 at 10:55, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 07.02.2012 11:35, Magnus Hagander wrote:

On Tue, Feb 7, 2012 at 10:31, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com>  wrote:

So, --statusint needs to be in milliseconds. And while we're at it, how

Attached patch does that and fixes the problem caused under
--disable-integer-datetimes.

difficult would be to ask the server for the current value of
replication_timeout, and set --statusint automatically based on that? Or
perhaps mark replication_timeout as GUC_REPORT. It is rather fiddly that
depending on a server setting, you need to pass an option in the client
or
it will just silently fail with no indication of what the problem is.

We can't really ask for it easily, since we're on a replication
connection. Unless we add that to the walsender grammar, but that
would make it impossible to use the receivexlog stuff against a 9.1
server (which I think still works, though I haven't tested it in a
while).

You could put a version-check there, and only send the command if connected
to a 9.2 server.

True..

Do we have a facility to make it a GUC_REPORT but only for walsender
connections?

There's no such facility at the moment.

It seems like a very unnecessary thing to have it sent out over every
single connection, since it would only be useful in a very small
subset of them.

True, and conversely, many of the current GUC_REPORT settings don't apply to
replication clients at all. Like standard_conforming_strings and DateStyle.

Right. But it's less important there, since the replication
connections will in any reasonable configuration be only a tiny part
of the connections.

I think we need another flag for settings that should be sent to replication
clients. GUC_REPORT_REPLICATION? Some settings would have both GUC_REPORT
and GUC_REPORT_REPLICATION, while others would have only one of them.

Yup, seems like the cleanest solution.

Agreed. But when replication_timeout is changed by SIGHUP in the server,
there would be a delay before pg_receivexlog receives the parameter
change packet and changes the standby message interval based on the
new value of replication_timeout. Which might cause unexpected replication
timeout. So the user-settable interval (i.e., --statusint) is still useful for
people who want to avoid such fragility, even if we will support the auto-
tuning of the interval.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

Attachments:

timeout_handling_v2.patchtext/x-diff; charset=US-ASCII; name=timeout_handling_v2.patchDownload+83-28
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Heikki Linnakangas (#3)
Re: incorrect handling of the timeout in pg_receivexlog

Heikki Linnakangas <heikki.linnakangas@enterprisedb.com> writes:

On 07.02.2012 09:03, Fujii Masao wrote:

What about changing receivelog.c so that it uses time_t instead of
TimestampTz? Which would make the code simpler, I think.

Hmm, that would reduce granularity to seconds.

It also creates portability issues that I'd just as soon not deal with,
ie, time_t is not the same width on all platforms. (The integer vs
float TimestampTz issue is a kind of portability problem, but we've
already bought into the assumption that sender and receiver must be
built with the same choice, no?)

regards, tom lane

#9Heikki Linnakangas
heikki.linnakangas@enterprisedb.com
In reply to: Tom Lane (#8)
Re: incorrect handling of the timeout in pg_receivexlog

On 07.02.2012 16:55, Tom Lane wrote:

(The integer vs float TimestampTz issue is a kind of portability
problem, but we've already bought into the assumption that sender and
receiver must be built with the same choice, no?)

Hmm, true. In hindsight, I think that was a bad choice, but it's a bit
late to change that. pg_basebackup doesn't otherwise care about the
integer/float timestamps, but it does send a timestamp back to the
server. You won't be able to actually start up the database if the
config options don't match, but I think it would be good if
pg_basebackup still worked across platforms and versions. For example,
you might have a central backup server that calls pg_basebackup on
several database servers, running on different platforms.

In 9.0, the only field in the protocol that depends on timestamp format
is WalDataMessageHeader->sendTime. That goes from server to client, and
pg_basebackup/pg_receivexlog don't care about that. In 9.1 we introduced
StandbyReplyMessage->sendTime, which is sent from client to server, but
looking at the code it looks like the server doesn't use it for
anything. In 9.2, we added WalSndrMessage->sendTime, which is used by a
standby server to calculate how far behind the standby is.

I'm tempted to just change all of those TimestampTz fields to something
that's independent of integer/float timestamp setting, in 9.2. At a
quick glance, it seems that it wouldn't break anything.

--
Heikki Linnakangas
EnterpriseDB http://www.enterprisedb.com

#10Magnus Hagander
magnus@hagander.net
In reply to: Heikki Linnakangas (#9)
Re: incorrect handling of the timeout in pg_receivexlog

On Tue, Feb 7, 2012 at 17:29, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 07.02.2012 16:55, Tom Lane wrote:

(The integer vs float TimestampTz issue is a kind of portability
problem, but we've already bought into the assumption that sender and
receiver must be built with the same choice, no?)

Hmm, true. In hindsight, I think that was a bad choice, but it's a bit late
to change that. pg_basebackup doesn't otherwise care about the integer/float
timestamps, but it does send a timestamp back to the server. You won't be
able to actually start up the database if the config options don't match,
but I think it would be good if pg_basebackup still worked across platforms
and versions. For example, you might have a central backup server that calls
pg_basebackup on several database servers, running on different platforms.

In 9.0, the only field in the protocol that depends on timestamp format is
WalDataMessageHeader->sendTime. That goes from server to client, and
pg_basebackup/pg_receivexlog don't care about that. In 9.1 we introduced
StandbyReplyMessage->sendTime, which is sent from client to server, but
looking at the code it looks like the server doesn't use it for anything. In
9.2, we added WalSndrMessage->sendTime, which is used by a standby server to
calculate how far behind the standby is.

I'm tempted to just change all of those TimestampTz fields to something
that's independent of integer/float timestamp setting, in 9.2. At a quick
glance, it seems that it wouldn't break anything.

In general, I think that would work. Since we can't replicate across
versions anyway.

Will it break using pg_basebackup 9.2 on a 9.1 server, though? that
would also be very useful in the scenario of the central server...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#11Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#10)
Re: incorrect handling of the timeout in pg_receivexlog

On Wed, Feb 8, 2012 at 1:33 AM, Magnus Hagander <magnus@hagander.net> wrote:

On Tue, Feb 7, 2012 at 17:29, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:

On 07.02.2012 16:55, Tom Lane wrote:

(The integer vs float TimestampTz issue is a kind of portability
problem, but we've already bought into the assumption that sender and
receiver must be built with the same choice, no?)

Hmm, true. In hindsight, I think that was a bad choice, but it's a bit late
to change that. pg_basebackup doesn't otherwise care about the integer/float
timestamps, but it does send a timestamp back to the server. You won't be
able to actually start up the database if the config options don't match,
but I think it would be good if pg_basebackup still worked across platforms
and versions. For example, you might have a central backup server that calls
pg_basebackup on several database servers, running on different platforms.

In 9.0, the only field in the protocol that depends on timestamp format is
WalDataMessageHeader->sendTime. That goes from server to client, and
pg_basebackup/pg_receivexlog don't care about that. In 9.1 we introduced
StandbyReplyMessage->sendTime, which is sent from client to server, but
looking at the code it looks like the server doesn't use it for anything. In
9.2, we added WalSndrMessage->sendTime, which is used by a standby server to
calculate how far behind the standby is.

I'm tempted to just change all of those TimestampTz fields to something
that's independent of integer/float timestamp setting, in 9.2. At a quick
glance, it seems that it wouldn't break anything.

Agreed. If we'll have not pushed such change into 9.2, we would break
something later.

In general, I think that would work. Since we can't replicate across
versions anyway.

Will it break using pg_basebackup 9.2 on a 9.1 server, though? that
would also be very useful in the scenario of the central server...

No unless I'm missing something. Because pg_basebackup doesn't use
any message which is defined in walprotocol.h if "-x stream" option is
not specified.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#12Fujii Masao
masao.fujii@gmail.com
In reply to: Fujii Masao (#11)
Re: incorrect handling of the timeout in pg_receivexlog

On Tue, Feb 28, 2012 at 6:08 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Wed, Feb 8, 2012 at 1:33 AM, Magnus Hagander <magnus@hagander.net> wrote:

Will it break using pg_basebackup 9.2 on a 9.1 server, though? that
would also be very useful in the scenario of the central server...

No unless I'm missing something. Because pg_basebackup doesn't use
any message which is defined in walprotocol.h if "-x stream" option is
not specified.

No, this is not right at all :( Changing TimestampTz fields in 9.2 would break
that use case.

If we support that use case, pg_basebackup 9.2 must know which an integer
or a double is used for TimestampTz in 9.1 server. Otherwise pg_basebackup
cannot process a WAL data message proporly. But unfortunately there is no
way for pg_basebackup 9.2 to know that... 9.1 has no API to report the actual
datatype of its TimestampTz field.

One idea to support that use case is to add new command-line option into
pg_basebackup, which specifies the datatype of TimestampTz field. You can
use one pg_basebackup 9.2 executable on 9.1 server whether
--disable-integer-datetimes is specified or not. But I'm not really sure if it's
worth doing this, because ISTM that it's rare to build a server and a
client with
the different choice about TimestampTz datatype.

Regards,

--
Fujii Masao
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center

#13Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#12)
Re: incorrect handling of the timeout in pg_receivexlog

Argh. This thread appears to have been forgotten - sorry about that.

Given that we're taling about a potential protocol change, we really
should resolve this before we wrap beta, no?

On Thu, Mar 29, 2012 at 6:43 AM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Tue, Feb 28, 2012 at 6:08 PM, Fujii Masao <masao.fujii@gmail.com> wrote:

On Wed, Feb 8, 2012 at 1:33 AM, Magnus Hagander <magnus@hagander.net> wrote:

Will it break using pg_basebackup 9.2 on a 9.1 server, though? that
would also be very useful in the scenario of the central server...

No unless I'm missing something. Because pg_basebackup doesn't use
any message which is defined in walprotocol.h if "-x stream" option is
not specified.

No, this is not right at all :( Changing TimestampTz fields in 9.2 would break
that use case.

If we support that use case, pg_basebackup 9.2 must know which an integer
or a double is used for TimestampTz in 9.1 server. Otherwise pg_basebackup
cannot process a WAL data message proporly. But unfortunately there is no
way for pg_basebackup 9.2 to know that... 9.1 has no API to report the actual
datatype of its TimestampTz field.

One idea to support that use case is to add new command-line option into
pg_basebackup, which specifies the datatype of TimestampTz field. You can
use one pg_basebackup 9.2 executable on 9.1 server whether
--disable-integer-datetimes is specified or not. But I'm not really sure if it's
worth doing this, because ISTM that it's rare to build a server and a
client with
the different choice about TimestampTz datatype.

I think that's survivable - but what will things look like if they do
mismatch? Can we detect "abnormal values" somewhere and at least abort
in a controlled fashion saying "sorry, wrong build flags"?

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#14Magnus Hagander
magnus@hagander.net
In reply to: Magnus Hagander (#13)
Re: incorrect handling of the timeout in pg_receivexlog

On Thu, May 10, 2012 at 3:04 PM, Magnus Hagander <magnus@hagander.net> wrote:

Argh. This thread appears to have been forgotten - sorry about that.

Given that we're taling about a potential protocol change, we really
should resolve this before we wrap beta, no?

Had a chat with Heikki about this, and we came to the conslusion that
we don't actually have to fix it befor ebeta. Because pg_basebackup is
going to have to consider 9.1 servers anyway, and we can just treat
9.2beta1 as being a 9.1 from this perspective.

We still have to fix it, but it' snot as urgent :-)

FWIW, the main plan we're onto is still to add the GUCs on new
connections to walsender, so we have something to work with...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#15Magnus Hagander
magnus@hagander.net
In reply to: Magnus Hagander (#14)
Re: incorrect handling of the timeout in pg_receivexlog

On Thu, May 10, 2012 at 4:43 PM, Magnus Hagander <magnus@hagander.net> wrote:

On Thu, May 10, 2012 at 3:04 PM, Magnus Hagander <magnus@hagander.net> wrote:

Argh. This thread appears to have been forgotten - sorry about that.

Given that we're taling about a potential protocol change, we really
should resolve this before we wrap beta, no?

Had a chat with Heikki about this, and we came to the conslusion that
we don't actually have to fix it befor ebeta. Because pg_basebackup is
going to have to consider 9.1 servers anyway, and we can just treat
9.2beta1 as being a 9.1 from this perspective.

We still have to fix it, but it' snot as urgent :-)

FWIW, the main plan we're onto is still to add the GUCs on new
connections to walsender, so we have something to work with...

And taking this a step further - we *already* send these GUCs.
Previous references to us not doing that were incorrect :-)

So this should be a much easier fix than we thought. And can be done
entirely in pg_basebackup, meaning we don't need to worry about
beta...

--
 Magnus Hagander
 Me: http://www.hagander.net/
 Work: http://www.redpill-linpro.com/

#16Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#15)
Re: incorrect handling of the timeout in pg_receivexlog

On Thu, May 10, 2012 at 11:51 PM, Magnus Hagander <magnus@hagander.net> wrote:

And taking this a step further - we *already* send these GUCs.
Previous references to us not doing that were incorrect :-)

So this should be a much easier fix than we thought. And can be done
entirely in pg_basebackup, meaning we don't need to worry about
beta...

Sounds good!

Regards,

--
Fujii Masao

#17Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#16)
Re: incorrect handling of the timeout in pg_receivexlog

On Thursday, May 10, 2012, Fujii Masao wrote:

On Thu, May 10, 2012 at 11:51 PM, Magnus Hagander <magnus@hagander.net<javascript:;>>
wrote:

And taking this a step further - we *already* send these GUCs.
Previous references to us not doing that were incorrect :-)

So this should be a much easier fix than we thought. And can be done
entirely in pg_basebackup, meaning we don't need to worry about
beta...

Sounds good!

Should we go down the easy way and just reject connections when the flag is
mismatching between the client and the server (trivial to do - see the
attached patch)? Or should we try to implement both floating point and
integer in pg_basebackup, and make it work in either case? (We'd still have
to reject it if pg_basebackup was compiled without support for int64 at
all, of course, but the large majority of cases will have integer
timestamps these days, but could be made to support both integer and float
for the trivial operations that pg_basebackup actually does).

How common *is* it to have a build that doesn't have integer timestamps
these days? Does any of the binary builds do that at all, for example? If
it's uncommon enough, I think we should just go with the easy way out...

//Magnus

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Attachments:

validate_int_timestamps.patchtext/x-patch; charset=US-ASCII; name=validate_int_timestamps.patchDownload+24-0
#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#17)
Re: incorrect handling of the timeout in pg_receivexlog

Magnus Hagander <magnus@hagander.net> writes:

How common *is* it to have a build that doesn't have integer timestamps
these days? Does any of the binary builds do that at all, for example? If
it's uncommon enough, I think we should just go with the easy way out...

+1 for just rejecting a mismatch.

regards, tom lane

#19Fujii Masao
masao.fujii@gmail.com
In reply to: Tom Lane (#18)
Re: incorrect handling of the timeout in pg_receivexlog

On Sat, May 12, 2012 at 12:44 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Magnus Hagander <magnus@hagander.net> writes:

How common *is* it to have a build that doesn't have integer timestamps
these days? Does any of the binary builds do that at all, for example? If
it's uncommon enough, I think we should just go with the easy way out...

+1 for just rejecting a mismatch.

Agreed.

Regards,

--
Fujii Masao

#20Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#17)
Re: incorrect handling of the timeout in pg_receivexlog

On Fri, May 11, 2012 at 11:43 PM, Magnus Hagander <magnus@hagander.net> wrote:

Should we go down the easy way and just reject connections when the flag is
mismatching between the client and the server (trivial to do - see the
attached patch)?

+ char *tmpparam;

You forgot to add "const" before "char", which causes a compile-time warning.

Regards,

--
Fujii Masao

#21Robert Haas
robertmhaas@gmail.com
In reply to: Fujii Masao (#20)
#22Fujii Masao
masao.fujii@gmail.com
In reply to: Robert Haas (#21)
#23Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#22)
#24Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#23)
#25Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#24)
#26Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#25)
#27Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#26)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Magnus Hagander (#27)
#29Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#27)
#30Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#29)
#31Fujii Masao
masao.fujii@gmail.com
In reply to: Magnus Hagander (#30)
#32Magnus Hagander
magnus@hagander.net
In reply to: Fujii Masao (#31)