Reducing pg_ctl's reaction time

Started by Tom Lanealmost 9 years ago29 messageshackers
Jump to latest
#1Tom Lane
tgl@sss.pgh.pa.us

I still have a bee in my bonnet about how slow the recovery TAP tests
are, and especially about how low the CPU usage is while they run,
suggesting that a lot of the wall clock time is being expended on
useless sleeps. Some analysis I did today found some low-hanging fruit
there: a significant part of the time is being spent in pg_ctl waiting
for postmaster start/stop operations. It waits in quanta of 1 second,
but the postmaster usually starts or stops in much less than that.
(In these tests, most of the shutdown checkpoints have little to do,
so that in many cases the postmaster stops in just a couple of msec.
Startup isn't very many msec either.)

The attached proposed patch adjusts pg_ctl to check every 100msec,
instead of every second, for the postmaster to be done starting or
stopping. This cuts the runtime of the recovery TAP tests from around
4m30s to around 3m10s on my machine, a good 30% savings. I experimented
with different check frequencies but there doesn't seem to be anything
to be gained by cutting the delay further (and presumably, it becomes
counterproductive at some point due to pg_ctl chewing too many cycles).

This change probably doesn't offer much real-world benefit, since few
people start/stop their postmasters often, and shutdown checkpoints are
seldom so cheap on production installations. Still, it doesn't seem
like it could hurt. The original choice to use once-per-second checks
was made for hardware a lot slower than what most of us use now; I do
not think it's been revisited since the first implementation of pg_ctl
in 1999 (cf 5b912b089).

Barring objections I'd like to push this soon.

regards, tom lane

Attachments:

less-wait-in-pg_ctl-waits.patchtext/x-diff; charset=us-ascii; name=less-wait-in-pg_ctl-waits.patchDownload+61-58
#2Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#1)
Re: Reducing pg_ctl's reaction time

On Mon, Jun 26, 2017 at 7:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The attached proposed patch adjusts pg_ctl to check every 100msec,
instead of every second, for the postmaster to be done starting or
stopping. This cuts the runtime of the recovery TAP tests from around
4m30s to around 3m10s on my machine, a good 30% savings. I experimented
with different check frequencies but there doesn't seem to be anything
to be gained by cutting the delay further (and presumably, it becomes
counterproductive at some point due to pg_ctl chewing too many cycles).

I see with a 2~3% noise similar improvements on my laptop with
non-parallel tests. That's nice.

+#define WAITS_PER_SEC 10 /* should divide 1000000 evenly */
As a matter of style, you could define 1000000 as well in a variable
and refer to the variable for the division.

Barring objections I'd like to push this soon.

This also pops up more easily failures with 001_stream_rep.pl without
a patch applied from the other recent thread, so this patch had better
not get in before anything from
/messages/by-id/8962.1498425057@sss.pgh.pa.us.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#2)
Re: Reducing pg_ctl's reaction time

Michael Paquier <michael.paquier@gmail.com> writes:

On Mon, Jun 26, 2017 at 7:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The attached proposed patch adjusts pg_ctl to check every 100msec,
instead of every second, for the postmaster to be done starting or
stopping.

+#define WAITS_PER_SEC 10 /* should divide 1000000 evenly */

As a matter of style, you could define 1000000 as well in a variable
and refer to the variable for the division.

Good idea, done that way. (My initial thought was to use USECS_PER_SEC
from timestamp.h, but that's declared as int64 which would have
complicated matters, so I just made a new symbol.)

This also pops up more easily failures with 001_stream_rep.pl without
a patch applied from the other recent thread, so this patch had better
not get in before anything from
/messages/by-id/8962.1498425057@sss.pgh.pa.us.

Check. I pushed your fix for that first.

Thanks for the review!

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Jeff Janes
jeff.janes@gmail.com
In reply to: Tom Lane (#3)
Re: Reducing pg_ctl's reaction time

On Mon, Jun 26, 2017 at 12:15 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael.paquier@gmail.com> writes:

On Mon, Jun 26, 2017 at 7:13 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

The attached proposed patch adjusts pg_ctl to check every 100msec,
instead of every second, for the postmaster to be done starting or
stopping.

+#define WAITS_PER_SEC 10 /* should divide 1000000 evenly */

As a matter of style, you could define 1000000 as well in a variable
and refer to the variable for the division.

Good idea, done that way. (My initial thought was to use USECS_PER_SEC
from timestamp.h, but that's declared as int64 which would have
complicated matters, so I just made a new symbol.)

This also pops up more easily failures with 001_stream_rep.pl without
a patch applied from the other recent thread, so this patch had better
not get in before anything from
/messages/by-id/8962.1498425057@sss.pgh.pa.us.

Check. I pushed your fix for that first.

Thanks for the review!

regards, tom lane

The 10 fold increase in log spam during long PITR recoveries is a bit
unfortunate.

9153 2017-06-26 12:55:40.243 PDT FATAL: the database system is starting up
9154 2017-06-26 12:55:40.345 PDT FATAL: the database system is starting up
9156 2017-06-26 12:55:40.447 PDT FATAL: the database system is starting up
9157 2017-06-26 12:55:40.550 PDT FATAL: the database system is starting up
...

I can live with it, but could we use an escalating wait time so it slows
back down to once a second after a while?

Cheers,

Jeff

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Janes (#4)
Re: Reducing pg_ctl's reaction time

Jeff Janes <jeff.janes@gmail.com> writes:

The 10 fold increase in log spam during long PITR recoveries is a bit
unfortunate.

9153 2017-06-26 12:55:40.243 PDT FATAL: the database system is starting up
9154 2017-06-26 12:55:40.345 PDT FATAL: the database system is starting up
9156 2017-06-26 12:55:40.447 PDT FATAL: the database system is starting up
9157 2017-06-26 12:55:40.550 PDT FATAL: the database system is starting up
...

I can live with it, but could we use an escalating wait time so it slows
back down to once a second after a while?

Sure, what do you think an appropriate behavior would be?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#5)
Re: Reducing pg_ctl's reaction time

On 2017-06-26 16:19:16 -0400, Tom Lane wrote:

Jeff Janes <jeff.janes@gmail.com> writes:

The 10 fold increase in log spam during long PITR recoveries is a bit
unfortunate.

9153 2017-06-26 12:55:40.243 PDT FATAL: the database system is starting up
9154 2017-06-26 12:55:40.345 PDT FATAL: the database system is starting up
9156 2017-06-26 12:55:40.447 PDT FATAL: the database system is starting up
9157 2017-06-26 12:55:40.550 PDT FATAL: the database system is starting up
...

I can live with it, but could we use an escalating wait time so it slows
back down to once a second after a while?

Sure, what do you think an appropriate behavior would be?

It'd not be unreasonble to check pg_control first, and only after that
indicates readyness check via the protocol. Doesn't quite seem like
something backpatchable tho.

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#6)
Re: Reducing pg_ctl's reaction time

Andres Freund <andres@anarazel.de> writes:

On 2017-06-26 16:19:16 -0400, Tom Lane wrote:

Sure, what do you think an appropriate behavior would be?

It'd not be unreasonble to check pg_control first, and only after that
indicates readyness check via the protocol.

Hm, that's a thought. The problem here isn't the frequency of checks,
but the log spam.

Doesn't quite seem like something backpatchable tho.

I didn't back-patch the pg_ctl change anyway, so that's no issue.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#7)
Re: Reducing pg_ctl's reaction time

On 2017-06-26 16:26:00 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2017-06-26 16:19:16 -0400, Tom Lane wrote:

Sure, what do you think an appropriate behavior would be?

It'd not be unreasonble to check pg_control first, and only after that
indicates readyness check via the protocol.

Hm, that's a thought. The problem here isn't the frequency of checks,
but the log spam.

Right. I think to deal with hot-standby we'd probably have to add new
state to the control file however. We don't just want to treat the
server as ready once DB_IN_PRODUCTION is reached.

Arguably we could and should improve the logic when the server has
started, right now it's pretty messy because we never treat a standby as
up if hot_standby is disabled...

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#7)
Re: Reducing pg_ctl's reaction time

I wrote:

Andres Freund <andres@anarazel.de> writes:

It'd not be unreasonble to check pg_control first, and only after that
indicates readyness check via the protocol.

Hm, that's a thought. The problem here isn't the frequency of checks,
but the log spam.

Actually, that wouldn't help much as things stand, because you can't
tell from pg_control whether hot standby is active. Assuming that
we want "pg_ctl start" to come back as soon as connections are allowed,
it'd have to start probing when it sees DB_IN_ARCHIVE_RECOVERY, which
means Jeff still has a problem with long recovery sessions.

We could maybe address that by changing the set of states in pg_control
(or perhaps simpler, adding a "hot standby active" flag there). That
might have wider consequences than we really want to deal with post-beta1
though.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#8)
Re: Reducing pg_ctl's reaction time

Andres Freund <andres@anarazel.de> writes:

Arguably we could and should improve the logic when the server has
started, right now it's pretty messy because we never treat a standby as
up if hot_standby is disabled...

True. If you could tell the difference between "HS disabled" and "HS not
enabled yet" from pg_control, that would make pg_ctl's behavior with
cold-standby servers much cleaner. Maybe it *is* worth messing with the
contents of pg_control at this late hour.

My inclination for the least invasive fix is to leave the DBState
enum alone and add a separate hot-standby state field with three
values (disabled/not-yet-enabled/enabled). Then pg_ctl would start
probing the postmaster when it saw either DB_IN_PRODUCTION DBstate
or hot-standby-enabled. (It'd almost not have to probe the postmaster
at all, except there's a race condition that the startup process
will probably change the field a little before the postmaster gets
the word to open the gates.) On the other hand, if it saw
DB_IN_ARCHIVE_RECOVERY with hot standby disabled, it'd stop waiting.

Any objections to that design sketch? Do we need to distinguish
between master and slave servers in the when-to-stop-waiting logic?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#10)
Re: Reducing pg_ctl's reaction time

Hi,

On 2017-06-26 16:49:07 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

Arguably we could and should improve the logic when the server has
started, right now it's pretty messy because we never treat a standby as
up if hot_standby is disabled...

True. If you could tell the difference between "HS disabled" and "HS not
enabled yet" from pg_control, that would make pg_ctl's behavior with
cold-standby servers much cleaner. Maybe it *is* worth messing with the
contents of pg_control at this late hour.

I'm +0.5.

My inclination for the least invasive fix is to leave the DBState
enum alone and add a separate hot-standby state field with three
values (disabled/not-yet-enabled/enabled).

Yea, that seems sane.

Then pg_ctl would start
probing the postmaster when it saw either DB_IN_PRODUCTION DBstate
or hot-standby-enabled. (It'd almost not have to probe the postmaster
at all, except there's a race condition that the startup process
will probably change the field a little before the postmaster gets
the word to open the gates.) On the other hand, if it saw
DB_IN_ARCHIVE_RECOVERY with hot standby disabled, it'd stop waiting.

It'd be quite possible to address the race-condition by moving the
updating of the control file to postmaster, to the
CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) block. That'd require
updating the control file from postmaster, which'd be somewhat ugly.
Perhaps that indicates that field shouldn't be in pg_control, but in the
pid file?

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#11)
Re: Reducing pg_ctl's reaction time

Andres Freund <andres@anarazel.de> writes:

It'd be quite possible to address the race-condition by moving the
updating of the control file to postmaster, to the
CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) block. That'd require
updating the control file from postmaster, which'd be somewhat ugly.

No, I don't like that at all. Has race conditions against updates
coming from the startup process.

Perhaps that indicates that field shouldn't be in pg_control, but in the
pid file?

Yeah, that would be a different way to go at it. The postmaster would
probably just write the state of the hot_standby GUC to the file, and
pg_ctl would have to infer things from there.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#12)
Re: Reducing pg_ctl's reaction time

On 2017-06-26 17:30:30 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

It'd be quite possible to address the race-condition by moving the
updating of the control file to postmaster, to the
CheckPostmasterSignal(PMSIGNAL_BEGIN_HOT_STANDBY) block. That'd require
updating the control file from postmaster, which'd be somewhat ugly.

No, I don't like that at all. Has race conditions against updates
coming from the startup process.

You'd obviously have to take the appropriate locks. I think the issue
here is less race conditions, and more that architecturally we'd
interact with shmem too much.

Perhaps that indicates that field shouldn't be in pg_control, but in the
pid file?

Yeah, that would be a different way to go at it. The postmaster would
probably just write the state of the hot_standby GUC to the file, and
pg_ctl would have to infer things from there.

I'd actually say we should just mirror the existing
#ifdef USE_SYSTEMD
if (!EnableHotStandby)
sd_notify(0, "READY=1");
#endif
with corresponding pidfile updates - doesn't really seem necessary for
pg_ctl to do more?

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#13)
Re: Reducing pg_ctl's reaction time

Andres Freund <andres@anarazel.de> writes:

On 2017-06-26 17:30:30 -0400, Tom Lane wrote:

No, I don't like that at all. Has race conditions against updates
coming from the startup process.

You'd obviously have to take the appropriate locks. I think the issue
here is less race conditions, and more that architecturally we'd
interact with shmem too much.

Uh, we are *not* taking any locks in the postmaster.

Yeah, that would be a different way to go at it. The postmaster would
probably just write the state of the hot_standby GUC to the file, and
pg_ctl would have to infer things from there.

I'd actually say we should just mirror the existing
#ifdef USE_SYSTEMD
if (!EnableHotStandby)
sd_notify(0, "READY=1");
#endif
with corresponding pidfile updates - doesn't really seem necessary for
pg_ctl to do more?

Hm. Take that a bit further, and we could drop the connection probes
altogether --- just put the whole responsibility on the postmaster to
show in the pidfile whether it's ready for connections or not.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#14)
Re: Reducing pg_ctl's reaction time

On 2017-06-26 17:38:03 -0400, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

On 2017-06-26 17:30:30 -0400, Tom Lane wrote:

No, I don't like that at all. Has race conditions against updates
coming from the startup process.

You'd obviously have to take the appropriate locks. I think the issue
here is less race conditions, and more that architecturally we'd
interact with shmem too much.

Uh, we are *not* taking any locks in the postmaster.

I'm not sure why you're 'Uh'ing, when my my point pretty much is that we
do not want to do so?

Hm. Take that a bit further, and we could drop the connection probes
altogether --- just put the whole responsibility on the postmaster to
show in the pidfile whether it's ready for connections or not.

Yea, that seems quite appealing, both from an architectural, simplicity,
and log noise perspective. I wonder if there's some added reliability by
the connection probe though? Essentially wondering if it'd be worthwhile
to keep a single connection test at the end. I'm somewhat disinclined
though.

- Andres

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#15)
Re: Reducing pg_ctl's reaction time

Andres Freund <andres@anarazel.de> writes:

On 2017-06-26 17:38:03 -0400, Tom Lane wrote:

Hm. Take that a bit further, and we could drop the connection probes
altogether --- just put the whole responsibility on the postmaster to
show in the pidfile whether it's ready for connections or not.

Yea, that seems quite appealing, both from an architectural, simplicity,
and log noise perspective. I wonder if there's some added reliability by
the connection probe though? Essentially wondering if it'd be worthwhile
to keep a single connection test at the end. I'm somewhat disinclined
though.

I agree --- part of the appeal of this idea is that there could be a net
subtraction of code from pg_ctl. (I think it wouldn't have to link libpq
anymore at all, though maybe I forgot something.) And you get rid of a
bunch of can't-connect failure modes, eg kernel packet filter in the way,
which probably outweighs any hypothetical reliability gain from confirming
the postmaster state the old way.

This would mean that v10 pg_ctl could not be used to start/stop older
postmasters, which is flexibility we tried to preserve in the past.
But I see that we already gave that up in quite a few code paths because
of their attempts to read pg_control, so I think it's a concern we can
ignore.

I'll draft something up later.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#16)
Re: Reducing pg_ctl's reaction time

I wrote:

Andres Freund <andres@anarazel.de> writes:

On 2017-06-26 17:38:03 -0400, Tom Lane wrote:

Hm. Take that a bit further, and we could drop the connection probes
altogether --- just put the whole responsibility on the postmaster to
show in the pidfile whether it's ready for connections or not.

Yea, that seems quite appealing, both from an architectural, simplicity,
and log noise perspective. I wonder if there's some added reliability by
the connection probe though? Essentially wondering if it'd be worthwhile
to keep a single connection test at the end. I'm somewhat disinclined
though.

I agree --- part of the appeal of this idea is that there could be a net
subtraction of code from pg_ctl. (I think it wouldn't have to link libpq
anymore at all, though maybe I forgot something.) And you get rid of a
bunch of can't-connect failure modes, eg kernel packet filter in the way,
which probably outweighs any hypothetical reliability gain from confirming
the postmaster state the old way.

Here's a draft patch for that. I quite like the results --- this seems
way simpler and more reliable than what pg_ctl has done up to now.
However, it's certainly arguable that this is too much change for an
optional post-beta patch. If we decide that it has to wait for v11,
I'd address Jeff's complaint by hacking the loop behavior in
test_postmaster_connection, which'd be ugly but not many lines of code.

Note that I followed the USE_SYSTEMD patch's lead as to where to report
postmaster state changes. Arguably, in the standby-server case, we
should not report the postmaster is ready until we've reached consistency.
But that would require additional signaling from the startup process
to the postmaster, so it seems like a separate change if we want it.

Thoughts?

regards, tom lane

Attachments:

new-pg_ctl-wait-implementation.patchtext/x-diff; charset=us-ascii; name=new-pg_ctl-wait-implementation.patchDownload+304-278
#18Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#17)
Re: Reducing pg_ctl's reaction time

Hi,

On 2017-06-27 14:59:18 -0400, Tom Lane wrote:

Here's a draft patch for that. I quite like the results --- this seems
way simpler and more reliable than what pg_ctl has done up to now.

Yea, I like that too.

However, it's certainly arguable that this is too much change for an
optional post-beta patch.

Yea, I think there's a valid case to be made for that. I'm still
inclined to go along with this, it seems we're otherwise just adding
complexity we're going to remove in a bit again.

If we decide that it has to wait for v11,
I'd address Jeff's complaint by hacking the loop behavior in
test_postmaster_connection, which'd be ugly but not many lines of code.

Basically increasing the wait time over time?

Note that I followed the USE_SYSTEMD patch's lead as to where to report
postmaster state changes. Arguably, in the standby-server case, we
should not report the postmaster is ready until we've reached consistency.
But that would require additional signaling from the startup process
to the postmaster, so it seems like a separate change if we want it.

I suspect we're going to want to add more states to this over time, but
as you say, there's no need to hurry.

/*
+	 * Report postmaster status in the postmaster.pid file, to allow pg_ctl to
+	 * see what's happening.  Note that all strings written to the status line
+	 * must be the same length, per comments for AddToDataDirLockFile().  We
+	 * currently make them all 8 bytes, padding with spaces.
+	 */
+	AddToDataDirLockFile(LOCK_FILE_LINE_PM_STATUS, "starting");

The 8-space thing in multiple places is a bit ugly. How about having a
a bunch of constants declared in one place? Alternatively make it
something like: status: $c, where $c is a one character code for the
various states?

@@ -1149,8 +1149,9 @@ TouchSocketLockFiles(void)
*
* Note: because we don't truncate the file, if we were to rewrite a line
* with less data than it had before, there would be garbage after the last
- * line.  We don't ever actually do that, so not worth adding another kernel
- * call to cover the possibility.
+ * line.  While we could fix that by adding a truncate call, that would make
+ * the file update non-atomic, which we'd rather avoid.  Therefore, callers
+ * should endeavor never to shorten a line once it's been written.
*/
void
AddToDataDirLockFile(int target_line, const char *str)
@@ -1193,19 +1194,26 @@ AddToDataDirLockFile(int target_line, co
srcptr = srcbuffer;
for (lineno = 1; lineno < target_line; lineno++)
{
-		if ((srcptr = strchr(srcptr, '\n')) == NULL)
-		{
-			elog(LOG, "incomplete data in \"%s\": found only %d newlines while trying to add line %d",
-				 DIRECTORY_LOCK_FILE, lineno - 1, target_line);
-			close(fd);
-			return;
-		}
-		srcptr++;
+		char	   *eol = strchr(srcptr, '\n');
+
+		if (eol == NULL)
+			break;				/* not enough lines in file yet */
+		srcptr = eol + 1;
}
memcpy(destbuffer, srcbuffer, srcptr - srcbuffer);
destptr = destbuffer + (srcptr - srcbuffer);
/*
+	 * Fill in any missing lines before the target line, in case lines are
+	 * added to the file out of order.
+	 */
+	for (; lineno < target_line; lineno++)
+	{
+		if (destptr < destbuffer + sizeof(destbuffer))
+			*destptr++ = '\n';
+	}
+
+	/*
* Write or rewrite the target line.
*/
snprintf(destptr, destbuffer + sizeof(destbuffer) - destptr,
"%s\n", str);

Not this patches fault, and shouldn't be changed now, but this is a
fairly weird way to manage and update this file.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#18)
Re: Reducing pg_ctl's reaction time

Andres Freund <andres@anarazel.de> writes:

On 2017-06-27 14:59:18 -0400, Tom Lane wrote:

If we decide that it has to wait for v11,
I'd address Jeff's complaint by hacking the loop behavior in
test_postmaster_connection, which'd be ugly but not many lines of code.

Basically increasing the wait time over time?

I was thinking of just dropping back to once-per-second after a couple
of seconds, with something along the lines of this in place of the
existing sleep at the bottom of the loop:

if (i >= 2 * WAITS_PER_SEC)
{
pg_usleep(USEC_PER_SEC);
i += WAITS_PER_SEC - 1;
}
else
pg_usleep(USEC_PER_SEC / WAITS_PER_SEC);

The 8-space thing in multiple places is a bit ugly. How about having a
a bunch of constants declared in one place? Alternatively make it
something like: status: $c, where $c is a one character code for the
various states?

Yeah, we could add the string values as macros in miscadmin.h, perhaps.
I don't like the single-character idea --- if we do expand the number of
things reported this way, that could get restrictive.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#18)
Re: Reducing pg_ctl's reaction time

Andres Freund <andres@anarazel.de> writes:

On 2017-06-27 14:59:18 -0400, Tom Lane wrote:

However, it's certainly arguable that this is too much change for an
optional post-beta patch.

Yea, I think there's a valid case to be made for that. I'm still
inclined to go along with this, it seems we're otherwise just adding
complexity we're going to remove in a bit again.

I'm not hearing anyone speaking against doing this now, so I'm going
to go ahead with it.

The 8-space thing in multiple places is a bit ugly. How about having a
a bunch of constants declared in one place?

While looking this over again, I got worried about the fact that pg_ctl
is #including "miscadmin.h". That's a pretty low-level backend header
and it wouldn't be surprising at all if somebody tried to put stuff in
it that wouldn't compile frontend-side. I think we should take the
opportunity, as long as we're touching this stuff, to split the #defines
that describe the contents of postmaster.pid into a separate header file.
Maybe "utils/pidfile.h" ?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#20)
#22Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#21)
#23Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#22)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#23)
#25Ants Aasma
ants.aasma@cybertec.at
In reply to: Tom Lane (#20)
#26Jeff Janes
jeff.janes@gmail.com
In reply to: Tom Lane (#17)
#27Andres Freund
andres@anarazel.de
In reply to: Jeff Janes (#26)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jeff Janes (#26)
#29Jeff Janes
jeff.janes@gmail.com
In reply to: Tom Lane (#28)