pgsql: Use asynchronous connect API in libpqwalreceiver

Started by Peter Eisentrautalmost 9 years ago22 messages

peter_e@gmx.net

almost 9 years ago

Use asynchronous connect API in libpqwalreceiver

This makes the connection attempt from CREATE SUBSCRIPTION and from
WalReceiver interruptable by the user in case the libpq connection is
hanging. The previous coding required immediate shutdown (SIGQUIT) of
PostgreSQL in that situation.

From: Petr Jelinek <petr.jelinek@2ndquadrant.com>
Tested-by: Thom Brown <thom@linux.com>

Branch
------
master

Details
-------
http://git.postgresql.org/pg/commitdiff/1e8a850094478a2036891fa3d4ce769bce411ee3

Modified Files
--------------
src/backend/postmaster/pgstat.c | 4 +-
.../libpqwalreceiver/libpqwalreceiver.c | 49 +++++++++++++++++++++-
src/include/pgstat.h | 2 +-
3 files changed, 50 insertions(+), 5 deletions(-)

--
Sent via pgsql-committers mailing list (pgsql-committers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-committers

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Peter Eisentraut (#1)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

Peter Eisentraut <peter_e@gmx.net> writes:

Use asynchronous connect API in libpqwalreceiver

Buildfarm member bowerbird has been failing in the pg_rewind test since
this patch went in. It looks like it's failing to complete connections
from the standby; which suggests that something platform-specific is
missing from this commit, but I dunno what.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Tom Lane (#2)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

On 3/3/17 19:16, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Use asynchronous connect API in libpqwalreceiver

Buildfarm member bowerbird has been failing in the pg_rewind test since
this patch went in. It looks like it's failing to complete connections
from the standby; which suggests that something platform-specific is
missing from this commit, but I dunno what.

Hmm, I wonder how widely tested the async connection API is on Windows
at all. I only see bowerbird and jacana running bin-check on Windows.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Peter Eisentraut (#3)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 3/3/17 19:16, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Use asynchronous connect API in libpqwalreceiver

Buildfarm member bowerbird has been failing in the pg_rewind test since
this patch went in. It looks like it's failing to complete connections
from the standby; which suggests that something platform-specific is
missing from this commit, but I dunno what.

Hmm, I wonder how widely tested the async connection API is on Windows
at all. I only see bowerbird and jacana running bin-check on Windows.

Yeah, I was wondering if this is just exposing a pre-existing bug.
However, the "normal" path operates by repeatedly invoking PQconnectPoll
(cf. connectDBComplete) so it's not immediately obvious how such a bug
would've escaped detection.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Petr Jelinek

petr.jelinek@2ndquadrant.com

almost 9 years ago

In reply to: Tom Lane (#4)

1 attachment(s)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

On 04/03/17 05:11, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 3/3/17 19:16, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Use asynchronous connect API in libpqwalreceiver

Buildfarm member bowerbird has been failing in the pg_rewind test since
this patch went in. It looks like it's failing to complete connections
from the standby; which suggests that something platform-specific is
missing from this commit, but I dunno what.

Hmm, I wonder how widely tested the async connection API is on Windows
at all. I only see bowerbird and jacana running bin-check on Windows.

Yeah, I was wondering if this is just exposing a pre-existing bug.
However, the "normal" path operates by repeatedly invoking PQconnectPoll
(cf. connectDBComplete) so it's not immediately obvious how such a bug
would've escaped detection.

I can see one difference though (I didn't see this code before) and that
is, the connectDBComplete starts with waiting for socket to become
writable and only then calls PQconnectPoll, while my patch starts with
PQconnectPoll call. And I see following comment in connectDBstart

/*
* The code for processing CONNECTION_NEEDED state is in PQconnectPoll(),
* so that it can easily be re-executed if needed again during the
* asynchronous startup process. However, we must run it once here,
* because callers expect a success return from this routine to mean that
* we are in PGRES_POLLING_WRITING connection state.
*/

So I guess I implemented it wrong in a subtle way that breaks on windows.

If that's the case, the attached should fix it, but I have no way of
testing it on windows, I can only say that it still works on my machine
so at least it hopefully does not make things worse.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

0001-Reorder-the-asynchronous-libpq-calls-for-replication.patchtext/plain; charset=UTF-8; name=0001-Reorder-the-asynchronous-libpq-calls-for-replication.patchDownload

From 7478c898954099c79c0a34743d2292f740120009 Mon Sep 17 00:00:00 2001
From: Petr Jelinek <pjmodos@pjmodos.net>
Date: Sat, 4 Mar 2017 07:43:17 +0100
Subject: [PATCH] Reorder the asynchronous libpq calls for replication
 connection

---
 .../replication/libpqwalreceiver/libpqwalreceiver.c   | 19 +++++++++++--------
 1 file changed, 11 insertions(+), 8 deletions(-)

diff --git a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
index 048d2aa..b24eb3f 100644
--- a/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
+++ b/src/backend/replication/libpqwalreceiver/libpqwalreceiver.c
@@ -113,7 +113,7 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname,
 				 char **err)
 {
 	WalReceiverConn *conn;
-	PostgresPollingStatusType status;
+	PostgresPollingStatusType status = PGRES_POLLING_WRITING;
 	const char *keys[5];
 	const char *vals[5];
 	int			i = 0;
@@ -155,12 +155,14 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname,
 		return NULL;
 	}
 
-	/* Poll connection. */
-	do
+	/*
+	 * Poll connection until we have OK or FAILED status.
+	 *
+	 * Note the initial state after PQconnectStartParams is
+	 * PGRES_POLLING_WRITING.
+	 */
+	while (status != PGRES_POLLING_OK && status != PGRES_POLLING_FAILED)
 	{
-		/* Determine current state of the connection. */
-		status = PQconnectPoll(conn->streamConn);
-
 		/* Sleep a bit if waiting for socket. */
 		if (status == PGRES_POLLING_READING ||
 			status == PGRES_POLLING_WRITING)
@@ -189,8 +191,9 @@ libpqrcv_connect(const char *conninfo, bool logical, const char *appname,
 				CHECK_FOR_INTERRUPTS();
 		}
 
-		/* Otherwise loop until we have OK or FAILED status. */
-	} while (status != PGRES_POLLING_OK && status != PGRES_POLLING_FAILED);
+		/* Determine new state of the connection. */
+		status = PQconnectPoll(conn->streamConn);
+	}
 
 	if (PQstatus(conn->streamConn) != CONNECTION_OK)
 	{
-- 
2.7.4

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Petr Jelinek (#5)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

On 3/4/17 01:45, Petr Jelinek wrote:

I can see one difference though (I didn't see this code before) and that
is, the connectDBComplete starts with waiting for socket to become
writable and only then calls PQconnectPoll, while my patch starts with
PQconnectPoll call. And I see following comment in connectDBstart

/*
* The code for processing CONNECTION_NEEDED state is in PQconnectPoll(),
* so that it can easily be re-executed if needed again during the
* asynchronous startup process. However, we must run it once here,
* because callers expect a success return from this routine to mean that
* we are in PGRES_POLLING_WRITING connection state.
*/

Yes, the libpq documentation also says that.

If that's the case, the attached should fix it, but I have no way of
testing it on windows, I can only say that it still works on my machine
so at least it hopefully does not make things worse.

Committed that. Let's see how it goes.

I rewrote the while loop as a for loop so that the control logic isn't
spread out over three screens.

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Peter Eisentraut

peter.eisentraut@2ndquadrant.com

almost 9 years ago

In reply to: Peter Eisentraut (#6)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

On 3/6/17 09:38, Peter Eisentraut wrote:

On 3/4/17 01:45, Petr Jelinek wrote:

If that's the case, the attached should fix it, but I have no way of
testing it on windows, I can only say that it still works on my machine
so at least it hopefully does not make things worse.

Committed that. Let's see how it goes.

So that didn't work. Now we probably need someone to dig into that host
directly.

Andrew, could you help?

--
Peter Eisentraut http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andrew Dunstan

andrew.dunstan@2ndquadrant.com

almost 9 years ago

In reply to: Peter Eisentraut (#7)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

On 03/08/2017 08:33 AM, Peter Eisentraut wrote:

On 3/6/17 09:38, Peter Eisentraut wrote:

On 3/4/17 01:45, Petr Jelinek wrote:

If that's the case, the attached should fix it, but I have no way of
testing it on windows, I can only say that it still works on my machine
so at least it hopefully does not make things worse.

Committed that. Let's see how it goes.

So that didn't work. Now we probably need someone to dig into that host
directly.

Andrew, could you help?

I'll take a look when I get a chance. Might be a few days.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Andrew Dunstan

andrew.dunstan@2ndquadrant.com

almost 9 years ago

In reply to: Tom Lane (#4)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

On 03/03/2017 11:11 PM, Tom Lane wrote:

Peter Eisentraut <peter.eisentraut@2ndquadrant.com> writes:

On 3/3/17 19:16, Tom Lane wrote:

Peter Eisentraut <peter_e@gmx.net> writes:

Use asynchronous connect API in libpqwalreceiver

Buildfarm member bowerbird has been failing in the pg_rewind test since
this patch went in. It looks like it's failing to complete connections
from the standby; which suggests that something platform-specific is
missing from this commit, but I dunno what.

Hmm, I wonder how widely tested the async connection API is on Windows
at all. I only see bowerbird and jacana running bin-check on Windows.

Yeah, I was wondering if this is just exposing a pre-existing bug.
However, the "normal" path operates by repeatedly invoking PQconnectPoll
(cf. connectDBComplete) so it's not immediately obvious how such a bug
would've escaped detection.

(After a long period of fruitless empirical testing I turned to the code)

Maybe I'm missing something, but connectDBComplete() handles a return of
PGRESS_POLLING_OK as a success while connectDBStart() seems not to. I
don't find anywhere in our code other than libpqwalreceiver that
actually uses that interface, so it's not surprising if it's now
failing. So my bet is it is indeed a long-standing bug.

cheers

andr4ew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Andrew Dunstan (#9)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 03/03/2017 11:11 PM, Tom Lane wrote:

Yeah, I was wondering if this is just exposing a pre-existing bug.
However, the "normal" path operates by repeatedly invoking PQconnectPoll
(cf. connectDBComplete) so it's not immediately obvious how such a bug
would've escaped detection.

(After a long period of fruitless empirical testing I turned to the code)
Maybe I'm missing something, but connectDBComplete() handles a return of
PGRESS_POLLING_OK as a success while connectDBStart() seems not to. I
don't find anywhere in our code other than libpqwalreceiver that
actually uses that interface, so it's not surprising if it's now
failing. So my bet is it is indeed a long-standing bug.

Meh ... that argument doesn't hold water, because the old code here called
PQconnectdbParams which is just PQconnectStartParams then
connectDBComplete. So the problem cannot be in connectDBStart; that's
common to both paths. It has to be some discrepancy between what
connectDBComplete does and what the new loop in libpqwalreceiver is doing.

The original loop coding in 1e8a85009 was not very close to the documented
spec for PQconnectPoll at all, and while e434ad39a made it closer, it's
still not really the same: connectDBComplete doesn't call PQconnectPoll
until the socket is known read-ready or write-ready. The walreceiver loop
does not guarantee that, but would make an additional call after any
random other wakeup. It's not very clear why bowerbird, and only
bowerbird, would be seeing such wakeups --- but I'm having a really hard
time seeing any other explanation for the change in behavior. (I wonder
whether bowerbird is telling us that WaitLatchOrSocket can sometimes
return prematurely on Windows.)

I'm also pretty sure that the ResetLatch call is in the wrong place which
could lead to missed wakeups, though that's the opposite of the immediate
problem.

I'll try correcting these things and we'll see if it gets any better.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Petr Jelinek

petr.jelinek@2ndquadrant.com

almost 9 years ago

In reply to: Tom Lane (#10)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

On 15/03/17 17:55, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 03/03/2017 11:11 PM, Tom Lane wrote:

Yeah, I was wondering if this is just exposing a pre-existing bug.
However, the "normal" path operates by repeatedly invoking PQconnectPoll
(cf. connectDBComplete) so it's not immediately obvious how such a bug
would've escaped detection.

(After a long period of fruitless empirical testing I turned to the code)
Maybe I'm missing something, but connectDBComplete() handles a return of
PGRESS_POLLING_OK as a success while connectDBStart() seems not to. I
don't find anywhere in our code other than libpqwalreceiver that
actually uses that interface, so it's not surprising if it's now
failing. So my bet is it is indeed a long-standing bug.

Meh ... that argument doesn't hold water, because the old code here called
PQconnectdbParams which is just PQconnectStartParams then
connectDBComplete. So the problem cannot be in connectDBStart; that's
common to both paths. It has to be some discrepancy between what
connectDBComplete does and what the new loop in libpqwalreceiver is doing.

The original loop coding in 1e8a85009 was not very close to the documented
spec for PQconnectPoll at all, and while e434ad39a made it closer, it's
still not really the same: connectDBComplete doesn't call PQconnectPoll
until the socket is known read-ready or write-ready. The walreceiver loop
does not guarantee that, but would make an additional call after any
random other wakeup. It's not very clear why bowerbird, and only
bowerbird, would be seeing such wakeups --- but I'm having a really hard
time seeing any other explanation for the change in behavior. (I wonder
whether bowerbird is telling us that WaitLatchOrSocket can sometimes
return prematurely on Windows.)

I'm also pretty sure that the ResetLatch call is in the wrong place which
could lead to missed wakeups, though that's the opposite of the immediate
problem.

I'll try correcting these things and we'll see if it gets any better.

Looks like that didn't help either.

I setup my own Windows machine and can reproduce the issue. I played
around a bit and could not really find a fix other than adding
WL_TIMEOUT and short timeout to WaitLatchOrSocket (it does wait a very
long time on the WaitLatchOrSocket otherwise before failing).

So I wonder if this is the same issue that caused us using different
coding for WaitLatchOrSocket in pgstat.c (lines ~3918-3940).

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Andres Freund

andres@anarazel.de

almost 9 years ago

In reply to: Petr Jelinek (#11)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

On 2017-03-16 13:00:54 +0100, Petr Jelinek wrote:

On 15/03/17 17:55, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 03/03/2017 11:11 PM, Tom Lane wrote:

Yeah, I was wondering if this is just exposing a pre-existing bug.
However, the "normal" path operates by repeatedly invoking PQconnectPoll
(cf. connectDBComplete) so it's not immediately obvious how such a bug
would've escaped detection.

(After a long period of fruitless empirical testing I turned to the code)
Maybe I'm missing something, but connectDBComplete() handles a return of
PGRESS_POLLING_OK as a success while connectDBStart() seems not to. I
don't find anywhere in our code other than libpqwalreceiver that
actually uses that interface, so it's not surprising if it's now
failing. So my bet is it is indeed a long-standing bug.

Meh ... that argument doesn't hold water, because the old code here called
PQconnectdbParams which is just PQconnectStartParams then
connectDBComplete. So the problem cannot be in connectDBStart; that's
common to both paths. It has to be some discrepancy between what
connectDBComplete does and what the new loop in libpqwalreceiver is doing.

The original loop coding in 1e8a85009 was not very close to the documented
spec for PQconnectPoll at all, and while e434ad39a made it closer, it's
still not really the same: connectDBComplete doesn't call PQconnectPoll
until the socket is known read-ready or write-ready. The walreceiver loop
does not guarantee that, but would make an additional call after any
random other wakeup. It's not very clear why bowerbird, and only
bowerbird, would be seeing such wakeups --- but I'm having a really hard
time seeing any other explanation for the change in behavior. (I wonder
whether bowerbird is telling us that WaitLatchOrSocket can sometimes
return prematurely on Windows.)

I'm also pretty sure that the ResetLatch call is in the wrong place which
could lead to missed wakeups, though that's the opposite of the immediate
problem.

I'll try correcting these things and we'll see if it gets any better.

Looks like that didn't help either.

I setup my own Windows machine and can reproduce the issue. I played
around a bit and could not really find a fix other than adding
WL_TIMEOUT and short timeout to WaitLatchOrSocket (it does wait a very
long time on the WaitLatchOrSocket otherwise before failing).

Hm. Could you use process explorer or such to see the exact events
happening? Seing that might help us to nail this down.

Greetings,

Andres Freund

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Andres Freund (#12)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

Andres Freund <andres@anarazel.de> writes:

On 2017-03-16 13:00:54 +0100, Petr Jelinek wrote:

Looks like that didn't help either.

I setup my own Windows machine and can reproduce the issue. I played
around a bit and could not really find a fix other than adding
WL_TIMEOUT and short timeout to WaitLatchOrSocket (it does wait a very
long time on the WaitLatchOrSocket otherwise before failing).

Hm. Could you use process explorer or such to see the exact events
happening? Seing that might help us to nail this down.

At this point it's hard to escape the conclusion that we're looking at
some sort of bug in the Windows version of the WaitEventSet infrastructure
--- or at least, the WaitEventSet-based waits for socket events are
evidently behaving differently from the implementation on the libpq side,
which is based on pqWaitTimed and thence pqSocketCheck and thence
pqSocketPoll.

Noticing that bowerbird builds with openssl, I'm a bit suspicious that
the issue might have something to do with the fact that pqSocketCheck
has a special case for SSL, which I don't think WaitEventSet knows about.
But I don't think SSL is actually active in the buildfarm run, so it's
hard to see how we get to that scenario exactly --- or even less why it
would only matter on Windows.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14

Andrew Dunstan

andrew.dunstan@2ndquadrant.com

almost 9 years ago

In reply to: Andrew Dunstan (#8)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

On 03/08/2017 08:29 PM, Andrew Dunstan wrote:

On 03/08/2017 08:33 AM, Peter Eisentraut wrote:

On 3/6/17 09:38, Peter Eisentraut wrote:

On 3/4/17 01:45, Petr Jelinek wrote:

If that's the case, the attached should fix it, but I have no way of
testing it on windows, I can only say that it still works on my machine
so at least it hopefully does not make things worse.

Committed that. Let's see how it goes.

So that didn't work. Now we probably need someone to dig into that host
directly.

Andrew, could you help?

I'll take a look when I get a chance. Might be a few days.

I have confirmed on jacana Petr's observation that adding a timeout to
the WaitLatchOrSocket cures the problem.

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Andrew Dunstan (#14)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

I have confirmed on jacana Petr's observation that adding a timeout to
the WaitLatchOrSocket cures the problem.

Does anyone have a theory as to why that cures the problem?

What length of timeout is being suggested here? Would a long timeout
perhaps create a performance issue? That is, I'm wondering if the
wait actually sleeps to the end of the timeout in the problem case(s).

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Petr Jelinek

petr.jelinek@2ndquadrant.com

almost 9 years ago

In reply to: Tom Lane (#15)

1 attachment(s)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

On 17/03/17 16:07, Tom Lane wrote:

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

I have confirmed on jacana Petr's observation that adding a timeout to
the WaitLatchOrSocket cures the problem.

Does anyone have a theory as to why that cures the problem?

I now have theory and even PoC patch for it.

The long wait of WaitLatchOrSocket happens after connection state
switches from CONNECTION_STARTED to CONNECTION_MADE in both cases the
return value of PQconnectPoll is PGRES_POLLING_WRITING so we wait for
FD_WRITE.

Now the documentation for WSAEventSelect says "The FD_WRITE network
event is handled slightly differently. An FD_WRITE network event is
recorded when a socket is first connected with a call to the connect,
ConnectEx, WSAConnect, WSAConnectByList, or WSAConnectByName function or
when a socket is accepted with accept, AcceptEx, or WSAAccept function
and then after a send fails with WSAEWOULDBLOCK and buffer space becomes
available. Therefore, an application can assume that sends are possible
starting from the first FD_WRITE network event setting and lasting until
a send returns WSAEWOULDBLOCK. After such a failure the application will
find out that sends are again possible when an FD_WRITE network event is
recorded and the associated event object is set."

But while PQconnectPoll does connect() before setting connection status
to CONNECTION_STARTED and returns PGRES_POLLING_WRITING so the FD_WRITE
eventually happens, it does not do any writes in the code block that
switches to CONNECTION_MADE and PGRES_POLLING_WRITING. That means
FD_WRITE event is never recorded as per quoted documentation.

Then what remains is why it works in libpq. If you look at
pgwin32_select which is eventually called by libpq code, it actually
handles the situation by trying empty WSASend on any socket that is
supposed to wait for FD_WRITE and only calling the
WaitForMultipleObjectsEx when WSASend failed with WSAEWOULDBLOCK, when
the WSASend succeeds it immediately returns ok.

So I did something similar in attached for WaitEventSetWaitBlock() and
it indeed solves the issue my windows test machine. Now the
coding/placement probably isn't the best (and there are no comments),
but maybe somebody will find proper place for this now that we know the
cause.

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

Attachments:

fix-wait-for-writable-socket-on-windows.patchtext/x-patch; name=fix-wait-for-writable-socket-on-windows.patchDownload

diff --git a/src/backend/storage/ipc/latch.c b/src/backend/storage/ipc/latch.c
index ea7f930..300b866 100644
--- a/src/backend/storage/ipc/latch.c
+++ b/src/backend/storage/ipc/latch.c
@@ -1401,6 +1401,37 @@ WaitEventSetWaitBlock(WaitEventSet *set, int cur_timeout,
 			WaitEventAdjustWin32(set, cur_event);
 			cur_event->reset = false;
 		}
+
+		if (cur_event->events & WL_SOCKET_WRITEABLE)
+		{
+			char		c;
+			WSABUF		buf;
+			DWORD		sent;
+			int			r;
+
+			buf.buf = &c;
+			buf.len = 0;
+
+			r = WSASend(cur_event->fd, &buf, 1, &sent, 0, NULL, NULL);
+			if (r == 0)
+			{
+				occurred_events->pos = cur_event->pos;
+				occurred_events->user_data = cur_event->user_data;
+				occurred_events->events = WL_SOCKET_WRITEABLE;
+				occurred_events->fd = cur_event->fd;
+				return 1;
+			}
+			else
+			{
+				if (WSAGetLastError() != WSAEWOULDBLOCK)
+					/*
+					 * Not completed, and not just "would block", so an error
+					 * occurred
+					 */
+					elog(ERROR, "failed writability check on socket: error code %u",
+						 WSAGetLastError());
+			}
+		}
 	}
 
 	/*

#17

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Petr Jelinek (#16)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:

Now the documentation for WSAEventSelect says "The FD_WRITE network
event is handled slightly differently. An FD_WRITE network event is
recorded when a socket is first connected with a call to the connect,
ConnectEx, WSAConnect, WSAConnectByList, or WSAConnectByName function or
when a socket is accepted with accept, AcceptEx, or WSAAccept function
and then after a send fails with WSAEWOULDBLOCK and buffer space becomes
available. Therefore, an application can assume that sends are possible
starting from the first FD_WRITE network event setting and lasting until
a send returns WSAEWOULDBLOCK. After such a failure the application will
find out that sends are again possible when an FD_WRITE network event is
recorded and the associated event object is set."

But while PQconnectPoll does connect() before setting connection status
to CONNECTION_STARTED and returns PGRES_POLLING_WRITING so the FD_WRITE
eventually happens, it does not do any writes in the code block that
switches to CONNECTION_MADE and PGRES_POLLING_WRITING. That means
FD_WRITE event is never recorded as per quoted documentation.

Ah-hah! Great detective work.

Then what remains is why it works in libpq. If you look at
pgwin32_select which is eventually called by libpq code, it actually
handles the situation by trying empty WSASend on any socket that is
supposed to wait for FD_WRITE and only calling the
WaitForMultipleObjectsEx when WSASend failed with WSAEWOULDBLOCK, when
the WSASend succeeds it immediately returns ok.

So I did something similar in attached for WaitEventSetWaitBlock() and
it indeed solves the issue my windows test machine. Now the
coding/placement probably isn't the best (and there are no comments),
but maybe somebody will find proper place for this now that we know the
cause.

Yeah, I'm afraid we had better do something more or less like this.
It's interesting to speculate about whether WaitEventSet could keep
additional state that would avoid the need for a dummy send() every
time, but that seems like material for new development not a bug fix.

In any case, a quick review of the code suggests that there are no
performance-critical places where we wait for WL_SOCKET_WRITEABLE
unless we've already detected that we have to wait for the network;
so the dummy send() isn't going to do anything except consume a
few more CPU cycles before we get to the point of blocking. It may
not be worth worrying about.

I'll add some comments and push this. Does everyone agree that
this had better get back-patched, too?

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Petr Jelinek

petr.jelinek@2ndquadrant.com

almost 9 years ago

In reply to: Tom Lane (#17)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

On 17/03/17 17:28, Tom Lane wrote:

Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:

Now the documentation for WSAEventSelect says "The FD_WRITE network
event is handled slightly differently. An FD_WRITE network event is
recorded when a socket is first connected with a call to the connect,
ConnectEx, WSAConnect, WSAConnectByList, or WSAConnectByName function or
when a socket is accepted with accept, AcceptEx, or WSAAccept function
and then after a send fails with WSAEWOULDBLOCK and buffer space becomes
available. Therefore, an application can assume that sends are possible
starting from the first FD_WRITE network event setting and lasting until
a send returns WSAEWOULDBLOCK. After such a failure the application will
find out that sends are again possible when an FD_WRITE network event is
recorded and the associated event object is set."

But while PQconnectPoll does connect() before setting connection status
to CONNECTION_STARTED and returns PGRES_POLLING_WRITING so the FD_WRITE
eventually happens, it does not do any writes in the code block that
switches to CONNECTION_MADE and PGRES_POLLING_WRITING. That means
FD_WRITE event is never recorded as per quoted documentation.

Ah-hah! Great detective work.

Then what remains is why it works in libpq. If you look at
pgwin32_select which is eventually called by libpq code, it actually
handles the situation by trying empty WSASend on any socket that is
supposed to wait for FD_WRITE and only calling the
WaitForMultipleObjectsEx when WSASend failed with WSAEWOULDBLOCK, when
the WSASend succeeds it immediately returns ok.

So I did something similar in attached for WaitEventSetWaitBlock() and
it indeed solves the issue my windows test machine. Now the
coding/placement probably isn't the best (and there are no comments),
but maybe somebody will find proper place for this now that we know the
cause.

Yeah, I'm afraid we had better do something more or less like this.
It's interesting to speculate about whether WaitEventSet could keep
additional state that would avoid the need for a dummy send() every
time, but that seems like material for new development not a bug fix.

In any case, a quick review of the code suggests that there are no
performance-critical places where we wait for WL_SOCKET_WRITEABLE
unless we've already detected that we have to wait for the network;
so the dummy send() isn't going to do anything except consume a
few more CPU cycles before we get to the point of blocking. It may
not be worth worrying about.

Thanks, now that I look at it again I think we might need to do cycle
with the occurred_events and returned_events and not return on first
find since theoretically there can be multiple sockets if this gets
called directly and not via WaitLatchOrSocket(). I don't have mental
power to finish it up right now though, sorry for that.

I'll add some comments and push this. Does everyone agree that
this had better get back-patched, too?

--
Petr Jelinek http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#19

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Petr Jelinek (#18)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:

On 17/03/17 17:28, Tom Lane wrote:

Yeah, I'm afraid we had better do something more or less like this.
It's interesting to speculate about whether WaitEventSet could keep
additional state that would avoid the need for a dummy send() every
time, but that seems like material for new development not a bug fix.

Thanks, now that I look at it again I think we might need to do cycle
with the occurred_events and returned_events and not return on first
find since theoretically there can be multiple sockets if this gets
called directly and not via WaitLatchOrSocket(). I don't have mental
power to finish it up right now though, sorry for that.

I think WaitEventSet is only required to identify at least one reason
for ending the wait; it is not required to identify all of them.
(Even if it tried to do so, the information could be out of date by
the time it returns.) So I'm not particularly fussed about that,
even if we had code that waited on write-ready for multiple sockets
which I don't believe we do.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#20

Andrew Dunstan

andrew.dunstan@2ndquadrant.com

almost 9 years ago

In reply to: Tom Lane (#17)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

On 03/17/2017 12:28 PM, Tom Lane wrote:

Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:

Now the documentation for WSAEventSelect says "The FD_WRITE network
event is handled slightly differently. An FD_WRITE network event is
recorded when a socket is first connected with a call to the connect,
ConnectEx, WSAConnect, WSAConnectByList, or WSAConnectByName function or
when a socket is accepted with accept, AcceptEx, or WSAAccept function
and then after a send fails with WSAEWOULDBLOCK and buffer space becomes
available. Therefore, an application can assume that sends are possible
starting from the first FD_WRITE network event setting and lasting until
a send returns WSAEWOULDBLOCK. After such a failure the application will
find out that sends are again possible when an FD_WRITE network event is
recorded and the associated event object is set."
But while PQconnectPoll does connect() before setting connection status
to CONNECTION_STARTED and returns PGRES_POLLING_WRITING so the FD_WRITE
eventually happens, it does not do any writes in the code block that
switches to CONNECTION_MADE and PGRES_POLLING_WRITING. That means
FD_WRITE event is never recorded as per quoted documentation.

Ah-hah! Great detective work.

Then what remains is why it works in libpq. If you look at
pgwin32_select which is eventually called by libpq code, it actually
handles the situation by trying empty WSASend on any socket that is
supposed to wait for FD_WRITE and only calling the
WaitForMultipleObjectsEx when WSASend failed with WSAEWOULDBLOCK, when
the WSASend succeeds it immediately returns ok.
So I did something similar in attached for WaitEventSetWaitBlock() and
it indeed solves the issue my windows test machine. Now the
coding/placement probably isn't the best (and there are no comments),
but maybe somebody will find proper place for this now that we know the
cause.

Yeah, I'm afraid we had better do something more or less like this.
It's interesting to speculate about whether WaitEventSet could keep
additional state that would avoid the need for a dummy send() every
time, but that seems like material for new development not a bug fix.

In any case, a quick review of the code suggests that there are no
performance-critical places where we wait for WL_SOCKET_WRITEABLE
unless we've already detected that we have to wait for the network;
so the dummy send() isn't going to do anything except consume a
few more CPU cycles before we get to the point of blocking. It may
not be worth worrying about.

I'll add some comments and push this. Does everyone agree that
this had better get back-patched, too?

Confirmed that this fixes the problem on jacana.

+1 for backpatching.

Does that mean we could remove the special logic in pgstat.c?

cheers

andrew

--
Andrew Dunstan https://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Andrew Dunstan (#20)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

Andrew Dunstan <andrew.dunstan@2ndquadrant.com> writes:

On 03/17/2017 12:28 PM, Tom Lane wrote:

I'll add some comments and push this. Does everyone agree that
this had better get back-patched, too?

Confirmed that this fixes the problem on jacana.
+1 for backpatching.

I've pushed this into 9.6, but the WaitEventSet code doesn't exist before
that. It might be that we should apply a similar fix to the predecessor
code, but I'm not really excited about it given that we have no indication
that the bug can be hit in the older branches. Anyway, lacking a Windows
machine to test on, I have no interest in trying to make such a patch
blindly.

Does that mean we could remove the special logic in pgstat.c?

If you mean the bit near line 3930 about

* Windows, at least in its Windows Server 2003 R2 incarnation,
* sometimes loses FD_READ events. Waking up and retrying the recv()
* fixes that, so don't sleep indefinitely. This is a crock of the
* first water, but until somebody wants to debug exactly what's
* happening there, this is the best we can do.

that doesn't seem related --- that's about missing FD_READ events not
FD_WRITE.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#22

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Tom Lane (#19)

Re: [COMMITTERS] pgsql: Use asynchronous connect API in libpqwalreceiver

I wrote:

Petr Jelinek <petr.jelinek@2ndquadrant.com> writes:

Thanks, now that I look at it again I think we might need to do cycle
with the occurred_events and returned_events and not return on first
find since theoretically there can be multiple sockets if this gets
called directly and not via WaitLatchOrSocket(). I don't have mental
power to finish it up right now though, sorry for that.

I think WaitEventSet is only required to identify at least one reason
for ending the wait; it is not required to identify all of them.

Also, I see the header comment for the Windows version of
WaitEventSetWaitBlock says it really only ever reports one event anyway.
So there seems no need to work harder than this. Pushed with cosmetic
adjustments.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers