psql not responding to SIGINT upon db reconnection

Started by Tristan Partinover 2 years ago43 messages
#1Tristan Partin
tristan@neon.tech
1 attachment(s)

Neon provides a quick start mechanism for psql using the following
workflow:

$ psql -h pg.neon.tech
NOTICE: Welcome to Neon!
Authenticate by visiting:
https://console.neon.tech/psql_session/xxx

Upon navigating to the link, the user selects their database to work
with. The psql process is then connected to a Postgres database at Neon.

psql provides a way to reconnect to the database from within itself: \c.
If, after connecting to a database such as the process previously
mentioned, the user starts a reconnection to the database from within
psql, the process will refuse to interrupt the reconnection attempt
after sending SIGINT to it.

tristan=> \c
NOTICE: Welcome to Neon!
Authenticate by visiting:
https://console.neon.tech/psql_session/yyy

^C
^C^C
^C^C

I am not really sure if this was a problem on Windows machines, but the
attached patch should support it if it does. If this was never a problem
on Windows, it might be best to check for any regressions.

This was originally discussed in the a GitHub issue[0]https://github.com/neondatabase/neon/issues/1449 at Neon.

[0]: https://github.com/neondatabase/neon/issues/1449

--
Tristan Partin
Neon (https://neon.tech)

Attachments:

v1-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchtext/x-patch; charset=utf-8; name=v1-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchDownload
From b9ccfc3c84a25b8616fd40495954bb6f77788e28 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Mon, 24 Jul 2023 11:12:59 -0500
Subject: [PATCH v1] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Restore the default SIGINT handler while polling in PQconnectdbParams(),
and put the custom handler back afterward.
---
 src/bin/psql/command.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 6733f008fd..e40413a229 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -40,6 +40,7 @@
 #include "large_obj.h"
 #include "libpq-fe.h"
 #include "libpq/pqcomm.h"
+#include "libpq/pqsignal.h"
 #include "mainloop.h"
 #include "portability/instr_time.h"
 #include "pqexpbuffer.h"
@@ -3530,8 +3531,11 @@ do_connect(enum trivalue reuse_previous_specification,
 	{
 		const char **keywords = pg_malloc((nconnopts + 1) * sizeof(*keywords));
 		const char **values = pg_malloc((nconnopts + 1) * sizeof(*values));
+		int			rc;
 		int			paramnum = 0;
 		PQconninfoOption *ci;
+		struct sigaction oldact;
+		struct sigaction newact = { 0 };
 
 		/*
 		 * Copy non-default settings into the PQconnectdbParams parameter
@@ -3576,9 +3580,24 @@ do_connect(enum trivalue reuse_previous_specification,
 		keywords[paramnum] = NULL;
 		values[paramnum] = NULL;
 
+		/*
+		 * Restore the default SIGINT behavior while within libpq. Otherwise, we
+		 * can never exit from polling for the database connection. Failure to
+		 * restore is non-fatal.
+		 */
+		newact.sa_handler = SIG_DFL;
+		rc = sigaction(SIGINT, &newact, &oldact);
+
 		/* Note we do not want libpq to re-expand the dbname parameter */
 		n_conn = PQconnectdbParams(keywords, values, false);
 
+		/*
+		 * Only attempt to restore the old handler if we successfully overwrote
+		 * it initially.
+		 */
+		if (rc == 0)
+			sigaction(SIGINT, &oldact, NULL);
+
 		pg_free(keywords);
 		pg_free(values);
 
-- 
Tristan Partin
Neon (https://neon.tech)

#2Gurjeet Singh
gurjeet@singh.im
In reply to: Tristan Partin (#1)
Re: psql not responding to SIGINT upon db reconnection

On Mon, Jul 24, 2023 at 9:26 AM Tristan Partin <tristan@neon.tech> wrote:

attached patch

+        /*
+         * Restore the default SIGINT behavior while within libpq.
Otherwise, we
+         * can never exit from polling for the database connection. Failure to
+         * restore is non-fatal.
+         */
+        newact.sa_handler = SIG_DFL;
+        rc = sigaction(SIGINT, &newact, &oldact);

There's no action taken if rc != 0. It doesn't seem right to
continue as if everything's fine when the handler registration fails.
At least a warning is warranted, so that the user reports such
failures to the community.

Best regards,
Gurjeet
http://Gurje.et

#3Tristan Partin
tristan@neon.tech
In reply to: Gurjeet Singh (#2)
Re: psql not responding to SIGINT upon db reconnection

On Mon Jul 24, 2023 at 12:00 PM CDT, Gurjeet Singh wrote:

On Mon, Jul 24, 2023 at 9:26 AM Tristan Partin <tristan@neon.tech> wrote:

attached patch

+        /*
+         * Restore the default SIGINT behavior while within libpq.
Otherwise, we
+         * can never exit from polling for the database connection. Failure to
+         * restore is non-fatal.
+         */
+        newact.sa_handler = SIG_DFL;
+        rc = sigaction(SIGINT, &newact, &oldact);

There's no action taken if rc != 0. It doesn't seem right to
continue as if everything's fine when the handler registration fails.
At least a warning is warranted, so that the user reports such
failures to the community.

If we fail to go back to the default handler, then we just get the
behavior we currently have. I am not sure logging a message like "Failed
to restore default SIGINT handler" is that useful to the user. It isn't
actionable, and at the end of the day isn't going to affect them for the
most part. They also aren't even aware that default handler was ever
overridden in the first place. I'm more than happy to add a debug log if
it is the blocker to getting this patch accepted however.

--
Tristan Partin
Neon (https://neon.tech)

#4Tristan Partin
tristan@neon.tech
In reply to: Tristan Partin (#1)
1 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

v2 is attached which fixes a grammatical issue in a comment.

--
Tristan Partin
Neon (https://neon.tech)

Attachments:

v2-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchtext/x-patch; charset=utf-8; name=v2-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchDownload
From b9ccfc3c84a25b8616fd40495954bb6f77788e28 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Mon, 24 Jul 2023 11:12:59 -0500
Subject: [PATCH v2] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Restore the default SIGINT handler while polling in PQconnectdbParams(),
and put the custom handler back afterward.
---
 src/bin/psql/command.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 6733f008fd..e40413a229 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -40,6 +40,7 @@
 #include "large_obj.h"
 #include "libpq-fe.h"
 #include "libpq/pqcomm.h"
+#include "libpq/pqsignal.h"
 #include "mainloop.h"
 #include "portability/instr_time.h"
 #include "pqexpbuffer.h"
@@ -3530,8 +3531,11 @@ do_connect(enum trivalue reuse_previous_specification,
 	{
 		const char **keywords = pg_malloc((nconnopts + 1) * sizeof(*keywords));
 		const char **values = pg_malloc((nconnopts + 1) * sizeof(*values));
+		int			rc;
 		int			paramnum = 0;
 		PQconninfoOption *ci;
+		struct sigaction oldact;
+		struct sigaction newact = { 0 };
 
 		/*
 		 * Copy non-default settings into the PQconnectdbParams parameter
@@ -3576,9 +3580,24 @@ do_connect(enum trivalue reuse_previous_specification,
 		keywords[paramnum] = NULL;
 		values[paramnum] = NULL;
 
+		/*
+		 * Restore the default SIGINT behavior while within libpq. Otherwise, we
+		 * can never exit from polling for the database connection. Failure to
+		 * restore is non-fatal.
+		 */
+		newact.sa_handler = SIG_DFL;
+		rc = sigaction(SIGINT, &newact, &oldact);
+
 		/* Note we do not want libpq to re-expand the dbname parameter */
 		n_conn = PQconnectdbParams(keywords, values, false);
 
+		/*
+		 * Only attempt to restore the old handler if we successfully overwrote
+		 * it initially.
+		 */
+		if (rc == 0)
+			sigaction(SIGINT, &oldact, NULL);
+
 		pg_free(keywords);
 		pg_free(values);
 
-- 
Tristan Partin
Neon (https://neon.tech)

#5Tristan Partin
tristan@neon.tech
In reply to: Tristan Partin (#4)
1 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

v3 is attached which fixes up some code comments I added which I hadn't
attached to the commit already, sigh.

--
Tristan Partin
Neon (https://neon.tech)

Attachments:

v3-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchtext/x-patch; charset=utf-8; name=v3-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchDownload
From 7f9554944911c77aa1a1900537a91e1e7bd75d93 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Mon, 24 Jul 2023 11:12:59 -0500
Subject: [PATCH v3] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Restore the default SIGINT handler while polling in PQconnectdbParams(),
and put the custom handler back afterward.
---
 src/bin/psql/command.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 6733f008fd..11dedbb4c6 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -40,6 +40,7 @@
 #include "large_obj.h"
 #include "libpq-fe.h"
 #include "libpq/pqcomm.h"
+#include "libpq/pqsignal.h"
 #include "mainloop.h"
 #include "portability/instr_time.h"
 #include "pqexpbuffer.h"
@@ -3530,8 +3531,11 @@ do_connect(enum trivalue reuse_previous_specification,
 	{
 		const char **keywords = pg_malloc((nconnopts + 1) * sizeof(*keywords));
 		const char **values = pg_malloc((nconnopts + 1) * sizeof(*values));
+		int			rc;
 		int			paramnum = 0;
 		PQconninfoOption *ci;
+		struct sigaction oldact;
+		struct sigaction newact = { 0 };
 
 		/*
 		 * Copy non-default settings into the PQconnectdbParams parameter
@@ -3576,9 +3580,24 @@ do_connect(enum trivalue reuse_previous_specification,
 		keywords[paramnum] = NULL;
 		values[paramnum] = NULL;
 
+		/*
+		 * Restore the default SIGINT behavior while within libpq. Otherwise,
+		 * SIGINT can never exit from polling for the database connection.
+		 * Failure to restore the default is non-fatal.
+		 */
+		newact.sa_handler = SIG_DFL;
+		rc = sigaction(SIGINT, &newact, &oldact);
+
 		/* Note we do not want libpq to re-expand the dbname parameter */
 		n_conn = PQconnectdbParams(keywords, values, false);
 
+		/*
+		 * Only attempt to restore the old handler if we successfully overrode
+		 * it initially.
+		 */
+		if (rc == 0)
+			sigaction(SIGINT, &oldact, NULL);
+
 		pg_free(keywords);
 		pg_free(values);
 
-- 
Tristan Partin
Neon (https://neon.tech)

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tristan Partin (#5)
Re: psql not responding to SIGINT upon db reconnection

"Tristan Partin" <tristan@neon.tech> writes:

v3 is attached which fixes up some code comments I added which I hadn't
attached to the commit already, sigh.

I don't care for this patch at all. You're bypassing the pqsignal
abstraction layer that the rest of psql goes through, and the behavior
you're implementing isn't very nice. People do not expect ^C to
kill psql - it should just stop the \c attempt and leave you as you
were.

Admittedly, getting PQconnectdbParams to return control on SIGINT
isn't too practical. But you could probably replace that with a loop
around PQconnectPoll and test for CancelRequested in the loop.

regards, tom lane

#7Tristan Partin
tristan@neon.tech
In reply to: Tom Lane (#6)
1 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

On Mon Jul 24, 2023 at 12:43 PM CDT, Tom Lane wrote:

"Tristan Partin" <tristan@neon.tech> writes:

v3 is attached which fixes up some code comments I added which I hadn't
attached to the commit already, sigh.

I don't care for this patch at all. You're bypassing the pqsignal
abstraction layer that the rest of psql goes through, and the behavior
you're implementing isn't very nice. People do not expect ^C to
kill psql - it should just stop the \c attempt and leave you as you
were.

Admittedly, getting PQconnectdbParams to return control on SIGINT
isn't too practical. But you could probably replace that with a loop
around PQconnectPoll and test for CancelRequested in the loop.

That sounds like a much better solution. Attached you will find a v4
that implements your suggestion. Please let me know if there is
something that I missed. I can confirm that the patch works.

$ ./build/src/bin/psql/psql -h pg.neon.tech
NOTICE: Welcome to Neon!
Authenticate by visiting:
https://console.neon.tech/psql_session/xxx

NOTICE: Connecting to database.
psql (17devel, server 15.3)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

tristan957=> \c
NOTICE: Welcome to Neon!
Authenticate by visiting:
https://console.neon.tech/psql_session/yyy

^Cconnection to server at "pg.neon.tech" (3.18.6.96), port 5432 failed:
Previous connection kept
tristan957=>

--
Tristan Partin
Neon (https://neon.tech)

Attachments:

v4-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchtext/x-patch; charset=utf-8; name=v4-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchDownload
From 73a95bc537f205cdac567f3f1860b08cf8323e17 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Mon, 24 Jul 2023 11:12:59 -0500
Subject: [PATCH v4] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Use PQconnectStartParams() in order to insert a CancelRequested check
into the polling loop.
---
 src/bin/psql/command.c | 25 ++++++++++++++++++++++++-
 1 file changed, 24 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 6733f008fd..bfecc25801 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -40,6 +40,7 @@
 #include "large_obj.h"
 #include "libpq-fe.h"
 #include "libpq/pqcomm.h"
+#include "libpq/pqsignal.h"
 #include "mainloop.h"
 #include "portability/instr_time.h"
 #include "pqexpbuffer.h"
@@ -3577,11 +3578,33 @@ do_connect(enum trivalue reuse_previous_specification,
 		values[paramnum] = NULL;
 
 		/* Note we do not want libpq to re-expand the dbname parameter */
-		n_conn = PQconnectdbParams(keywords, values, false);
+		n_conn = PQconnectStartParams(keywords, values, false);
 
 		pg_free(keywords);
 		pg_free(values);
 
+		while (true) {
+			int		socket;
+
+			if (CancelRequested)
+				break;
+
+			socket = PQsocket(n_conn);
+			if (socket == -1)
+				break;
+
+			switch (PQconnectPoll(n_conn)) {
+			case PGRES_POLLING_OK:
+			case PGRES_POLLING_FAILED:
+				break;
+			case PGRES_POLLING_READING:
+			case PGRES_POLLING_WRITING:
+				continue;
+			case PGRES_POLLING_ACTIVE:
+				pg_unreachable();
+			}
+		}
+
 		if (PQstatus(n_conn) == CONNECTION_OK)
 			break;
 
-- 
Tristan Partin
Neon (https://neon.tech)

#8Shlok Kyal
shlok.kyal.oss@gmail.com
In reply to: Tristan Partin (#7)
Re: psql not responding to SIGINT upon db reconnection

Hi,

That sounds like a much better solution. Attached you will find a v4
that implements your suggestion. Please let me know if there is
something that I missed. I can confirm that the patch works.

$ ./build/src/bin/psql/psql -h pg.neon.tech
NOTICE: Welcome to Neon!
Authenticate by visiting:
https://console.neon.tech/psql_session/xxx

NOTICE: Connecting to database.
psql (17devel, server 15.3)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

tristan957=> \c
NOTICE: Welcome to Neon!
Authenticate by visiting:
https://console.neon.tech/psql_session/yyy

^Cconnection to server at "pg.neon.tech" (3.18.6.96), port 5432 failed:
Previous connection kept
tristan957=>

I went through CFbot and found out that some tests are timing out.
Links:
https://cirrus-ci.com/task/6735437444677632
https://cirrus-ci.com/task/4536414189125632
https://cirrus-ci.com/task/5046587584413696
https://cirrus-ci.com/task/6172487491256320
https://cirrus-ci.com/task/5609537537835008

Some tests which are timing out are as follows:
[00:48:49.243] Summary of Failures:
[00:48:49.243]
[00:48:49.243] 4/270 postgresql:regress / regress/regress TIMEOUT 1000.01s
[00:48:49.243] 6/270 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade
TIMEOUT 1000.02s
[00:48:49.243] 34/270 postgresql:recovery /
recovery/027_stream_regress TIMEOUT 1000.02s
[00:48:49.243] 48/270 postgresql:plpgsql / plpgsql/regress TIMEOUT 1000.02s
[00:48:49.243] 49/270 postgresql:plperl / plperl/regress TIMEOUT 1000.01s
[00:48:49.243] 61/270 postgresql:dblink / dblink/regress TIMEOUT 1000.03s
[00:48:49.243] 89/270 postgresql:postgres_fdw / postgres_fdw/regress
TIMEOUT 1000.01s
[00:48:49.243] 93/270 postgresql:test_decoding / test_decoding/regress
TIMEOUT 1000.02s
[00:48:49.243] 110/270 postgresql:test_extensions /
test_extensions/regress TIMEOUT 1000.03s
[00:48:49.243] 123/270 postgresql:unsafe_tests / unsafe_tests/regress
TIMEOUT 1000.02s
[00:48:49.243] 152/270 postgresql:pg_dump / pg_dump/010_dump_connstr
TIMEOUT 1000.03s

Just want to make sure you are aware of the issue.

Thanks
Shlok Kumar Kyal

#9Tristan Partin
tristan@neon.tech
In reply to: Shlok Kyal (#8)
Re: psql not responding to SIGINT upon db reconnection

On Thu Nov 2, 2023 at 4:03 AM CDT, Shlok Kyal wrote:

Hi,

That sounds like a much better solution. Attached you will find a v4
that implements your suggestion. Please let me know if there is
something that I missed. I can confirm that the patch works.

$ ./build/src/bin/psql/psql -h pg.neon.tech
NOTICE: Welcome to Neon!
Authenticate by visiting:
https://console.neon.tech/psql_session/xxx

NOTICE: Connecting to database.
psql (17devel, server 15.3)
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, compression: off)
Type "help" for help.

tristan957=> \c
NOTICE: Welcome to Neon!
Authenticate by visiting:
https://console.neon.tech/psql_session/yyy

^Cconnection to server at "pg.neon.tech" (3.18.6.96), port 5432 failed:
Previous connection kept
tristan957=>

I went through CFbot and found out that some tests are timing out.
Links:
https://cirrus-ci.com/task/6735437444677632
https://cirrus-ci.com/task/4536414189125632
https://cirrus-ci.com/task/5046587584413696
https://cirrus-ci.com/task/6172487491256320
https://cirrus-ci.com/task/5609537537835008

Some tests which are timing out are as follows:
[00:48:49.243] Summary of Failures:
[00:48:49.243]
[00:48:49.243] 4/270 postgresql:regress / regress/regress TIMEOUT 1000.01s
[00:48:49.243] 6/270 postgresql:pg_upgrade / pg_upgrade/002_pg_upgrade
TIMEOUT 1000.02s
[00:48:49.243] 34/270 postgresql:recovery /
recovery/027_stream_regress TIMEOUT 1000.02s
[00:48:49.243] 48/270 postgresql:plpgsql / plpgsql/regress TIMEOUT 1000.02s
[00:48:49.243] 49/270 postgresql:plperl / plperl/regress TIMEOUT 1000.01s
[00:48:49.243] 61/270 postgresql:dblink / dblink/regress TIMEOUT 1000.03s
[00:48:49.243] 89/270 postgresql:postgres_fdw / postgres_fdw/regress
TIMEOUT 1000.01s
[00:48:49.243] 93/270 postgresql:test_decoding / test_decoding/regress
TIMEOUT 1000.02s
[00:48:49.243] 110/270 postgresql:test_extensions /
test_extensions/regress TIMEOUT 1000.03s
[00:48:49.243] 123/270 postgresql:unsafe_tests / unsafe_tests/regress
TIMEOUT 1000.02s
[00:48:49.243] 152/270 postgresql:pg_dump / pg_dump/010_dump_connstr
TIMEOUT 1000.03s

Just want to make sure you are aware of the issue.

Hi Shlok!

Thanks for taking a look. I will check these failures out to see if
I can reproduce.

--
Tristan Partin
Neon (https://neon.tech)

#10Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Tristan Partin (#9)
Re: psql not responding to SIGINT upon db reconnection

On 06/11/2023 19:16, Tristan Partin wrote:

That sounds like a much better solution. Attached you will find a v4
that implements your suggestion. Please let me know if there is
something that I missed. I can confirm that the patch works.

This patch is missing a select(). It will busy loop until the connection
is established or cancelled.

Shouldn't we also clear CancelRequested after we have cancelled the
attempt? Otherwise, any subsequent attempts will immediately fail too.

Should we use 'cancel_pressed' here rather than CancelRequested? To be
honest, I don't understand the difference, so that's a genuine question.
There was an attempt at unifying them in the past but it was reverted in
commit 5d43c3c54d.

--
Heikki Linnakangas
Neon (https://neon.tech)

#11Tristan Partin
tristan@neon.tech
In reply to: Heikki Linnakangas (#10)
Re: psql not responding to SIGINT upon db reconnection

On Thu Nov 16, 2023 at 8:33 AM CST, Heikki Linnakangas wrote:

On 06/11/2023 19:16, Tristan Partin wrote:

That sounds like a much better solution. Attached you will find a v4
that implements your suggestion. Please let me know if there is
something that I missed. I can confirm that the patch works.

This patch is missing a select(). It will busy loop until the connection
is established or cancelled.

If I add a wait (select, poll, etc.), then I can't control-C during the
blocking call, so it doesn't really solve the problem. On Linux, we have
signalfds which seem like the perfect solution to this problem, "wait on
the pq socket or SIGINT." But that doesn't translate well to other
operating systems :(.

tristan957=> \c
NOTICE: Welcome to Neon!
Authenticate by visiting:
https://console.neon.tech/psql_session/XXXX

^CTerminated

You can see here that I can't terminate the command. Where you see
"Terminated" is me running `kill $(pgrep psql)` in another terminal.

Shouldn't we also clear CancelRequested after we have cancelled the
attempt? Otherwise, any subsequent attempts will immediately fail too.

After switching to cancel_pressed, I don't think so. See this bit of
code in the psql main loop:

/* main loop to get queries and execute them */
while (successResult == EXIT_SUCCESS)
{
/*
* Clean up after a previous Control-C
*/
if (cancel_pressed)
{
if (!pset.cur_cmd_interactive)
{
/*
* You get here if you stopped a script with Ctrl-C.
*/
successResult = EXIT_USER;
break;
}

cancel_pressed = false;
}

Should we use 'cancel_pressed' here rather than CancelRequested? To be
honest, I don't understand the difference, so that's a genuine question.
There was an attempt at unifying them in the past but it was reverted in
commit 5d43c3c54d.

The more I look at this, the more I don't understand... I need to look
at this harder, but wanted to get this email out. Switched to
cancel_pressed though. Thanks for pointing it out. Going to wait to send
a v2 while more discussion occurs.

--
Tristan Partin
Neon (https://neon.tech)

#12Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Tristan Partin (#11)
Re: psql not responding to SIGINT upon db reconnection

On 22/11/2023 19:29, Tristan Partin wrote:

On Thu Nov 16, 2023 at 8:33 AM CST, Heikki Linnakangas wrote:

On 06/11/2023 19:16, Tristan Partin wrote:

That sounds like a much better solution. Attached you will find a v4
that implements your suggestion. Please let me know if there is
something that I missed. I can confirm that the patch works.

This patch is missing a select(). It will busy loop until the connection
is established or cancelled.

If I add a wait (select, poll, etc.), then I can't control-C during the
blocking call, so it doesn't really solve the problem.

Hmm, they should return with EINTR on signal. At least on Linux; I'm not
sure how portable that is. See signal(7) man page, section "Interruption
of system calls and library functions by signal handlers". You could
also use a timeout like 5 s to ensure that you wake up and notice that
the signal was received eventually, even if it doesn't interrupt the
blocking call.

--
Heikki Linnakangas
Neon (https://neon.tech)

#13Tristan Partin
tristan@neon.tech
In reply to: Heikki Linnakangas (#12)
1 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

On Wed Nov 22, 2023 at 3:00 PM CST, Heikki Linnakangas wrote:

On 22/11/2023 19:29, Tristan Partin wrote:

On Thu Nov 16, 2023 at 8:33 AM CST, Heikki Linnakangas wrote:

On 06/11/2023 19:16, Tristan Partin wrote:

That sounds like a much better solution. Attached you will find a v4
that implements your suggestion. Please let me know if there is
something that I missed. I can confirm that the patch works.

This patch is missing a select(). It will busy loop until the connection
is established or cancelled.

If I add a wait (select, poll, etc.), then I can't control-C during the
blocking call, so it doesn't really solve the problem.

Hmm, they should return with EINTR on signal. At least on Linux; I'm not
sure how portable that is. See signal(7) man page, section "Interruption
of system calls and library functions by signal handlers". You could
also use a timeout like 5 s to ensure that you wake up and notice that
the signal was received eventually, even if it doesn't interrupt the
blocking call.

Ha, you're right. I had this working yesterday, but convinced myself it
didn't. I had a do while loop wrapping the blocking call. Here is a v4,
which seems to pass the tests that were pointed out to be failing
earlier.

Noticed that I copy-pasted pqSocketPoll() into the psql code. I think
this may be controversial. Not sure what the best solution to the issue
is. I will pay attention to the buildfarm animals when they pick this
up.

--
Tristan Partin
Neon (https://neon.tech)

Attachments:

v4-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchtext/x-patch; charset=utf-8; name=v4-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchDownload
From 1b4303df1200c561cf478f9c7037ff940f6cd741 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Mon, 24 Jul 2023 11:12:59 -0500
Subject: [PATCH v4] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Use PQconnectStartParams() in order to insert a CancelRequested check
into the polling loop.
---
 src/bin/psql/command.c | 129 ++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 128 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 82cc091568..588eed9351 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -11,6 +11,11 @@
 #include <time.h>
 #include <pwd.h>
 #include <utime.h>
+#if HAVE_POLL
+#include <poll.h>
+#else
+#include <sys/select.h>
+#endif
 #ifndef WIN32
 #include <sys/stat.h>			/* for stat() */
 #include <sys/time.h>			/* for setitimer() */
@@ -40,6 +45,7 @@
 #include "large_obj.h"
 #include "libpq-fe.h"
 #include "libpq/pqcomm.h"
+#include "libpq/pqsignal.h"
 #include "mainloop.h"
 #include "portability/instr_time.h"
 #include "pqexpbuffer.h"
@@ -3251,6 +3257,90 @@ copy_previous_query(PQExpBuffer query_buf, PQExpBuffer previous_buf)
 	return false;
 }
 
+/*
+ * Check a file descriptor for read and/or write data, possibly waiting.
+ * If neither forRead nor forWrite are set, immediately return a timeout
+ * condition (without waiting).  Return >0 if condition is met, 0
+ * if a timeout occurred, -1 if an error or interrupt occurred.
+ *
+ * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
+ * if end_time is 0 (or indeed, any time before now).
+ */
+static int
+pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time)
+{
+	/* We use poll(2) if available, otherwise select(2) */
+#ifdef HAVE_POLL
+	struct pollfd input_fd;
+	int			timeout_ms;
+
+	if (!forRead && !forWrite)
+		return 0;
+
+	input_fd.fd = sock;
+	input_fd.events = POLLERR;
+	input_fd.revents = 0;
+
+	if (forRead)
+		input_fd.events |= POLLIN;
+	if (forWrite)
+		input_fd.events |= POLLOUT;
+
+	/* Compute appropriate timeout interval */
+	if (end_time == ((time_t) -1))
+		timeout_ms = -1;
+	else
+	{
+		time_t		now = time(NULL);
+
+		if (end_time > now)
+			timeout_ms = (end_time - now) * 1000;
+		else
+			timeout_ms = 0;
+	}
+
+	return poll(&input_fd, 1, timeout_ms);
+#else							/* !HAVE_POLL */
+
+	fd_set		input_mask;
+	fd_set		output_mask;
+	fd_set		except_mask;
+	struct timeval timeout;
+	struct timeval *ptr_timeout;
+
+	if (!forRead && !forWrite)
+		return 0;
+
+	FD_ZERO(&input_mask);
+	FD_ZERO(&output_mask);
+	FD_ZERO(&except_mask);
+	if (forRead)
+		FD_SET(sock, &input_mask);
+
+	if (forWrite)
+		FD_SET(sock, &output_mask);
+	FD_SET(sock, &except_mask);
+
+	/* Compute appropriate timeout interval */
+	if (end_time == ((time_t) -1))
+		ptr_timeout = NULL;
+	else
+	{
+		time_t		now = time(NULL);
+
+		if (end_time > now)
+			timeout.tv_sec = end_time - now;
+		else
+			timeout.tv_sec = 0;
+		timeout.tv_usec = 0;
+		ptr_timeout = &timeout;
+	}
+
+	return select(sock + 1, &input_mask, &output_mask,
+				  &except_mask, ptr_timeout);
+#endif							/* HAVE_POLL */
+}
+
 /*
  * Ask the user for a password; 'username' is the username the
  * password is for, if one has been explicitly specified.
@@ -3324,6 +3414,7 @@ do_connect(enum trivalue reuse_previous_specification,
 	bool		keep_password = true;
 	bool		has_connection_string;
 	bool		reuse_previous;
+	bool		for_read;
 
 	has_connection_string = dbname ?
 		recognized_connection_string(dbname) : false;
@@ -3614,11 +3705,47 @@ do_connect(enum trivalue reuse_previous_specification,
 		values[paramnum] = NULL;
 
 		/* Note we do not want libpq to re-expand the dbname parameter */
-		n_conn = PQconnectdbParams(keywords, values, false);
+		n_conn = PQconnectStartParams(keywords, values, false);
 
 		pg_free(keywords);
 		pg_free(values);
 
+		for_read = false;
+		while (true) {
+			int		rc;
+			int		sock;
+
+			if (cancel_pressed)
+				break;
+
+			sock = PQsocket(n_conn);
+			if (sock == -1)
+				break;
+
+			rc = pqSocketPoll(sock, for_read, !for_read, (time_t)-1);
+			Assert(rc != 0); /* Timeout is impossible. */
+			if (rc == -1)
+			{
+				success = false;
+				break;
+			}
+
+			switch (PQconnectPoll(n_conn)) {
+			case PGRES_POLLING_OK:
+			case PGRES_POLLING_FAILED:
+				goto finish;
+			case PGRES_POLLING_READING:
+				for_read = true;
+				continue;
+			case PGRES_POLLING_WRITING:
+				for_read = false;
+				continue;
+			case PGRES_POLLING_ACTIVE:
+				pg_unreachable();
+			}
+		}
+
+	finish:
 		if (PQstatus(n_conn) == CONNECTION_OK)
 			break;
 
-- 
Tristan Partin
Neon (https://neon.tech)

#14Heikki Linnakangas
hlinnaka@iki.fi
In reply to: Tristan Partin (#13)
Re: psql not responding to SIGINT upon db reconnection

On 22/11/2023 23:29, Tristan Partin wrote:

Ha, you're right. I had this working yesterday, but convinced myself it
didn't. I had a do while loop wrapping the blocking call. Here is a v4,
which seems to pass the tests that were pointed out to be failing
earlier.

Thanks! This suffers from a classic race condition:

+			if (cancel_pressed)
+				break;
+
+			sock = PQsocket(n_conn);
+			if (sock == -1)
+				break;
+
+			rc = pqSocketPoll(sock, for_read, !for_read, (time_t)-1);
+			Assert(rc != 0); /* Timeout is impossible. */
+			if (rc == -1)
+			{
+				success = false;
+				break;
+			}

If a signal arrives between the "if (cancel_pressed)" check and
pqSocketPoll(), pgSocketPoll will "miss" the signal and block
indefinitely. There are three solutions to this:

1. Use a timeout, so that you wake up periodically to check for any
missed signals. Easy solution, the downsides are that you will not react
as quickly if the signal is missed, and you waste some cycles by waking
up unnecessarily.

2. The Self-pipe trick: https://cr.yp.to/docs/selfpipe.html. We also use
that in src/backend/storage/ipc/latch.c. It's portable, but somewhat
complicated.

3. Use pselect() or ppoll(). They were created specifically to address
this problem. Not fully portable, I think it's missing on Windows at least.

Maybe use pselect() if it's available, and select() with a timeout if
it's not.

--
Heikki Linnakangas
Neon (https://neon.tech)

#15Tristan Partin
tristan@neon.tech
In reply to: Heikki Linnakangas (#14)
1 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

On Thu Nov 23, 2023 at 3:19 AM CST, Heikki Linnakangas wrote:

On 22/11/2023 23:29, Tristan Partin wrote:

Ha, you're right. I had this working yesterday, but convinced myself it
didn't. I had a do while loop wrapping the blocking call. Here is a v4,
which seems to pass the tests that were pointed out to be failing
earlier.

Thanks! This suffers from a classic race condition:

+			if (cancel_pressed)
+				break;
+
+			sock = PQsocket(n_conn);
+			if (sock == -1)
+				break;
+
+			rc = pqSocketPoll(sock, for_read, !for_read, (time_t)-1);
+			Assert(rc != 0); /* Timeout is impossible. */
+			if (rc == -1)
+			{
+				success = false;
+				break;
+			}

If a signal arrives between the "if (cancel_pressed)" check and
pqSocketPoll(), pgSocketPoll will "miss" the signal and block
indefinitely. There are three solutions to this:

1. Use a timeout, so that you wake up periodically to check for any
missed signals. Easy solution, the downsides are that you will not react
as quickly if the signal is missed, and you waste some cycles by waking
up unnecessarily.

2. The Self-pipe trick: https://cr.yp.to/docs/selfpipe.html. We also use
that in src/backend/storage/ipc/latch.c. It's portable, but somewhat
complicated.

3. Use pselect() or ppoll(). They were created specifically to address
this problem. Not fully portable, I think it's missing on Windows at least.

Maybe use pselect() if it's available, and select() with a timeout if
it's not.

First I have ever heard of this race condition, and now I will commit it
to durable storage :). I went ahead and did option #3 that you proposed.
On Windows, I have a 1 second timeout for the select/poll. That seemed
like a good balance of user experience and spinning the CPU. But I am
open to other values. I don't have a Windows VM, but maybe I should set
one up...

I am not completely in love with the code I have written. Lots of
conditional compilation which makes it hard to read. Looking forward to
another round of review to see what y'all think.

For what it's worth, I did try #2, but since psql installs its own
SIGINT handler, the code became really hairy.

--
Tristan Partin
Neon (https://neon.tech)

Attachments:

v5-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchtext/x-patch; charset=utf-8; name=v5-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchDownload
From 4fdccf9211ac78bbd7430488d515646f9e9cce9b Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Mon, 24 Jul 2023 11:12:59 -0500
Subject: [PATCH v5] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Use PQconnectStartParams() in order to insert a CancelRequested check
into the polling loop.
---
 meson.build                |   1 +
 src/bin/psql/command.c     | 222 ++++++++++++++++++++++++++++++++++++-
 src/include/pg_config.h.in |   3 +
 src/tools/msvc/Solution.pm |   1 +
 4 files changed, 226 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index ee58ee7a06..2d63485c53 100644
--- a/meson.build
+++ b/meson.build
@@ -2440,6 +2440,7 @@ func_checks = [
   ['posix_fadvise'],
   ['posix_fallocate'],
   ['ppoll'],
+  ['pselect'],
   ['pstat'],
   ['pthread_barrier_wait', {'dependencies': [thread_dep]}],
   ['pthread_is_threaded_np', {'dependencies': [thread_dep]}],
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 82cc091568..c325fedb51 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -11,6 +11,11 @@
 #include <time.h>
 #include <pwd.h>
 #include <utime.h>
+#if HAVE_POLL
+#include <poll.h>
+#else
+#include <sys/select.h>
+#endif
 #ifndef WIN32
 #include <sys/stat.h>			/* for stat() */
 #include <sys/time.h>			/* for setitimer() */
@@ -40,6 +45,7 @@
 #include "large_obj.h"
 #include "libpq-fe.h"
 #include "libpq/pqcomm.h"
+#include "libpq/pqsignal.h"
 #include "mainloop.h"
 #include "portability/instr_time.h"
 #include "pqexpbuffer.h"
@@ -3251,6 +3257,169 @@ copy_previous_query(PQExpBuffer query_buf, PQExpBuffer previous_buf)
 	return false;
 }
 
+/*
+ * Check a file descriptor for read and/or write data, possibly waiting.
+ * If neither forRead nor forWrite are set, immediately return a timeout
+ * condition (without waiting).  Return >0 if condition is met, 0
+ * if a timeout occurred, -1 if an error or interrupt occurred.
+ *
+ * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
+ * if end_time is 0 (or indeed, any time before now).
+ */
+static int
+pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time)
+{
+	/*
+	 * We use functions in the following order if available:
+	 *   - ppoll(2) OR pselect(2)
+	 *   - poll(2) OR select(2)
+	 */
+#ifdef HAVE_POLL
+	struct pollfd input_fd;
+#ifdef HAVE_PPOLL
+	int				rc;
+	sigset_t		emptyset;
+	sigset_t		blockset;
+	sigset_t		origset;
+	struct timespec timeout;
+	struct timespec *ptr_timeout;
+#else
+	int			timeout_ms;
+#endif
+
+	if (!forRead && !forWrite)
+		return 0;
+
+	input_fd.fd = sock;
+	input_fd.events = POLLERR;
+	input_fd.revents = 0;
+
+	if (forRead)
+		input_fd.events |= POLLIN;
+	if (forWrite)
+		input_fd.events |= POLLOUT;
+
+	/* Compute appropriate timeout interval */
+#ifdef HAVE_PPOLL
+	sigemptyset(&blockset);
+	sigaddset(&blockset, SIGINT);
+	sigprocmask(SIG_BLOCK, &blockset, &origset);
+
+	if (end_time == ((time_t) -1))
+		ptr_timeout = NULL;
+	else
+	{
+		timeout.tv_sec = end_time;
+		timeout.tv_nsec = 0;
+		ptr_timeout = &timeout;
+	}
+#else
+	if (end_time == ((time_t) -1))
+		timeout_ms = -1;
+	else
+	{
+		time_t		now = time(NULL);
+
+		if (end_time > now)
+			timeout_ms = (end_time - now) * 1000;
+		else
+			timeout_ms = 0;
+	}
+#endif
+
+#ifdef HAVE_PPOLL
+	sigemptyset(&emptyset);
+	if (cancel_pressed)
+	{
+		sigprocmask(SIG_SETMASK, &origset, NULL);
+		return 1;
+	}
+
+	rc = ppoll(&input_fd, 1, ptr_timeout, &emptyset);
+	sigprocmask(SIG_SETMASK, &origset, NULL);
+	return rc;
+#else
+	return poll(&input_fd, 1, timeout_ms);
+#endif
+#else							/* !HAVE_POLL */
+
+	fd_set		input_mask;
+	fd_set		output_mask;
+	fd_set		except_mask;
+#ifdef HAVE_PSELECT
+	int 			rc;
+	sigset_t		emptyset;
+	sigset_t		blockset;
+	sigset_t		origset;
+	struct timespec timeout;
+	struct timespec *ptr_timeout;
+#else
+	struct timeval timeout;
+	struct timeval *ptr_timeout;
+#endif
+
+	if (!forRead && !forWrite)
+		return 0;
+
+	FD_ZERO(&input_mask);
+	FD_ZERO(&output_mask);
+	FD_ZERO(&except_mask);
+	if (forRead)
+		FD_SET(sock, &input_mask);
+
+	if (forWrite)
+		FD_SET(sock, &output_mask);
+	FD_SET(sock, &except_mask);
+
+	/* Compute appropriate timeout interval */
+#ifdef HAVE_PSELECT
+	sigemptyset(&blockset);
+	sigaddset(&blockset, SIGINT);
+	sigprocmask(SIG_BLOCK, &blockset, &origset);
+
+	if (end_time == ((time_t) -1))
+		ptr_timeout = NULL;
+	else
+	{
+		timeout.tv_sec = end_time;
+		timeout.tv_nsec = 0;
+		ptr_timeout = &timeout;
+	}
+#else
+	if (end_time == ((time_t) -1))
+		ptr_timeout = NULL;
+	else
+	{
+		time_t		now = time(NULL);
+
+		if (end_time > now)
+			timeout.tv_sec = end_time - now;
+		else
+			timeout.tv_sec = 0;
+		timeout.tv_usec = 0;
+		ptr_timeout = &timeout;
+	}
+#endif
+
+#ifdef HAVE_PSELECT
+	sigemptyset(&emptyset);
+	if (cancel_pressed)
+	{
+		sigprocmask(SIG_SETMASK, &origset, NULL);
+		return 1;
+	}
+
+	rc = pselect(sock + 1, &input_mask, &output_mask,
+				 &except_mask, ptr_timeout, &emptyset);
+	sigprocmask(SIG_SETMASK, &origset, NULL);
+	return rc;
+#else
+	return select(sock + 1, &input_mask, &output_mask,
+				  &except_mask, ptr_timeout);
+#endif
+#endif							/* HAVE_POLL */
+}
+
 /*
  * Ask the user for a password; 'username' is the username the
  * password is for, if one has been explicitly specified.
@@ -3324,6 +3493,7 @@ do_connect(enum trivalue reuse_previous_specification,
 	bool		keep_password = true;
 	bool		has_connection_string;
 	bool		reuse_previous;
+	bool		for_read;
 
 	has_connection_string = dbname ?
 		recognized_connection_string(dbname) : false;
@@ -3614,11 +3784,61 @@ do_connect(enum trivalue reuse_previous_specification,
 		values[paramnum] = NULL;
 
 		/* Note we do not want libpq to re-expand the dbname parameter */
-		n_conn = PQconnectdbParams(keywords, values, false);
+		n_conn = PQconnectStartParams(keywords, values, false);
 
 		pg_free(keywords);
 		pg_free(values);
 
+		for_read = false;
+		while (true) {
+			int		rc;
+			int		sock;
+			time_t	timeout;
+
+			if (cancel_pressed)
+				break;
+
+			sock = PQsocket(n_conn);
+			if (sock == -1)
+				break;
+
+			/*
+			 * We use ppoll(2)/pselect(2) to account for the race condition in
+			 * which SIGINT is sent after checking cancel_pressed. But on
+			 * platforms that don't have either function, we can just spin the
+			 * CPU a bit polling, so set the timeout to 1 second if we don't
+			 * have the aforementioned functions, otherwise set timeout to
+			 * a negative value indicating we will sit and wait forever.
+			 */
+#if defined(HAVE_PPOLL) || defined(HAVE_PSELECT)
+			timeout = -1;
+#else
+			timeout = 1;
+#endif
+
+			rc = pqSocketPoll(sock, for_read, !for_read, timeout);
+			if (rc == -1)
+			{
+				success = false;
+				break;
+			}
+
+			switch (PQconnectPoll(n_conn)) {
+			case PGRES_POLLING_OK:
+			case PGRES_POLLING_FAILED:
+				goto finish;
+			case PGRES_POLLING_READING:
+				for_read = true;
+				continue;
+			case PGRES_POLLING_WRITING:
+				for_read = false;
+				continue;
+			case PGRES_POLLING_ACTIVE:
+				pg_unreachable();
+			}
+		}
+
+	finish:
 		if (PQstatus(n_conn) == CONNECTION_OK)
 			break;
 
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index d8a2985567..f9fd7d0de7 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -333,6 +333,9 @@
 /* Define to 1 if you have the `ppoll' function. */
 #undef HAVE_PPOLL
 
+/* Define to 1 if you have the `pselect' function. */
+#undef HAVE_PSELECT
+
 /* Define if you have POSIX threads libraries and header files. */
 #undef HAVE_PTHREAD
 
diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index 98a5b5d872..d035f44f73 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -308,6 +308,7 @@ sub GenerateFiles
 		HAVE_POSIX_FADVISE => undef,
 		HAVE_POSIX_FALLOCATE => undef,
 		HAVE_PPOLL => undef,
+		HAVE_PSELECT => undef,
 		HAVE_PTHREAD => undef,
 		HAVE_PTHREAD_BARRIER_WAIT => undef,
 		HAVE_PTHREAD_IS_THREADED_NP => undef,
-- 
Tristan Partin
Neon (https://neon.tech)

#16Tristan Partin
tristan@neon.tech
In reply to: Tristan Partin (#15)
1 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

On Wed Nov 29, 2023 at 11:48 AM CST, Tristan Partin wrote:

I am not completely in love with the code I have written. Lots of
conditional compilation which makes it hard to read. Looking forward to
another round of review to see what y'all think.

Ok. Here is a patch which just uses select(2) with a timeout of 1s or
pselect(2) if it is available. I also moved the state machine processing
into its own function.

Thanks for your comments thus far.

--
Tristan Partin
Neon (https://neon.tech)

Attachments:

v6-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchtext/x-patch; charset=utf-8; name=v6-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchDownload
From b22f286d3733d6d5ec2a924682679f6884b3600c Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Mon, 24 Jul 2023 11:12:59 -0500
Subject: [PATCH v6] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Use PQconnectStartParams() in order to insert a CancelRequested check
into the polling loop.
---
 meson.build                |   1 +
 src/bin/psql/command.c     | 224 ++++++++++++++++++++++++++++++++++++-
 src/include/pg_config.h.in |   3 +
 src/tools/msvc/Solution.pm |   1 +
 4 files changed, 228 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index ee58ee7a06..2d63485c53 100644
--- a/meson.build
+++ b/meson.build
@@ -2440,6 +2440,7 @@ func_checks = [
   ['posix_fadvise'],
   ['posix_fallocate'],
   ['ppoll'],
+  ['pselect'],
   ['pstat'],
   ['pthread_barrier_wait', {'dependencies': [thread_dep]}],
   ['pthread_is_threaded_np', {'dependencies': [thread_dep]}],
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 82cc091568..e87401b790 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -11,6 +11,11 @@
 #include <time.h>
 #include <pwd.h>
 #include <utime.h>
+#if HAVE_POLL
+#include <poll.h>
+#else
+#include <sys/select.h>
+#endif
 #ifndef WIN32
 #include <sys/stat.h>			/* for stat() */
 #include <sys/time.h>			/* for setitimer() */
@@ -40,6 +45,7 @@
 #include "large_obj.h"
 #include "libpq-fe.h"
 #include "libpq/pqcomm.h"
+#include "libpq/pqsignal.h"
 #include "mainloop.h"
 #include "portability/instr_time.h"
 #include "pqexpbuffer.h"
@@ -3251,6 +3257,169 @@ copy_previous_query(PQExpBuffer query_buf, PQExpBuffer previous_buf)
 	return false;
 }
 
+/*
+ * Check a file descriptor for read and/or write data, possibly waiting.
+ * If neither forRead nor forWrite are set, immediately return a timeout
+ * condition (without waiting).  Return >0 if condition is met, 0
+ * if a timeout occurred, -1 if an error or interrupt occurred.
+ *
+ * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
+ * if end_time is 0 (or indeed, any time before now).
+ */
+static int
+pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time)
+{
+	/*
+	 * We use functions in the following order if available:
+	 *   - ppoll(2) OR pselect(2)
+	 *   - poll(2) OR select(2)
+	 */
+#ifdef HAVE_POLL
+	struct pollfd input_fd;
+#ifdef HAVE_PPOLL
+	int			rc;
+	sigset_t	emptyset;
+	sigset_t	blockset;
+	sigset_t	origset;
+	struct timespec timeout;
+	struct timespec *ptr_timeout;
+#else
+	int			timeout_ms;
+#endif
+
+	if (!forRead && !forWrite)
+		return 0;
+
+	input_fd.fd = sock;
+	input_fd.events = POLLERR;
+	input_fd.revents = 0;
+
+	if (forRead)
+		input_fd.events |= POLLIN;
+	if (forWrite)
+		input_fd.events |= POLLOUT;
+
+	/* Compute appropriate timeout interval */
+#ifdef HAVE_PPOLL
+	sigemptyset(&blockset);
+	sigaddset(&blockset, SIGINT);
+	sigprocmask(SIG_BLOCK, &blockset, &origset);
+
+	if (end_time == ((time_t) -1))
+		ptr_timeout = NULL;
+	else
+	{
+		timeout.tv_sec = end_time;
+		timeout.tv_nsec = 0;
+		ptr_timeout = &timeout;
+	}
+#else
+	if (end_time == ((time_t) -1))
+		timeout_ms = -1;
+	else
+	{
+		time_t		now = time(NULL);
+
+		if (end_time > now)
+			timeout_ms = (end_time - now) * 1000;
+		else
+			timeout_ms = 0;
+	}
+#endif
+
+#ifdef HAVE_PPOLL
+	sigemptyset(&emptyset);
+	if (cancel_pressed)
+	{
+		sigprocmask(SIG_SETMASK, &origset, NULL);
+		return 1;
+	}
+
+	rc = ppoll(&input_fd, 1, ptr_timeout, &emptyset);
+	sigprocmask(SIG_SETMASK, &origset, NULL);
+	return rc;
+#else
+	return poll(&input_fd, 1, timeout_ms);
+#endif
+#else							/* !HAVE_POLL */
+
+	fd_set		input_mask;
+	fd_set		output_mask;
+	fd_set		except_mask;
+#ifdef HAVE_PSELECT
+	int			rc;
+	sigset_t	emptyset;
+	sigset_t	blockset;
+	sigset_t	origset;
+	struct timespec timeout;
+	struct timespec *ptr_timeout;
+#else
+	struct timeval timeout;
+	struct timeval *ptr_timeout;
+#endif
+
+	if (!forRead && !forWrite)
+		return 0;
+
+	FD_ZERO(&input_mask);
+	FD_ZERO(&output_mask);
+	FD_ZERO(&except_mask);
+	if (forRead)
+		FD_SET(sock, &input_mask);
+
+	if (forWrite)
+		FD_SET(sock, &output_mask);
+	FD_SET(sock, &except_mask);
+
+	/* Compute appropriate timeout interval */
+#ifdef HAVE_PSELECT
+	sigemptyset(&blockset);
+	sigaddset(&blockset, SIGINT);
+	sigprocmask(SIG_BLOCK, &blockset, &origset);
+
+	if (end_time == ((time_t) -1))
+		ptr_timeout = NULL;
+	else
+	{
+		timeout.tv_sec = end_time;
+		timeout.tv_nsec = 0;
+		ptr_timeout = &timeout;
+	}
+#else
+	if (end_time == ((time_t) -1))
+		ptr_timeout = NULL;
+	else
+	{
+		time_t		now = time(NULL);
+
+		if (end_time > now)
+			timeout.tv_sec = end_time - now;
+		else
+			timeout.tv_sec = 0;
+		timeout.tv_usec = 0;
+		ptr_timeout = &timeout;
+	}
+#endif
+
+#ifdef HAVE_PSELECT
+	sigemptyset(&emptyset);
+	if (cancel_pressed)
+	{
+		sigprocmask(SIG_SETMASK, &origset, NULL);
+		return 1;
+	}
+
+	rc = pselect(sock + 1, &input_mask, &output_mask,
+				 &except_mask, ptr_timeout, &emptyset);
+	sigprocmask(SIG_SETMASK, &origset, NULL);
+	return rc;
+#else
+	return select(sock + 1, &input_mask, &output_mask,
+				  &except_mask, ptr_timeout);
+#endif
+#endif							/* HAVE_POLL */
+}
+
 /*
  * Ask the user for a password; 'username' is the username the
  * password is for, if one has been explicitly specified.
@@ -3324,6 +3493,7 @@ do_connect(enum trivalue reuse_previous_specification,
 	bool		keep_password = true;
 	bool		has_connection_string;
 	bool		reuse_previous;
+	bool		for_read;
 
 	has_connection_string = dbname ?
 		recognized_connection_string(dbname) : false;
@@ -3614,11 +3784,63 @@ do_connect(enum trivalue reuse_previous_specification,
 		values[paramnum] = NULL;
 
 		/* Note we do not want libpq to re-expand the dbname parameter */
-		n_conn = PQconnectdbParams(keywords, values, false);
+		n_conn = PQconnectStartParams(keywords, values, false);
 
 		pg_free(keywords);
 		pg_free(values);
 
+		for_read = false;
+		while (true)
+		{
+			int			rc;
+			int			sock;
+			time_t		timeout;
+
+			if (cancel_pressed)
+				break;
+
+			sock = PQsocket(n_conn);
+			if (sock == -1)
+				break;
+
+			/*
+			 * We use ppoll(2)/pselect(2) to account for the race condition in
+			 * which SIGINT is sent after checking cancel_pressed. But on
+			 * platforms that don't have either function, we can just spin the
+			 * CPU a bit polling, so set the timeout to 1 second if we don't
+			 * have the aforementioned functions, otherwise set timeout to a
+			 * negative value indicating we will sit and wait forever.
+			 */
+#if defined(HAVE_PPOLL) || defined(HAVE_PSELECT)
+			timeout = -1;
+#else
+			timeout = 1;
+#endif
+
+			rc = pqSocketPoll(sock, for_read, !for_read, timeout);
+			if (rc == -1)
+			{
+				success = false;
+				break;
+			}
+
+			switch (PQconnectPoll(n_conn))
+			{
+				case PGRES_POLLING_OK:
+				case PGRES_POLLING_FAILED:
+					goto finish;
+				case PGRES_POLLING_READING:
+					for_read = true;
+					continue;
+				case PGRES_POLLING_WRITING:
+					for_read = false;
+					continue;
+				case PGRES_POLLING_ACTIVE:
+					pg_unreachable();
+			}
+		}
+
+finish:
 		if (PQstatus(n_conn) == CONNECTION_OK)
 			break;
 
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index d8a2985567..f9fd7d0de7 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -333,6 +333,9 @@
 /* Define to 1 if you have the `ppoll' function. */
 #undef HAVE_PPOLL
 
+/* Define to 1 if you have the `pselect' function. */
+#undef HAVE_PSELECT
+
 /* Define if you have POSIX threads libraries and header files. */
 #undef HAVE_PTHREAD
 
diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index 98a5b5d872..d035f44f73 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -308,6 +308,7 @@ sub GenerateFiles
 		HAVE_POSIX_FADVISE => undef,
 		HAVE_POSIX_FALLOCATE => undef,
 		HAVE_PPOLL => undef,
+		HAVE_PSELECT => undef,
 		HAVE_PTHREAD => undef,
 		HAVE_PTHREAD_BARRIER_WAIT => undef,
 		HAVE_PTHREAD_IS_THREADED_NP => undef,
-- 
Tristan Partin
Neon (https://neon.tech)

#17Robert Haas
robertmhaas@gmail.com
In reply to: Tristan Partin (#16)
Re: psql not responding to SIGINT upon db reconnection

On Tue, Dec 5, 2023 at 1:35 PM Tristan Partin <tristan@neon.tech> wrote:

On Wed Nov 29, 2023 at 11:48 AM CST, Tristan Partin wrote:

I am not completely in love with the code I have written. Lots of
conditional compilation which makes it hard to read. Looking forward to
another round of review to see what y'all think.

Ok. Here is a patch which just uses select(2) with a timeout of 1s or
pselect(2) if it is available. I also moved the state machine processing
into its own function.

Hmm, this adds a function called pqSocketPoll to psql/command.c. But
there already is such a function in libpq/fe-misc.c. It's not quite
the same, though. Having the same function in two different modules
with subtly different definitions seems like it's probably not the
right idea.

Also, this seems awfully complicated for something that's supposed to
(checks notes) wait for a file descriptor to become ready for I/O for
up to 1 second. It's 160 lines of code in pqSocketPoll and another 50
in the caller. If this is really the simplest way to do this, we
really need to rethink our libpq APIs or, uh, something. I wonder if
we could make this simpler by, say:

- always use select
- always use a 1-second timeout
- if it returns faster because the race condition doesn't happen, cool
- if it doesn't, well then you get to wait for a second, oh well

I don't feel strongly that that's the right way to go and if Heikki or
some other committer wants to go with this more complex conditional
approach, that's fine. But to me it seems like a lot.

--
Robert Haas
EDB: http://www.enterprisedb.com

#18Tristan Partin
tristan@neon.tech
In reply to: Robert Haas (#17)
1 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

On Fri Jan 5, 2024 at 12:24 PM CST, Robert Haas wrote:

On Tue, Dec 5, 2023 at 1:35 PM Tristan Partin <tristan@neon.tech> wrote:

On Wed Nov 29, 2023 at 11:48 AM CST, Tristan Partin wrote:

I am not completely in love with the code I have written. Lots of
conditional compilation which makes it hard to read. Looking forward to
another round of review to see what y'all think.

Ok. Here is a patch which just uses select(2) with a timeout of 1s or
pselect(2) if it is available. I also moved the state machine processing
into its own function.

Hmm, this adds a function called pqSocketPoll to psql/command.c. But
there already is such a function in libpq/fe-misc.c. It's not quite
the same, though. Having the same function in two different modules
with subtly different definitions seems like it's probably not the
right idea.

Yep, not tied to the function name. Happy to rename as anyone suggests.

Also, this seems awfully complicated for something that's supposed to
(checks notes) wait for a file descriptor to become ready for I/O for
up to 1 second. It's 160 lines of code in pqSocketPoll and another 50
in the caller. If this is really the simplest way to do this, we
really need to rethink our libpq APIs or, uh, something. I wonder if
we could make this simpler by, say:

- always use select
- always use a 1-second timeout
- if it returns faster because the race condition doesn't happen, cool
- if it doesn't, well then you get to wait for a second, oh well

I don't feel strongly that that's the right way to go and if Heikki or
some other committer wants to go with this more complex conditional
approach, that's fine. But to me it seems like a lot.

I think the way to go is to expose some variation of libpq's
pqSocketPoll(), which I would be happy to put together a patch for.
Making frontends, psql in this case, have to reimplement the polling
logic doesn't strike me as fruitful, which is essentially what I have
done.

Thanks for your input!

But also wait a second. In my last email, I said:

Ok. Here is a patch which just uses select(2) with a timeout of 1s or
pselect(2) if it is available. I also moved the state machine processing
into its own function.

This is definitely not the patch I meant to send. What the? Here is what
I meant to send, but I stand by my comment that we should just expose
a variation of pqSocketPoll().

--
Tristan Partin
Neon (https://neon.tech)

Attachments:

v7-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchtext/x-patch; charset=utf-8; name=v7-0001-Allow-SIGINT-to-cancel-psql-database-reconnection.patchDownload
From 5e19b26111708b654c7b49c3b9fd7dc18b94c0e7 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Mon, 24 Jul 2023 11:12:59 -0500
Subject: [PATCH v7 1/2] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Use PQconnectStartParams() in order to insert a CancelRequested check
into the polling loop.
---
 meson.build                |   1 +
 src/bin/psql/command.c     | 156 ++++++++++++++++++++++++++++++++++++-
 src/include/pg_config.h.in |   3 +
 src/tools/msvc/Solution.pm |   1 +
 4 files changed, 160 insertions(+), 1 deletion(-)

diff --git a/meson.build b/meson.build
index ee58ee7a06..2d63485c53 100644
--- a/meson.build
+++ b/meson.build
@@ -2440,6 +2440,7 @@ func_checks = [
   ['posix_fadvise'],
   ['posix_fallocate'],
   ['ppoll'],
+  ['pselect'],
   ['pstat'],
   ['pthread_barrier_wait', {'dependencies': [thread_dep]}],
   ['pthread_is_threaded_np', {'dependencies': [thread_dep]}],
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 82cc091568..3a76623b05 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -11,6 +11,7 @@
 #include <time.h>
 #include <pwd.h>
 #include <utime.h>
+#include <sys/select.h>
 #ifndef WIN32
 #include <sys/stat.h>			/* for stat() */
 #include <sys/time.h>			/* for setitimer() */
@@ -40,6 +41,7 @@
 #include "large_obj.h"
 #include "libpq-fe.h"
 #include "libpq/pqcomm.h"
+#include "libpq/pqsignal.h"
 #include "mainloop.h"
 #include "portability/instr_time.h"
 #include "pqexpbuffer.h"
@@ -3298,6 +3300,157 @@ param_is_newly_set(const char *old_val, const char *new_val)
 	return false;
 }
 
+/*
+ * Check a file descriptor for read and/or write data, possibly waiting.
+ * If neither forRead nor forWrite are set, immediately return a timeout
+ * condition (without waiting).  Return >0 if condition is met, 0
+ * if a timeout occurred, -1 if an error or interrupt occurred.
+ *
+ * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
+ * if end_time is 0 (or indeed, any time before now).
+ *
+ * Uses the select(2) family of functions because it is available on every
+ * platform. It is unlikely that psql would be holding enough file descriptors
+ * that would necessitate using poll(2) or ppoll(2) for example.
+ */
+static int
+pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time)
+{
+	/*
+	 * We use functions in the following order if available: pselect(2) or,
+	 * select(2).
+	 */
+	fd_set		input_mask;
+	fd_set		output_mask;
+	fd_set		except_mask;
+#ifdef HAVE_PSELECT
+	int			rc;
+	sigset_t	emptyset;
+	sigset_t	blockset;
+	sigset_t	origset;
+	struct timespec timeout;
+	struct timespec *ptr_timeout;
+#else
+	struct timeval timeout;
+	struct timeval *ptr_timeout;
+#endif
+
+	if (!forRead && !forWrite)
+		return 0;
+
+	FD_ZERO(&input_mask);
+	FD_ZERO(&output_mask);
+	FD_ZERO(&except_mask);
+	if (forRead)
+		FD_SET(sock, &input_mask);
+
+	if (forWrite)
+		FD_SET(sock, &output_mask);
+	FD_SET(sock, &except_mask);
+
+	/* Compute appropriate timeout interval */
+#ifdef HAVE_PSELECT
+	sigemptyset(&blockset);
+	sigaddset(&blockset, SIGINT);
+	sigprocmask(SIG_BLOCK, &blockset, &origset);
+
+	if (end_time == ((time_t) -1))
+		ptr_timeout = NULL;
+	else
+	{
+		timeout.tv_sec = end_time;
+		timeout.tv_nsec = 0;
+		ptr_timeout = &timeout;
+	}
+#else
+	if (end_time == ((time_t) -1))
+		ptr_timeout = NULL;
+	else
+	{
+		time_t		now = time(NULL);
+
+		if (end_time > now)
+			timeout.tv_sec = end_time - now;
+		else
+			timeout.tv_sec = 0;
+		timeout.tv_usec = 0;
+		ptr_timeout = &timeout;
+	}
+#endif
+
+#ifdef HAVE_PSELECT
+	sigemptyset(&emptyset);
+	if (cancel_pressed)
+	{
+		sigprocmask(SIG_SETMASK, &origset, NULL);
+		return 1;
+	}
+
+	rc = pselect(sock + 1, &input_mask, &output_mask,
+				 &except_mask, ptr_timeout, &emptyset);
+	sigprocmask(SIG_SETMASK, &origset, NULL);
+	return rc;
+#else
+	return select(sock + 1, &input_mask, &output_mask,
+				  &except_mask, ptr_timeout);
+#endif
+}
+
+static void
+process_connection_state_machine(PGconn *conn)
+{
+	bool		for_read = false;
+
+	while (true)
+	{
+		int			rc;
+		int			sock;
+		time_t		timeout;
+
+		if (cancel_pressed)
+			break;
+
+		sock = PQsocket(conn);
+		if (sock == -1)
+			break;
+
+		/*
+		 * We use pselect(2) to account for the race condition in which SIGINT
+		 * is sent after checking cancel_pressed. But on platforms that don't
+		 * have either function, we can just spin the CPU a bit polling, so
+		 * set the timeout to 1 second if we don't have the aforementioned
+		 * function. Otherwise, set timeout to a negative value indicating we
+		 * will sit and wait forever.
+		 */
+#if defined(HAVE_PSELECT)
+		timeout = -1;
+#else
+		timeout = 1;
+#endif
+
+		rc = pqSocketPoll(sock, for_read, !for_read, timeout);
+		if (rc == -1)
+			return;
+
+		switch (PQconnectPoll(conn))
+		{
+			case PGRES_POLLING_OK:
+			case PGRES_POLLING_FAILED:
+				return;
+			case PGRES_POLLING_READING:
+				for_read = true;
+				continue;
+			case PGRES_POLLING_WRITING:
+				for_read = false;
+				continue;
+			case PGRES_POLLING_ACTIVE:
+				pg_unreachable();
+		}
+	}
+
+	pg_unreachable();
+}
+
 /*
  * do_connect -- handler for \connect
  *
@@ -3614,11 +3767,12 @@ do_connect(enum trivalue reuse_previous_specification,
 		values[paramnum] = NULL;
 
 		/* Note we do not want libpq to re-expand the dbname parameter */
-		n_conn = PQconnectdbParams(keywords, values, false);
+		n_conn = PQconnectStartParams(keywords, values, false);
 
 		pg_free(keywords);
 		pg_free(values);
 
+		process_connection_state_machine(n_conn);
 		if (PQstatus(n_conn) == CONNECTION_OK)
 			break;
 
diff --git a/src/include/pg_config.h.in b/src/include/pg_config.h.in
index d8a2985567..f9fd7d0de7 100644
--- a/src/include/pg_config.h.in
+++ b/src/include/pg_config.h.in
@@ -333,6 +333,9 @@
 /* Define to 1 if you have the `ppoll' function. */
 #undef HAVE_PPOLL
 
+/* Define to 1 if you have the `pselect' function. */
+#undef HAVE_PSELECT
+
 /* Define if you have POSIX threads libraries and header files. */
 #undef HAVE_PTHREAD
 
diff --git a/src/tools/msvc/Solution.pm b/src/tools/msvc/Solution.pm
index 98a5b5d872..d035f44f73 100644
--- a/src/tools/msvc/Solution.pm
+++ b/src/tools/msvc/Solution.pm
@@ -308,6 +308,7 @@ sub GenerateFiles
 		HAVE_POSIX_FADVISE => undef,
 		HAVE_POSIX_FALLOCATE => undef,
 		HAVE_PPOLL => undef,
+		HAVE_PSELECT => undef,
 		HAVE_PTHREAD => undef,
 		HAVE_PTHREAD_BARRIER_WAIT => undef,
 		HAVE_PTHREAD_IS_THREADED_NP => undef,
-- 
Tristan Partin
Neon (https://neon.tech)

#19Robert Haas
robertmhaas@gmail.com
In reply to: Tristan Partin (#18)
Re: psql not responding to SIGINT upon db reconnection

On Mon, Jan 8, 2024 at 1:03 AM Tristan Partin <tristan@neon.tech> wrote:

I think the way to go is to expose some variation of libpq's
pqSocketPoll(), which I would be happy to put together a patch for.
Making frontends, psql in this case, have to reimplement the polling
logic doesn't strike me as fruitful, which is essentially what I have
done.

I encourage further exploration of this line of attack. I fear that if
I were to commit something like what you've posted up until now,
people would complain that that code was too ugly to live, and I'd
have a hard time telling them that they're wrong.

--
Robert Haas
EDB: http://www.enterprisedb.com

#20Tristan Partin
tristan@neon.tech
In reply to: Robert Haas (#19)
Re: psql not responding to SIGINT upon db reconnection

On Fri Jan 12, 2024 at 10:45 AM CST, Robert Haas wrote:

On Mon, Jan 8, 2024 at 1:03 AM Tristan Partin <tristan@neon.tech> wrote:

I think the way to go is to expose some variation of libpq's
pqSocketPoll(), which I would be happy to put together a patch for.
Making frontends, psql in this case, have to reimplement the polling
logic doesn't strike me as fruitful, which is essentially what I have
done.

I encourage further exploration of this line of attack. I fear that if
I were to commit something like what you've posted up until now,
people would complain that that code was too ugly to live, and I'd
have a hard time telling them that they're wrong.

Completely agree. Let me look into this. Perhaps I can get something up
next week or the week after.

--
Tristan Partin
Neon (https://neon.tech)

#21Tristan Partin
tristan@neon.tech
In reply to: Tristan Partin (#20)
2 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

On Fri Jan 12, 2024 at 11:13 AM CST, Tristan Partin wrote:

On Fri Jan 12, 2024 at 10:45 AM CST, Robert Haas wrote:

On Mon, Jan 8, 2024 at 1:03 AM Tristan Partin <tristan@neon.tech> wrote:

I think the way to go is to expose some variation of libpq's
pqSocketPoll(), which I would be happy to put together a patch for.
Making frontends, psql in this case, have to reimplement the polling
logic doesn't strike me as fruitful, which is essentially what I have
done.

I encourage further exploration of this line of attack. I fear that if
I were to commit something like what you've posted up until now,
people would complain that that code was too ugly to live, and I'd
have a hard time telling them that they're wrong.

Completely agree. Let me look into this. Perhaps I can get something up
next week or the week after.

Not next week, but here is a respin. I've exposed pqSocketPoll as
PQsocketPoll and am just using that. You can see the diff is so much
smaller, which is great!

In order to fight the race condition, I am just using a 1 second timeout
instead of trying to integrate pselect or ppoll. We could add
a PQsocketPPoll() to support those use cases, but I am not sure how
available pselect and ppoll are. I guess on Windows we don't have
pselect. I don't think using the pipe trick that Heikki mentioned
earlier is suitable to expose via an API in libpq, but someone else
might have a different opinion.

Maybe this is good enough until someone complains? Most people would
probably just chalk any latency between keypress and cancellation as
network latency and not a hardcoded 1 second.

Thanks for your feedback Robert!

--
Tristan Partin
Neon (https://neon.tech)

Attachments:

v8-0001-Expose-PQsocketPoll-for-frontends.patchtext/x-patch; charset=utf-8; name=v8-0001-Expose-PQsocketPoll-for-frontends.patchDownload
From 4fa6db1900a355a171d3e16019d02f3a415764d0 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Tue, 30 Jan 2024 15:40:57 -0600
Subject: [PATCH v8 1/2] Expose PQsocketPoll for frontends

PQsocketPoll should help with state machine processing when connecting
to a database asynchronously via PQconnectStart().
---
 doc/src/sgml/libpq.sgml         | 5 ++++-
 src/interfaces/libpq/fe-misc.c  | 7 +++----
 src/interfaces/libpq/libpq-fe.h | 4 ++++
 3 files changed, 11 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d0d5aefadc..aa26c2cc8d 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -358,7 +358,10 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn);
        Loop thus: If <function>PQconnectPoll(conn)</function> last returned
        <symbol>PGRES_POLLING_READING</symbol>, wait until the socket is ready to
        read (as indicated by <function>select()</function>, <function>poll()</function>, or
-       similar system function).
+       similar system function).  Note that <function>PQsocketPoll</function>
+       can help reduce boilerplate by abstracting the setup of
+       <function>select()</function>, or <function>poll()</function> if it is
+       available on your system.
        Then call <function>PQconnectPoll(conn)</function> again.
        Conversely, if <function>PQconnectPoll(conn)</function> last returned
        <symbol>PGRES_POLLING_WRITING</symbol>, wait until the socket is ready
diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index 47a28b0a3a..ee917d375d 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -55,7 +55,6 @@ static int	pqPutMsgBytes(const void *buf, size_t len, PGconn *conn);
 static int	pqSendSome(PGconn *conn, int len);
 static int	pqSocketCheck(PGconn *conn, int forRead, int forWrite,
 						  time_t end_time);
-static int	pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time);
 
 /*
  * PQlibVersion: return the libpq version number
@@ -1059,7 +1058,7 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time)
 
 	/* We will retry as long as we get EINTR */
 	do
-		result = pqSocketPoll(conn->sock, forRead, forWrite, end_time);
+		result = PQsocketPoll(conn->sock, forRead, forWrite, end_time);
 	while (result < 0 && SOCK_ERRNO == EINTR);
 
 	if (result < 0)
@@ -1083,8 +1082,8 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time)
  * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
  * if end_time is 0 (or indeed, any time before now).
  */
-static int
-pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time)
+int
+PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time)
 {
 	/* We use poll(2) if available, otherwise select(2) */
 #ifdef HAVE_POLL
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index defc415fa3..11a7fd32b6 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -21,6 +21,7 @@ extern "C"
 #endif
 
 #include <stdio.h>
+#include <time.h>
 
 /*
  * postgres_ext.h defines the backend's externally visible types,
@@ -644,6 +645,9 @@ extern int	lo_export(PGconn *conn, Oid lobjId, const char *filename);
 /* Get the version of the libpq library in use */
 extern int	PQlibVersion(void);
 
+/* Poll a file descriptor for reading and/or writing with a timeout */
+extern int	PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time);
+
 /* Determine length of multibyte encoded char at *s */
 extern int	PQmblen(const char *s, int encoding);
 
-- 
Tristan Partin
Neon (https://neon.tech)

v8-0002-Allow-SIGINT-to-cancel-psql-database-reconnection.patchtext/x-patch; charset=utf-8; name=v8-0002-Allow-SIGINT-to-cancel-psql-database-reconnection.patchDownload
From e4d61216b822852940631ada65b903b65780fb71 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Tue, 30 Jan 2024 15:53:10 -0600
Subject: [PATCH v8 2/2] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Use PQconnectStartParams() in order to insert a cancel_pressed check
into the polling loop.
---
 leak.supp                        |  2 ++
 src/bin/psql/command.c           | 43 +++++++++++++++++++++++++++++++-
 src/interfaces/libpq/exports.txt |  1 +
 3 files changed, 45 insertions(+), 1 deletion(-)
 create mode 100644 leak.supp

diff --git a/leak.supp b/leak.supp
new file mode 100644
index 0000000000..f3fc27a1d8
--- /dev/null
+++ b/leak.supp
@@ -0,0 +1,2 @@
+leak:save_ps_display_args
+leak:parse_psql_options
diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 5c906e4806..1900657324 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3279,6 +3279,46 @@ param_is_newly_set(const char *old_val, const char *new_val)
 	return false;
 }
 
+static void
+process_connection_state_machine(PGconn *conn)
+{
+	bool		for_read = false;
+
+	while (true)
+	{
+		int			rc;
+		int			sock;
+
+		if (cancel_pressed)
+			break;
+
+		sock = PQsocket(conn);
+		if (sock == -1)
+			break;
+
+		rc = PQsocketPoll(sock, for_read, !for_read, 1);
+		if (rc == -1)
+			return;
+
+		switch (PQconnectPoll(conn))
+		{
+			case PGRES_POLLING_OK:
+			case PGRES_POLLING_FAILED:
+				return;
+			case PGRES_POLLING_READING:
+				for_read = true;
+				continue;
+			case PGRES_POLLING_WRITING:
+				for_read = false;
+				continue;
+			case PGRES_POLLING_ACTIVE:
+				pg_unreachable();
+		}
+	}
+
+	pg_unreachable();
+}
+
 /*
  * do_connect -- handler for \connect
  *
@@ -3595,11 +3635,12 @@ do_connect(enum trivalue reuse_previous_specification,
 		values[paramnum] = NULL;
 
 		/* Note we do not want libpq to re-expand the dbname parameter */
-		n_conn = PQconnectdbParams(keywords, values, false);
+		n_conn = PQconnectStartParams(keywords, values, false);
 
 		pg_free(keywords);
 		pg_free(values);
 
+		process_connection_state_machine(n_conn);
 		if (PQstatus(n_conn) == CONNECTION_OK)
 			break;
 
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 088592deb1..7a7a6ddbab 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -193,3 +193,4 @@ PQsendClosePrepared       190
 PQsendClosePortal         191
 PQchangePassword          192
 PQsendPipelineSync        193
+PQsocketPoll              194
-- 
Tristan Partin
Neon (https://neon.tech)

#22Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Tristan Partin (#21)
Re: psql not responding to SIGINT upon db reconnection

On Tue, 30 Jan 2024 at 23:20, Tristan Partin <tristan@neon.tech> wrote:

Not next week, but here is a respin. I've exposed pqSocketPoll as
PQsocketPoll and am just using that. You can see the diff is so much
smaller, which is great!

The exports.txt change should be made part of patch 0001, also docs
are missing for the newly exposed function. PQsocketPoll does look
like quite a nice API to expose from libpq.

#23Tristan Partin
tristan@neon.tech
In reply to: Jelte Fennema-Nio (#22)
Re: psql not responding to SIGINT upon db reconnection

On Tue Jan 30, 2024 at 4:42 PM CST, Jelte Fennema-Nio wrote:

On Tue, 30 Jan 2024 at 23:20, Tristan Partin <tristan@neon.tech> wrote:

Not next week, but here is a respin. I've exposed pqSocketPoll as
PQsocketPoll and am just using that. You can see the diff is so much
smaller, which is great!

The exports.txt change should be made part of patch 0001, also docs
are missing for the newly exposed function. PQsocketPoll does look
like quite a nice API to expose from libpq.

I was looking for documentation of PQsocket(), but didn't find any
standalone (unless I completely missed it). So I just copied how
PQsocket() is documented in PQconnectPoll(). I am happy to document it
separately if you think it would be useful.

My bad on the exports.txt change being in the wrong commit. Git
things... I will fix it on the next re-spin after resolving the previous
paragraph.

--
Tristan Partin
Neon (https://neon.tech)

#24Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Tristan Partin (#23)
Re: psql not responding to SIGINT upon db reconnection

On Wed, 31 Jan 2024 at 19:07, Tristan Partin <tristan@neon.tech> wrote:

I was looking for documentation of PQsocket(), but didn't find any
standalone (unless I completely missed it). So I just copied how
PQsocket() is documented in PQconnectPoll(). I am happy to document it
separately if you think it would be useful.

PQsocket its documentation is here:
https://www.postgresql.org/docs/16/libpq-status.html#LIBPQ-PQSOCKET

I think PQsocketPoll should probably have its API documentation
(describing arguments and return value at minimum) in this section of
the docs: https://www.postgresql.org/docs/16/libpq-async.html
And I think the example in the PQconnnectPoll API docs could benefit
from having PQsocketPoll used in that example.

#25Robert Haas
robertmhaas@gmail.com
In reply to: Tristan Partin (#23)
Re: psql not responding to SIGINT upon db reconnection

On Wed, Jan 31, 2024 at 1:07 PM Tristan Partin <tristan@neon.tech> wrote:

I was looking for documentation of PQsocket(), but didn't find any
standalone (unless I completely missed it). So I just copied how
PQsocket() is documented in PQconnectPoll(). I am happy to document it
separately if you think it would be useful.

As Jelte said back at the end of January, I think you just completely
missed it. The relevant part of libpq.sgml starts like this:

<varlistentry id="libpq-PQsocket">
<term><function>PQsocket</function><indexterm><primary>PQsocket</primary></indexterm></term>

As far as I know, we document all of the exported libpq functions in
that SGML file.

I think there's no real reason why we couldn't get at least 0001 and
maybe also 0002 into this release, but only if you move quickly on
this. Feature freeze is approaching rapidly.

Modulo the documentation changes, I think 0001 is pretty much ready to
go. 0002 needs comments. I'm also not so sure about the name
process_connection_state_machine(); it seems a little verbose. How
about something like wait_until_connected()? And maybe put it below
do_connect() instead of above.

--
Robert Haas
EDB: http://www.enterprisedb.com

#26Tristan Partin
tristan@neon.tech
In reply to: Robert Haas (#25)
2 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

On Fri Mar 22, 2024 at 9:59 AM CDT, Robert Haas wrote:

On Wed, Jan 31, 2024 at 1:07 PM Tristan Partin <tristan@neon.tech> wrote:

I was looking for documentation of PQsocket(), but didn't find any
standalone (unless I completely missed it). So I just copied how
PQsocket() is documented in PQconnectPoll(). I am happy to document it
separately if you think it would be useful.

As Jelte said back at the end of January, I think you just completely
missed it. The relevant part of libpq.sgml starts like this:

<varlistentry id="libpq-PQsocket">
<term><function>PQsocket</function><indexterm><primary>PQsocket</primary></indexterm></term>

As far as I know, we document all of the exported libpq functions in
that SGML file.

I think there's no real reason why we couldn't get at least 0001 and
maybe also 0002 into this release, but only if you move quickly on
this. Feature freeze is approaching rapidly.

Modulo the documentation changes, I think 0001 is pretty much ready to
go. 0002 needs comments. I'm also not so sure about the name
process_connection_state_machine(); it seems a little verbose. How
about something like wait_until_connected()? And maybe put it below
do_connect() instead of above.

Robert, Jelte:

Sorry for taking a while to get back to y'all. I have taken your
feedback into consideration for v9. This is my first time writing
Postgres docs, so I'm ready for another round of editing :).

Robert, could you point out some places where comments would be useful
in 0002? I did rename the function and moved it as suggested, thanks! In
turn, I also realized I forgot a prototype, so also added it.

This patchset has also been rebased on master, so the libpq export
number for PQsocketPoll was bumped.

--
Tristan Partin
Neon (https://neon.tech)

Attachments:

v9-0001-Expose-PQsocketPoll-for-frontends.patchtext/x-patch; charset=utf-8; name=v9-0001-Expose-PQsocketPoll-for-frontends.patchDownload
From 7f8bf7250ecf79f65c129ccc42643c36bc53f882 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Tue, 30 Jan 2024 15:40:57 -0600
Subject: [PATCH v9 1/2] Expose PQsocketPoll for frontends

PQsocketPoll should help with state machine processing when connecting
to a database asynchronously via PQconnectStart().
---
 doc/src/sgml/libpq.sgml          | 38 +++++++++++++++++++++++++++++++-
 src/interfaces/libpq/exports.txt |  1 +
 src/interfaces/libpq/fe-misc.c   |  7 +++---
 src/interfaces/libpq/libpq-fe.h  |  4 ++++
 4 files changed, 45 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d3e87056f2c..10f191f6b88 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -262,6 +262,39 @@ PGconn *PQsetdb(char *pghost,
      </listitem>
     </varlistentry>
 
+    <varlistentry id="libpq-PQsocketPoll">
+     <term><function>PQsocketPoll</function><indexterm><primary>PQsocketPoll</primary></indexterm></term>
+     <listitem>
+      <para>
+       <indexterm><primary>nonblocking connection</primary></indexterm>
+       Poll a connection&apos;s underlying socket descriptor retrieved with <xref linkend="libpq-PQsocket"/>.
+<synopsis>
+int PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time);
+</synopsis>
+      </para>
+
+      <para>
+       This function takes care of the setup for monitoring a file descriptor. The underlying
+       function is either <function>poll(2)</function> or <function>select(2)</function>,
+       depending on platform support. The primary use of this function is working through the state
+       machine instantiated by <xref linkend="libpq-PQconnectStartParams"/>.
+      </para>
+
+      <para>
+       The function returns a value greater than <literal>0</literal> if the specified condition
+       is met, <literal>0</literal> if a timeout occurred, or <literal>-1</literal> if an error
+       or interrupt occurred. In the event <literal>forRead</literal> and
+       <literal>forWrite</literal> are not set, the function immediately retuns a timeout
+       condition.
+      </para>
+
+      <para>
+       <literal>end_time</literal> is the time in the future in seconds starting from the UNIX
+       epoch in which you would like the function to return if the condition is not met.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry id="libpq-PQconnectStartParams">
      <term><function>PQconnectStartParams</function><indexterm><primary>PQconnectStartParams</primary></indexterm></term>
      <term><function>PQconnectStart</function><indexterm><primary>PQconnectStart</primary></indexterm></term>
@@ -358,7 +391,10 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn);
        Loop thus: If <function>PQconnectPoll(conn)</function> last returned
        <symbol>PGRES_POLLING_READING</symbol>, wait until the socket is ready to
        read (as indicated by <function>select()</function>, <function>poll()</function>, or
-       similar system function).
+       similar system function).  Note that <function>PQsocketPoll</function>
+       can help reduce boilerplate by abstracting the setup of
+       <function>select(2)</function> or <function>poll(2)</function> if it is
+       available on your system.
        Then call <function>PQconnectPoll(conn)</function> again.
        Conversely, if <function>PQconnectPoll(conn)</function> last returned
        <symbol>PGRES_POLLING_WRITING</symbol>, wait until the socket is ready
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 9fbd3d34074..1e48d37677d 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -202,3 +202,4 @@ PQcancelSocket            199
 PQcancelErrorMessage      200
 PQcancelReset             201
 PQcancelFinish            202
+PQsocketPoll              203
diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index f2fc78a481c..f562cd8d344 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -55,7 +55,6 @@ static int	pqPutMsgBytes(const void *buf, size_t len, PGconn *conn);
 static int	pqSendSome(PGconn *conn, int len);
 static int	pqSocketCheck(PGconn *conn, int forRead, int forWrite,
 						  time_t end_time);
-static int	pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time);
 
 /*
  * PQlibVersion: return the libpq version number
@@ -1059,7 +1058,7 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time)
 
 	/* We will retry as long as we get EINTR */
 	do
-		result = pqSocketPoll(conn->sock, forRead, forWrite, end_time);
+		result = PQsocketPoll(conn->sock, forRead, forWrite, end_time);
 	while (result < 0 && SOCK_ERRNO == EINTR);
 
 	if (result < 0)
@@ -1083,8 +1082,8 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time)
  * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
  * if end_time is 0 (or indeed, any time before now).
  */
-static int
-pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time)
+int
+PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time)
 {
 	/* We use poll(2) if available, otherwise select(2) */
 #ifdef HAVE_POLL
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 09b485bd2bc..98c27415281 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -21,6 +21,7 @@ extern "C"
 #endif
 
 #include <stdio.h>
+#include <time.h>
 
 /*
  * postgres_ext.h defines the backend's externally visible types,
@@ -670,6 +671,9 @@ extern int	lo_export(PGconn *conn, Oid lobjId, const char *filename);
 /* Get the version of the libpq library in use */
 extern int	PQlibVersion(void);
 
+/* Poll a file descriptor for reading and/or writing with a timeout */
+extern int	PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time);
+
 /* Determine length of multibyte encoded char at *s */
 extern int	PQmblen(const char *s, int encoding);
 
-- 
Tristan Partin
Neon (https://neon.tech)

v9-0002-Allow-SIGINT-to-cancel-psql-database-reconnection.patchtext/x-patch; charset=utf-8; name=v9-0002-Allow-SIGINT-to-cancel-psql-database-reconnection.patchDownload
From 779e642df8013b3fc8c63f9faf370cbbf2950392 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Tue, 30 Jan 2024 15:53:10 -0600
Subject: [PATCH v9 2/2] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Use PQconnectStartParams() in order to insert a cancel_pressed check
into the polling loop.
---
 src/bin/psql/command.c | 50 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 49 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9b0fa041f73..00a20e47328 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -159,6 +159,7 @@ static void discard_query_text(PsqlScanState scan_state, ConditionalStack cstack
 static bool copy_previous_query(PQExpBuffer query_buf, PQExpBuffer previous_buf);
 static bool do_connect(enum trivalue reuse_previous_specification,
 					   char *dbname, char *user, char *host, char *port);
+static void wait_until_connected(PGconn *conn);
 static bool do_edit(const char *filename_arg, PQExpBuffer query_buf,
 					int lineno, bool discard_on_quit, bool *edited);
 static bool do_shell(const char *command);
@@ -3595,11 +3596,12 @@ do_connect(enum trivalue reuse_previous_specification,
 		values[paramnum] = NULL;
 
 		/* Note we do not want libpq to re-expand the dbname parameter */
-		n_conn = PQconnectdbParams(keywords, values, false);
+		n_conn = PQconnectStartParams(keywords, values, false);
 
 		pg_free(keywords);
 		pg_free(values);
 
+		wait_until_connected(n_conn);
 		if (PQstatus(n_conn) == CONNECTION_OK)
 			break;
 
@@ -3748,6 +3750,52 @@ do_connect(enum trivalue reuse_previous_specification,
 	return true;
 }
 
+static void
+wait_until_connected(PGconn *conn)
+{
+	bool		for_read = false;
+
+	while (true)
+	{
+		int			rc;
+		int			sock;
+		time_t		timeout;
+
+		/*
+		 * On every iteration of the state machine, let's check if the user has
+		 * requested a cancellation of the connection process.
+		 */
+		if (cancel_pressed)
+			break;
+
+		sock = PQsocket(conn);
+		if (sock == -1)
+			break;
+
+		/* Use a 1 second timeout */
+		timeout = time(NULL) + 1;
+		rc = PQsocketPoll(sock, for_read, !for_read, timeout);
+		if (rc == -1)
+			return;
+
+		switch (PQconnectPoll(conn))
+		{
+			case PGRES_POLLING_OK:
+			case PGRES_POLLING_FAILED:
+				return;
+			case PGRES_POLLING_READING:
+				for_read = true;
+				continue;
+			case PGRES_POLLING_WRITING:
+				for_read = false;
+				continue;
+			case PGRES_POLLING_ACTIVE:
+				pg_unreachable();
+		}
+	}
+
+	pg_unreachable();
+}
 
 void
 connection_warnings(bool in_startup)
-- 
Tristan Partin
Neon (https://neon.tech)

#27Robert Haas
robertmhaas@gmail.com
In reply to: Tristan Partin (#26)
Re: psql not responding to SIGINT upon db reconnection

On Fri, Mar 22, 2024 at 1:05 PM Tristan Partin <tristan@neon.tech> wrote:

Sorry for taking a while to get back to y'all. I have taken your
feedback into consideration for v9. This is my first time writing
Postgres docs, so I'm ready for another round of editing :).

Yeah, that looks like it needs some wordsmithing yet. I can take a
crack at that at some point, but here are a few notes:

- "takes care of" and "working through the state machine" seem quite
vague to me.
- the meanings of forRead and forWrite don't seem to be documented.
- "retuns" is a typo.

Robert, could you point out some places where comments would be useful
in 0002? I did rename the function and moved it as suggested, thanks! In
turn, I also realized I forgot a prototype, so also added it.

Well, for starters, I'd give the function a header comment.

Telling me that a 1 second timeout is a 1 second timeout is less
useful than telling me why we've picked a 1 second timeout. Presumably
the answer is "so we can respond to cancel interrupts in a timely
fashion", but I'd make that explicit.

It might be worth a comment saying that PQsocket() shouldn't be
hoisted out of the loop because it could be a multi-host connection
string and the socket might change under us. Unless that's not true,
in which case we should hoist the PQsocket() call out of the loop.

I think it would also be worth a comment saying that we don't need to
report errors here, as the caller will do that; we just need to wait
until the connection is known to have either succeeded or failed, or
until the user presses cancel.

--
Robert Haas
EDB: http://www.enterprisedb.com

#28Tristan Partin
tristan@neon.tech
In reply to: Robert Haas (#27)
2 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

On Fri Mar 22, 2024 at 12:17 PM CDT, Robert Haas wrote:

On Fri, Mar 22, 2024 at 1:05 PM Tristan Partin <tristan@neon.tech> wrote:

Sorry for taking a while to get back to y'all. I have taken your
feedback into consideration for v9. This is my first time writing
Postgres docs, so I'm ready for another round of editing :).

Yeah, that looks like it needs some wordsmithing yet. I can take a
crack at that at some point, but here are a few notes:

- "takes care of" and "working through the state machine" seem quite
vague to me.
- the meanings of forRead and forWrite don't seem to be documented.
- "retuns" is a typo.

Robert, could you point out some places where comments would be useful
in 0002? I did rename the function and moved it as suggested, thanks! In
turn, I also realized I forgot a prototype, so also added it.

Well, for starters, I'd give the function a header comment.

Telling me that a 1 second timeout is a 1 second timeout is less
useful than telling me why we've picked a 1 second timeout. Presumably
the answer is "so we can respond to cancel interrupts in a timely
fashion", but I'd make that explicit.

It might be worth a comment saying that PQsocket() shouldn't be
hoisted out of the loop because it could be a multi-host connection
string and the socket might change under us. Unless that's not true,
in which case we should hoist the PQsocket() call out of the loop.

I think it would also be worth a comment saying that we don't need to
report errors here, as the caller will do that; we just need to wait
until the connection is known to have either succeeded or failed, or
until the user presses cancel.

This is good feedback, thanks. I have added comments where you
suggested. I reworded the PQsocketPoll docs to hopefully meet your
expectations. I am using the term "connection sequence" which is from
the PQconnectStartParams docs, so hopefully people will be able to make
that connection. I wrote documentation for "forRead" and "forWrite" as
well.

I had a question about parameter naming. Right now I have a mix of
camel-case and snake-case in the function signature since that is what
I inherited. Should I change that to be consistent? If so, which case
would you like?

Thanks for your continued reviews.

--
Tristan Partin
Neon (https://neon.tech)

Attachments:

v10-0002-Allow-SIGINT-to-cancel-psql-database-reconnectio.patchtext/x-patch; charset=utf-8; name=v10-0002-Allow-SIGINT-to-cancel-psql-database-reconnectio.patchDownload
From dc1eb3866975680b55451b79b40f109e3104eee8 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Tue, 30 Jan 2024 15:53:10 -0600
Subject: [PATCH v10 2/2] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Use PQconnectStartParams() in order to insert a cancel_pressed check
into the polling loop.
---
 src/bin/psql/command.c | 72 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 71 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9b0fa041f73..1e00b0d4869 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -159,6 +159,7 @@ static void discard_query_text(PsqlScanState scan_state, ConditionalStack cstack
 static bool copy_previous_query(PQExpBuffer query_buf, PQExpBuffer previous_buf);
 static bool do_connect(enum trivalue reuse_previous_specification,
 					   char *dbname, char *user, char *host, char *port);
+static void wait_until_connected(PGconn *conn);
 static bool do_edit(const char *filename_arg, PQExpBuffer query_buf,
 					int lineno, bool discard_on_quit, bool *edited);
 static bool do_shell(const char *command);
@@ -3595,11 +3596,12 @@ do_connect(enum trivalue reuse_previous_specification,
 		values[paramnum] = NULL;
 
 		/* Note we do not want libpq to re-expand the dbname parameter */
-		n_conn = PQconnectdbParams(keywords, values, false);
+		n_conn = PQconnectStartParams(keywords, values, false);
 
 		pg_free(keywords);
 		pg_free(values);
 
+		wait_until_connected(n_conn);
 		if (PQstatus(n_conn) == CONNECTION_OK)
 			break;
 
@@ -3748,6 +3750,74 @@ do_connect(enum trivalue reuse_previous_specification,
 	return true;
 }
 
+/*
+ * Processes the connection sequence described by PQconnectStartParams(). Don't
+ * worry about reporting errors in this function. Our caller will check the
+ * connection's status, and report appropriately.
+ */
+static void
+wait_until_connected(PGconn *conn)
+{
+	bool		forRead = false;
+
+	while (true)
+	{
+		int			rc;
+		int			sock;
+		time_t		end_time;
+
+		/*
+		 * On every iteration of the connection sequence, let's check if the
+		 * user has requested a cancellation.
+		 */
+		if (cancel_pressed)
+			break;
+
+		/*
+		 * Do not assume that the socket remains the same across
+		 * PQconnectPoll() calls.
+		 */
+		sock = PQsocket(conn);
+		if (sock == -1)
+			break;
+
+		/*
+		 * If the user sends SIGINT between the cancel_pressed check, and
+		 * polling of the socket, it will not be recognized. Instead, we will
+		 * just wait until the next step in the connection sequence or forever,
+		 * which might require users to send SIGTERM or SIGQUIT.
+		 *
+		 * Some solutions would include the "self-pipe trick," using
+		 * pselect(2) and ppoll(2), or using a timeout.
+		 *
+		 * The self-pipe trick requires a bit of code to setup. pselect(2) and
+		 * ppoll(2) are not on all the platforms we support. The simplest
+		 * solution happens to just be adding a timeout, so let's wait for 1
+		 * second and check cancel_pressed again.
+		 */
+		end_time = time(NULL) + 1;
+		rc = PQsocketPoll(sock, forRead, !forRead, end_time);
+		if (rc == -1)
+			return;
+
+		switch (PQconnectPoll(conn))
+		{
+			case PGRES_POLLING_OK:
+			case PGRES_POLLING_FAILED:
+				return;
+			case PGRES_POLLING_READING:
+				forRead = true;
+				continue;
+			case PGRES_POLLING_WRITING:
+				forRead = false;
+				continue;
+			case PGRES_POLLING_ACTIVE:
+				pg_unreachable();
+		}
+	}
+
+	pg_unreachable();
+}
 
 void
 connection_warnings(bool in_startup)
-- 
Tristan Partin
Neon (https://neon.tech)

v10-0001-Expose-PQsocketPoll-for-frontends.patchtext/x-patch; charset=utf-8; name=v10-0001-Expose-PQsocketPoll-for-frontends.patchDownload
From dc45e95ab443d973845dd840600d6719dcd4cae2 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Tue, 30 Jan 2024 15:40:57 -0600
Subject: [PATCH v10 1/2] Expose PQsocketPoll for frontends

PQsocketPoll should help with state machine processing when connecting
to a database asynchronously via PQconnectStart().
---
 doc/src/sgml/libpq.sgml          | 43 +++++++++++++++++++++++++++++++-
 src/interfaces/libpq/exports.txt |  1 +
 src/interfaces/libpq/fe-misc.c   |  7 +++---
 src/interfaces/libpq/libpq-fe.h  |  4 +++
 4 files changed, 50 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d3e87056f2c..774b57ba70b 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -262,6 +262,44 @@ PGconn *PQsetdb(char *pghost,
      </listitem>
     </varlistentry>
 
+    <varlistentry id="libpq-PQsocketPoll">
+     <term><function>PQsocketPoll</function><indexterm><primary>PQsocketPoll</primary></indexterm></term>
+     <listitem>
+      <para>
+       <indexterm><primary>nonblocking connection</primary></indexterm>
+       Poll a connection&apos;s underlying socket descriptor retrieved with <xref linkend="libpq-PQsocket"/>.
+<synopsis>
+int PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time);
+</synopsis>
+      </para>
+
+      <para>
+       This function sets up polling of a file descriptor. The underlying function is either
+       <function>poll(2)</function> or <function>select(2)</function>, depending on platform
+       support. The primary use of this function is iterating through the connection sequence
+       described in the documentation of <xref linkend="libpq-PQconnectStartParams"/>. If
+       <parameter>forRead</parameter> is specified, the function waits for the socket to be ready
+       for reading. If <parameter>forWrite</parameter> is specified, the function waits for the
+       socket to be ready for write. See <literal>POLLIN</literal> and <literal>POLLOUT</literal>
+       from <function>poll(2)</function>, or <parameter>readfds</parameter> and
+       <parameter>writefds</parameter> from <function>select(2)</function> for more information.
+      </para>
+
+      <para>
+       The function returns a value greater than <literal>0</literal> if the specified condition
+       is met, <literal>0</literal> if a timeout occurred, or <literal>-1</literal> if an error
+       or interrupt occurred. In the event <literal>forRead</literal> and
+       <literal>forWrite</literal> are not set, the function immediately returns a timeout
+       condition.
+      </para>
+
+      <para>
+       <literal>end_time</literal> is the time in the future in seconds starting from the UNIX
+       epoch in which you would like the function to return if the condition is not met.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry id="libpq-PQconnectStartParams">
      <term><function>PQconnectStartParams</function><indexterm><primary>PQconnectStartParams</primary></indexterm></term>
      <term><function>PQconnectStart</function><indexterm><primary>PQconnectStart</primary></indexterm></term>
@@ -358,7 +396,10 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn);
        Loop thus: If <function>PQconnectPoll(conn)</function> last returned
        <symbol>PGRES_POLLING_READING</symbol>, wait until the socket is ready to
        read (as indicated by <function>select()</function>, <function>poll()</function>, or
-       similar system function).
+       similar system function).  Note that <function>PQsocketPoll</function>
+       can help reduce boilerplate by abstracting the setup of
+       <function>select(2)</function> or <function>poll(2)</function> if it is
+       available on your system.
        Then call <function>PQconnectPoll(conn)</function> again.
        Conversely, if <function>PQconnectPoll(conn)</function> last returned
        <symbol>PGRES_POLLING_WRITING</symbol>, wait until the socket is ready
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 9fbd3d34074..1e48d37677d 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -202,3 +202,4 @@ PQcancelSocket            199
 PQcancelErrorMessage      200
 PQcancelReset             201
 PQcancelFinish            202
+PQsocketPoll              203
diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index f2fc78a481c..f562cd8d344 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -55,7 +55,6 @@ static int	pqPutMsgBytes(const void *buf, size_t len, PGconn *conn);
 static int	pqSendSome(PGconn *conn, int len);
 static int	pqSocketCheck(PGconn *conn, int forRead, int forWrite,
 						  time_t end_time);
-static int	pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time);
 
 /*
  * PQlibVersion: return the libpq version number
@@ -1059,7 +1058,7 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time)
 
 	/* We will retry as long as we get EINTR */
 	do
-		result = pqSocketPoll(conn->sock, forRead, forWrite, end_time);
+		result = PQsocketPoll(conn->sock, forRead, forWrite, end_time);
 	while (result < 0 && SOCK_ERRNO == EINTR);
 
 	if (result < 0)
@@ -1083,8 +1082,8 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time)
  * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
  * if end_time is 0 (or indeed, any time before now).
  */
-static int
-pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time)
+int
+PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time)
 {
 	/* We use poll(2) if available, otherwise select(2) */
 #ifdef HAVE_POLL
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 09b485bd2bc..8d3c5c6f662 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -21,6 +21,7 @@ extern "C"
 #endif
 
 #include <stdio.h>
+#include <time.h>
 
 /*
  * postgres_ext.h defines the backend's externally visible types,
@@ -670,6 +671,9 @@ extern int	lo_export(PGconn *conn, Oid lobjId, const char *filename);
 /* Get the version of the libpq library in use */
 extern int	PQlibVersion(void);
 
+/* Poll a socket for reading and/or writing with an optional timeout */
+extern int	PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time);
+
 /* Determine length of multibyte encoded char at *s */
 extern int	PQmblen(const char *s, int encoding);
 
-- 
Tristan Partin
Neon (https://neon.tech)

#29Robert Haas
robertmhaas@gmail.com
In reply to: Tristan Partin (#28)
Re: psql not responding to SIGINT upon db reconnection

On Fri, Mar 22, 2024 at 4:58 PM Tristan Partin <tristan@neon.tech> wrote:

I had a question about parameter naming. Right now I have a mix of
camel-case and snake-case in the function signature since that is what
I inherited. Should I change that to be consistent? If so, which case
would you like?

Uh... PostgreSQL is kind of the wild west in that regard. The thing to
do is look for nearby precedents, but that doesn't help much here
because in the very same file, libpq-fe.h, we have:

extern int PQsetResultAttrs(PGresult *res, int numAttributes,
PGresAttDesc *attDescs);
extern int PQsetvalue(PGresult *res, int tup_num, int field_num,
char *value, int len);

Since the existing naming is consistent with one of those two styles,
I'd probably just leave it be.

+       The function returns a value greater than <literal>0</literal>
if the specified condition
+       is met, <literal>0</literal> if a timeout occurred, or
<literal>-1</literal> if an error
+       or interrupt occurred. In the event <literal>forRead</literal> and

We either need to tell people how to find out which error it was, or
if that's not possible and we can't reasonably make it possible, we
need to tell them why they shouldn't care. Because there's nothing
more delightful than someone who shows up and says "hey, I tried to do
XYZ, and I got an error," as if that were sufficient information for
me to do something useful.

+       <literal>end_time</literal> is the time in the future in
seconds starting from the UNIX
+       epoch in which you would like the function to return if the
condition is not met.

This sentence seems a bit contorted to me, like maybe Yoda wrote it. I
was about to try to rephrase it and maybe split it in two when I
wondered why we need to document how time_t works at all. Can't we
just say something like "If end_time is not -1, it specifies the time
at which this function should stop waiting for the condition to be
met" -- and maybe move it to the end of the first paragraph, so it's
before where we list the meanings of the return values?

--
Robert Haas
EDB: http://www.enterprisedb.com

#30Tristan Partin
tristan@neon.tech
In reply to: Robert Haas (#29)
2 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

On Mon Mar 25, 2024 at 1:44 PM CDT, Robert Haas wrote:

On Fri, Mar 22, 2024 at 4:58 PM Tristan Partin <tristan@neon.tech> wrote:

I had a question about parameter naming. Right now I have a mix of
camel-case and snake-case in the function signature since that is what
I inherited. Should I change that to be consistent? If so, which case
would you like?

Uh... PostgreSQL is kind of the wild west in that regard. The thing to
do is look for nearby precedents, but that doesn't help much here
because in the very same file, libpq-fe.h, we have:

extern int PQsetResultAttrs(PGresult *res, int numAttributes,
PGresAttDesc *attDescs);
extern int PQsetvalue(PGresult *res, int tup_num, int field_num,
char *value, int len);

Since the existing naming is consistent with one of those two styles,
I'd probably just leave it be.

+       The function returns a value greater than <literal>0</literal>
if the specified condition
+       is met, <literal>0</literal> if a timeout occurred, or
<literal>-1</literal> if an error
+       or interrupt occurred. In the event <literal>forRead</literal> and

We either need to tell people how to find out which error it was, or
if that's not possible and we can't reasonably make it possible, we
need to tell them why they shouldn't care. Because there's nothing
more delightful than someone who shows up and says "hey, I tried to do
XYZ, and I got an error," as if that were sufficient information for
me to do something useful.

+       <literal>end_time</literal> is the time in the future in
seconds starting from the UNIX
+       epoch in which you would like the function to return if the
condition is not met.

This sentence seems a bit contorted to me, like maybe Yoda wrote it. I
was about to try to rephrase it and maybe split it in two when I
wondered why we need to document how time_t works at all. Can't we
just say something like "If end_time is not -1, it specifies the time
at which this function should stop waiting for the condition to be
met" -- and maybe move it to the end of the first paragraph, so it's
before where we list the meanings of the return values?

Incorporated feedback, I have :).

--
Tristan Partin
Neon (https://neon.tech)

Attachments:

v11-0001-Expose-PQsocketPoll-for-frontends.patchtext/x-patch; charset=utf-8; name=v11-0001-Expose-PQsocketPoll-for-frontends.patchDownload
From 14c794d9699f7b9f27d1a4ec026f748c6b7d8853 Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Tue, 30 Jan 2024 15:40:57 -0600
Subject: [PATCH v11 1/2] Expose PQsocketPoll for frontends

PQsocketPoll should help with state machine processing when connecting
to a database asynchronously via PQconnectStart().
---
 doc/src/sgml/libpq.sgml          | 40 +++++++++++++++++++++++++++++++-
 src/interfaces/libpq/exports.txt |  1 +
 src/interfaces/libpq/fe-misc.c   |  7 +++---
 src/interfaces/libpq/libpq-fe.h  |  4 ++++
 4 files changed, 47 insertions(+), 5 deletions(-)

diff --git a/doc/src/sgml/libpq.sgml b/doc/src/sgml/libpq.sgml
index d3e87056f2c..e69feacfe6a 100644
--- a/doc/src/sgml/libpq.sgml
+++ b/doc/src/sgml/libpq.sgml
@@ -262,6 +262,41 @@ PGconn *PQsetdb(char *pghost,
      </listitem>
     </varlistentry>
 
+    <varlistentry id="libpq-PQsocketPoll">
+     <term><function>PQsocketPoll</function><indexterm><primary>PQsocketPoll</primary></indexterm></term>
+     <listitem>
+      <para>
+       <indexterm><primary>nonblocking connection</primary></indexterm>
+       Poll a connection&apos;s underlying socket descriptor retrieved with <xref linkend="libpq-PQsocket"/>.
+<synopsis>
+int PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time);
+</synopsis>
+      </para>
+
+      <para>
+       This function sets up polling of a file descriptor. The underlying function is either
+       <function>poll(2)</function> or <function>select(2)</function>, depending on platform
+       support. The primary use of this function is iterating through the connection sequence
+       described in the documentation of <xref linkend="libpq-PQconnectStartParams"/>. If
+       <parameter>forRead</parameter> is specified, the function waits for the socket to be ready
+       for reading. If <parameter>forWrite</parameter> is specified, the function waits for the
+       socket to be ready for write. See <literal>POLLIN</literal> and <literal>POLLOUT</literal>
+       from <function>poll(2)</function>, or <parameter>readfds</parameter> and
+       <parameter>writefds</parameter> from <function>select(2)</function> for more information. If
+       <parameter>end_time</parameter> is not <literal>-1</literal>, it specifies the time at which
+       this function should stop waiting for the condition to be met.
+      </para>
+
+      <para>
+       The function returns a value greater than <literal>0</literal> if the specified condition
+       is met, <literal>0</literal> if a timeout occurred, or <literal>-1</literal> if an error
+       occurred. The error can be retrieved by checking the <literal>errno(3)</literal> value. In
+       the event <literal>forRead</literal> and <literal>forWrite</literal> are not set, the
+       function immediately returns a timeout condition.
+      </para>
+     </listitem>
+    </varlistentry>
+
     <varlistentry id="libpq-PQconnectStartParams">
      <term><function>PQconnectStartParams</function><indexterm><primary>PQconnectStartParams</primary></indexterm></term>
      <term><function>PQconnectStart</function><indexterm><primary>PQconnectStart</primary></indexterm></term>
@@ -358,7 +393,10 @@ PostgresPollingStatusType PQconnectPoll(PGconn *conn);
        Loop thus: If <function>PQconnectPoll(conn)</function> last returned
        <symbol>PGRES_POLLING_READING</symbol>, wait until the socket is ready to
        read (as indicated by <function>select()</function>, <function>poll()</function>, or
-       similar system function).
+       similar system function).  Note that <function>PQsocketPoll</function>
+       can help reduce boilerplate by abstracting the setup of
+       <function>select(2)</function> or <function>poll(2)</function> if it is
+       available on your system.
        Then call <function>PQconnectPoll(conn)</function> again.
        Conversely, if <function>PQconnectPoll(conn)</function> last returned
        <symbol>PGRES_POLLING_WRITING</symbol>, wait until the socket is ready
diff --git a/src/interfaces/libpq/exports.txt b/src/interfaces/libpq/exports.txt
index 9fbd3d34074..1e48d37677d 100644
--- a/src/interfaces/libpq/exports.txt
+++ b/src/interfaces/libpq/exports.txt
@@ -202,3 +202,4 @@ PQcancelSocket            199
 PQcancelErrorMessage      200
 PQcancelReset             201
 PQcancelFinish            202
+PQsocketPoll              203
diff --git a/src/interfaces/libpq/fe-misc.c b/src/interfaces/libpq/fe-misc.c
index f2fc78a481c..f562cd8d344 100644
--- a/src/interfaces/libpq/fe-misc.c
+++ b/src/interfaces/libpq/fe-misc.c
@@ -55,7 +55,6 @@ static int	pqPutMsgBytes(const void *buf, size_t len, PGconn *conn);
 static int	pqSendSome(PGconn *conn, int len);
 static int	pqSocketCheck(PGconn *conn, int forRead, int forWrite,
 						  time_t end_time);
-static int	pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time);
 
 /*
  * PQlibVersion: return the libpq version number
@@ -1059,7 +1058,7 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time)
 
 	/* We will retry as long as we get EINTR */
 	do
-		result = pqSocketPoll(conn->sock, forRead, forWrite, end_time);
+		result = PQsocketPoll(conn->sock, forRead, forWrite, end_time);
 	while (result < 0 && SOCK_ERRNO == EINTR);
 
 	if (result < 0)
@@ -1083,8 +1082,8 @@ pqSocketCheck(PGconn *conn, int forRead, int forWrite, time_t end_time)
  * Timeout is infinite if end_time is -1.  Timeout is immediate (no blocking)
  * if end_time is 0 (or indeed, any time before now).
  */
-static int
-pqSocketPoll(int sock, int forRead, int forWrite, time_t end_time)
+int
+PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time)
 {
 	/* We use poll(2) if available, otherwise select(2) */
 #ifdef HAVE_POLL
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 09b485bd2bc..8d3c5c6f662 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -21,6 +21,7 @@ extern "C"
 #endif
 
 #include <stdio.h>
+#include <time.h>
 
 /*
  * postgres_ext.h defines the backend's externally visible types,
@@ -670,6 +671,9 @@ extern int	lo_export(PGconn *conn, Oid lobjId, const char *filename);
 /* Get the version of the libpq library in use */
 extern int	PQlibVersion(void);
 
+/* Poll a socket for reading and/or writing with an optional timeout */
+extern int	PQsocketPoll(int sock, int forRead, int forWrite, time_t end_time);
+
 /* Determine length of multibyte encoded char at *s */
 extern int	PQmblen(const char *s, int encoding);
 
-- 
Tristan Partin
Neon (https://neon.tech)

v11-0002-Allow-SIGINT-to-cancel-psql-database-reconnectio.patchtext/x-patch; charset=utf-8; name=v11-0002-Allow-SIGINT-to-cancel-psql-database-reconnectio.patchDownload
From 6e3b08aa89e7a0268c3ade83c708adf1e1637ffa Mon Sep 17 00:00:00 2001
From: Tristan Partin <tristan@neon.tech>
Date: Tue, 30 Jan 2024 15:53:10 -0600
Subject: [PATCH v11 2/2] Allow SIGINT to cancel psql database reconnections

After installing the SIGINT handler in psql, SIGINT can no longer cancel
database reconnections. For instance, if the user starts a reconnection
and then needs to do some form of interaction (ie psql is polling),
there is no way to cancel the reconnection process currently.

Use PQconnectStartParams() in order to insert a cancel_pressed check
into the polling loop.
---
 src/bin/psql/command.c | 72 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 71 insertions(+), 1 deletion(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index 9b0fa041f73..1e00b0d4869 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -159,6 +159,7 @@ static void discard_query_text(PsqlScanState scan_state, ConditionalStack cstack
 static bool copy_previous_query(PQExpBuffer query_buf, PQExpBuffer previous_buf);
 static bool do_connect(enum trivalue reuse_previous_specification,
 					   char *dbname, char *user, char *host, char *port);
+static void wait_until_connected(PGconn *conn);
 static bool do_edit(const char *filename_arg, PQExpBuffer query_buf,
 					int lineno, bool discard_on_quit, bool *edited);
 static bool do_shell(const char *command);
@@ -3595,11 +3596,12 @@ do_connect(enum trivalue reuse_previous_specification,
 		values[paramnum] = NULL;
 
 		/* Note we do not want libpq to re-expand the dbname parameter */
-		n_conn = PQconnectdbParams(keywords, values, false);
+		n_conn = PQconnectStartParams(keywords, values, false);
 
 		pg_free(keywords);
 		pg_free(values);
 
+		wait_until_connected(n_conn);
 		if (PQstatus(n_conn) == CONNECTION_OK)
 			break;
 
@@ -3748,6 +3750,74 @@ do_connect(enum trivalue reuse_previous_specification,
 	return true;
 }
 
+/*
+ * Processes the connection sequence described by PQconnectStartParams(). Don't
+ * worry about reporting errors in this function. Our caller will check the
+ * connection's status, and report appropriately.
+ */
+static void
+wait_until_connected(PGconn *conn)
+{
+	bool		forRead = false;
+
+	while (true)
+	{
+		int			rc;
+		int			sock;
+		time_t		end_time;
+
+		/*
+		 * On every iteration of the connection sequence, let's check if the
+		 * user has requested a cancellation.
+		 */
+		if (cancel_pressed)
+			break;
+
+		/*
+		 * Do not assume that the socket remains the same across
+		 * PQconnectPoll() calls.
+		 */
+		sock = PQsocket(conn);
+		if (sock == -1)
+			break;
+
+		/*
+		 * If the user sends SIGINT between the cancel_pressed check, and
+		 * polling of the socket, it will not be recognized. Instead, we will
+		 * just wait until the next step in the connection sequence or forever,
+		 * which might require users to send SIGTERM or SIGQUIT.
+		 *
+		 * Some solutions would include the "self-pipe trick," using
+		 * pselect(2) and ppoll(2), or using a timeout.
+		 *
+		 * The self-pipe trick requires a bit of code to setup. pselect(2) and
+		 * ppoll(2) are not on all the platforms we support. The simplest
+		 * solution happens to just be adding a timeout, so let's wait for 1
+		 * second and check cancel_pressed again.
+		 */
+		end_time = time(NULL) + 1;
+		rc = PQsocketPoll(sock, forRead, !forRead, end_time);
+		if (rc == -1)
+			return;
+
+		switch (PQconnectPoll(conn))
+		{
+			case PGRES_POLLING_OK:
+			case PGRES_POLLING_FAILED:
+				return;
+			case PGRES_POLLING_READING:
+				forRead = true;
+				continue;
+			case PGRES_POLLING_WRITING:
+				forRead = false;
+				continue;
+			case PGRES_POLLING_ACTIVE:
+				pg_unreachable();
+		}
+	}
+
+	pg_unreachable();
+}
 
 void
 connection_warnings(bool in_startup)
-- 
Tristan Partin
Neon (https://neon.tech)

#31Robert Haas
robertmhaas@gmail.com
In reply to: Tristan Partin (#30)
Re: psql not responding to SIGINT upon db reconnection

On Mon, Apr 1, 2024 at 12:04 PM Tristan Partin <tristan@neon.tech> wrote:

This sentence seems a bit contorted to me, like maybe Yoda wrote it.

Incorporated feedback, I have :).

Committed it, I did. My thanks for working on this issue, I extend.

--
Robert Haas
EDB: http://www.enterprisedb.com

#32Tristan Partin
tristan@neon.tech
In reply to: Robert Haas (#31)
Re: psql not responding to SIGINT upon db reconnection

On Tue Apr 2, 2024 at 9:32 AM CDT, Robert Haas wrote:

On Mon, Apr 1, 2024 at 12:04 PM Tristan Partin <tristan@neon.tech> wrote:

This sentence seems a bit contorted to me, like maybe Yoda wrote it.

Incorporated feedback, I have :).

Committed it, I did. My thanks for working on this issue, I extend.

Thanks to everyone for their reviews! Patch is in a much better place
than when it started.

--
Tristan Partin
Neon (https://neon.tech)

#33Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Robert Haas (#31)
2 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

On Tue, 2 Apr 2024 at 16:33, Robert Haas <robertmhaas@gmail.com> wrote:

Committed it, I did. My thanks for working on this issue, I extend.

Looking at the committed version of this patch, the pg_unreachable
calls seemed weird to me. 1 is actually incorrect, thus possibly
resulting in undefined behaviour. And for the other call an imho
better fix would be to remove the now 21 year unused enum variant,
instead of introducing its only reference in the whole codebase.

Attached are two trivial patches, feel free to remove both of the
pg_unreachable calls.

Attachments:

v11-0002-Remove-PGRES_POLLING_ACTIVE.patchtext/x-patch; charset=US-ASCII; name=v11-0002-Remove-PGRES_POLLING_ACTIVE.patchDownload
From 7f4279bb939e2d2707fdd0f471893d734959ab91 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
Date: Wed, 3 Apr 2024 15:21:52 +0200
Subject: [PATCH v11 2/2] Remove PGRES_POLLING_ACTIVE

This enum variant has been unused for at least 21 years
44aba280207740d0956160c0288e61f28f024a71. It was left for backwards
compatibility for "awhile". I think that time has come.
---
 src/bin/psql/command.c          | 2 --
 src/interfaces/libpq/libpq-fe.h | 2 --
 2 files changed, 4 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index dc3cc7c10b7..9a163cf22cd 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3811,8 +3811,6 @@ wait_until_connected(PGconn *conn)
 			case PGRES_POLLING_WRITING:
 				forRead = false;
 				continue;
-			case PGRES_POLLING_ACTIVE:
-				pg_unreachable();
 		}
 	}
 }
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 8d3c5c6f662..2459b0cf5e1 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -91,8 +91,6 @@ typedef enum
 	PGRES_POLLING_READING,		/* These two indicate that one may	  */
 	PGRES_POLLING_WRITING,		/* use select before polling again.   */
 	PGRES_POLLING_OK,
-	PGRES_POLLING_ACTIVE		/* unused; keep for awhile for backwards
-								 * compatibility */
 } PostgresPollingStatusType;
 
 typedef enum
-- 
2.34.1

v11-0001-Fix-actually-reachable-pg_unreachable-call.patchtext/x-patch; charset=US-ASCII; name=v11-0001-Fix-actually-reachable-pg_unreachable-call.patchDownload
From 29100578b5f0b4cec68ee1b8c572c4f280335da3 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
Date: Wed, 3 Apr 2024 15:19:04 +0200
Subject: [PATCH v11 1/2] Fix actually reachable pg_unreachable call

In cafe1056558fe07cdc52b95205588fcd80870362 a call to pg_unreachable was
introduced that was actually reachable, because there were break
statements in the loop. This addresses that by both changing the break
statements to return and removing the call to pg_unreachable, which
seemed of dubious necessity anyway.
---
 src/bin/psql/command.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c005624e9c3..dc3cc7c10b7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3771,7 +3771,7 @@ wait_until_connected(PGconn *conn)
 		 * user has requested a cancellation.
 		 */
 		if (cancel_pressed)
-			break;
+			return;
 
 		/*
 		 * Do not assume that the socket remains the same across
@@ -3779,7 +3779,7 @@ wait_until_connected(PGconn *conn)
 		 */
 		sock = PQsocket(conn);
 		if (sock == -1)
-			break;
+			return;
 
 		/*
 		 * If the user sends SIGINT between the cancel_pressed check, and
@@ -3815,8 +3815,6 @@ wait_until_connected(PGconn *conn)
 				pg_unreachable();
 		}
 	}
-
-	pg_unreachable();
 }
 
 void

base-commit: 936e3fa3787a51397280c1081587586e83c20399
-- 
2.34.1

#34Tristan Partin
tristan@neon.tech
In reply to: Jelte Fennema-Nio (#33)
Re: psql not responding to SIGINT upon db reconnection

On Wed Apr 3, 2024 at 8:32 AM CDT, Jelte Fennema-Nio wrote:

On Tue, 2 Apr 2024 at 16:33, Robert Haas <robertmhaas@gmail.com> wrote:

Committed it, I did. My thanks for working on this issue, I extend.

Looking at the committed version of this patch, the pg_unreachable
calls seemed weird to me. 1 is actually incorrect, thus possibly
resulting in undefined behaviour. And for the other call an imho
better fix would be to remove the now 21 year unused enum variant,
instead of introducing its only reference in the whole codebase.

Attached are two trivial patches, feel free to remove both of the
pg_unreachable calls.

Patches look good. Sorry about causing you to do some work.

--
Tristan Partin
Neon (https://neon.tech)

#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Jelte Fennema-Nio (#33)
Re: psql not responding to SIGINT upon db reconnection

Jelte Fennema-Nio <postgres@jeltef.nl> writes:

Looking at the committed version of this patch, the pg_unreachable
calls seemed weird to me. 1 is actually incorrect, thus possibly
resulting in undefined behaviour. And for the other call an imho
better fix would be to remove the now 21 year unused enum variant,
instead of introducing its only reference in the whole codebase.

If we do the latter, we will almost certainly get pushback from
distros who check for library ABI breaks. I fear the comment
suggesting that we could remove it someday is too optimistic.

regards, tom lane

#36Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Tom Lane (#35)
2 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

On Wed, 3 Apr 2024 at 16:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

If we do the latter, we will almost certainly get pushback from
distros who check for library ABI breaks. I fear the comment
suggesting that we could remove it someday is too optimistic.

Alright, changed patch 0002 to keep the variant. But remove it from
the recently added switch/case. And also updated the comment to remove
the "for awhile".

Attachments:

v12-0001-Fix-actually-reachable-pg_unreachable-call.patchtext/x-patch; charset=US-ASCII; name=v12-0001-Fix-actually-reachable-pg_unreachable-call.patchDownload
From 29100578b5f0b4cec68ee1b8c572c4f280335da3 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
Date: Wed, 3 Apr 2024 15:19:04 +0200
Subject: [PATCH v12 1/2] Fix actually reachable pg_unreachable call

In cafe1056558fe07cdc52b95205588fcd80870362 a call to pg_unreachable was
introduced that was actually reachable, because there were break
statements in the loop. This addresses that by both changing the break
statements to return and removing the call to pg_unreachable, which
seemed of dubious necessity anyway.
---
 src/bin/psql/command.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c005624e9c3..dc3cc7c10b7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3771,7 +3771,7 @@ wait_until_connected(PGconn *conn)
 		 * user has requested a cancellation.
 		 */
 		if (cancel_pressed)
-			break;
+			return;
 
 		/*
 		 * Do not assume that the socket remains the same across
@@ -3779,7 +3779,7 @@ wait_until_connected(PGconn *conn)
 		 */
 		sock = PQsocket(conn);
 		if (sock == -1)
-			break;
+			return;
 
 		/*
 		 * If the user sends SIGINT between the cancel_pressed check, and
@@ -3815,8 +3815,6 @@ wait_until_connected(PGconn *conn)
 				pg_unreachable();
 		}
 	}
-
-	pg_unreachable();
 }
 
 void

base-commit: 936e3fa3787a51397280c1081587586e83c20399
-- 
2.34.1

v12-0002-Remove-PGRES_POLLING_ACTIVE-case.patchtext/x-patch; charset=US-ASCII; name=v12-0002-Remove-PGRES_POLLING_ACTIVE-case.patchDownload
From f71634361a0c9ba3416c75fc3ff6d55536f134c4 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
Date: Wed, 3 Apr 2024 15:21:52 +0200
Subject: [PATCH v12 2/2] Remove PGRES_POLLING_ACTIVE case

This enum variant has been unused for at least 21 years
44aba280207740d0956160c0288e61f28f024a71. It's been there only for
backwards compatibility of the ABI, so no need to check for it.

Also update the comment on PGRES_POLLING_ACTIVE, since we're stuck with
it forever due to ABI compatibility guarantees.
---
 src/bin/psql/command.c          | 2 --
 src/interfaces/libpq/libpq-fe.h | 3 +--
 2 files changed, 1 insertion(+), 4 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index dc3cc7c10b7..9a163cf22cd 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3811,8 +3811,6 @@ wait_until_connected(PGconn *conn)
 			case PGRES_POLLING_WRITING:
 				forRead = false;
 				continue;
-			case PGRES_POLLING_ACTIVE:
-				pg_unreachable();
 		}
 	}
 }
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 8d3c5c6f662..c184e853889 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -91,8 +91,7 @@ typedef enum
 	PGRES_POLLING_READING,		/* These two indicate that one may	  */
 	PGRES_POLLING_WRITING,		/* use select before polling again.   */
 	PGRES_POLLING_OK,
-	PGRES_POLLING_ACTIVE		/* unused; keep for awhile for backwards
-								 * compatibility */
+	PGRES_POLLING_ACTIVE		/* unused; keep for backwards compatibility */
 } PostgresPollingStatusType;
 
 typedef enum
-- 
2.34.1

#37Tristan Partin
tristan@neon.tech
In reply to: Jelte Fennema-Nio (#36)
Re: psql not responding to SIGINT upon db reconnection

On Wed Apr 3, 2024 at 9:50 AM CDT, Jelte Fennema-Nio wrote:

On Wed, 3 Apr 2024 at 16:31, Tom Lane <tgl@sss.pgh.pa.us> wrote:

If we do the latter, we will almost certainly get pushback from
distros who check for library ABI breaks. I fear the comment
suggesting that we could remove it someday is too optimistic.

Alright, changed patch 0002 to keep the variant. But remove it from
the recently added switch/case. And also updated the comment to remove
the "for awhile".

Removing from the switch statement causes a warning:

[1/2] Compiling C object src/bin/psql/psql.p/command.c.o
../src/bin/psql/command.c: In function ‘wait_until_connected’:
../src/bin/psql/command.c:3803:17: warning: enumeration value ‘PGRES_POLLING_ACTIVE’ not handled in switch [-Wswitch]
3803 | switch (PQconnectPoll(conn))
| ^~~~~~

--
Tristan Partin
Neon (https://neon.tech)

#38Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Tristan Partin (#37)
2 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

On Wed, 3 Apr 2024 at 16:55, Tristan Partin <tristan@neon.tech> wrote:

Removing from the switch statement causes a warning:

[1/2] Compiling C object src/bin/psql/psql.p/command.c.o
../src/bin/psql/command.c: In function ‘wait_until_connected’:
../src/bin/psql/command.c:3803:17: warning: enumeration value ‘PGRES_POLLING_ACTIVE’ not handled in switch [-Wswitch]
3803 | switch (PQconnectPoll(conn))
| ^~~~~~

Ofcourse... fixed now

Attachments:

v13-0001-Fix-actually-reachable-pg_unreachable-call.patchapplication/x-patch; name=v13-0001-Fix-actually-reachable-pg_unreachable-call.patchDownload
From 29100578b5f0b4cec68ee1b8c572c4f280335da3 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
Date: Wed, 3 Apr 2024 15:19:04 +0200
Subject: [PATCH v13 1/2] Fix actually reachable pg_unreachable call

In cafe1056558fe07cdc52b95205588fcd80870362 a call to pg_unreachable was
introduced that was actually reachable, because there were break
statements in the loop. This addresses that by both changing the break
statements to return and removing the call to pg_unreachable, which
seemed of dubious necessity anyway.
---
 src/bin/psql/command.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c005624e9c3..dc3cc7c10b7 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3771,7 +3771,7 @@ wait_until_connected(PGconn *conn)
 		 * user has requested a cancellation.
 		 */
 		if (cancel_pressed)
-			break;
+			return;
 
 		/*
 		 * Do not assume that the socket remains the same across
@@ -3779,7 +3779,7 @@ wait_until_connected(PGconn *conn)
 		 */
 		sock = PQsocket(conn);
 		if (sock == -1)
-			break;
+			return;
 
 		/*
 		 * If the user sends SIGINT between the cancel_pressed check, and
@@ -3815,8 +3815,6 @@ wait_until_connected(PGconn *conn)
 				pg_unreachable();
 		}
 	}
-
-	pg_unreachable();
 }
 
 void

base-commit: 936e3fa3787a51397280c1081587586e83c20399
-- 
2.34.1

v13-0002-Remove-PGRES_POLLING_ACTIVE-case.patchapplication/x-patch; name=v13-0002-Remove-PGRES_POLLING_ACTIVE-case.patchDownload
From 76cef25162b44adc20172afee47836ca765d3b5c Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
Date: Wed, 3 Apr 2024 15:21:52 +0200
Subject: [PATCH v13 2/2] Remove PGRES_POLLING_ACTIVE case

This enum variant has been unused for at least 21 years
44aba280207740d0956160c0288e61f28f024a71. It's been there only for
backwards compatibility of the ABI, so no need to check for it.

Also update the comment on PGRES_POLLING_ACTIVE, since we're stuck with
it forever due to ABI compatibility guarantees.
---
 src/bin/psql/command.c          | 7 ++-----
 src/interfaces/libpq/libpq-fe.h | 3 +--
 2 files changed, 3 insertions(+), 7 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index dc3cc7c10b7..5b31d08bf70 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3802,17 +3802,14 @@ wait_until_connected(PGconn *conn)
 
 		switch (PQconnectPoll(conn))
 		{
-			case PGRES_POLLING_OK:
-			case PGRES_POLLING_FAILED:
-				return;
 			case PGRES_POLLING_READING:
 				forRead = true;
 				continue;
 			case PGRES_POLLING_WRITING:
 				forRead = false;
 				continue;
-			case PGRES_POLLING_ACTIVE:
-				pg_unreachable();
+			default:
+				return;
 		}
 	}
 }
diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 8d3c5c6f662..c184e853889 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -91,8 +91,7 @@ typedef enum
 	PGRES_POLLING_READING,		/* These two indicate that one may	  */
 	PGRES_POLLING_WRITING,		/* use select before polling again.   */
 	PGRES_POLLING_OK,
-	PGRES_POLLING_ACTIVE		/* unused; keep for awhile for backwards
-								 * compatibility */
+	PGRES_POLLING_ACTIVE		/* unused; keep for backwards compatibility */
 } PostgresPollingStatusType;
 
 typedef enum
-- 
2.34.1

#39Tristan Partin
tristan@neon.tech
In reply to: Jelte Fennema-Nio (#38)
Re: psql not responding to SIGINT upon db reconnection

On Wed Apr 3, 2024 at 10:05 AM CDT, Jelte Fennema-Nio wrote:

On Wed, 3 Apr 2024 at 16:55, Tristan Partin <tristan@neon.tech> wrote:

Removing from the switch statement causes a warning:

[1/2] Compiling C object src/bin/psql/psql.p/command.c.o
../src/bin/psql/command.c: In function ‘wait_until_connected’:
../src/bin/psql/command.c:3803:17: warning: enumeration value ‘PGRES_POLLING_ACTIVE’ not handled in switch [-Wswitch]
3803 | switch (PQconnectPoll(conn))
| ^~~~~~

Ofcourse... fixed now

I think patch 2 makes it worse. The value in -Wswitch is that when new
enum variants are added, the developer knows the locations to update.
Adding a default case makes -Wswitch pointless.

Patch 1 is still good. The comment change in patch 2 is good too!

--
Tristan Partin
Neon (https://neon.tech)

#40Robert Haas
robertmhaas@gmail.com
In reply to: Tristan Partin (#39)
Re: psql not responding to SIGINT upon db reconnection

On Wed, Apr 3, 2024 at 11:17 AM Tristan Partin <tristan@neon.tech> wrote:

I think patch 2 makes it worse. The value in -Wswitch is that when new
enum variants are added, the developer knows the locations to update.
Adding a default case makes -Wswitch pointless.

Patch 1 is still good. The comment change in patch 2 is good too!

It seems to me that 0001 should either remove the pg_unreachable()
call or change the break to a return, but not both. The commit message
tries to justify doing both by saying that the pg_unreachable() call
doesn't have much value, but there's not much value in omitting
pg_unreachable() from unreachable places, either, so I'm not
convinced.

I agree with Tristan's analysis of 0002.

--
Robert Haas
EDB: http://www.enterprisedb.com

#41Jelte Fennema-Nio
postgres@jeltef.nl
In reply to: Robert Haas (#40)
2 attachment(s)
Re: psql not responding to SIGINT upon db reconnection

On Wed, 3 Apr 2024 at 21:49, Robert Haas <robertmhaas@gmail.com> wrote:

It seems to me that 0001 should either remove the pg_unreachable()
call or change the break to a return, but not both.

Updated the patch to only remove the pg_unreachable call (and keep the breaks).

but there's not much value in omitting
pg_unreachable() from unreachable places, either, so I'm not
convinced.

But I don't agree with this. Having pg_unreachable in places where it
brings no perf benefit has two main downsides:

1. It's extra lines of code
2. If a programmer/reviewer is not careful about maintaining this
unreachable invariant (now or in the future), undefined behavior can
be introduced quite easily.

Also, I'd expect any optimizing compiler to know that the code after
the loop is already unreachable if there are no break/goto statements
in its body.

I agree with Tristan's analysis of 0002.

Updated 0002, to only change the comment. But honestly I don't think
using "default" really introduces many chances for future bugs in this
case, since it seems very unlikely we'll ever add new variants to the
PostgresPollingStatusType enum.

Attachments:

v14-0001-Fix-actually-reachable-pg_unreachable-call.patchtext/x-patch; charset=US-ASCII; name=v14-0001-Fix-actually-reachable-pg_unreachable-call.patchDownload
From b0ccfb6fca3e4ae9e1e62ac3b6a4c5f428b56e04 Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
Date: Wed, 3 Apr 2024 15:19:04 +0200
Subject: [PATCH v14 1/2] Fix actually reachable pg_unreachable call

In cafe1056558fe07cdc52b95205588fcd80870362 a call to pg_unreachable was
introduced that was actually reachable, because there were break
statements in the loop. This addresses that by removing the call to
pg_unreachable, which seemed of dubious necessity anyway.
---
 src/bin/psql/command.c | 2 --
 1 file changed, 2 deletions(-)

diff --git a/src/bin/psql/command.c b/src/bin/psql/command.c
index c005624e9c3..479f9f2be59 100644
--- a/src/bin/psql/command.c
+++ b/src/bin/psql/command.c
@@ -3815,8 +3815,6 @@ wait_until_connected(PGconn *conn)
 				pg_unreachable();
 		}
 	}
-
-	pg_unreachable();
 }
 
 void

base-commit: 936e3fa3787a51397280c1081587586e83c20399
-- 
2.34.1

v14-0002-Change-comment-on-PGRES_POLLING_ACTIVE.patchtext/x-patch; charset=US-ASCII; name=v14-0002-Change-comment-on-PGRES_POLLING_ACTIVE.patchDownload
From 799eb6989d481ab617c473b8acbbfc1a405c3c7d Mon Sep 17 00:00:00 2001
From: Jelte Fennema-Nio <jelte.fennema@microsoft.com>
Date: Wed, 3 Apr 2024 15:21:52 +0200
Subject: [PATCH v14 2/2] Change comment on PGRES_POLLING_ACTIVE

Updates the comment on PGRES_POLLING_ACTIVE, since we're stuck with
it forever due to ABI compatibility guarantees.
---
 src/interfaces/libpq/libpq-fe.h | 3 +--
 1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/src/interfaces/libpq/libpq-fe.h b/src/interfaces/libpq/libpq-fe.h
index 8d3c5c6f662..c184e853889 100644
--- a/src/interfaces/libpq/libpq-fe.h
+++ b/src/interfaces/libpq/libpq-fe.h
@@ -91,8 +91,7 @@ typedef enum
 	PGRES_POLLING_READING,		/* These two indicate that one may	  */
 	PGRES_POLLING_WRITING,		/* use select before polling again.   */
 	PGRES_POLLING_OK,
-	PGRES_POLLING_ACTIVE		/* unused; keep for awhile for backwards
-								 * compatibility */
+	PGRES_POLLING_ACTIVE		/* unused; keep for backwards compatibility */
 } PostgresPollingStatusType;
 
 typedef enum
-- 
2.34.1

#42Robert Haas
robertmhaas@gmail.com
In reply to: Jelte Fennema-Nio (#41)
Re: psql not responding to SIGINT upon db reconnection

On Thu, Apr 4, 2024 at 6:43 AM Jelte Fennema-Nio <postgres@jeltef.nl> wrote:

But I don't agree with this. Having pg_unreachable in places where it
brings no perf benefit has two main downsides:

1. It's extra lines of code
2. If a programmer/reviewer is not careful about maintaining this
unreachable invariant (now or in the future), undefined behavior can
be introduced quite easily.

Also, I'd expect any optimizing compiler to know that the code after
the loop is already unreachable if there are no break/goto statements
in its body.

I'm not super-well-informed about the effects of pg_unreachable(), but
I feel like it's a little strange to complain that it's an extra line
of code. Presumably, it translates into no executable instructions,
and might even reduce the size of the generated code. Now, it could
still be a cognitive burden on human readers, but hopefully it should
do the exact opposite: it should make the code easier to understand by
documenting that the author thought that a certain point in the code
could not be reached. In that sense, it's like a code comment.

Likewise, it certainly is true that one must be careful to put
pg_unreachable() only in places that aren't reachable, and to add or
remove it as appropriate if the set of unreachable places changes. But
it's also true that one must be careful to put any other thing that
looks like a function call only in places where that thing is
appropriate.

I agree with Tristan's analysis of 0002.

Updated 0002, to only change the comment. But honestly I don't think
using "default" really introduces many chances for future bugs in this
case, since it seems very unlikely we'll ever add new variants to the
PostgresPollingStatusType enum.

That might be true, but we've avoided using default: for switches over
lots of other enum types in lots of other places in PostgreSQL, and
overall it's saved us from lots of bugs. I'm not a stickler about
this; if there's some reason not to do it in some particular case,
that's fine with me. But where it's possible to do it without any real
disadvantage, I think it's a healthy practice.

I have committed these versions of the patches.

--
Robert Haas
EDB: http://www.enterprisedb.com

#43Tristan Partin
tristan@neon.tech
In reply to: Robert Haas (#42)
Re: psql not responding to SIGINT upon db reconnection

Thanks Jelte and Robert for the extra effort to correct my mistake.
I apologize. Bad copy-paste from a previous revision of the patch at
some point.

--
Tristan Partin
Neon (https://neon.tech)