Minor code de-duplication in fe-connect.c

Started by Gurjeet Singhover 2 years ago6 messages

gurjeet@singh.im

over 2 years ago

1 attachment(s)

Commit [1]Support connection load balancing in libpq 7f5b19817eaf38e70ad1153db4e644ee9456853e implements Fisher-Yates shuffling algorithm to shuffle
connection addresses, in two places.

The attached patch moves the duplicated code to a function, and calls
it in those 2 places.

[1]: Support connection load balancing in libpq 7f5b19817eaf38e70ad1153db4e644ee9456853e
7f5b19817eaf38e70ad1153db4e644ee9456853e

Best regards,
Gurjeet https://Gurje.et
Postgres Contributors Team, http://aws.amazon.com

Attachments:

v0-0001-Reduce-code-duplication-in-fe-connect.c.patchapplication/octet-stream; name=v0-0001-Reduce-code-duplication-in-fe-connect.c.patchDownload

From 5e2621abc697d4cad962fa3ed13f91f0eb043681 Mon Sep 17 00:00:00 2001
From: Gurjeet Singh <gurjeet@singh.im>
Date: Thu, 20 Apr 2023 22:06:39 -0700
Subject: [PATCH v0] Reduce code duplication in fe-connect.c

Alternate commit title: Move address-shuffling code to a function

The address-shuffling algorithm implemented by commit 7f5b198 was
duplicated, so place it in a function and reuse.
---
 src/interfaces/libpq/fe-connect.c | 61 +++++++++++++++----------------
 1 file changed, 30 insertions(+), 31 deletions(-)

diff --git a/src/interfaces/libpq/fe-connect.c b/src/interfaces/libpq/fe-connect.c
index fcd3d0d9a3..8c277f5529 100644
--- a/src/interfaces/libpq/fe-connect.c
+++ b/src/interfaces/libpq/fe-connect.c
@@ -1058,6 +1058,31 @@ libpq_prng_init(PGconn *conn)
 	pg_prng_seed(&conn->prng_state, rseed);
 }
 
+/*
+ * Shuffles the addresses of the connection.
+ */
+static void
+shuffleAddresses(PGConn *conn)
+{
+	/*
+	 * This is the "inside-out" variant of the Fisher-Yates shuffle
+	 * algorithm. Notionally, we append each new value to the array
+	 * and then swap it with a randomly-chosen array element (possibly
+	 * including itself, else we fail to generate permutations with
+	 * the last integer last).  The swap step can be optimized by
+	 * combining it with the insertion.
+	 */
+	for (int i = 1; i < conn->naddr; i++)
+	{
+		AddrInfo	temp;
+		int			j = pg_prng_uint64_range(&conn->prng_state, 0, i);
+
+		temp		  = conn->addr[j];
+		conn->addr[j] = conn->addr[i];
+		conn->addr[i] = temp;
+	}
+}
+
 /*
  *		connectOptions2
  *
@@ -1711,26 +1736,14 @@ connectOptions2(PGconn *conn)
 	else
 		conn->load_balance_type = LOAD_BALANCE_DISABLE;
 
+	/*
+	 * If random load balancing is enabled we shuffle the addresses.
+	 */
 	if (conn->load_balance_type == LOAD_BALANCE_RANDOM)
 	{
 		libpq_prng_init(conn);
 
-		/*
-		 * This is the "inside-out" variant of the Fisher-Yates shuffle
-		 * algorithm. Notionally, we append each new value to the array and
-		 * then swap it with a randomly-chosen array element (possibly
-		 * including itself, else we fail to generate permutations with the
-		 * last integer last).  The swap step can be optimized by combining it
-		 * with the insertion.
-		 */
-		for (i = 1; i < conn->nconnhost; i++)
-		{
-			int			j = pg_prng_uint64_range(&conn->prng_state, 0, i);
-			pg_conn_host temp = conn->connhost[j];
-
-			conn->connhost[j] = conn->connhost[i];
-			conn->connhost[i] = temp;
-		}
+		shuffleAddresses(conn);
 	}
 
 	/*
@@ -2745,25 +2758,11 @@ keep_going:						/* We will come back to here until there is
 		 */
 		if (conn->load_balance_type == LOAD_BALANCE_RANDOM)
 		{
-			/*
-			 * This is the "inside-out" variant of the Fisher-Yates shuffle
-			 * algorithm. Notionally, we append each new value to the array
-			 * and then swap it with a randomly-chosen array element (possibly
-			 * including itself, else we fail to generate permutations with
-			 * the last integer last).  The swap step can be optimized by
-			 * combining it with the insertion.
-			 *
 			 * We don't need to initialize conn->prng_state here, because that
 			 * already happened in connectOptions2.
 			 */
-			for (int i = 1; i < conn->naddr; i++)
-			{
-				int			j = pg_prng_uint64_range(&conn->prng_state, 0, i);
-				AddrInfo	temp = conn->addr[j];
 
-				conn->addr[j] = conn->addr[i];
-				conn->addr[i] = temp;
-			}
+			shuffleAddresses(conn);
 		}
 
 		reset_connection_state_machine = true;
-- 
2.35.1

Daniel Gustafsson

daniel@yesql.se

over 2 years ago

In reply to: Gurjeet Singh (#1)

Re: Minor code de-duplication in fe-connect.c

On 21 Apr 2023, at 07:29, Gurjeet Singh <gurjeet@singh.im> wrote:

Commit [1] implements Fisher-Yates shuffling algorithm to shuffle
connection addresses, in two places.

The attached patch moves the duplicated code to a function, and calls
it in those 2 places.

The reason I left it like this when reviewing and committing is that I think it
makes for more readable code. The amount of lines saved is pretty small, and
"shuffle" isn't an exact term so by reading the code it isn't immediate clear
what such a function would do. By having the shuffle algorithm where it's used
it's clear what the code does and what the outcome is. If others disagree I
can go ahead and refactor of course, but I personally would not deem it a net
win in code quality.

--
Daniel Gustafsson

Robert Haas

robertmhaas@gmail.com

over 2 years ago

In reply to: Daniel Gustafsson (#2)

Re: Minor code de-duplication in fe-connect.c

On Fri, Apr 21, 2023 at 8:25 AM Daniel Gustafsson <daniel@yesql.se> wrote:

The reason I left it like this when reviewing and committing is that I think it
makes for more readable code. The amount of lines saved is pretty small, and
"shuffle" isn't an exact term so by reading the code it isn't immediate clear
what such a function would do. By having the shuffle algorithm where it's used
it's clear what the code does and what the outcome is. If others disagree I
can go ahead and refactor of course, but I personally would not deem it a net
win in code quality.

I think we should avoid nitpicking stuff like this. I likely would
have used a subroutine if I'd done it myself, but I definitely
wouldn't have submitted a patch to change whatever the last person did
without some tangible reason for so doing. It's not a good use of
reviewer and committer time to litigate things like this.

--
Robert Haas
EDB: http://www.enterprisedb.com

Gurjeet Singh

gurjeet@singh.im

over 2 years ago

In reply to: Robert Haas (#3)

Re: Minor code de-duplication in fe-connect.c

On Fri, Apr 21, 2023 at 7:47 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Apr 21, 2023 at 8:25 AM Daniel Gustafsson <daniel@yesql.se> wrote:

The reason I left it like this when reviewing and committing is that I think it
makes for more readable code. The amount of lines saved is pretty small, and
"shuffle" isn't an exact term so by reading the code it isn't immediate clear
what such a function would do. By having the shuffle algorithm where it's used
it's clear what the code does and what the outcome is. If others disagree I
can go ahead and refactor of course, but I personally would not deem it a net
win in code quality.

I think we should avoid nitpicking stuff like this. I likely would
have used a subroutine if I'd done it myself, but I definitely
wouldn't have submitted a patch to change whatever the last person did
without some tangible reason for so doing.

Code duplication, and the possibility that a change in one location
(bugfix or otherwise) does not get applied to the other locations, is
a good enough reason to submit a patch, IMHO.

It's not a good use of
reviewer and committer time to litigate things like this.

Postgres has a very high bar for code quality, and for documentation.
It is a major reason that attracts people to, and keeps them in the
Postgres ecosystem, as users and contributors. If anything, we should
encourage folks to point out such inconsistencies in code and docs
that keep the quality high.

This is not a attack on any one commit, or author/committer of the
commit; sorry if it appeared that way. I was merely reviewing the
commit that introduced a nice libpq feature. This patch is merely a
minor improvement to the code that I think deserves a consideration.
It's not a litigation, by any means.

Best regards,
Gurjeet https://Gurje.et
Postgres Contributors Team, http://aws.amazon.com

Daniel Gustafsson

daniel@yesql.se

over 2 years ago

In reply to: Gurjeet Singh (#4)

Re: Minor code de-duplication in fe-connect.c

On 21 Apr 2023, at 18:38, Gurjeet Singh <gurjeet@singh.im> wrote:

On Fri, Apr 21, 2023 at 7:47 AM Robert Haas <robertmhaas@gmail.com> wrote:

On Fri, Apr 21, 2023 at 8:25 AM Daniel Gustafsson <daniel@yesql.se> wrote:

The reason I left it like this when reviewing and committing is that I think it
makes for more readable code. The amount of lines saved is pretty small, and
"shuffle" isn't an exact term so by reading the code it isn't immediate clear
what such a function would do. By having the shuffle algorithm where it's used
it's clear what the code does and what the outcome is. If others disagree I
can go ahead and refactor of course, but I personally would not deem it a net
win in code quality.

I think we should avoid nitpicking stuff like this. I likely would
have used a subroutine if I'd done it myself, but I definitely
wouldn't have submitted a patch to change whatever the last person did
without some tangible reason for so doing.

Code duplication, and the possibility that a change in one location
(bugfix or otherwise) does not get applied to the other locations, is
a good enough reason to submit a patch, IMHO.

It's not a good use of
reviewer and committer time to litigate things like this.

Postgres has a very high bar for code quality, and for documentation.
It is a major reason that attracts people to, and keeps them in the
Postgres ecosystem, as users and contributors. If anything, we should
encourage folks to point out such inconsistencies in code and docs
that keep the quality high.

This is not a attack on any one commit, or author/committer of the
commit; sorry if it appeared that way. I was merely reviewing the
commit that introduced a nice libpq feature. This patch is merely a
minor improvement to the code that I think deserves a consideration.
It's not a litigation, by any means.

I didn't actually read the patch earlier, but since Robert gave a +1 to the
idea of refactoring I had a look. I have a few comments:

+static void
+shuffleAddresses(PGConn *conn)
This fails to compile since the struct is typedefed PGconn.

- /*
- * This is the "inside-out" variant of the Fisher-Yates shuffle
- * algorithm. Notionally, we append each new value to the array
- * and then swap it with a randomly-chosen array element (possibly
- * including itself, else we fail to generate permutations with
- * the last integer last). The swap step can be optimized by
- * combining it with the insertion.
- *
* We don't need to initialize conn->prng_state here, because that
* already happened in connectOptions2.
*/
This also fails to compile since it removes the starting /* marker of the
comment resulting in a comment starting on *.

- for (i = 1; i < conn->nconnhost; i++)
- for (int i = 1; i < conn->naddr; i++)
You are replacing these loops for shuffling an array with a function that does
this:
+ for (int i = 1; i < conn->naddr; i++)
This is not the same thing, one is shuffling the hosts and the other is
shuffling the addresses resolved for the hosts (which may be a n:m
relationship). If you had compiled and run the tests you would have seen that
the libpq tests now fail with this applied, as would be expected.

Shuffling the arrays can of course be made into a subroutine, but what to
shuffle and where needs to be passed in to it and at that point I doubt the
code readability is much improved over the current coding.

--
Daniel Gustafsson

Gurjeet Singh

gurjeet@singh.im

over 2 years ago

In reply to: Daniel Gustafsson (#5)

Re: Minor code de-duplication in fe-connect.c

On Mon, Apr 24, 2023 at 5:14 AM Daniel Gustafsson <daniel@yesql.se> wrote:

On 21 Apr 2023, at 18:38, Gurjeet Singh <gurjeet@singh.im> wrote:

On Fri, Apr 21, 2023 at 7:47 AM Robert Haas <robertmhaas@gmail.com>

wrote:

On Fri, Apr 21, 2023 at 8:25 AM Daniel Gustafsson <daniel@yesql.se>

wrote:

The reason I left it like this when reviewing and committing is that I

think it

makes for more readable code. The amount of lines saved is pretty

small, and

"shuffle" isn't an exact term so by reading the code it isn't

immediate clear

what such a function would do. By having the shuffle algorithm where

it's used

it's clear what the code does and what the outcome is. If others

disagree I

can go ahead and refactor of course, but I personally would not deem

it a net

win in code quality.

I think we should avoid nitpicking stuff like this. I likely would
have used a subroutine if I'd done it myself, but I definitely
wouldn't have submitted a patch to change whatever the last person did
without some tangible reason for so doing.

Code duplication, and the possibility that a change in one location
(bugfix or otherwise) does not get applied to the other locations, is
a good enough reason to submit a patch, IMHO.

It's not a good use of
reviewer and committer time to litigate things like this.

Postgres has a very high bar for code quality, and for documentation.
It is a major reason that attracts people to, and keeps them in the
Postgres ecosystem, as users and contributors. If anything, we should
encourage folks to point out such inconsistencies in code and docs
that keep the quality high.

This is not a attack on any one commit, or author/committer of the
commit; sorry if it appeared that way. I was merely reviewing the
commit that introduced a nice libpq feature. This patch is merely a
minor improvement to the code that I think deserves a consideration.
It's not a litigation, by any means.

I didn't actually read the patch earlier, but since Robert gave a +1 to the
idea of refactoring I had a look. I have a few comments:
+static void
+shuffleAddresses(PGConn *conn)
This fails to compile since the struct is typedefed PGconn.
- /*
- * This is the "inside-out" variant of the Fisher-Yates shuffle
- * algorithm. Notionally, we append each new value to the array
- * and then swap it with a randomly-chosen array element (possibly
- * including itself, else we fail to generate permutations with
- * the last integer last). The swap step can be optimized by
- * combining it with the insertion.
- *
* We don't need to initialize conn->prng_state here, because that
* already happened in connectOptions2.
*/
This also fails to compile since it removes the starting /* marker of the
comment resulting in a comment starting on *.
-       for (i = 1; i < conn->nconnhost; i++)
-       for (int i = 1; i < conn->naddr; i++)
You are replacing these loops for shuffling an array with a function that
does
this:
+       for (int i = 1; i < conn->naddr; i++)
This is not the same thing, one is shuffling the hosts and the other is
shuffling the addresses resolved for the hosts (which may be a n:m
relationship).  If you had compiled and run the tests you would have seen
that
the libpq tests now fail with this applied, as would be expected.
Shuffling the arrays can of course be made into a subroutine, but what to
shuffle and where needs to be passed in to it and at that point I doubt the
code readability is much improved over the current coding.

Sorry about the errors. This seems to be the older version of the patch
that I had generated before fixing these mistakes. I do remember
encountering the compiler errors and revising the code to fix these,
especially the upper vs. lower case and the partial removal of the coment.
Away from my keyboard, so please expect the newer patch some time later.

Best regards,
Gurjeet
http://Gurje.et