Simplify the logic a bit (src/bin/scripts/reindexdb.c)

Started by Ranier Vilela11 months ago18 messages
#1Ranier Vilela
ranier.vf@gmail.com
1 attachment(s)

Hi.

Coverity complained about possible dereference null pointer
in *reindex_one_database* function.
That's not really true.
But the logic is unnecessarily complicated.

Let's simplify it to humans and machines.

patch attached.

Best regards,
Ranier Vilela

Attachments:

simplifies-reindex-one-database-reindexdb.patchapplication/octet-stream; name=simplifies-reindex-one-database-reindexdb.patchDownload
diff --git a/src/bin/scripts/reindexdb.c b/src/bin/scripts/reindexdb.c
index 665864fd22..6036959a78 100644
--- a/src/bin/scripts/reindexdb.c
+++ b/src/bin/scripts/reindexdb.c
@@ -365,18 +365,10 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 				indices_tables_list = get_parallel_object_list(conn, process_type,
 															   user_list, echo);
 
-				/*
-				 * Bail out if nothing to process.  'user_list' was modified
-				 * in-place, so check if it has at least one cell.
-				 */
-				if (user_list->head == NULL)
+				/* Bail out if nothing to process */
+				if (indices_tables_list == NULL)
 					return;
 
-				/*
-				 * Assuming 'user_list' is not empty, 'indices_tables_list'
-				 * shouldn't be empty as well.
-				 */
-				Assert(indices_tables_list != NULL);
 				indices_tables_cell = indices_tables_list->head;
 
 				break;
#2Álvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Ranier Vilela (#1)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

On 2025-Feb-13, Ranier Vilela wrote:

Hi.

Coverity complained about possible dereference null pointer
in *reindex_one_database* function.
That's not really true.
But the logic is unnecessarily complicated.

Hmm, this code looks quite suspect, but I wonder if instead of (what
looks more or less like) a straight revert of cc0e7ebd304a as you
propose, a better fix wouldn't be to split get_parallel_object_list in
two: get_parallel_table_list for the DATABASE and SCHEMA cases, and
get_parallel_tabidx_list (or whatever) for the INDEX case. In the first
case we just return a list of values, but in the latter case we also
meddle with the input list which becomes an output list ...

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
¡Ay, ay, ay! Con lo mucho que yo lo quería (bis)
se fue de mi vera ... se fue para siempre, pa toíta ... pa toíta la vida
¡Ay Camarón! ¡Ay Camarón! (Paco de Lucía)

#3Ranier Vilela
ranier.vf@gmail.com
In reply to: Álvaro Herrera (#2)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

Hi Álvaro.

Em qui., 13 de fev. de 2025 às 18:38, Álvaro Herrera <
alvherre@alvh.no-ip.org> escreveu:

On 2025-Feb-13, Ranier Vilela wrote:

Hi.

Coverity complained about possible dereference null pointer
in *reindex_one_database* function.
That's not really true.
But the logic is unnecessarily complicated.

Hmm, this code looks quite suspect, but I wonder if instead of (what
looks more or less like) a straight revert of cc0e7ebd304a as you
propose, a better fix wouldn't be to split get_parallel_object_list in
two: get_parallel_table_list for the DATABASE and SCHEMA cases, and
get_parallel_tabidx_list (or whatever) for the INDEX case. In the first
case we just return a list of values, but in the latter case we also
meddle with the input list which becomes an output list ...

Sure, I'll try to do it.

best regards,
Ranier Vilela

#4Ranier Vilela
ranier.vf@gmail.com
In reply to: Ranier Vilela (#3)
1 attachment(s)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

Em sex., 14 de fev. de 2025 às 09:13, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Hi Álvaro.

Em qui., 13 de fev. de 2025 às 18:38, Álvaro Herrera <
alvherre@alvh.no-ip.org> escreveu:

On 2025-Feb-13, Ranier Vilela wrote:

Hi.

Coverity complained about possible dereference null pointer
in *reindex_one_database* function.
That's not really true.
But the logic is unnecessarily complicated.

Hmm, this code looks quite suspect, but I wonder if instead of (what
looks more or less like) a straight revert of cc0e7ebd304a as you
propose, a better fix wouldn't be to split get_parallel_object_list in
two: get_parallel_table_list for the DATABASE and SCHEMA cases, and
get_parallel_tabidx_list (or whatever) for the INDEX case. In the first
case we just return a list of values, but in the latter case we also
meddle with the input list which becomes an output list ...

Sure, I'll try to do it.

Attached is the prototype version v1.
What do you think?

best regards,
Ranier Vilela

Attachments:

v1-simplifies-reindex-one-database-reindexdb.patchapplication/octet-stream; name=v1-simplifies-reindex-one-database-reindexdb.patchDownload
diff --git a/src/bin/scripts/reindexdb.c b/src/bin/scripts/reindexdb.c
index 665864fd22..5726a15ce5 100644
--- a/src/bin/scripts/reindexdb.c
+++ b/src/bin/scripts/reindexdb.c
@@ -33,7 +33,11 @@ typedef enum ReindexType
 } ReindexType;
 
 
-static SimpleStringList *get_parallel_object_list(PGconn *conn,
+static SimpleStringList *get_parallel_tables_list(PGconn *conn,
+												  ReindexType type,
+												  SimpleStringList *user_list,
+												  bool echo);
+static SimpleStringList *get_parallel_tabidx_list(PGconn *conn,
 												  ReindexType type,
 												  SimpleStringList *user_list,
 												  bool echo);
@@ -333,26 +337,32 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 			case REINDEX_DATABASE:
 
 				/* Build a list of relations from the database */
-				process_list = get_parallel_object_list(conn, process_type,
+				process_list = get_parallel_tables_list(conn, process_type,
 														user_list, echo);
 				process_type = REINDEX_TABLE;
 
 				/* Bail out if nothing to process */
 				if (process_list == NULL)
+				{
+					PQfinish(conn);
 					return;
+				}
 				break;
 
 			case REINDEX_SCHEMA:
 				Assert(user_list != NULL);
 
 				/* Build a list of relations from all the schemas */
-				process_list = get_parallel_object_list(conn, process_type,
+				process_list = get_parallel_tables_list(conn, process_type,
 														user_list, echo);
 				process_type = REINDEX_TABLE;
 
 				/* Bail out if nothing to process */
 				if (process_list == NULL)
+				{
+					PQfinish(conn);
 					return;
+				}
 				break;
 
 			case REINDEX_INDEX:
@@ -362,21 +372,16 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 				 * Build a list of relations from the indices.  This will
 				 * accordingly reorder the list of indices too.
 				 */
-				indices_tables_list = get_parallel_object_list(conn, process_type,
+				indices_tables_list = get_parallel_tabidx_list(conn, process_type,
 															   user_list, echo);
 
-				/*
-				 * Bail out if nothing to process.  'user_list' was modified
-				 * in-place, so check if it has at least one cell.
-				 */
-				if (user_list->head == NULL)
+				/* Bail out if nothing to process */
+				if (indices_tables_list == NULL)
+				{
+					PQfinish(conn);
 					return;
+				}
 
-				/*
-				 * Assuming 'user_list' is not empty, 'indices_tables_list'
-				 * shouldn't be empty as well.
-				 */
-				Assert(indices_tables_list != NULL);
 				indices_tables_cell = indices_tables_list->head;
 
 				break;
@@ -613,7 +618,7 @@ run_reindex_command(PGconn *conn, ReindexType type, const char *name,
 }
 
 /*
- * Prepare the list of objects to process by querying the catalogs.
+ * Prepare the list of tables to process by querying the catalogs.
  *
  * This function will return a SimpleStringList object containing the entire
  * list of tables in the given database that should be processed by a parallel
@@ -621,15 +626,13 @@ run_reindex_command(PGconn *conn, ReindexType type, const char *name,
  * table.
  */
 static SimpleStringList *
-get_parallel_object_list(PGconn *conn, ReindexType type,
+get_parallel_tables_list(PGconn *conn, ReindexType type,
 						 SimpleStringList *user_list, bool echo)
 {
 	PQExpBufferData catalog_query;
-	PQExpBufferData buf;
 	PGresult   *res;
 	SimpleStringList *tables;
-	int			ntups,
-				i;
+	int			ntups;
 
 	initPQExpBuffer(&catalog_query);
 
@@ -658,7 +661,6 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 		case REINDEX_SCHEMA:
 			{
 				SimpleStringListCell *cell;
-				bool		nsp_listed = false;
 
 				Assert(user_list != NULL);
 
@@ -680,14 +682,10 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 
 				for (cell = user_list->head; cell; cell = cell->next)
 				{
-					const char *nspname = cell->val;
-
-					if (nsp_listed)
+					if (cell != user_list->head)
 						appendPQExpBufferStr(&catalog_query, ", ");
-					else
-						nsp_listed = true;
 
-					appendStringLiteralConn(&catalog_query, nspname, conn);
+					appendStringLiteralConn(&catalog_query, cell->val, conn);
 				}
 
 				appendPQExpBufferStr(&catalog_query, ")\n"
@@ -695,6 +693,69 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 			}
 			break;
 
+		case REINDEX_INDEX:
+		case REINDEX_SYSTEM:
+		case REINDEX_TABLE:
+			Assert(false);
+			break;
+	}
+
+	res = executeQuery(conn, catalog_query.data, echo);
+	termPQExpBuffer(&catalog_query);
+
+	/*
+	 * If no rows are returned, there are no matching tables, so we are done.
+	 */
+	ntups = PQntuples(res);
+	if (ntups == 0)
+	{
+		PQclear(res);
+		return NULL;
+	}
+
+	tables = pg_malloc0(sizeof(SimpleStringList));
+
+	/* Build qualified identifiers for each table */
+	for (int i = 0; i < ntups; i++)
+	{
+		simple_string_list_append(tables,
+							 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
+											   PQgetvalue(res, i, 0),
+											   PQclientEncoding(conn)));
+	}
+	PQclear(res);
+
+	return tables;
+}
+
+/*
+ * Prepare the list of tables to process by querying the catalogs.
+ *
+ * This function will return a SimpleStringList object containing the entire
+ * list of tables in the given database that should be processed by a parallel
+ * database-wide reindex (excluding system tables), or NULL if there's no such
+ * table.
+ */
+static SimpleStringList *
+get_parallel_tabidx_list(PGconn *conn, ReindexType type,
+						 SimpleStringList *user_list, bool echo)
+{
+	PQExpBufferData catalog_query;
+	PGresult   *res;
+	SimpleStringList *tables;
+	int			ntups;
+
+	initPQExpBuffer(&catalog_query);
+
+	/*
+	 * The queries here are using a safe search_path, so there's no need to
+	 * fully qualify everything.
+	 */
+	switch (type)
+	{
+		case REINDEX_DATABASE:
+			Assert(false);
+			break;
 		case REINDEX_INDEX:
 			{
 				SimpleStringListCell *cell;
@@ -765,40 +826,28 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 	if (ntups == 0)
 	{
 		PQclear(res);
-		PQfinish(conn);
 		return NULL;
 	}
 
 	tables = pg_malloc0(sizeof(SimpleStringList));
 
-	/* Build qualified identifiers for each table */
-	initPQExpBuffer(&buf);
-	for (i = 0; i < ntups; i++)
+	/* Build qualified identifiers for each index */
+	for (int i = 0; i < ntups; i++)
 	{
-		appendPQExpBufferStr(&buf,
+		simple_string_list_append(tables,
 							 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
 											   PQgetvalue(res, i, 0),
 											   PQclientEncoding(conn)));
 
-		simple_string_list_append(tables, buf.data);
-		resetPQExpBuffer(&buf);
-
-		if (type == REINDEX_INDEX)
-		{
-			/*
-			 * For index-level REINDEX, rebuild the list of indexes to match
-			 * the order of tables list.
-			 */
-			appendPQExpBufferStr(&buf,
-								 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
-												   PQgetvalue(res, i, 2),
-												   PQclientEncoding(conn)));
-
-			simple_string_list_append(user_list, buf.data);
-			resetPQExpBuffer(&buf);
-		}
+		/*
+		 * For index-level REINDEX, rebuild the list of indices to match
+		 * the order of tables list.
+		 */
+		simple_string_list_append(user_list,
+							 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
+											   PQgetvalue(res, i, 2),
+											   PQclientEncoding(conn)));
 	}
-	termPQExpBuffer(&buf);
 	PQclear(res);
 
 	return tables;
#5Ranier Vilela
ranier.vf@gmail.com
In reply to: Ranier Vilela (#4)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

Em sex., 14 de fev. de 2025 às 10:19, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Em sex., 14 de fev. de 2025 às 09:13, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Hi Álvaro.

Em qui., 13 de fev. de 2025 às 18:38, Álvaro Herrera <
alvherre@alvh.no-ip.org> escreveu:

On 2025-Feb-13, Ranier Vilela wrote:

Hi.

Coverity complained about possible dereference null pointer
in *reindex_one_database* function.
That's not really true.
But the logic is unnecessarily complicated.

Hmm, this code looks quite suspect, but I wonder if instead of (what
looks more or less like) a straight revert of cc0e7ebd304a as you
propose, a better fix wouldn't be to split get_parallel_object_list in
two: get_parallel_table_list for the DATABASE and SCHEMA cases, and
get_parallel_tabidx_list (or whatever) for the INDEX case. In the first
case we just return a list of values, but in the latter case we also
meddle with the input list which becomes an output list ...

Sure, I'll try to do it.

Attached is the prototype version v1.

Any chance to push this forward?
Is it worth creating a committfest entry?

best regards,
Ranier Vilela

#6Álvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Ranier Vilela (#5)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

On 2025-Feb-21, Ranier Vilela wrote:

Any chance to push this forward?
Is it worth creating a committfest entry?

Yeah, I'll have a look.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"Entristecido, Wutra (canción de Las Barreras)
echa a Freyr a rodar
y a nosotros al mar"

#7Álvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Ranier Vilela (#4)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

On 2025-Feb-14, Ranier Vilela wrote:

Attached is the prototype version v1.
What do you think?

I think this is still a bit confused. The new function's comment says
"prepare the list of tables to ..." but what it really receives is a
list of _indexes_ (as evidenced by the fact that they're compared to
pg_index.indexrelid). So on input the user_list is an index list, and
on output it has been changed into a list of tables, and the list of
indexes is the function's return value? That seems quite weird. I
propose to change the API of this new function thus:

static void
get_parallel_tabidx_list(PGconn *conn,
SimpleStringList *index_list,
SimpleStringList *table_list,
bool echo);

where index_list is inout and the table_list is just an output argument.

I would also remove the 'type' argument, because I don't see a need to
keep it.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"After a quick R of TFM, all I can say is HOLY CR** THAT IS COOL! PostgreSQL was
amazing when I first started using it at 7.2, and I'm continually astounded by
learning new features and techniques made available by the continuing work of
the development team."
Berend Tober, http://archives.postgresql.org/pgsql-hackers/2007-08/msg01009.php

#8Ranier Vilela
ranier.vf@gmail.com
In reply to: Álvaro Herrera (#7)
1 attachment(s)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

Hi Álvaro.

Em qui., 27 de fev. de 2025 às 16:50, Álvaro Herrera <
alvherre@alvh.no-ip.org> escreveu:

On 2025-Feb-14, Ranier Vilela wrote:

Attached is the prototype version v1.
What do you think?

I think this is still a bit confused. The new function's comment says
"prepare the list of tables to ..." but what it really receives is a
list of _indexes_ (as evidenced by the fact that they're compared to
pg_index.indexrelid). So on input the user_list is an index list, and
on output it has been changed into a list of tables, and the list of
indexes is the function's return value? That seems quite weird.

Yeah, I think that is confusing.

I

propose to change the API of this new function thus:

static void
get_parallel_tabidx_list(PGconn *conn,
SimpleStringList *index_list,
SimpleStringList *table_list,
bool echo);

where index_list is inout and the table_list is just an output argument.

Ok.

I would also remove the 'type' argument, because I don't see a need to
keep it.

Ok.

Thanks for your guidance.

v2 attached, please comment if you have any further objections.

best regards,
Ranier Vilela

Attachments:

v2-simplifies-reindex-one-database-reindexdb.patchapplication/octet-stream; name=v2-simplifies-reindex-one-database-reindexdb.patchDownload
diff --git a/src/bin/scripts/reindexdb.c b/src/bin/scripts/reindexdb.c
index 665864fd22..68ce43afc9 100644
--- a/src/bin/scripts/reindexdb.c
+++ b/src/bin/scripts/reindexdb.c
@@ -33,10 +33,14 @@ typedef enum ReindexType
 } ReindexType;
 
 
-static SimpleStringList *get_parallel_object_list(PGconn *conn,
+static SimpleStringList *get_parallel_tables_list(PGconn *conn,
 												  ReindexType type,
 												  SimpleStringList *user_list,
 												  bool echo);
+static void get_parallel_tabidx_list(PGconn *conn,
+							  SimpleStringList *index_list,
+							  SimpleStringList *table_list,
+							  bool echo);
 static void reindex_one_database(ConnParams *cparams, ReindexType type,
 								 SimpleStringList *user_list,
 								 const char *progname,
@@ -333,26 +337,32 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 			case REINDEX_DATABASE:
 
 				/* Build a list of relations from the database */
-				process_list = get_parallel_object_list(conn, process_type,
+				process_list = get_parallel_tables_list(conn, process_type,
 														user_list, echo);
 				process_type = REINDEX_TABLE;
 
 				/* Bail out if nothing to process */
 				if (process_list == NULL)
+				{
+					PQfinish(conn);
 					return;
+				}
 				break;
 
 			case REINDEX_SCHEMA:
 				Assert(user_list != NULL);
 
 				/* Build a list of relations from all the schemas */
-				process_list = get_parallel_object_list(conn, process_type,
+				process_list = get_parallel_tables_list(conn, process_type,
 														user_list, echo);
 				process_type = REINDEX_TABLE;
 
 				/* Bail out if nothing to process */
 				if (process_list == NULL)
+				{
+					PQfinish(conn);
 					return;
+				}
 				break;
 
 			case REINDEX_INDEX:
@@ -362,21 +372,16 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 				 * Build a list of relations from the indices.  This will
 				 * accordingly reorder the list of indices too.
 				 */
-				indices_tables_list = get_parallel_object_list(conn, process_type,
-															   user_list, echo);
+				get_parallel_tabidx_list(conn, user_list,
+													indices_tables_list, echo);
 
-				/*
-				 * Bail out if nothing to process.  'user_list' was modified
-				 * in-place, so check if it has at least one cell.
-				 */
-				if (user_list->head == NULL)
+				/* Bail out if nothing to process */
+				if (indices_tables_list == NULL)
+				{
+					PQfinish(conn);
 					return;
+				}
 
-				/*
-				 * Assuming 'user_list' is not empty, 'indices_tables_list'
-				 * shouldn't be empty as well.
-				 */
-				Assert(indices_tables_list != NULL);
 				indices_tables_cell = indices_tables_list->head;
 
 				break;
@@ -480,6 +485,7 @@ finish:
 
 	ParallelSlotsTerminate(sa);
 	pfree(sa);
+	PQfinish(conn);
 
 	if (failed)
 		exit(1);
@@ -613,7 +619,7 @@ run_reindex_command(PGconn *conn, ReindexType type, const char *name,
 }
 
 /*
- * Prepare the list of objects to process by querying the catalogs.
+ * Prepare the list of tables to process by querying the catalogs.
  *
  * This function will return a SimpleStringList object containing the entire
  * list of tables in the given database that should be processed by a parallel
@@ -621,15 +627,13 @@ run_reindex_command(PGconn *conn, ReindexType type, const char *name,
  * table.
  */
 static SimpleStringList *
-get_parallel_object_list(PGconn *conn, ReindexType type,
+get_parallel_tables_list(PGconn *conn, ReindexType type,
 						 SimpleStringList *user_list, bool echo)
 {
 	PQExpBufferData catalog_query;
-	PQExpBufferData buf;
 	PGresult   *res;
 	SimpleStringList *tables;
-	int			ntups,
-				i;
+	int			ntups;
 
 	initPQExpBuffer(&catalog_query);
 
@@ -658,7 +662,6 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 		case REINDEX_SCHEMA:
 			{
 				SimpleStringListCell *cell;
-				bool		nsp_listed = false;
 
 				Assert(user_list != NULL);
 
@@ -680,14 +683,10 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 
 				for (cell = user_list->head; cell; cell = cell->next)
 				{
-					const char *nspname = cell->val;
-
-					if (nsp_listed)
-						appendPQExpBufferStr(&catalog_query, ", ");
-					else
-						nsp_listed = true;
+					if (cell != user_list->head)
+						appendPQExpBufferChar(&catalog_query, ',');
 
-					appendStringLiteralConn(&catalog_query, nspname, conn);
+					appendStringLiteralConn(&catalog_query, cell->val, conn);
 				}
 
 				appendPQExpBufferStr(&catalog_query, ")\n"
@@ -696,59 +695,6 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 			break;
 
 		case REINDEX_INDEX:
-			{
-				SimpleStringListCell *cell;
-
-				Assert(user_list != NULL);
-
-				/*
-				 * Straight-forward index-level REINDEX is not supported with
-				 * multiple jobs as we cannot control the concurrent
-				 * processing of multiple indexes depending on the same
-				 * relation.  But we can extract the appropriate table name
-				 * for the index and put REINDEX INDEX commands into different
-				 * jobs, according to the parent tables.
-				 *
-				 * We will order the results to group the same tables
-				 * together. We fetch index names as well to build a new list
-				 * of them with matching order.
-				 */
-				appendPQExpBufferStr(&catalog_query,
-									 "SELECT t.relname, n.nspname, i.relname\n"
-									 "FROM pg_catalog.pg_index x\n"
-									 "JOIN pg_catalog.pg_class t ON t.oid = x.indrelid\n"
-									 "JOIN pg_catalog.pg_class i ON i.oid = x.indexrelid\n"
-									 "LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.relnamespace\n"
-									 "WHERE x.indexrelid OPERATOR(pg_catalog.=) ANY(ARRAY['");
-
-				for (cell = user_list->head; cell; cell = cell->next)
-				{
-					if (cell != user_list->head)
-						appendPQExpBufferStr(&catalog_query, "', '");
-
-					appendQualifiedRelation(&catalog_query, cell->val, conn, echo);
-				}
-
-				/*
-				 * Order tables by the size of its greatest index.  Within the
-				 * table, order indexes by their sizes.
-				 */
-				appendPQExpBufferStr(&catalog_query,
-									 "']::pg_catalog.regclass[])\n"
-									 "ORDER BY max(i.relpages) OVER \n"
-									 "    (PARTITION BY n.nspname, t.relname),\n"
-									 "  n.nspname, t.relname, i.relpages;\n");
-
-				/*
-				 * We're going to re-order the user_list to match the order of
-				 * tables.  So, empty the user_list to fill it from the query
-				 * result.
-				 */
-				simple_string_list_destroy(user_list);
-				user_list->head = user_list->tail = NULL;
-			}
-			break;
-
 		case REINDEX_SYSTEM:
 		case REINDEX_TABLE:
 			Assert(false);
@@ -765,45 +711,134 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 	if (ntups == 0)
 	{
 		PQclear(res);
-		PQfinish(conn);
 		return NULL;
 	}
 
 	tables = pg_malloc0(sizeof(SimpleStringList));
 
 	/* Build qualified identifiers for each table */
-	initPQExpBuffer(&buf);
-	for (i = 0; i < ntups; i++)
+	for (int i = 0; i < ntups; i++)
 	{
-		appendPQExpBufferStr(&buf,
+		simple_string_list_append(tables,
 							 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
 											   PQgetvalue(res, i, 0),
 											   PQclientEncoding(conn)));
-
-		simple_string_list_append(tables, buf.data);
-		resetPQExpBuffer(&buf);
-
-		if (type == REINDEX_INDEX)
-		{
-			/*
-			 * For index-level REINDEX, rebuild the list of indexes to match
-			 * the order of tables list.
-			 */
-			appendPQExpBufferStr(&buf,
-								 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
-												   PQgetvalue(res, i, 2),
-												   PQclientEncoding(conn)));
-
-			simple_string_list_append(user_list, buf.data);
-			resetPQExpBuffer(&buf);
-		}
 	}
-	termPQExpBuffer(&buf);
 	PQclear(res);
 
 	return tables;
 }
 
+/*
+ * Prepare the list of tables to process by querying the catalogs.
+ *
+ * This function will return a SimpleStringList object containing the entire
+ * list of tables in the given database that should be processed by a parallel
+ * database-wide reindex (excluding system tables), or NULL if there's no such
+ * table.
+ */
+static void
+get_parallel_tabidx_list(PGconn *conn,
+						 SimpleStringList *index_list,
+						 SimpleStringList *table_list,
+						 bool echo)
+{
+	PQExpBufferData catalog_query;
+	PGresult   *res;
+	SimpleStringListCell *cell;
+	int			ntups;
+
+	Assert(index_list != NULL);
+	Assert(table_list == NULL);
+
+	initPQExpBuffer(&catalog_query);
+
+	/*
+	 * The queries here are using a safe search_path, so there's no need to
+	 * fully qualify everything.
+	 */
+
+	/*
+	 * Straight-forward index-level REINDEX is not supported with
+	 * multiple jobs as we cannot control the concurrent
+	 * processing of multiple indexes depending on the same
+	 * relation.  But we can extract the appropriate table name
+	 * for the index and put REINDEX INDEX commands into different
+	 * jobs, according to the parent tables.
+	 *
+	 * We will order the results to group the same tables
+	 * together. We fetch index names as well to build a new list
+	 * of them with matching order.
+	 */
+	appendPQExpBufferStr(&catalog_query,
+							 "SELECT t.relname, n.nspname, i.relname\n"
+							 "FROM pg_catalog.pg_index x\n"
+							 "JOIN pg_catalog.pg_class t ON t.oid = x.indrelid\n"
+							 "JOIN pg_catalog.pg_class i ON i.oid = x.indexrelid\n"
+							 "LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.relnamespace\n"
+							 "WHERE x.indexrelid OPERATOR(pg_catalog.=) ANY(ARRAY['");
+
+	for (cell = index_list->head; cell; cell = cell->next)
+	{
+		if (cell != index_list->head)
+			appendPQExpBufferStr(&catalog_query, "', '");
+
+			appendQualifiedRelation(&catalog_query, cell->val, conn, echo);
+	}
+
+	/*
+	 * Order tables by the size of its greatest index.  Within the
+	 * table, order indexes by their sizes.
+	 */
+	appendPQExpBufferStr(&catalog_query,
+						 "']::pg_catalog.regclass[])\n"
+						 "ORDER BY max(i.relpages) OVER \n"
+						 "    (PARTITION BY n.nspname, t.relname),\n"
+						 "  n.nspname, t.relname, i.relpages;\n");
+
+	/*
+	 * We're going to re-order the index_list to match the order of
+	 * tables.  So, empty the index_list to fill it from the query
+	 * result.
+	 */
+	simple_string_list_destroy(index_list);
+	index_list->head = index_list->tail = NULL;
+
+	res = executeQuery(conn, catalog_query.data, echo);
+	termPQExpBuffer(&catalog_query);
+
+	/*
+	 * If no rows are returned, there are no matching tables, so we are done.
+	 */
+	ntups = PQntuples(res);
+	if (ntups == 0)
+	{
+		PQclear(res);
+		return;
+	}
+
+	table_list = pg_malloc0(sizeof(SimpleStringList));
+
+	/* Build qualified identifiers for each index */
+	for (int i = 0; i < ntups; i++)
+	{
+		simple_string_list_append(table_list,
+							 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
+											   PQgetvalue(res, i, 0),
+											   PQclientEncoding(conn)));
+
+		/*
+		 * For index-level REINDEX, rebuild the list of indices to match
+		 * the order of tables list.
+		 */
+		simple_string_list_append(index_list,
+							 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
+											   PQgetvalue(res, i, 2),
+											   PQclientEncoding(conn)));
+	}
+	PQclear(res);
+}
+
 static void
 reindex_all_databases(ConnParams *cparams,
 					  const char *progname, bool echo, bool quiet, bool verbose,
#9Álvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Ranier Vilela (#8)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

On 2025-Feb-28, Ranier Vilela wrote:

v2 attached, please comment if you have any further objections.

I think running the tests would have been a good idea, as would checking
for compiler warnings.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"I love the Postgres community. It's all about doing things _properly_. :-)"
(David Garamond)

#10Ranier Vilela
ranier.vf@gmail.com
In reply to: Álvaro Herrera (#9)
1 attachment(s)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

Em qui., 6 de mar. de 2025 às 15:51, Álvaro Herrera <alvherre@alvh.no-ip.org>
escreveu:

On 2025-Feb-28, Ranier Vilela wrote:

v2 attached, please comment if you have any further objections.

I think running the tests would have been a good idea, as would checking
for compiler warnings.

Hi.
Rebased for current head.

Thanks for the commiter 24503fa
<http://24503fa95cd8cd57a8d695a910056a1cfcbacefc&gt;

best regards,
Ranier Vilela

Attachments:

v3-simplifies-reindex-one-database-reindexdb.patchapplication/octet-stream; name=v3-simplifies-reindex-one-database-reindexdb.patchDownload
diff --git a/src/bin/scripts/reindexdb.c b/src/bin/scripts/reindexdb.c
index b00c811286..9081646b1c 100644
--- a/src/bin/scripts/reindexdb.c
+++ b/src/bin/scripts/reindexdb.c
@@ -33,10 +33,14 @@ typedef enum ReindexType
 } ReindexType;
 
 
-static SimpleStringList *get_parallel_object_list(PGconn *conn,
+static SimpleStringList *get_parallel_tables_list(PGconn *conn,
 												  ReindexType type,
 												  SimpleStringList *user_list,
 												  bool echo);
+static void get_parallel_tabidx_list(PGconn *conn,
+						 SimpleStringList *index_list,
+						 SimpleStringList *table_list,
+						 bool echo);
 static void reindex_one_database(ConnParams *cparams, ReindexType type,
 								 SimpleStringList *user_list,
 								 const char *progname,
@@ -333,7 +337,7 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 			case REINDEX_DATABASE:
 
 				/* Build a list of relations from the database */
-				process_list = get_parallel_object_list(conn, process_type,
+				process_list = get_parallel_tables_list(conn, process_type,
 														user_list, echo);
 				process_type = REINDEX_TABLE;
 
@@ -349,7 +353,7 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 				Assert(user_list != NULL);
 
 				/* Build a list of relations from all the schemas */
-				process_list = get_parallel_object_list(conn, process_type,
+				process_list = get_parallel_tables_list(conn, process_type,
 														user_list, echo);
 				process_type = REINDEX_TABLE;
 
@@ -368,24 +372,16 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 				 * Build a list of relations from the indices.  This will
 				 * accordingly reorder the list of indices too.
 				 */
-				indices_tables_list = get_parallel_object_list(conn, process_type,
-															   user_list, echo);
+				get_parallel_tabidx_list(conn, user_list,
+													indices_tables_list, echo);
 
-				/*
-				 * Bail out if nothing to process.  'user_list' was modified
-				 * in-place, so check if it has at least one cell.
-				 */
-				if (user_list->head == NULL)
+				/* Bail out if nothing to process */
+				if (indices_tables_list == NULL)
 				{
 					PQfinish(conn);
 					return;
 				}
 
-				/*
-				 * Assuming 'user_list' is not empty, 'indices_tables_list'
-				 * shouldn't be empty as well.
-				 */
-				Assert(indices_tables_list != NULL);
 				indices_tables_cell = indices_tables_list->head;
 
 				break;
@@ -623,7 +619,7 @@ run_reindex_command(PGconn *conn, ReindexType type, const char *name,
 }
 
 /*
- * Prepare the list of objects to process by querying the catalogs.
+ * Prepare the list of tables to process by querying the catalogs.
  *
  * This function will return a SimpleStringList object containing the entire
  * list of tables in the given database that should be processed by a parallel
@@ -631,15 +627,13 @@ run_reindex_command(PGconn *conn, ReindexType type, const char *name,
  * table.
  */
 static SimpleStringList *
-get_parallel_object_list(PGconn *conn, ReindexType type,
+get_parallel_tables_list(PGconn *conn, ReindexType type,
 						 SimpleStringList *user_list, bool echo)
 {
 	PQExpBufferData catalog_query;
-	PQExpBufferData buf;
 	PGresult   *res;
 	SimpleStringList *tables;
-	int			ntups,
-				i;
+	int			ntups;
 
 	initPQExpBuffer(&catalog_query);
 
@@ -668,7 +662,6 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 		case REINDEX_SCHEMA:
 			{
 				SimpleStringListCell *cell;
-				bool		nsp_listed = false;
 
 				Assert(user_list != NULL);
 
@@ -690,14 +683,10 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 
 				for (cell = user_list->head; cell; cell = cell->next)
 				{
-					const char *nspname = cell->val;
-
-					if (nsp_listed)
-						appendPQExpBufferStr(&catalog_query, ", ");
-					else
-						nsp_listed = true;
+					if (cell != user_list->head)
+						appendPQExpBufferChar(&catalog_query, ',');
 
-					appendStringLiteralConn(&catalog_query, nspname, conn);
+					appendStringLiteralConn(&catalog_query, cell->val, conn);
 				}
 
 				appendPQExpBufferStr(&catalog_query, ")\n"
@@ -706,59 +695,6 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 			break;
 
 		case REINDEX_INDEX:
-			{
-				SimpleStringListCell *cell;
-
-				Assert(user_list != NULL);
-
-				/*
-				 * Straight-forward index-level REINDEX is not supported with
-				 * multiple jobs as we cannot control the concurrent
-				 * processing of multiple indexes depending on the same
-				 * relation.  But we can extract the appropriate table name
-				 * for the index and put REINDEX INDEX commands into different
-				 * jobs, according to the parent tables.
-				 *
-				 * We will order the results to group the same tables
-				 * together. We fetch index names as well to build a new list
-				 * of them with matching order.
-				 */
-				appendPQExpBufferStr(&catalog_query,
-									 "SELECT t.relname, n.nspname, i.relname\n"
-									 "FROM pg_catalog.pg_index x\n"
-									 "JOIN pg_catalog.pg_class t ON t.oid = x.indrelid\n"
-									 "JOIN pg_catalog.pg_class i ON i.oid = x.indexrelid\n"
-									 "LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.relnamespace\n"
-									 "WHERE x.indexrelid OPERATOR(pg_catalog.=) ANY(ARRAY['");
-
-				for (cell = user_list->head; cell; cell = cell->next)
-				{
-					if (cell != user_list->head)
-						appendPQExpBufferStr(&catalog_query, "', '");
-
-					appendQualifiedRelation(&catalog_query, cell->val, conn, echo);
-				}
-
-				/*
-				 * Order tables by the size of its greatest index.  Within the
-				 * table, order indexes by their sizes.
-				 */
-				appendPQExpBufferStr(&catalog_query,
-									 "']::pg_catalog.regclass[])\n"
-									 "ORDER BY max(i.relpages) OVER \n"
-									 "    (PARTITION BY n.nspname, t.relname),\n"
-									 "  n.nspname, t.relname, i.relpages;\n");
-
-				/*
-				 * We're going to re-order the user_list to match the order of
-				 * tables.  So, empty the user_list to fill it from the query
-				 * result.
-				 */
-				simple_string_list_destroy(user_list);
-				user_list->head = user_list->tail = NULL;
-			}
-			break;
-
 		case REINDEX_SYSTEM:
 		case REINDEX_TABLE:
 			Assert(false);
@@ -781,36 +717,125 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 	tables = pg_malloc0(sizeof(SimpleStringList));
 
 	/* Build qualified identifiers for each table */
-	initPQExpBuffer(&buf);
-	for (i = 0; i < ntups; i++)
+	for (int i = 0; i < ntups; i++)
 	{
-		appendPQExpBufferStr(&buf,
+		simple_string_list_append(tables,
+ 							 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
+ 											   PQgetvalue(res, i, 0),
+ 											   PQclientEncoding(conn)));
+	}
+	PQclear(res);
+
+	return tables;
+}
+
+/*
+ * Prepare the list of tables to process by querying the catalogs.
+ *
+ * This functions will process two SimpleStringList objects containing the
+ * list of indices and the list of tables that should be processed by a parallel
+ * database-wide reindex (excluding system tables).
+ */
+static void
+get_parallel_tabidx_list(PGconn *conn,
+						 SimpleStringList *index_list,
+						 SimpleStringList *table_list,
+						 bool echo)
+{
+	PQExpBufferData catalog_query;
+	PGresult   *res;
+	SimpleStringListCell *cell;
+	int			ntups;
+
+	Assert(index_list != NULL);
+	Assert(table_list == NULL);
+
+	initPQExpBuffer(&catalog_query);
+
+	/*
+	 * The queries here are using a safe search_path, so there's no need to
+	 * fully qualify everything.
+	 */
+
+	/*
+	 * Straight-forward index-level REINDEX is not supported with
+	 * multiple jobs as we cannot control the concurrent
+	 * processing of multiple indexes depending on the same
+	 * relation.  But we can extract the appropriate table name
+	 * for the index and put REINDEX INDEX commands into different
+	 * jobs, according to the parent tables.
+	 *
+	 * We will order the results to group the same tables
+	 * together. We fetch index names as well to build a new list
+	 * of them with matching order.
+	 */
+	appendPQExpBufferStr(&catalog_query,
+							 "SELECT t.relname, n.nspname, i.relname\n"
+							 "FROM pg_catalog.pg_index x\n"
+							 "JOIN pg_catalog.pg_class t ON t.oid = x.indrelid\n"
+							 "JOIN pg_catalog.pg_class i ON i.oid = x.indexrelid\n"
+							 "LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.relnamespace\n"
+							 "WHERE x.indexrelid OPERATOR(pg_catalog.=) ANY(ARRAY['");
+
+	for (cell = index_list->head; cell; cell = cell->next)
+	{
+		if (cell != index_list->head)
+			appendPQExpBufferStr(&catalog_query, "', '");
+
+			appendQualifiedRelation(&catalog_query, cell->val, conn, echo);
+	}
+
+	/*
+	 * Order tables by the size of its greatest index.  Within the
+	 * table, order indexes by their sizes.
+	 */
+	appendPQExpBufferStr(&catalog_query,
+						 "']::pg_catalog.regclass[])\n"
+						 "ORDER BY max(i.relpages) OVER \n"
+						 "    (PARTITION BY n.nspname, t.relname),\n"
+						 "  n.nspname, t.relname, i.relpages;\n");
+
+	/*
+	 * We're going to re-order the index_list to match the order of
+	 * tables.  So, empty the index_list to fill it from the query
+	 * result.
+	 */
+	simple_string_list_destroy(index_list);
+	index_list->head = index_list->tail = NULL;
+
+	res = executeQuery(conn, catalog_query.data, echo);
+	termPQExpBuffer(&catalog_query);
+
+	/*
+	 * If no rows are returned, there are no matching tables, so we are done.
+	 */
+	ntups = PQntuples(res);
+	if (ntups == 0)
+	{
+		PQclear(res);
+		return;
+	}
+
+	table_list = pg_malloc0(sizeof(SimpleStringList));
+
+	/* Build qualified identifiers for each index */
+	for (int i = 0; i < ntups; i++)
+	{
+		simple_string_list_append(table_list,
 							 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
 											   PQgetvalue(res, i, 0),
 											   PQclientEncoding(conn)));
 
-		simple_string_list_append(tables, buf.data);
-		resetPQExpBuffer(&buf);
-
-		if (type == REINDEX_INDEX)
-		{
-			/*
-			 * For index-level REINDEX, rebuild the list of indexes to match
-			 * the order of tables list.
-			 */
-			appendPQExpBufferStr(&buf,
-								 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
-												   PQgetvalue(res, i, 2),
-												   PQclientEncoding(conn)));
-
-			simple_string_list_append(user_list, buf.data);
-			resetPQExpBuffer(&buf);
-		}
+		/*
+		 * For index-level REINDEX, rebuild the list of indices to match
+		 * the order of tables list.
+		 */
+		simple_string_list_append(index_list,
+							 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
+											   PQgetvalue(res, i, 2),
+											   PQclientEncoding(conn)));
 	}
-	termPQExpBuffer(&buf);
 	PQclear(res);
-
-	return tables;
 }
 
 static void
#11Álvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Ranier Vilela (#10)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

Hello

I find that this is still quite broken -- both the original, and your
patch. I have already complained about a fundamental bug in the
original in [1]/messages/by-id/202503071820.j25zn3lo4hvn@alvherre.pgsql. In addition to what I reported there, in the unpatched
code I noticed that we're wasting memory and CPU by comparing the
qualified table name, when it would be faster to just store the table
OID and do the comparison based on that. Because the REINDEX INDEX
query only needs to specify the *index* name, not the table name, we
don't need the table name to be stored in the indices_tables_list: we
can convert it into an OID list and store that instead. The strcmp
becomes a numeric comparison. Yay. This is of course a pretty trivial,
almost meaningless change.

[1]: /messages/by-id/202503071820.j25zn3lo4hvn@alvherre.pgsql

But your patch also makes the mistake that indices_table_list is passed
as a pointer by value, so the list that is filled by
get_parallel_tabidx_list() can never have any effect. I wonder if you
tested this patch at all. With your patch and the sample setup I posted
in [1]/messages/by-id/202503071820.j25zn3lo4hvn@alvherre.pgsql, the program doesn't do anything. It never calls REINDEX at all.
It is a dead program. It's so bogus that in fact, there's no program,
it's just a bug that takes a lot of disk space.

For this to work, we need to pass that list (the output list) as a
pointer reference, so that the caller can _receive_ the output list.

We also get this compiler warning:

/pgsql/source/master/src/bin/scripts/reindexdb.c: In function ‘get_parallel_tabidx_list’:
/pgsql/source/master/src/bin/scripts/reindexdb.c:782:17: warning: this ‘if’ clause does not guard... [-Wmisleading-indentation]
782 | if (cell != index_list->head)
| ^~
/pgsql/source/master/src/bin/scripts/reindexdb.c:785:25: note: ...this statement, but the latter is misleadingly indented as if it were guarded by the ‘if’
785 | appendQualifiedRelation(&catalog_query, cell->val, conn, echo);
| ^~~~~~~~~~~~~~~~~~~~~~~

Not to mention the 'git apply' warnings:

I schmee: master 0$ git apply /tmp/v3-simplifies-reindex-one-database-reindexdb.patch
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:191: space before tab in indent.
fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:192: space before tab in indent.
PQgetvalue(res, i, 0),
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:193: space before tab in indent.
PQclientEncoding(conn)));
advertencia: 3 líneas agregan errores de espacios en blanco.

Anyway, my version of this is attached. It fixes the problems with your
patch, but not Orlov's fundamental bug. Without fixing that bug, this
program does not deserve this supposedly parallel mode that doesn't
actually deliver. I wonder if we shouldn't just revert 47f99a407de2
completely.

You, on the other hand, really need to be much more careful with the
patches you submit.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Selbst das größte Genie würde nicht weit kommen, wenn es
alles seinem eigenen Innern verdanken wollte." (Johann Wolfgang von Goethe)
Ni aún el genio más grande llegaría muy lejos si
quisiera sacarlo todo de su propio interior.

#12Ranier Vilela
ranier.vf@gmail.com
In reply to: Álvaro Herrera (#11)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

Em sex., 7 de mar. de 2025 às 15:54, Álvaro Herrera <alvherre@alvh.no-ip.org>
escreveu:

Hello

I find that this is still quite broken -- both the original, and your
patch. I have already complained about a fundamental bug in the
original in [1]. In addition to what I reported there, in the unpatched
code I noticed that we're wasting memory and CPU by comparing the
qualified table name, when it would be faster to just store the table
OID and do the comparison based on that. Because the REINDEX INDEX
query only needs to specify the *index* name, not the table name, we
don't need the table name to be stored in the indices_tables_list: we
can convert it into an OID list and store that instead. The strcmp
becomes a numeric comparison. Yay. This is of course a pretty trivial,
almost meaningless change.

[1] /messages/by-id/202503071820.j25zn3lo4hvn@alvherre.pgsql

But your patch also makes the mistake that indices_table_list is passed
as a pointer by value, so the list that is filled by
get_parallel_tabidx_list() can never have any effect. I wonder if you
tested this patch at all. With your patch and the sample setup I posted
in [1], the program doesn't do anything. It never calls REINDEX at all.
It is a dead program. It's so bogus that in fact, there's no program,
it's just a bug that takes a lot of disk space.

For this to work, we need to pass that list (the output list) as a
pointer reference, so that the caller can _receive_ the output list.

We also get this compiler warning:

/pgsql/source/master/src/bin/scripts/reindexdb.c: In function
‘get_parallel_tabidx_list’:
/pgsql/source/master/src/bin/scripts/reindexdb.c:782:17: warning: this
‘if’ clause does not guard... [-Wmisleading-indentation]
782 | if (cell != index_list->head)
| ^~
/pgsql/source/master/src/bin/scripts/reindexdb.c:785:25: note: ...this
statement, but the latter is misleadingly indented as if it were guarded by
the ‘if’
785 | appendQualifiedRelation(&catalog_query,
cell->val, conn, echo);
| ^~~~~~~~~~~~~~~~~~~~~~~

Not to mention the 'git apply' warnings:

I schmee: master 0$ git apply
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:191: space before
tab in indent.

fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:192: space before
tab in indent.

PQgetvalue(res, i, 0),
/tmp/v3-simplifies-reindex-one-database-reindexdb.patch:193: space before
tab in indent.

PQclientEncoding(conn)));
advertencia: 3 líneas agregan errores de espacios en blanco.

Anyway, my version of this is attached. It fixes the problems with your
patch, but not Orlov's fundamental bug. Without fixing that bug, this
program does not deserve this supposedly parallel mode that doesn't
actually deliver. I wonder if we shouldn't just revert 47f99a407de2
completely.

You, on the other hand, really need to be much more careful with the
patches you submit.

Yes of course, thank you for making the necessary corrections.

best regards,
Ranier Vilela

#13Álvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Álvaro Herrera (#11)
1 attachment(s)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

On 2025-Mar-07, Álvaro Herrera wrote:

Anyway, my version of this is attached. It fixes the problems with your
patch, but not Orlov's fundamental bug.

And of course I forgot to actually attach the patch. Good grief.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/
"No deja de ser humillante para una persona de ingenio saber
que no hay tonto que no le pueda enseñar algo." (Jean B. Say)

Attachments:

v4-try-but-fail-to-unbreak-reindexdb.patchtext/x-diff; charset=utf-8Download
diff --git a/src/bin/scripts/reindexdb.c b/src/bin/scripts/reindexdb.c
index b00c8112869..c55742daf81 100644
--- a/src/bin/scripts/reindexdb.c
+++ b/src/bin/scripts/reindexdb.c
@@ -12,6 +12,7 @@
 #include "postgres_fe.h"
 
 #include <limits.h>
+#include <stdlib.h>
 
 #include "catalog/pg_class_d.h"
 #include "common.h"
@@ -33,10 +34,14 @@ typedef enum ReindexType
 } ReindexType;
 
 
-static SimpleStringList *get_parallel_object_list(PGconn *conn,
+static SimpleStringList *get_parallel_tables_list(PGconn *conn,
 												  ReindexType type,
 												  SimpleStringList *user_list,
 												  bool echo);
+static void get_parallel_tabidx_list(PGconn *conn,
+									 SimpleStringList *index_list,
+									 SimpleOidList **table_list,
+									 bool echo);
 static void reindex_one_database(ConnParams *cparams, ReindexType type,
 								 SimpleStringList *user_list,
 								 const char *progname,
@@ -276,15 +281,15 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 {
 	PGconn	   *conn;
 	SimpleStringListCell *cell;
-	SimpleStringListCell *indices_tables_cell = NULL;
+	SimpleOidListCell *indices_tables_cell = NULL;
 	bool		parallel = concurrentCons > 1;
-	SimpleStringList *process_list = user_list;
-	SimpleStringList *indices_tables_list = NULL;
+	SimpleStringList *process_list;
+	SimpleOidList *tableoid_list = NULL;
 	ReindexType process_type = type;
 	ParallelSlotArray *sa;
 	bool		failed = false;
-	int			items_count = 0;
-	char	   *prev_index_table_name = NULL;
+	int			items_count;
+	Oid			prev_index_table_oid = InvalidOid;
 	ParallelSlot *free_slot = NULL;
 
 	conn = connectDatabase(cparams, progname, echo, false, true);
@@ -322,6 +327,7 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 			case REINDEX_INDEX:
 			case REINDEX_SCHEMA:
 			case REINDEX_TABLE:
+				process_list = user_list;
 				Assert(user_list != NULL);
 				break;
 		}
@@ -330,26 +336,14 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 	{
 		switch (process_type)
 		{
-			case REINDEX_DATABASE:
-
-				/* Build a list of relations from the database */
-				process_list = get_parallel_object_list(conn, process_type,
-														user_list, echo);
-				process_type = REINDEX_TABLE;
-
-				/* Bail out if nothing to process */
-				if (process_list == NULL)
-				{
-					PQfinish(conn);
-					return;
-				}
-				break;
-
 			case REINDEX_SCHEMA:
 				Assert(user_list != NULL);
+				/* fall through */
 
-				/* Build a list of relations from all the schemas */
-				process_list = get_parallel_object_list(conn, process_type,
+			case REINDEX_DATABASE:
+
+				/* Build a list of relations from the given list */
+				process_list = get_parallel_tables_list(conn, process_type,
 														user_list, echo);
 				process_type = REINDEX_TABLE;
 
@@ -362,45 +356,34 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 				break;
 
 			case REINDEX_INDEX:
-				Assert(user_list != NULL);
 
 				/*
-				 * Build a list of relations from the indices.  This will
-				 * accordingly reorder the list of indices too.
+				 * Generate a list of indexes and a matching list of table
+				 * OIDs, based on the user-specified index names.
 				 */
-				indices_tables_list = get_parallel_object_list(conn, process_type,
-															   user_list, echo);
+				get_parallel_tabidx_list(conn, user_list, &tableoid_list,
+										 echo);
 
-				/*
-				 * Bail out if nothing to process.  'user_list' was modified
-				 * in-place, so check if it has at least one cell.
-				 */
-				if (user_list->head == NULL)
+				/* Bail out if nothing to process */
+				if (tableoid_list == NULL)
 				{
 					PQfinish(conn);
 					return;
 				}
 
-				/*
-				 * Assuming 'user_list' is not empty, 'indices_tables_list'
-				 * shouldn't be empty as well.
-				 */
-				Assert(indices_tables_list != NULL);
-				indices_tables_cell = indices_tables_list->head;
+				indices_tables_cell = tableoid_list->head;
+				process_list = user_list;
 
 				break;
 
 			case REINDEX_SYSTEM:
 				/* not supported */
+				process_list = NULL;
 				Assert(false);
 				break;
 
 			case REINDEX_TABLE:
-
-				/*
-				 * Fall through.  The list of items for tables is already
-				 * created.
-				 */
+				process_list = user_list;
 				break;
 		}
 	}
@@ -410,6 +393,7 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 	 * the list.  We choose the minimum between the number of concurrent
 	 * connections and the number of items in the list.
 	 */
+	items_count = 0;
 	for (cell = process_list->head; cell; cell = cell->next)
 	{
 		items_count++;
@@ -442,15 +426,15 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 		/*
 		 * For parallel index-level REINDEX, the indices of the same table are
 		 * ordered together and they are to be processed by the same job.  So,
-		 * we don't switch the job as soon as the index belongs to the same
-		 * table as the previous one.
+		 * we don't switch parallel slot until the index belongs to a
+		 * different table as the previous one.
 		 */
 		if (parallel && process_type == REINDEX_INDEX)
 		{
-			if (prev_index_table_name != NULL &&
-				strcmp(prev_index_table_name, indices_tables_cell->val) == 0)
+			if (prev_index_table_oid != InvalidOid &&
+				prev_index_table_oid == indices_tables_cell->val)
 				need_new_slot = false;
-			prev_index_table_name = indices_tables_cell->val;
+			prev_index_table_oid = indices_tables_cell->val;
 			indices_tables_cell = indices_tables_cell->next;
 		}
 
@@ -482,10 +466,10 @@ finish:
 		pg_free(process_list);
 	}
 
-	if (indices_tables_list)
+	if (tableoid_list)
 	{
-		simple_string_list_destroy(indices_tables_list);
-		pg_free(indices_tables_list);
+		simple_oid_list_destroy(tableoid_list);
+		pg_free(tableoid_list);
 	}
 
 	ParallelSlotsTerminate(sa);
@@ -623,7 +607,7 @@ run_reindex_command(PGconn *conn, ReindexType type, const char *name,
 }
 
 /*
- * Prepare the list of objects to process by querying the catalogs.
+ * Prepare the list of tables to process by querying the catalogs.
  *
  * This function will return a SimpleStringList object containing the entire
  * list of tables in the given database that should be processed by a parallel
@@ -631,15 +615,13 @@ run_reindex_command(PGconn *conn, ReindexType type, const char *name,
  * table.
  */
 static SimpleStringList *
-get_parallel_object_list(PGconn *conn, ReindexType type,
+get_parallel_tables_list(PGconn *conn, ReindexType type,
 						 SimpleStringList *user_list, bool echo)
 {
 	PQExpBufferData catalog_query;
-	PQExpBufferData buf;
 	PGresult   *res;
 	SimpleStringList *tables;
-	int			ntups,
-				i;
+	int			ntups;
 
 	initPQExpBuffer(&catalog_query);
 
@@ -668,7 +650,6 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 		case REINDEX_SCHEMA:
 			{
 				SimpleStringListCell *cell;
-				bool		nsp_listed = false;
 
 				Assert(user_list != NULL);
 
@@ -690,14 +671,10 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 
 				for (cell = user_list->head; cell; cell = cell->next)
 				{
-					const char *nspname = cell->val;
+					if (cell != user_list->head)
+						appendPQExpBufferChar(&catalog_query, ',');
 
-					if (nsp_listed)
-						appendPQExpBufferStr(&catalog_query, ", ");
-					else
-						nsp_listed = true;
-
-					appendStringLiteralConn(&catalog_query, nspname, conn);
+					appendStringLiteralConn(&catalog_query, cell->val, conn);
 				}
 
 				appendPQExpBufferStr(&catalog_query, ")\n"
@@ -706,59 +683,6 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 			break;
 
 		case REINDEX_INDEX:
-			{
-				SimpleStringListCell *cell;
-
-				Assert(user_list != NULL);
-
-				/*
-				 * Straight-forward index-level REINDEX is not supported with
-				 * multiple jobs as we cannot control the concurrent
-				 * processing of multiple indexes depending on the same
-				 * relation.  But we can extract the appropriate table name
-				 * for the index and put REINDEX INDEX commands into different
-				 * jobs, according to the parent tables.
-				 *
-				 * We will order the results to group the same tables
-				 * together. We fetch index names as well to build a new list
-				 * of them with matching order.
-				 */
-				appendPQExpBufferStr(&catalog_query,
-									 "SELECT t.relname, n.nspname, i.relname\n"
-									 "FROM pg_catalog.pg_index x\n"
-									 "JOIN pg_catalog.pg_class t ON t.oid = x.indrelid\n"
-									 "JOIN pg_catalog.pg_class i ON i.oid = x.indexrelid\n"
-									 "LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.relnamespace\n"
-									 "WHERE x.indexrelid OPERATOR(pg_catalog.=) ANY(ARRAY['");
-
-				for (cell = user_list->head; cell; cell = cell->next)
-				{
-					if (cell != user_list->head)
-						appendPQExpBufferStr(&catalog_query, "', '");
-
-					appendQualifiedRelation(&catalog_query, cell->val, conn, echo);
-				}
-
-				/*
-				 * Order tables by the size of its greatest index.  Within the
-				 * table, order indexes by their sizes.
-				 */
-				appendPQExpBufferStr(&catalog_query,
-									 "']::pg_catalog.regclass[])\n"
-									 "ORDER BY max(i.relpages) OVER \n"
-									 "    (PARTITION BY n.nspname, t.relname),\n"
-									 "  n.nspname, t.relname, i.relpages;\n");
-
-				/*
-				 * We're going to re-order the user_list to match the order of
-				 * tables.  So, empty the user_list to fill it from the query
-				 * result.
-				 */
-				simple_string_list_destroy(user_list);
-				user_list->head = user_list->tail = NULL;
-			}
-			break;
-
 		case REINDEX_SYSTEM:
 		case REINDEX_TABLE:
 			Assert(false);
@@ -781,38 +705,119 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 	tables = pg_malloc0(sizeof(SimpleStringList));
 
 	/* Build qualified identifiers for each table */
-	initPQExpBuffer(&buf);
-	for (i = 0; i < ntups; i++)
+	for (int i = 0; i < ntups; i++)
 	{
-		appendPQExpBufferStr(&buf,
-							 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
-											   PQgetvalue(res, i, 0),
-											   PQclientEncoding(conn)));
-
-		simple_string_list_append(tables, buf.data);
-		resetPQExpBuffer(&buf);
-
-		if (type == REINDEX_INDEX)
-		{
-			/*
-			 * For index-level REINDEX, rebuild the list of indexes to match
-			 * the order of tables list.
-			 */
-			appendPQExpBufferStr(&buf,
-								 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
-												   PQgetvalue(res, i, 2),
-												   PQclientEncoding(conn)));
-
-			simple_string_list_append(user_list, buf.data);
-			resetPQExpBuffer(&buf);
-		}
+		simple_string_list_append(tables,
+								  fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
+													PQgetvalue(res, i, 0),
+													PQclientEncoding(conn)));
 	}
-	termPQExpBuffer(&buf);
 	PQclear(res);
 
 	return tables;
 }
 
+/*
+ * Given a user-specified list of indexes, prepare a matching list
+ * indexes to process, and also a matching list of table OIDs to which each
+ * index belongs.  The latter is needed to avoid scheduling two parallel tasks
+ * with concurrent reindexing of indexes on the same table.
+ *
+ * On input, index_list is the user-specified index list.  table_list is an
+ * output argument which is filled with a list of the tables to process; on
+ * output, index_list is a matching reordered list of indexes.  Caller is
+ * supposed to walk both lists in unison.  Both pointers will be NULL if
+ * there's nothing to process.
+ */
+static void
+get_parallel_tabidx_list(PGconn *conn,
+						 SimpleStringList *index_list,
+						 SimpleOidList **table_list,
+						 bool echo)
+{
+	PQExpBufferData catalog_query;
+	PGresult   *res;
+	SimpleStringListCell *cell;
+	int			ntups;
+
+	Assert(index_list != NULL);
+
+	initPQExpBuffer(&catalog_query);
+
+	/*
+	 * The queries here are using a safe search_path, so there's no need to
+	 * fully qualify everything.
+	 */
+
+	/*
+	 * We cannot use REINDEX in parallel in a straightforward way, because
+	 * we'd be unable to control concurrent processing of multiple indexes on
+	 * the same table.  But we can extract the table OID together with each
+	 * indexes, to allow scheduling each REINDEX INDEX command on a separate
+	 * job according to the owning table.
+	 *
+	 * We order the results to group all indexes of each table together.
+	 */
+	appendPQExpBufferStr(&catalog_query,
+						 "SELECT x.indrelid, n.nspname, i.relname\n"
+						 "FROM pg_catalog.pg_index x\n"
+						 "JOIN pg_catalog.pg_class i ON i.oid = x.indexrelid\n"
+						 "LEFT JOIN pg_catalog.pg_namespace n ON n.oid = i.relnamespace\n"
+						 "WHERE x.indexrelid = ANY(ARRAY['");
+
+	for (cell = index_list->head; cell; cell = cell->next)
+	{
+		if (cell != index_list->head)
+			appendPQExpBufferStr(&catalog_query, "', '");
+
+		appendQualifiedRelation(&catalog_query, cell->val, conn, echo);
+	}
+
+	/*
+	 * Order tables by the size of its greatest index.  Within each table,
+	 * order indexes by their sizes.
+	 */
+	appendPQExpBufferStr(&catalog_query,
+						 "']::pg_catalog.regclass[])\n"
+						 "ORDER BY max(i.relpages) OVER \n"
+						 "    (PARTITION BY x.indrelid),\n"
+						 "  x.indrelid, i.relpages;\n");
+
+	/* Empty the original index_list to fill it from the query result. */
+	simple_string_list_destroy(index_list);
+	index_list->head = index_list->tail = NULL;
+
+	res = executeQuery(conn, catalog_query.data, echo);
+	termPQExpBuffer(&catalog_query);
+
+	/*
+	 * If no rows are returned, there are no matching tables, so we are done.
+	 */
+	ntups = PQntuples(res);
+	if (ntups == 0)
+	{
+		PQclear(res);
+		return;
+	}
+
+	*table_list = pg_malloc0(sizeof(SimpleOidList));
+
+	/*
+	 * Build two lists, one with table OIDs and the other with fully-qualified
+	 * index names.
+	 */
+	for (int i = 0; i < ntups; i++)
+	{
+		simple_oid_list_append(*table_list, atooid(PQgetvalue(res, i, 0)));
+		simple_string_list_append(index_list,
+								  fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
+													PQgetvalue(res, i, 2),
+													PQclientEncoding(conn)));
+	}
+
+	PQclear(res);
+}
+
 static void
 reindex_all_databases(ConnParams *cparams,
 					  const char *progname, bool echo, bool quiet, bool verbose,
#14Ranier Vilela
ranier.vf@gmail.com
In reply to: Álvaro Herrera (#13)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

Em sex., 7 de mar. de 2025 às 16:01, Álvaro Herrera <alvherre@alvh.no-ip.org>
escreveu:

On 2025-Mar-07, Álvaro Herrera wrote:

Anyway, my version of this is attached. It fixes the problems with your
patch, but not Orlov's fundamental bug.

And of course I forgot to actually attach the patch. Good grief.

Test with your v4 patch, on Windows 64 bits.

results:
reindexdb -U postgres -d postgres -j4 --echo -i foo1 -i foo2 -i bar1 -i
bar2 -i baz1 -i baz2 -i baz3 -i baz4

Cut to show only REINDEX (order) cmds:

RESET search_path;
SELECT c.relname, ns.nspname
FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
AND c.oid OPERATOR(pg_catalog.=) 'public.baz4'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.baz4;
RESET search_path;
SELECT c.relname, ns.nspname
FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
AND c.oid OPERATOR(pg_catalog.=) 'public.baz1'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.baz1;
RESET search_path;
SELECT c.relname, ns.nspname
FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
AND c.oid OPERATOR(pg_catalog.=) 'public.baz2'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.baz2;
RESET search_path;
SELECT c.relname, ns.nspname
FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
AND c.oid OPERATOR(pg_catalog.=) 'public.baz3'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.baz3;
SELECT pg_catalog.set_config('search_path', '', false);
RESET search_path;
SELECT c.relname, ns.nspname
FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
AND c.oid OPERATOR(pg_catalog.=) 'public.foo1'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.foo1;
RESET search_path;
SELECT c.relname, ns.nspname
FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
AND c.oid OPERATOR(pg_catalog.=) 'public.foo2'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.foo2;
SELECT pg_catalog.set_config('search_path', '', false);
RESET search_path;
SELECT c.relname, ns.nspname
FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
AND c.oid OPERATOR(pg_catalog.=) 'public.bar1'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.bar1;
RESET search_path;
SELECT c.relname, ns.nspname
FROM pg_catalog.pg_class c, pg_catalog.pg_namespace ns
WHERE c.relnamespace OPERATOR(pg_catalog.=) ns.oid
AND c.oid OPERATOR(pg_catalog.=) 'public.bar2'::pg_catalog.regclass;
SELECT pg_catalog.set_config('search_path', '', false);
REINDEX INDEX public.bar2;

best regards,
Ranier Vilela

#15Ranier Vilela
ranier.vf@gmail.com
In reply to: Ranier Vilela (#14)
1 attachment(s)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

Em sex., 7 de mar. de 2025 às 16:14, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Em sex., 7 de mar. de 2025 às 16:01, Álvaro Herrera <
alvherre@alvh.no-ip.org> escreveu:

On 2025-Mar-07, Álvaro Herrera wrote:

Anyway, my version of this is attached. It fixes the problems with your
patch, but not Orlov's fundamental bug.

And of course I forgot to actually attach the patch. Good grief.

Test with your v4 patch, on Windows 64 bits.

Rebased against 682c5be
<https://github.com/postgres/postgres/commit/682c5be25c28192c56e9d5e2a9ca14673a2fcf4b&gt;

Tests:
reindexdb -U postgres -d postgres -j4 --echo -i foo1 -i foo2 -i bar1 -i
bar2 -i baz1 -i baz2 -i baz3 -i baz4 | grep REINDEX
File STDIN:
REINDEX INDEX public.baz4;
REINDEX INDEX public.baz1;
REINDEX INDEX public.baz2;
REINDEX INDEX public.baz3;
REINDEX INDEX public.foo1;
REINDEX INDEX public.foo2;
REINDEX INDEX public.bar1;
REINDEX INDEX public.bar2;

best regards,
Ranier Vilela

Attachments:

v5-try-but-fail-to-unbreak-reindexdb.patchapplication/octet-stream; name=v5-try-but-fail-to-unbreak-reindexdb.patchDownload
diff --git a/src/bin/scripts/reindexdb.c b/src/bin/scripts/reindexdb.c
index 9fe03ab8fa..4eb230b790 100644
--- a/src/bin/scripts/reindexdb.c
+++ b/src/bin/scripts/reindexdb.c
@@ -12,6 +12,7 @@
 #include "postgres_fe.h"
 
 #include <limits.h>
+#include <stdlib.h>
 
 #include "catalog/pg_class_d.h"
 #include "common.h"
@@ -33,10 +34,14 @@ typedef enum ReindexType
 } ReindexType;
 
 
-static SimpleStringList *get_parallel_object_list(PGconn *conn,
+static SimpleStringList *get_parallel_tables_list(PGconn *conn,
 												  ReindexType type,
 												  SimpleStringList *user_list,
 												  bool echo);
+static void get_parallel_tabidx_list(PGconn *conn,
+									 SimpleStringList *index_list,
+									 SimpleOidList **table_list,
+									 bool echo);
 static void reindex_one_database(ConnParams *cparams, ReindexType type,
 								 SimpleStringList *user_list,
 								 const char *progname,
@@ -279,13 +284,14 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 {
 	PGconn	   *conn;
 	SimpleStringListCell *cell;
-	SimpleStringListCell *indices_tables_cell = NULL;
+	SimpleOidListCell *indices_tables_cell = NULL;
 	bool		parallel = concurrentCons > 1;
-	SimpleStringList *process_list = user_list;
-	SimpleStringList *indices_tables_list = NULL;
+	SimpleStringList *process_list;
+	SimpleOidList *tableoid_list = NULL;
 	ReindexType process_type = type;
 	ParallelSlotArray *sa;
 	bool		failed = false;
+	Oid			prev_index_table_oid = InvalidOid;
 	int			items_count = 0;
 	ParallelSlot *free_slot = NULL;
 
@@ -324,6 +330,7 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 			case REINDEX_INDEX:
 			case REINDEX_SCHEMA:
 			case REINDEX_TABLE:
+				process_list = user_list;			
 				Assert(user_list != NULL);
 				break;
 		}
@@ -332,26 +339,14 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 	{
 		switch (process_type)
 		{
-			case REINDEX_DATABASE:
-
-				/* Build a list of relations from the database */
-				process_list = get_parallel_object_list(conn, process_type,
-														user_list, echo);
-				process_type = REINDEX_TABLE;
-
-				/* Bail out if nothing to process */
-				if (process_list == NULL)
-				{
-					PQfinish(conn);
-					return;
-				}
-				break;
-
 			case REINDEX_SCHEMA:
 				Assert(user_list != NULL);
+				/* fall through */
 
-				/* Build a list of relations from all the schemas */
-				process_list = get_parallel_object_list(conn, process_type,
+			case REINDEX_DATABASE:
+
+				/* Build a list of relations from the database */
+				process_list = get_parallel_tables_list(conn, process_type,
 														user_list, echo);
 				process_type = REINDEX_TABLE;
 
@@ -367,42 +362,31 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 				Assert(user_list != NULL);
 
 				/*
-				 * Build a list of relations from the indices.  This will
-				 * accordingly reorder the list of indices too.
+				 * Generate a list of indexes and a matching list of table
+				 * OIDs, based on the user-specified index names.
 				 */
-				indices_tables_list = get_parallel_object_list(conn, process_type,
-															   user_list, echo);
+				get_parallel_tabidx_list(conn, user_list, &tableoid_list,
+										 echo);
 
-				/*
-				 * Bail out if nothing to process.  'user_list' was modified
-				 * in-place, so check if it has at least one cell.
-				 */
-				if (user_list->head == NULL)
+				/* Bail out if nothing to process */
+				if (tableoid_list == NULL)
 				{
 					PQfinish(conn);
 					return;
 				}
 
-				/*
-				 * Assuming 'user_list' is not empty, 'indices_tables_list'
-				 * shouldn't be empty as well.
-				 */
-				Assert(indices_tables_list != NULL);
-				indices_tables_cell = indices_tables_list->head;
-
+				indices_tables_cell = tableoid_list->head;
+				process_list = user_list;
 				break;
 
 			case REINDEX_SYSTEM:
 				/* not supported */
+				process_list = NULL;
 				Assert(false);
 				break;
 
 			case REINDEX_TABLE:
-
-				/*
-				 * Fall through.  The list of items for tables is already
-				 * created.
-				 */
+				process_list = user_list;
 				break;
 		}
 	}
@@ -412,6 +396,7 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 	 * the list.  We choose the minimum between the number of concurrent
 	 * connections and the number of items in the list.
 	 */
+	items_count = 0;
 	for (cell = process_list->head; cell; cell = cell->next)
 	{
 		items_count++;
@@ -461,7 +446,7 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 			gen_reindex_command(free_slot->connection, process_type, objname,
 								echo, verbose, concurrently, tablespace, &sql);
 			while (indices_tables_cell->next &&
-				   strcmp(indices_tables_cell->val, indices_tables_cell->next->val) == 0)
+				   indices_tables_cell->val == indices_tables_cell->next->val)
 			{
 				indices_tables_cell = indices_tables_cell->next;
 				cell = cell->next;
@@ -494,10 +479,10 @@ finish:
 		pg_free(process_list);
 	}
 
-	if (indices_tables_list)
+	if (tableoid_list)
 	{
-		simple_string_list_destroy(indices_tables_list);
-		pg_free(indices_tables_list);
+		simple_oid_list_destroy(tableoid_list);
+		pg_free(tableoid_list);
 	}
 
 	ParallelSlotsTerminate(sa);
@@ -634,7 +619,7 @@ run_reindex_command(PGconn *conn, ReindexType type, const char *name,
 }
 
 /*
- * Prepare the list of objects to process by querying the catalogs.
+ * Prepare the list of tables to process by querying the catalogs.
  *
  * This function will return a SimpleStringList object containing the entire
  * list of tables in the given database that should be processed by a parallel
@@ -642,15 +627,13 @@ run_reindex_command(PGconn *conn, ReindexType type, const char *name,
  * table.
  */
 static SimpleStringList *
-get_parallel_object_list(PGconn *conn, ReindexType type,
+get_parallel_tables_list(PGconn *conn, ReindexType type,
 						 SimpleStringList *user_list, bool echo)
 {
 	PQExpBufferData catalog_query;
-	PQExpBufferData buf;
 	PGresult   *res;
 	SimpleStringList *tables;
-	int			ntups,
-				i;
+	int			ntups;
 
 	initPQExpBuffer(&catalog_query);
 
@@ -679,7 +662,6 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 		case REINDEX_SCHEMA:
 			{
 				SimpleStringListCell *cell;
-				bool		nsp_listed = false;
 
 				Assert(user_list != NULL);
 
@@ -701,14 +683,10 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 
 				for (cell = user_list->head; cell; cell = cell->next)
 				{
-					const char *nspname = cell->val;
-
-					if (nsp_listed)
-						appendPQExpBufferStr(&catalog_query, ", ");
-					else
-						nsp_listed = true;
+					if (cell != user_list->head)
+						appendPQExpBufferChar(&catalog_query, ',');
 
-					appendStringLiteralConn(&catalog_query, nspname, conn);
+					appendStringLiteralConn(&catalog_query, cell->val, conn);
 				}
 
 				appendPQExpBufferStr(&catalog_query, ")\n"
@@ -717,59 +695,6 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 			break;
 
 		case REINDEX_INDEX:
-			{
-				SimpleStringListCell *cell;
-
-				Assert(user_list != NULL);
-
-				/*
-				 * Straight-forward index-level REINDEX is not supported with
-				 * multiple jobs as we cannot control the concurrent
-				 * processing of multiple indexes depending on the same
-				 * relation.  But we can extract the appropriate table name
-				 * for the index and put REINDEX INDEX commands into different
-				 * jobs, according to the parent tables.
-				 *
-				 * We will order the results to group the same tables
-				 * together. We fetch index names as well to build a new list
-				 * of them with matching order.
-				 */
-				appendPQExpBufferStr(&catalog_query,
-									 "SELECT t.relname, n.nspname, i.relname\n"
-									 "FROM pg_catalog.pg_index x\n"
-									 "JOIN pg_catalog.pg_class t ON t.oid = x.indrelid\n"
-									 "JOIN pg_catalog.pg_class i ON i.oid = x.indexrelid\n"
-									 "LEFT JOIN pg_catalog.pg_namespace n ON n.oid = t.relnamespace\n"
-									 "WHERE x.indexrelid OPERATOR(pg_catalog.=) ANY(ARRAY['");
-
-				for (cell = user_list->head; cell; cell = cell->next)
-				{
-					if (cell != user_list->head)
-						appendPQExpBufferStr(&catalog_query, "', '");
-
-					appendQualifiedRelation(&catalog_query, cell->val, conn, echo);
-				}
-
-				/*
-				 * Order tables by the size of its greatest index.  Within the
-				 * table, order indexes by their sizes.
-				 */
-				appendPQExpBufferStr(&catalog_query,
-									 "']::pg_catalog.regclass[])\n"
-									 "ORDER BY max(i.relpages) OVER \n"
-									 "    (PARTITION BY n.nspname, t.relname),\n"
-									 "  n.nspname, t.relname, i.relpages;\n");
-
-				/*
-				 * We're going to re-order the user_list to match the order of
-				 * tables.  So, empty the user_list to fill it from the query
-				 * result.
-				 */
-				simple_string_list_destroy(user_list);
-				user_list->head = user_list->tail = NULL;
-			}
-			break;
-
 		case REINDEX_SYSTEM:
 		case REINDEX_TABLE:
 			Assert(false);
@@ -792,36 +717,117 @@ get_parallel_object_list(PGconn *conn, ReindexType type,
 	tables = pg_malloc0(sizeof(SimpleStringList));
 
 	/* Build qualified identifiers for each table */
-	initPQExpBuffer(&buf);
-	for (i = 0; i < ntups; i++)
+	for (int i = 0; i < ntups; i++)
 	{
-		appendPQExpBufferStr(&buf,
-							 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
-											   PQgetvalue(res, i, 0),
-											   PQclientEncoding(conn)));
+		simple_string_list_append(tables,
+								  fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
+													PQgetvalue(res, i, 0),
+													PQclientEncoding(conn)));
+	}
+	PQclear(res);
 
-		simple_string_list_append(tables, buf.data);
-		resetPQExpBuffer(&buf);
+	return tables;
+}
 
-		if (type == REINDEX_INDEX)
-		{
-			/*
-			 * For index-level REINDEX, rebuild the list of indexes to match
-			 * the order of tables list.
-			 */
-			appendPQExpBufferStr(&buf,
-								 fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
-												   PQgetvalue(res, i, 2),
-												   PQclientEncoding(conn)));
+/*
+ * Given a user-specified list of indexes, prepare a matching list
+ * indexes to process, and also a matching list of table OIDs to which each
+ * index belongs.  The latter is needed to avoid scheduling two parallel tasks
+ * with concurrent reindexing of indexes on the same table.
+ *
+ * On input, index_list is the user-specified index list.  table_list is an
+ * output argument which is filled with a list of the tables to process; on
+ * output, index_list is a matching reordered list of indexes.  Caller is
+ * supposed to walk both lists in unison.  Both pointers will be NULL if
+ * there's nothing to process.
+ */
+static void
+get_parallel_tabidx_list(PGconn *conn,
+						 SimpleStringList *index_list,
+						 SimpleOidList **table_list,
+						 bool echo)
+{
+	PQExpBufferData catalog_query;
+	PGresult   *res;
+	SimpleStringListCell *cell;
+	int			ntups;
 
-			simple_string_list_append(user_list, buf.data);
-			resetPQExpBuffer(&buf);
-		}
+	Assert(index_list != NULL);
+
+	initPQExpBuffer(&catalog_query);
+
+	/*
+	 * The queries here are using a safe search_path, so there's no need to
+	 * fully qualify everything.
+	 */
+
+	/*
+	 * We cannot use REINDEX in parallel in a straightforward way, because
+	 * we'd be unable to control concurrent processing of multiple indexes on
+	 * the same table.  But we can extract the table OID together with each
+	 * indexes, to allow scheduling each REINDEX INDEX command on a separate
+	 * job according to the owning table.
+	 *
+	 * We order the results to group all indexes of each table together.
+	 */
+	appendPQExpBufferStr(&catalog_query,
+						 "SELECT x.indrelid, n.nspname, i.relname\n"
+						 "FROM pg_catalog.pg_index x\n"
+						 "JOIN pg_catalog.pg_class i ON i.oid = x.indexrelid\n"
+						 "LEFT JOIN pg_catalog.pg_namespace n ON n.oid = i.relnamespace\n"
+						 "WHERE x.indexrelid = ANY(ARRAY['");
+
+	for (cell = index_list->head; cell; cell = cell->next)
+	{
+		if (cell != index_list->head)
+			appendPQExpBufferStr(&catalog_query, "', '");
+
+		appendQualifiedRelation(&catalog_query, cell->val, conn, echo);
 	}
-	termPQExpBuffer(&buf);
-	PQclear(res);
 
-	return tables;
+	/*
+	 * Order tables by the size of its greatest index.  Within each table,
+	 * order indexes by their sizes.
+	 */
+	appendPQExpBufferStr(&catalog_query,
+						 "']::pg_catalog.regclass[])\n"
+						 "ORDER BY max(i.relpages) OVER \n"
+						 "    (PARTITION BY x.indrelid),\n"
+						 "  x.indrelid, i.relpages;\n");
+
+	/* Empty the original index_list to fill it from the query result. */
+	simple_string_list_destroy(index_list);
+	index_list->head = index_list->tail = NULL;
+
+	res = executeQuery(conn, catalog_query.data, echo);
+	termPQExpBuffer(&catalog_query);
+
+	/*
+	 * If no rows are returned, there are no matching tables, so we are done.
+	 */
+	ntups = PQntuples(res);
+	if (ntups == 0)
+	{
+		PQclear(res);
+		return;
+	}
+
+	*table_list = pg_malloc0(sizeof(SimpleOidList));
+
+	/*
+	 * Build two lists, one with table OIDs and the other with fully-qualified
+	 * index names.
+	 */
+	for (int i = 0; i < ntups; i++)
+	{
+		simple_oid_list_append(*table_list, atooid(PQgetvalue(res, i, 0)));
+		simple_string_list_append(index_list,
+								  fmtQualifiedIdEnc(PQgetvalue(res, i, 1),
+													PQgetvalue(res, i, 2),
+													PQclientEncoding(conn)));
+	}
+
+	PQclear(res);
 }
 
 static void
diff --git a/src/bin/scripts/t/090_reindexdb.pl b/src/bin/scripts/t/090_reindexdb.pl
index 378f8ad7a5..2c4b641c9b 100644
--- a/src/bin/scripts/t/090_reindexdb.pl
+++ b/src/bin/scripts/t/090_reindexdb.pl
@@ -263,11 +263,9 @@ $node->safe_psql(
 	CREATE SCHEMA s1;
 	CREATE TABLE s1.t1(id integer);
 	CREATE INDEX ON s1.t1(id);
-	CREATE INDEX i1 ON s1.t1(id);
 	CREATE SCHEMA s2;
 	CREATE TABLE s2.t2(id integer);
 	CREATE INDEX ON s2.t2(id);
-	CREATE INDEX i2 ON s2.t2(id);
 	-- empty schema
 	CREATE SCHEMA s3;
 |);
@@ -279,11 +277,11 @@ $node->command_ok(
 	[
 		'reindexdb',
 		'--jobs' => '2',
-		'--index' => 's1.i1',
-		'--index' => 's2.i2',
+		'--index' => 's1.t1_id_idx',
+		'--index' => 's2.t2_id_idx',
 		'postgres',
 	],
-	'parallel reindexdb for indices');
+	'failover for parallel reindexdb for indices');
 # Note that the ordering of the commands is not stable, so the second
 # command for s2.t2 is not checked after.
 $node->issues_sql_like(
#16Ranier Vilela
ranier.vf@gmail.com
In reply to: Ranier Vilela (#15)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

Em seg., 17 de mar. de 2025 às 16:17, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Em sex., 7 de mar. de 2025 às 16:14, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Em sex., 7 de mar. de 2025 às 16:01, Álvaro Herrera <
alvherre@alvh.no-ip.org> escreveu:

On 2025-Mar-07, Álvaro Herrera wrote:

Anyway, my version of this is attached. It fixes the problems with

your

patch, but not Orlov's fundamental bug.

And of course I forgot to actually attach the patch. Good grief.

Test with your v4 patch, on Windows 64 bits.

Rebased against 682c5be
<https://github.com/postgres/postgres/commit/682c5be25c28192c56e9d5e2a9ca14673a2fcf4b&gt;

Thanks Álvaro, for the commit.

best regards,
Ranier Vilela

#17Álvaro Herrera
alvherre@alvh.no-ip.org
In reply to: Ranier Vilela (#16)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

On 2025-Mar-18, Ranier Vilela wrote:

Thanks Álvaro, for the commit.

Thank you for the impetus.

--
Álvaro Herrera Breisgau, Deutschland — https://www.EnterpriseDB.com/

#18Ranier Vilela
ranier.vf@gmail.com
In reply to: Ranier Vilela (#15)
1 attachment(s)
Re: Simplify the logic a bit (src/bin/scripts/reindexdb.c)

Em seg., 17 de mar. de 2025 às 16:17, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Em sex., 7 de mar. de 2025 às 16:14, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Em sex., 7 de mar. de 2025 às 16:01, Álvaro Herrera <
alvherre@alvh.no-ip.org> escreveu:

On 2025-Mar-07, Álvaro Herrera wrote:

Anyway, my version of this is attached. It fixes the problems with

your

patch, but not Orlov's fundamental bug.

And of course I forgot to actually attach the patch. Good grief.

Test with your v4 patch, on Windows 64 bits.

Rebased against 682c5be
<https://github.com/postgres/postgres/commit/682c5be25c28192c56e9d5e2a9ca14673a2fcf4b&gt;

Coverity reported two new warnings regarding reindexdb.

CID 1593911: (#1 of 1): Dereference after null check (FORWARD_NULL)
49. var_deref_op: Dereferencing null pointer indices_tables_cell.

CID 1593910: (#1 of 1): Explicit null dereferenced (FORWARD_NULL)
9. var_deref_op: Dereferencing null pointer process_list.

Attached is an attempt to fix.

best regards,
Ranier Vilela

Attachments:

fix-possible-dereference-reindex_db.patchapplication/octet-stream; name=fix-possible-dereference-reindex_db.patchDownload
diff --git a/src/bin/scripts/reindexdb.c b/src/bin/scripts/reindexdb.c
index 860a0fcb46..d944876eea 100644
--- a/src/bin/scripts/reindexdb.c
+++ b/src/bin/scripts/reindexdb.c
@@ -348,13 +348,6 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 				process_list = get_parallel_tables_list(conn, process_type,
 														user_list, echo);
 				process_type = REINDEX_TABLE;
-
-				/* Bail out if nothing to process */
-				if (process_list == NULL)
-				{
-					PQfinish(conn);
-					return;
-				}
 				break;
 
 			case REINDEX_INDEX:
@@ -380,7 +373,6 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 
 			case REINDEX_SYSTEM:
 				/* not supported */
-				process_list = NULL;
 				Assert(false);
 				break;
 
@@ -390,6 +382,13 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 		}
 	}
 
+	/* Bail out if nothing to process */
+	if (process_list == NULL)
+	{
+		PQfinish(conn);
+		return;
+	}
+
 	/*
 	 * Adjust the number of concurrent connections depending on the items in
 	 * the list.  We choose the minimum between the number of concurrent
@@ -434,7 +433,7 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 
 		ParallelSlotSetHandler(free_slot, TableCommandResultHandler, NULL);
 		initPQExpBuffer(&sql);
-		if (parallel && process_type == REINDEX_INDEX)
+		if (parallel && process_type == REINDEX_INDEX && indices_tables_cell)
 		{
 			/*
 			 * For parallel index-level REINDEX, the indices of the same table
@@ -444,17 +443,16 @@ reindex_one_database(ConnParams *cparams, ReindexType type,
 			 */
 			gen_reindex_command(free_slot->connection, process_type, objname,
 								echo, verbose, concurrently, tablespace, &sql);
-			while (indices_tables_cell->next &&
+			while (indices_tables_cell && indices_tables_cell->next &&
 				   indices_tables_cell->val == indices_tables_cell->next->val)
 			{
-				indices_tables_cell = indices_tables_cell->next;
 				cell = cell->next;
 				objname = cell->val;
 				appendPQExpBufferChar(&sql, '\n');
 				gen_reindex_command(free_slot->connection, process_type, objname,
 									echo, verbose, concurrently, tablespace, &sql);
+				indices_tables_cell = indices_tables_cell->next;
 			}
-			indices_tables_cell = indices_tables_cell->next;
 		}
 		else
 		{