bumping HASH_VERSION to 3

Started by Robert Haasalmost 9 years ago18 messages

robertmhaas@gmail.com

almost 9 years ago

Starting a new thread about this to get more visibility.

Despite the extensive work that has been done on hash indexes this
release, we have thus far not made any change to the on-disk format
that is not nominally backward-compatible. Commit
293e24e507838733aba4748b514536af2d39d7f2 did make a change for new
hash indexes, but included backward-compatibility code so that old
indexes would continue to work. However, I'd like to also commit
Mithun Cy's patch to expand hash indexes more gradually -- latest
version in /messages/by-id/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9zke83s6ifBg@mail.gmail.com
-- and that's not backward-compatible.

It would be possible to write code to convert the old metapage format
to the new metapage format introduced by that patch, and it wouldn't
be very hard, but I think it would be better to NOT do that, and
instead force everybody upgrading to v10 to rebuild all of their hash
indexes. If we don't do that, then we'll never know whether
instances of hash index corruption reported against v10 or higher are
caused by defects in the new code, because there's always the chance
that the hash index could have been built on a pre-v10 version, got
corrupted because of the lack of WAL-logging, and then been brought up
to v10+ via pg_upgrade. Forcing a reindex in v10 kills three birds
with one stone:

- No old, not logged, possibly corrupt hash indexes floating around
after an upgrade to v10.
- Can remove the backward-compatibility code added by
293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around
forever.
- No need to worry about doing an in-place upgrade of the metapage for
the above-mentioned patch.

Thoughts?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Magnus Hagander

magnus@hagander.net

almost 9 years ago

In reply to: Robert Haas (#1)

Re: bumping HASH_VERSION to 3

On Fri, Mar 31, 2017 at 8:17 PM, Robert Haas <robertmhaas@gmail.com> wrote:

Starting a new thread about this to get more visibility.

Despite the extensive work that has been done on hash indexes this
release, we have thus far not made any change to the on-disk format
that is not nominally backward-compatible. Commit
293e24e507838733aba4748b514536af2d39d7f2 did make a change for new
hash indexes, but included backward-compatibility code so that old
indexes would continue to work. However, I'd like to also commit
Mithun Cy's patch to expand hash indexes more gradually -- latest
version in /messages/by-id/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9
zke83s6ifBg@mail.gmail.com
-- and that's not backward-compatible.

It would be possible to write code to convert the old metapage format
to the new metapage format introduced by that patch, and it wouldn't
be very hard, but I think it would be better to NOT do that, and
instead force everybody upgrading to v10 to rebuild all of their hash
indexes. If we don't do that, then we'll never know whether
instances of hash index corruption reported against v10 or higher are
caused by defects in the new code, because there's always the chance
that the hash index could have been built on a pre-v10 version, got
corrupted because of the lack of WAL-logging, and then been brought up
to v10+ via pg_upgrade. Forcing a reindex in v10 kills three birds
with one stone:

- No old, not logged, possibly corrupt hash indexes floating around
after an upgrade to v10.
- Can remove the backward-compatibility code added by
293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around
forever.
- No need to worry about doing an in-place upgrade of the metapage for
the above-mentioned patch.

Thoughts?

Given the state of hash indexes in <= 9.6, I think this is a reasonable
tradeoff. Most people won't be using them at all today. Those that do will
have to "pay" with a REINDEX on upgrade. I think the benefits definitely
outweigh the cost.

So +1 for doing it.

--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/

Jesper Pedersen

jesper.pedersen@redhat.com

almost 9 years ago

In reply to: Robert Haas (#1)

Re: bumping HASH_VERSION to 3

On 03/31/2017 02:17 PM, Robert Haas wrote:

Starting a new thread about this to get more visibility.

Despite the extensive work that has been done on hash indexes this
release, we have thus far not made any change to the on-disk format
that is not nominally backward-compatible. Commit
293e24e507838733aba4748b514536af2d39d7f2 did make a change for new
hash indexes, but included backward-compatibility code so that old
indexes would continue to work. However, I'd like to also commit
Mithun Cy's patch to expand hash indexes more gradually -- latest
version in /messages/by-id/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9zke83s6ifBg@mail.gmail.com
-- and that's not backward-compatible.

It would be possible to write code to convert the old metapage format
to the new metapage format introduced by that patch, and it wouldn't
be very hard, but I think it would be better to NOT do that, and
instead force everybody upgrading to v10 to rebuild all of their hash
indexes. If we don't do that, then we'll never know whether
instances of hash index corruption reported against v10 or higher are
caused by defects in the new code, because there's always the chance
that the hash index could have been built on a pre-v10 version, got
corrupted because of the lack of WAL-logging, and then been brought up
to v10+ via pg_upgrade. Forcing a reindex in v10 kills three birds
with one stone:

- No old, not logged, possibly corrupt hash indexes floating around
after an upgrade to v10.
- Can remove the backward-compatibility code added by
293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around
forever.
- No need to worry about doing an in-place upgrade of the metapage for
the above-mentioned patch.

Thoughts?

Best regards,
Jesper

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Tom Lane

tgl@sss.pgh.pa.us

almost 9 years ago

In reply to: Robert Haas (#1)

Re: bumping HASH_VERSION to 3

Robert Haas <robertmhaas@gmail.com> writes:

Forcing a reindex in v10 kills three birds
with one stone:

- No old, not logged, possibly corrupt hash indexes floating around
after an upgrade to v10.
- Can remove the backward-compatibility code added by
293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around
forever.
- No need to worry about doing an in-place upgrade of the metapage for
the above-mentioned patch.

Thoughts?

+1, as long as we're clear on what will happen when pg_upgrade'ing
an installation containing hash indexes. I think a minimum requirement is
that it succeed and be able to start up, and allow the user to manually
REINDEX such indexes afterwards. Bonus points for:

1. teaching pg_upgrade to create a script containing the required REINDEX
commands. (I think it's produced scripts for similar requirements in the
past.)

2. marking the index invalid so that the system would silently ignore it
until it's been reindexed. I think there might be adequate infrastructure
for that already thanks to REINDEX CONCURRENTLY, and it'd just be a matter
of getting pg_upgrade to hack the indexes' catalog state. (If not, it's
probably not worth the trouble.)

A variant on that might just be to not transfer over hash indexes,
leaving the user to CREATE them rather than REINDEX them.

regards, tom lane

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Joe Conway

mail@joeconway.com

almost 9 years ago

In reply to: Magnus Hagander (#2)

Re: bumping HASH_VERSION to 3

On 03/31/2017 11:19 AM, Magnus Hagander wrote:

On Fri, Mar 31, 2017 at 8:17 PM, Robert Haas <robertmhaas@gmail.com
<mailto:robertmhaas@gmail.com>> wrote:

Starting a new thread about this to get more visibility.

Despite the extensive work that has been done on hash indexes this
release, we have thus far not made any change to the on-disk format
that is not nominally backward-compatible. Commit
293e24e507838733aba4748b514536af2d39d7f2 did make a change for new
hash indexes, but included backward-compatibility code so that old
indexes would continue to work. However, I'd like to also commit
Mithun Cy's patch to expand hash indexes more gradually -- latest
version in
/messages/by-id/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9zke83s6ifBg@mail.gmail.com
</messages/by-id/CAD__OujD-iBxm91ZcqziaYftWqJxnFqgMv361V9zke83s6ifBg@mail.gmail.com>
-- and that's not backward-compatible.

It would be possible to write code to convert the old metapage format
to the new metapage format introduced by that patch, and it wouldn't
be very hard, but I think it would be better to NOT do that, and
instead force everybody upgrading to v10 to rebuild all of their hash
indexes. If we don't do that, then we'll never know whether
instances of hash index corruption reported against v10 or higher are
caused by defects in the new code, because there's always the chance
that the hash index could have been built on a pre-v10 version, got
corrupted because of the lack of WAL-logging, and then been brought up
to v10+ via pg_upgrade. Forcing a reindex in v10 kills three birds
with one stone:

- No old, not logged, possibly corrupt hash indexes floating around
after an upgrade to v10.
- Can remove the backward-compatibility code added by
293e24e507838733aba4748b514536af2d39d7f2 instead of keeping it around
forever.
- No need to worry about doing an in-place upgrade of the metapage for
the above-mentioned patch.

Thoughts?

Given the state of hash indexes in <= 9.6, I think this is a reasonable
tradeoff. Most people won't be using them at all today. Those that do
will have to "pay" with a REINDEX on upgrade. I think the benefits
definitely outweigh the cost.

So +1 for doing it.

--
Crunchy Data - http://crunchydata.com
PostgreSQL Support for Secure Enterprises
Consulting, Training, & Open Source Development

Bruce Momjian

bruce@momjian.us

almost 9 years ago

In reply to: Tom Lane (#4)

Re: bumping HASH_VERSION to 3

On Fri, Mar 31, 2017 at 02:48:05PM -0400, Tom Lane wrote:

+1, as long as we're clear on what will happen when pg_upgrade'ing
an installation containing hash indexes. I think a minimum requirement is
that it succeed and be able to start up, and allow the user to manually
REINDEX such indexes afterwards. Bonus points for:

1. teaching pg_upgrade to create a script containing the required REINDEX
commands. (I think it's produced scripts for similar requirements in the
past.)

2. marking the index invalid so that the system would silently ignore it
until it's been reindexed. I think there might be adequate infrastructure
for that already thanks to REINDEX CONCURRENTLY, and it'd just be a matter
of getting pg_upgrade to hack the indexes' catalog state. (If not, it's
probably not worth the trouble.)

We already have code to do all of that, but it was removed from
pg_upgrade in 9.5. You can still see it in 9.4:

contrib/pg_upgrade/version_old_8_3.c::old_8_3_invalidate_hash_gin_indexes()

I would be happy to restore that code and make it work for PG 10.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Robert Haas

robertmhaas@gmail.com

almost 9 years ago

In reply to: Bruce Momjian (#6)

Re: bumping HASH_VERSION to 3

On Tue, Apr 11, 2017 at 2:23 PM, Bruce Momjian <bruce@momjian.us> wrote:

On Fri, Mar 31, 2017 at 02:48:05PM -0400, Tom Lane wrote:

+1, as long as we're clear on what will happen when pg_upgrade'ing
an installation containing hash indexes. I think a minimum requirement is
that it succeed and be able to start up, and allow the user to manually
REINDEX such indexes afterwards. Bonus points for:

1. teaching pg_upgrade to create a script containing the required REINDEX
commands. (I think it's produced scripts for similar requirements in the
past.)

2. marking the index invalid so that the system would silently ignore it
until it's been reindexed. I think there might be adequate infrastructure
for that already thanks to REINDEX CONCURRENTLY, and it'd just be a matter
of getting pg_upgrade to hack the indexes' catalog state. (If not, it's
probably not worth the trouble.)

We already have code to do all of that, but it was removed from
pg_upgrade in 9.5. You can still see it in 9.4:

contrib/pg_upgrade/version_old_8_3.c::old_8_3_invalidate_hash_gin_indexes()

I would be happy to restore that code and make it work for PG 10.

Cool!

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Amit Kapila

amit.kapila16@gmail.com

over 8 years ago

In reply to: Bruce Momjian (#6)

1 attachment(s)

Re: bumping HASH_VERSION to 3

On Tue, Apr 11, 2017 at 11:53 PM, Bruce Momjian <bruce@momjian.us> wrote:

On Fri, Mar 31, 2017 at 02:48:05PM -0400, Tom Lane wrote:

+1, as long as we're clear on what will happen when pg_upgrade'ing
an installation containing hash indexes. I think a minimum requirement is
that it succeed and be able to start up, and allow the user to manually
REINDEX such indexes afterwards. Bonus points for:

1. teaching pg_upgrade to create a script containing the required REINDEX
commands. (I think it's produced scripts for similar requirements in the
past.)

2. marking the index invalid so that the system would silently ignore it
until it's been reindexed. I think there might be adequate infrastructure
for that already thanks to REINDEX CONCURRENTLY, and it'd just be a matter
of getting pg_upgrade to hack the indexes' catalog state. (If not, it's
probably not worth the trouble.)

We already have code to do all of that, but it was removed from
pg_upgrade in 9.5. You can still see it in 9.4:

contrib/pg_upgrade/version_old_8_3.c::old_8_3_invalidate_hash_gin_indexes()

Thanks for the pointer.

I would be happy to restore that code and make it work for PG 10.

Attached patch implements the two points suggested by Tom. I will add
this to PG-10 open issues list so that we don't forget about this
work, let me know if you think otherwise.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

pg_upgrade_invalidate_old_hash_index_v1.patchapplication/octet-stream; name=pg_upgrade_invalidate_old_hash_index_v1.patchDownload

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index d52c81e..9358b30 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -98,9 +98,16 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/* Pre-PG 10 allowed tables with 'unknown' type columns */
+	/*
+	 * Pre-PG 10 allowed tables with 'unknown' type columns and non WAL logged
+	 * hash indexes
+	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 906)
+	{
 		old_9_6_check_for_unknown_data_type_usage(&old_cluster);
+		if (user_opts.check)
+			old_9_6_invalidate_hash_indexes(&old_cluster, true);
+	}
 
 	/* 9.5 and below should not have roles starting with pg_ */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 905)
@@ -176,6 +183,14 @@ issue_warnings(void)
 		new_9_0_populate_pg_largeobject_metadata(&new_cluster, false);
 		stop_postmaster(false);
 	}
+
+	/* Reindex hash indexes for old < 10.0 */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 906)
+	{
+		start_postmaster(&new_cluster, true);
+		old_9_6_invalidate_hash_indexes(&old_cluster, false);
+		stop_postmaster(false);
+	}
 }
 
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index b78b877..8fbf8ac 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -441,6 +441,8 @@ void new_9_0_populate_pg_largeobject_metadata(ClusterInfo *cluster,
 										 bool check_mode);
 void		old_9_3_check_for_line_data_type_usage(ClusterInfo *cluster);
 void		old_9_6_check_for_unknown_data_type_usage(ClusterInfo *cluster);
+void old_9_6_invalidate_hash_indexes(ClusterInfo *cluster,
+								bool check_mode);
 
 /* parallel.c */
 void parallel_exec_prog(const char *log_file, const char *opt_log_file,
diff --git a/src/bin/pg_upgrade/version.c b/src/bin/pg_upgrade/version.c
index a3651aa..1af0a48 100644
--- a/src/bin/pg_upgrade/version.c
+++ b/src/bin/pg_upgrade/version.c
@@ -287,3 +287,115 @@ old_9_6_check_for_unknown_data_type_usage(ClusterInfo *cluster)
 	else
 		check_ok();
 }
+
+/*
+ * old_9_6_invalidate_hash_indexes()
+ *	9.6 -> 10
+ *	Hash index binary format has changed from 9.6->10.0
+ */
+void
+old_9_6_invalidate_hash_indexes(ClusterInfo *cluster, bool check_mode)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	bool		found = false;
+	char		output_path[MAXPGPATH];
+
+	prep_status("Checking for hash indexes");
+
+	snprintf(output_path, sizeof(output_path), "reindex_hash.sql");
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		bool		db_used = false;
+		int			ntups;
+		int			rowno;
+		int			i_nspname,
+					i_relname;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* find hash and gin indexes */
+		res = executeQueryOrDie(conn,
+								"SELECT n.nspname, c.relname "
+								"FROM	pg_catalog.pg_class c, "
+								"		pg_catalog.pg_index i, "
+								"		pg_catalog.pg_am a, "
+								"		pg_catalog.pg_namespace n "
+								"WHERE	i.indexrelid = c.oid AND "
+								"		c.relam = a.oid AND "
+								"		c.relnamespace = n.oid AND "
+								"		a.amname = 'hash'"
+			);
+
+		ntups = PQntuples(res);
+		i_nspname = PQfnumber(res, "nspname");
+		i_relname = PQfnumber(res, "relname");
+		for (rowno = 0; rowno < ntups; rowno++)
+		{
+			found = true;
+			if (!check_mode)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s\n", output_path,
+							 strerror(errno));
+				if (!db_used)
+				{
+					PQExpBufferData connectbuf;
+
+					initPQExpBuffer(&connectbuf);
+					appendPsqlMetaConnect(&connectbuf, active_db->db_name);
+					fputs(connectbuf.data, script);
+					termPQExpBuffer(&connectbuf);
+					db_used = true;
+				}
+				fprintf(script, "REINDEX INDEX %s.%s;\n",
+						quote_identifier(PQgetvalue(res, rowno, i_nspname)),
+						quote_identifier(PQgetvalue(res, rowno, i_relname)));
+			}
+		}
+
+		PQclear(res);
+
+		if (!check_mode && db_used)
+			/* mark hash indexes as invalid */
+			PQclear(executeQueryOrDie(conn,
+									  "UPDATE pg_catalog.pg_index i "
+									  "SET	indisvalid = false "
+									  "FROM	pg_catalog.pg_class c, "
+									  "		pg_catalog.pg_am a, "
+									  "		pg_catalog.pg_namespace n "
+									  "WHERE	i.indexrelid = c.oid AND "
+									  "		c.relam = a.oid AND "
+									  "		c.relnamespace = n.oid AND "
+									  "		a.amname = 'hash'"));
+
+		PQfinish(conn);
+	}
+
+	if (script)
+		fclose(script);
+
+	if (found)
+	{
+		report_status(PG_WARNING, "warning");
+		if (check_mode)
+			pg_log(PG_WARNING, "\n"
+				   "Your installation contains hash indexes.  These indexes have different\n"
+				   "internal formats between your old and new clusters, so they must be\n"
+				   "reindexed with the REINDEX command.  After upgrading, you will be given\n"
+				   "REINDEX instructions.\n\n");
+		else
+			pg_log(PG_WARNING, "\n"
+				   "Your installation contains hash indexes.  These indexes have different\n"
+				   "internal formats between your old and new clusters, so they must be\n"
+				   "reindexed with the REINDEX command.  The file:\n"
+				   "    %s\n"
+				   "when executed by psql by the database superuser will recreate all invalid\n"
+			  "indexes; until then, none of these indexes will be used.\n\n",
+				   output_path);
+	}
+	else
+		check_ok();
+}

Noah Misch

noah@leadboat.com

over 8 years ago

In reply to: Amit Kapila (#8)

Re: bumping HASH_VERSION to 3

On Wed, May 10, 2017 at 11:25:22AM +0530, Amit Kapila wrote:

On Tue, Apr 11, 2017 at 11:53 PM, Bruce Momjian <bruce@momjian.us> wrote:

On Fri, Mar 31, 2017 at 02:48:05PM -0400, Tom Lane wrote:

+1, as long as we're clear on what will happen when pg_upgrade'ing
an installation containing hash indexes. I think a minimum requirement is
that it succeed and be able to start up, and allow the user to manually
REINDEX such indexes afterwards. Bonus points for:

1. teaching pg_upgrade to create a script containing the required REINDEX
commands. (I think it's produced scripts for similar requirements in the
past.)

2. marking the index invalid so that the system would silently ignore it
until it's been reindexed. I think there might be adequate infrastructure
for that already thanks to REINDEX CONCURRENTLY, and it'd just be a matter
of getting pg_upgrade to hack the indexes' catalog state. (If not, it's
probably not worth the trouble.)

We already have code to do all of that, but it was removed from
pg_upgrade in 9.5. You can still see it in 9.4:

contrib/pg_upgrade/version_old_8_3.c::old_8_3_invalidate_hash_gin_indexes()

Thanks for the pointer.

I would be happy to restore that code and make it work for PG 10.

Attached patch implements the two points suggested by Tom. I will add
this to PG-10 open issues list so that we don't forget about this
work, let me know if you think otherwise.

[Action required within three days. This is a generic notification.]

The above-described topic is currently a PostgreSQL 10 open item. Robert,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1]/messages/by-id/20170404140717.GA2675809@tornado.leadboat.com and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.

[1]: /messages/by-id/20170404140717.GA2675809@tornado.leadboat.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Noah Misch (#9)

Re: bumping HASH_VERSION to 3

On Mon, May 15, 2017 at 12:08 AM, Noah Misch <noah@leadboat.com> wrote:

The above-described topic is currently a PostgreSQL 10 open item. Robert,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.

I don't believe this patch can be committed to beta1 which wraps in
just a few hours; it seems to need a bit of work. I'll update again
by Friday based on how review unfolds this week. I anticipate
committing something on Wednesday or Thursday unless Bruce picks this
one up.

+        /* find hash and gin indexes */
+        res = executeQueryOrDie(conn,
+                                "SELECT n.nspname, c.relname "
+                                "FROM   pg_catalog.pg_class c, "
+                                "       pg_catalog.pg_index i, "
+                                "       pg_catalog.pg_am a, "
+                                "       pg_catalog.pg_namespace n "
+                                "WHERE  i.indexrelid = c.oid AND "
+                                "       c.relam = a.oid AND "
+                                "       c.relnamespace = n.oid AND "
+                                "       a.amname = 'hash'"

Comment doesn't match code.

+        if (!check_mode && db_used)
+            /* mark hash indexes as invalid */
+            PQclear(executeQueryOrDie(conn,

Given that we have a comment here, I'd put curly braces around this block.

+ snprintf(output_path, sizeof(output_path), "reindex_hash.sql");

This looks suspiciously pointless. The contents of output_path will
always be precisely "reindex_hash.sql"; you could just char
*output_path = "reindex_hash.sql" and get the same effect. Compare
this to the logic in create_script_for_cluster_analyze, which prepends
SCRIPT_PREFIX. (I am not sure why it's necessary to prepend "./" on
Windows, but if it's needed in that case, maybe it's needed in this
case, too.)

+        start_postmaster(&new_cluster, true);
+        old_9_6_invalidate_hash_indexes(&old_cluster, false);
+        stop_postmaster(false);

Won't this cause the hash indexes to be invalided in the old cluster
rather than the new one?

This might need a visit from pgindent in one or two places, too.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#11

Amit Kapila

amit.kapila16@gmail.com

over 8 years ago

In reply to: Robert Haas (#10)

Re: bumping HASH_VERSION to 3

On Mon, May 15, 2017 at 8:57 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, May 15, 2017 at 12:08 AM, Noah Misch <noah@leadboat.com> wrote:

The above-described topic is currently a PostgreSQL 10 open item. Robert,
since you committed the patch believed to have created it, you own this open
item. If some other commit is more relevant or if this does not belong as a
v10 open item, please let us know. Otherwise, please observe the policy on
open item ownership[1] and send a status update within three calendar days of
this message. Include a date for your subsequent status update. Testers may
discover new open items at any time, and I want to plan to get them all fixed
well in advance of shipping v10. Consequently, I will appreciate your efforts
toward speedy resolution. Thanks.

I don't believe this patch can be committed to beta1 which wraps in
just a few hours; it seems to need a bit of work. I'll update again
by Friday based on how review unfolds this week. I anticipate
committing something on Wednesday or Thursday unless Bruce picks this
one up.
+        /* find hash and gin indexes */
+        res = executeQueryOrDie(conn,
+                                "SELECT n.nspname, c.relname "
+                                "FROM   pg_catalog.pg_class c, "
+                                "       pg_catalog.pg_index i, "
+                                "       pg_catalog.pg_am a, "
+                                "       pg_catalog.pg_namespace n "
+                                "WHERE  i.indexrelid = c.oid AND "
+                                "       c.relam = a.oid AND "
+                                "       c.relnamespace = n.oid AND "
+                                "       a.amname = 'hash'"
Comment doesn't match code.
+        if (!check_mode && db_used)
+            /* mark hash indexes as invalid */
+            PQclear(executeQueryOrDie(conn,
Given that we have a comment here, I'd put curly braces around this block.

Okay, will change.

+ snprintf(output_path, sizeof(output_path), "reindex_hash.sql");

This looks suspiciously pointless. The contents of output_path will
always be precisely "reindex_hash.sql"; you could just char
*output_path = "reindex_hash.sql" and get the same effect.

Sure, but the same code pattern is used in all other similar functions
in version.c, refer new_9_0_populate_pg_largeobject_metadata. Both
the ways will serve the purpose, do you think it makes sense to write
this differently?

Compare
this to the logic in create_script_for_cluster_analyze, which prepends
SCRIPT_PREFIX.

That is required for .sh or .bat files not for .sql files. I think we
need to compare it with logic in
new_9_0_populate_pg_largeobject_metadata.

(I am not sure why it's necessary to prepend "./" on
Windows, but if it's needed in that case, maybe it's needed in this
case, too.)

Hmm, "./" is required for non-windows, but as mentioned above, this is
not required for our case.

+        start_postmaster(&new_cluster, true);
+        old_9_6_invalidate_hash_indexes(&old_cluster, false);
+        stop_postmaster(false);
Won't this cause the hash indexes to be invalided in the old cluster
rather than the new one?

oops. copy-paste. It passed in my testing because I have not used any
different options (like port number) for old or new server.

This might need a visit from pgindent in one or two places, too.

I have run pgindent before sending the previous version, but will
again verify the same.

I will send an updated patch once we agree on above points.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Amit Kapila (#11)

Re: bumping HASH_VERSION to 3

On Tue, May 16, 2017 at 7:31 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

+ snprintf(output_path, sizeof(output_path), "reindex_hash.sql");

This looks suspiciously pointless. The contents of output_path will
always be precisely "reindex_hash.sql"; you could just char
*output_path = "reindex_hash.sql" and get the same effect.

Sure, but the same code pattern is used in all other similar functions
in version.c, refer new_9_0_populate_pg_largeobject_metadata. Both
the ways will serve the purpose, do you think it makes sense to write
this differently?

Yes. It's silly to copy a constant string into a new buffer just for
the heck of it. Perhaps the old instances should also be cleaned up
at some point, but even if we don't bother, copying absolutely
pointless coding into new places isn't getting us anywhere.

Hmm, "./" is required for non-windows, but as mentioned above, this is
not required for our case.

OK.

I will send an updated patch once we agree on above points.

Sounds good.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13

Amit Kapila

amit.kapila16@gmail.com

over 8 years ago

In reply to: Robert Haas (#12)

1 attachment(s)

Re: bumping HASH_VERSION to 3

On Tue, May 16, 2017 at 5:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, May 16, 2017 at 7:31 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

I will send an updated patch once we agree on above points.

Sounds good.

Attached patch addresses all the comments as discussed.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

Attachments:

pg_upgrade_invalidate_old_hash_index_v2.patchapplication/octet-stream; name=pg_upgrade_invalidate_old_hash_index_v2.patchDownload

diff --git a/src/bin/pg_upgrade/check.c b/src/bin/pg_upgrade/check.c
index d52c81e..8b9e81e 100644
--- a/src/bin/pg_upgrade/check.c
+++ b/src/bin/pg_upgrade/check.c
@@ -98,9 +98,16 @@ check_and_dump_old_cluster(bool live_check)
 	check_for_reg_data_type_usage(&old_cluster);
 	check_for_isn_and_int8_passing_mismatch(&old_cluster);
 
-	/* Pre-PG 10 allowed tables with 'unknown' type columns */
+	/*
+	 * Pre-PG 10 allowed tables with 'unknown' type columns and non WAL logged
+	 * hash indexes
+	 */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 906)
+	{
 		old_9_6_check_for_unknown_data_type_usage(&old_cluster);
+		if (user_opts.check)
+			old_9_6_invalidate_hash_indexes(&old_cluster, true);
+	}
 
 	/* 9.5 and below should not have roles starting with pg_ */
 	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 905)
@@ -176,6 +183,14 @@ issue_warnings(void)
 		new_9_0_populate_pg_largeobject_metadata(&new_cluster, false);
 		stop_postmaster(false);
 	}
+
+	/* Reindex hash indexes for old < 10.0 */
+	if (GET_MAJOR_VERSION(old_cluster.major_version) <= 906)
+	{
+		start_postmaster(&new_cluster, true);
+		old_9_6_invalidate_hash_indexes(&new_cluster, false);
+		stop_postmaster(false);
+	}
 }
 
 
diff --git a/src/bin/pg_upgrade/pg_upgrade.h b/src/bin/pg_upgrade/pg_upgrade.h
index b78b877..8fbf8ac 100644
--- a/src/bin/pg_upgrade/pg_upgrade.h
+++ b/src/bin/pg_upgrade/pg_upgrade.h
@@ -441,6 +441,8 @@ void new_9_0_populate_pg_largeobject_metadata(ClusterInfo *cluster,
 										 bool check_mode);
 void		old_9_3_check_for_line_data_type_usage(ClusterInfo *cluster);
 void		old_9_6_check_for_unknown_data_type_usage(ClusterInfo *cluster);
+void old_9_6_invalidate_hash_indexes(ClusterInfo *cluster,
+								bool check_mode);
 
 /* parallel.c */
 void parallel_exec_prog(const char *log_file, const char *opt_log_file,
diff --git a/src/bin/pg_upgrade/version.c b/src/bin/pg_upgrade/version.c
index a3651aa..814eaa5 100644
--- a/src/bin/pg_upgrade/version.c
+++ b/src/bin/pg_upgrade/version.c
@@ -287,3 +287,115 @@ old_9_6_check_for_unknown_data_type_usage(ClusterInfo *cluster)
 	else
 		check_ok();
 }
+
+/*
+ * old_9_6_invalidate_hash_indexes()
+ *	9.6 -> 10
+ *	Hash index binary format has changed from 9.6->10.0
+ */
+void
+old_9_6_invalidate_hash_indexes(ClusterInfo *cluster, bool check_mode)
+{
+	int			dbnum;
+	FILE	   *script = NULL;
+	bool		found = false;
+	char	   *output_path = "reindex_hash.sql";
+
+	prep_status("Checking for hash indexes");
+
+	for (dbnum = 0; dbnum < cluster->dbarr.ndbs; dbnum++)
+	{
+		PGresult   *res;
+		bool		db_used = false;
+		int			ntups;
+		int			rowno;
+		int			i_nspname,
+					i_relname;
+		DbInfo	   *active_db = &cluster->dbarr.dbs[dbnum];
+		PGconn	   *conn = connectToServer(cluster, active_db->db_name);
+
+		/* find hash indexes */
+		res = executeQueryOrDie(conn,
+								"SELECT n.nspname, c.relname "
+								"FROM	pg_catalog.pg_class c, "
+								"		pg_catalog.pg_index i, "
+								"		pg_catalog.pg_am a, "
+								"		pg_catalog.pg_namespace n "
+								"WHERE	i.indexrelid = c.oid AND "
+								"		c.relam = a.oid AND "
+								"		c.relnamespace = n.oid AND "
+								"		a.amname = 'hash'"
+			);
+
+		ntups = PQntuples(res);
+		i_nspname = PQfnumber(res, "nspname");
+		i_relname = PQfnumber(res, "relname");
+		for (rowno = 0; rowno < ntups; rowno++)
+		{
+			found = true;
+			if (!check_mode)
+			{
+				if (script == NULL && (script = fopen_priv(output_path, "w")) == NULL)
+					pg_fatal("could not open file \"%s\": %s\n", output_path,
+							 strerror(errno));
+				if (!db_used)
+				{
+					PQExpBufferData connectbuf;
+
+					initPQExpBuffer(&connectbuf);
+					appendPsqlMetaConnect(&connectbuf, active_db->db_name);
+					fputs(connectbuf.data, script);
+					termPQExpBuffer(&connectbuf);
+					db_used = true;
+				}
+				fprintf(script, "REINDEX INDEX %s.%s;\n",
+						quote_identifier(PQgetvalue(res, rowno, i_nspname)),
+						quote_identifier(PQgetvalue(res, rowno, i_relname)));
+			}
+		}
+
+		PQclear(res);
+
+		if (!check_mode && db_used)
+		{
+			/* mark hash indexes as invalid */
+			PQclear(executeQueryOrDie(conn,
+									  "UPDATE pg_catalog.pg_index i "
+									  "SET	indisvalid = false "
+									  "FROM	pg_catalog.pg_class c, "
+									  "		pg_catalog.pg_am a, "
+									  "		pg_catalog.pg_namespace n "
+									  "WHERE	i.indexrelid = c.oid AND "
+									  "		c.relam = a.oid AND "
+									  "		c.relnamespace = n.oid AND "
+									  "		a.amname = 'hash'"));
+		}
+
+		PQfinish(conn);
+	}
+
+	if (script)
+		fclose(script);
+
+	if (found)
+	{
+		report_status(PG_WARNING, "warning");
+		if (check_mode)
+			pg_log(PG_WARNING, "\n"
+				   "Your installation contains hash indexes.  These indexes have different\n"
+				   "internal formats between your old and new clusters, so they must be\n"
+				   "reindexed with the REINDEX command.  After upgrading, you will be given\n"
+				   "REINDEX instructions.\n\n");
+		else
+			pg_log(PG_WARNING, "\n"
+				   "Your installation contains hash indexes.  These indexes have different\n"
+				   "internal formats between your old and new clusters, so they must be\n"
+				   "reindexed with the REINDEX command.  The file:\n"
+				   "    %s\n"
+				   "when executed by psql by the database superuser will recreate all invalid\n"
+			  "indexes; until then, none of these indexes will be used.\n\n",
+				   output_path);
+	}
+	else
+		check_ok();
+}

#14

Robert Haas

robertmhaas@gmail.com

over 8 years ago

In reply to: Amit Kapila (#13)

Re: bumping HASH_VERSION to 3

On Tue, May 16, 2017 at 9:25 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, May 16, 2017 at 5:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, May 16, 2017 at 7:31 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

I will send an updated patch once we agree on above points.

Sounds good.

Attached patch addresses all the comments as discussed.

Committed.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15

Amit Kapila

amit.kapila16@gmail.com

over 8 years ago

In reply to: Robert Haas (#14)

Re: bumping HASH_VERSION to 3

On Sat, May 20, 2017 at 2:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, May 16, 2017 at 9:25 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, May 16, 2017 at 5:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, May 16, 2017 at 7:31 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

I will send an updated patch once we agree on above points.

Sounds good.

Attached patch addresses all the comments as discussed.

Committed.

Thanks.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16

Bruce Momjian

bruce@momjian.us

over 8 years ago

In reply to: Robert Haas (#14)

Re: bumping HASH_VERSION to 3

On Fri, May 19, 2017 at 04:54:27PM -0400, Robert Haas wrote:

On Tue, May 16, 2017 at 9:25 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, May 16, 2017 at 5:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, May 16, 2017 at 7:31 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

I will send an updated patch once we agree on above points.

Sounds good.

Attached patch addresses all the comments as discussed.

Committed.

Just checking, but you reused some of my code from the 8.3-8.4 migration
in pg_upgrade that was removed a while back, rather than writing it from
scratch, right?

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17

Amit Kapila

amit.kapila16@gmail.com

over 8 years ago

In reply to: Bruce Momjian (#16)

Re: bumping HASH_VERSION to 3

On Sun, May 21, 2017 at 5:26 AM, Bruce Momjian <bruce@momjian.us> wrote:

On Fri, May 19, 2017 at 04:54:27PM -0400, Robert Haas wrote:

On Tue, May 16, 2017 at 9:25 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, May 16, 2017 at 5:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, May 16, 2017 at 7:31 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

I will send an updated patch once we agree on above points.

Sounds good.

Attached patch addresses all the comments as discussed.

Committed.

Just checking, but you reused some of my code from the 8.3-8.4 migration
in pg_upgrade that was removed a while back, rather than writing it from
scratch, right?

Yes, I have adapted it to what is required for hash indexes, adjusted
the code for v10 and there are some other minor changes in code.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18

Bruce Momjian

bruce@momjian.us

over 8 years ago

In reply to: Amit Kapila (#17)

Re: bumping HASH_VERSION to 3

On Sun, May 21, 2017 at 09:11:29AM +0530, Amit Kapila wrote:

On Sun, May 21, 2017 at 5:26 AM, Bruce Momjian <bruce@momjian.us> wrote:

On Fri, May 19, 2017 at 04:54:27PM -0400, Robert Haas wrote:

On Tue, May 16, 2017 at 9:25 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Tue, May 16, 2017 at 5:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Tue, May 16, 2017 at 7:31 AM, Amit Kapila <amit.kapila16@gmail.com> wrote:

I will send an updated patch once we agree on above points.

Sounds good.

Attached patch addresses all the comments as discussed.

Committed.

Just checking, but you reused some of my code from the 8.3-8.4 migration
in pg_upgrade that was removed a while back, rather than writing it from
scratch, right?

Yes, I have adapted it to what is required for hash indexes, adjusted
the code for v10 and there are some other minor changes in code.

Great, thanks. It was hard to do originally so I am glad you could
reuse it.

--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com

+ As you are, so once was I.  As I am, so you will be. +
+                      Ancient Roman grave inscription +

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers