pg_upgrade problem with invalid indexes
I got a report today on the IRC channel about a pg_upgrade problem with
upgrading clusters with indexes that exist but are invalid.
For example, if you use CREATE INDEX CONCURRENTLY, then shut down the
server while it is running, the index will be left as INVALID; from our
CREATE INDEX docs:
If a problem arises while scanning the table, such as a uniqueness
violation in a unique index, the CREATE INDEX command will fail but
leave behind an 'invalid' index. This index will be ignored
for querying purposes because it might be incomplete; however
it will still consume update overhead. The psql \d command will
report such an index as INVALID:
postgres=# \d tab
Table "public.tab"
Column | Type | Modifiers
--------+---------+-----------
col | integer |
Indexes:
"idx" btree (col) INVALID
The recommended recovery method in such cases is to drop the
index and try again to perform CREATE INDEX CONCURRENTLY. (Another
possibility is to rebuild the index with REINDEX. However, since
REINDEX does not support concurrent builds, this option is unlikely
to seem attractive.)
The problem is that this invalid state is not dumped by pg_dump, meaning
pg_upgrade will restore the index as valid.
There are a few possible fixes. The first would be to have pg_upgrade
throw an error on any invalid index in the old cluster. Another option
would be to preserve the invalid state in pg_dump --binary-upgrade.
I also need help in how to communicate this to users since our next
minor release will be in the future.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
There are a few possible fixes. The first would be to have pg_upgrade
throw an error on any invalid index in the old cluster. Another option
would be to preserve the invalid state in pg_dump --binary-upgrade.
Or to not dump invalid indexes at all in --binary-upgrade mode.
I also need help in how to communicate this to users since our next
minor release will be in the future.
Blog post?
--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Bruce Momjian wrote:
There are a few possible fixes. The first would be to have pg_upgrade
throw an error on any invalid index in the old cluster. Another option
would be to preserve the invalid state in pg_dump --binary-upgrade.
Yet another option would be for pg_dump --binary-upgrade to ignore
invalid indexes altogether (and probably "not ready" indexes, too, not
sure).
--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Dec 6, 2012 at 09:35:19PM -0300, Alvaro Herrera wrote:
Bruce Momjian wrote:
There are a few possible fixes. The first would be to have pg_upgrade
throw an error on any invalid index in the old cluster. Another option
would be to preserve the invalid state in pg_dump --binary-upgrade.Yet another option would be for pg_dump --binary-upgrade to ignore
invalid indexes altogether (and probably "not ready" indexes, too, not
sure).
Yes, I thought of not dumping it. The problem is that we don't delete
the index when it fails, so I assumed we didn't want to lose the index
creation information. I need to understand why we did that. Why do we
have pg_dump dump the index then?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Bruce Momjian <bruce@momjian.us> writes:
Yes, I thought of not dumping it. The problem is that we don't delete
the index when it fails, so I assumed we didn't want to lose the index
creation information. I need to understand why we did that.
Because CREATE INDEX CONCURRENTLY can't drop the index if it's already
failed. It's not because we want to do that, it's an implementation
restriction of the horrid kluge that is CREATE/DROP INDEX CONCURRENTLY.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, 2012-12-06 at 16:31 -0800, Josh Berkus wrote:
There are a few possible fixes. The first would be to have pg_upgrade
throw an error on any invalid index in the old cluster. Another option
would be to preserve the invalid state in pg_dump --binary-upgrade.Or to not dump invalid indexes at all in --binary-upgrade mode.
+1
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/06/2012 07:58 PM, Jeff Davis wrote:
On Thu, 2012-12-06 at 16:31 -0800, Josh Berkus wrote:
There are a few possible fixes. The first would be to have pg_upgrade
throw an error on any invalid index in the old cluster. Another option
would be to preserve the invalid state in pg_dump --binary-upgrade.Or to not dump invalid indexes at all in --binary-upgrade mode.
+1
I think I prefer the first suggestion. If they are trying to upgrade
when there's an invalid index presumably they aren't aware of the
invalidity (or they would have done something about it). It would be
better to fail and make them fix or remove the index, ISTM.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Dec 6, 2012 at 07:53:57PM -0500, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
Yes, I thought of not dumping it. The problem is that we don't delete
the index when it fails, so I assumed we didn't want to lose the index
creation information. I need to understand why we did that.Because CREATE INDEX CONCURRENTLY can't drop the index if it's already
failed. It's not because we want to do that, it's an implementation
restriction of the horrid kluge that is CREATE/DROP INDEX CONCURRENTLY.
Well, what is the logic that pg_dump dumps it then, even in
non-binary-upgrade mode?
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, 2012-12-06 at 20:05 -0500, Andrew Dunstan wrote:
I think I prefer the first suggestion. If they are trying to upgrade
when there's an invalid index presumably they aren't aware of the
invalidity (or they would have done something about it). It would be
better to fail and make them fix or remove the index, ISTM.
I'm a little concerned about introducing extra causes of failure into
upgrade when we don't have to. They could have gone on with that invalid
index forever, and I don't see it as the job of upgrade to alert someone
to that problem.
That being said, it's a reasonable position, and I am fine with either
approach.
Regards,
Jeff Davis
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Bruce Momjian <bruce@momjian.us> writes:
On Thu, Dec 6, 2012 at 07:53:57PM -0500, Tom Lane wrote:
Because CREATE INDEX CONCURRENTLY can't drop the index if it's already
failed. It's not because we want to do that, it's an implementation
restriction of the horrid kluge that is CREATE/DROP INDEX CONCURRENTLY.
Well, what is the logic that pg_dump dumps it then, even in
non-binary-upgrade mode?
Actually, I was thinking about proposing exactly that. Ideally the
system should totally ignore an invalid index (we just fixed some bugs
in that line already). So it would be perfectly consistent for pg_dump
to ignore it too, with or without --binary-upgrade.
One possible spanner in the works for pg_upgrade is that this would mean
there can be relation files in the database directories that it should
ignore (not transfer over). Dunno if that takes any logic changes.
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Dec 6, 2012 at 09:10:21PM -0500, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
On Thu, Dec 6, 2012 at 07:53:57PM -0500, Tom Lane wrote:
Because CREATE INDEX CONCURRENTLY can't drop the index if it's already
failed. It's not because we want to do that, it's an implementation
restriction of the horrid kluge that is CREATE/DROP INDEX CONCURRENTLY.Well, what is the logic that pg_dump dumps it then, even in
non-binary-upgrade mode?Actually, I was thinking about proposing exactly that. Ideally the
system should totally ignore an invalid index (we just fixed some bugs
in that line already). So it would be perfectly consistent for pg_dump
to ignore it too, with or without --binary-upgrade.One possible spanner in the works for pg_upgrade is that this would mean
there can be relation files in the database directories that it should
ignore (not transfer over). Dunno if that takes any logic changes.
As soon as pg_dump stopped dumping the CREATE INDEX, pg_upgrade would
stop creating creating it in the new cluster, and not transfer the index
files.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 12/06/2012 09:23 PM, Bruce Momjian wrote:
On Thu, Dec 6, 2012 at 09:10:21PM -0500, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
On Thu, Dec 6, 2012 at 07:53:57PM -0500, Tom Lane wrote:
Because CREATE INDEX CONCURRENTLY can't drop the index if it's already
failed. It's not because we want to do that, it's an implementation
restriction of the horrid kluge that is CREATE/DROP INDEX CONCURRENTLY.Well, what is the logic that pg_dump dumps it then, even in
non-binary-upgrade mode?Actually, I was thinking about proposing exactly that. Ideally the
system should totally ignore an invalid index (we just fixed some bugs
in that line already). So it would be perfectly consistent for pg_dump
to ignore it too, with or without --binary-upgrade.One possible spanner in the works for pg_upgrade is that this would mean
there can be relation files in the database directories that it should
ignore (not transfer over). Dunno if that takes any logic changes.As soon as pg_dump stopped dumping the CREATE INDEX, pg_upgrade would
stop creating creating it in the new cluster, and not transfer the index
files.
So we'll lose the index definition and leave some files behind? This
sounds a bit messy to say the least.
Making the user fix it seems much more sensible to me. Otherwise I
suspect we'll find users who get strangely surprised when they can no
longer find any trace of an expected index in their upgraded database.
cheers
andrew
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Andrew Dunstan (andrew@dunslane.net) wrote:
So we'll lose the index definition and leave some files behind? This
sounds a bit messy to say the least.
Agreed.
Making the user fix it seems much more sensible to me. Otherwise I
suspect we'll find users who get strangely surprised when they can
no longer find any trace of an expected index in their upgraded
database.
Or preserve it as-is. I don't really like the 'make them fix it'
option, as a user could run into that in the middle of a planned upgrade
that had been tested and never had that come up.
Thanks,
Stephen
Stephen Frost <sfrost@snowman.net> writes:
* Andrew Dunstan (andrew@dunslane.net) wrote:
Making the user fix it seems much more sensible to me. Otherwise I
suspect we'll find users who get strangely surprised when they can
no longer find any trace of an expected index in their upgraded
database.
Or preserve it as-is.
To do that, we would have to add an option to CREATE INDEX to create it
in an invalid state. Which is stupid...
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Dec 6, 2012 at 09:41:00PM -0500, Andrew Dunstan wrote:
On 12/06/2012 09:23 PM, Bruce Momjian wrote:
On Thu, Dec 6, 2012 at 09:10:21PM -0500, Tom Lane wrote:
Bruce Momjian <bruce@momjian.us> writes:
On Thu, Dec 6, 2012 at 07:53:57PM -0500, Tom Lane wrote:
Because CREATE INDEX CONCURRENTLY can't drop the index if it's already
failed. It's not because we want to do that, it's an implementation
restriction of the horrid kluge that is CREATE/DROP INDEX CONCURRENTLY.Well, what is the logic that pg_dump dumps it then, even in
non-binary-upgrade mode?Actually, I was thinking about proposing exactly that. Ideally the
system should totally ignore an invalid index (we just fixed some bugs
in that line already). So it would be perfectly consistent for pg_dump
to ignore it too, with or without --binary-upgrade.One possible spanner in the works for pg_upgrade is that this would mean
there can be relation files in the database directories that it should
ignore (not transfer over). Dunno if that takes any logic changes.As soon as pg_dump stopped dumping the CREATE INDEX, pg_upgrade would
stop creating creating it in the new cluster, and not transfer the index
files.So we'll lose the index definition and leave some files behind? This
sounds a bit messy to say the least.
You will lose the index definition, but the new cluster will not have
the invalid index files from the old cluster.
Making the user fix it seems much more sensible to me. Otherwise I
suspect we'll find users who get strangely surprised when they can
no longer find any trace of an expected index in their upgraded
database.
Well, the indexes weren't being used.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Dec 6, 2012 at 09:45:11PM -0500, Stephen Frost wrote:
* Andrew Dunstan (andrew@dunslane.net) wrote:
So we'll lose the index definition and leave some files behind? This
sounds a bit messy to say the least.Agreed.
Making the user fix it seems much more sensible to me. Otherwise I
suspect we'll find users who get strangely surprised when they can
no longer find any trace of an expected index in their upgraded
database.Or preserve it as-is. I don't really like the 'make them fix it'
option, as a user could run into that in the middle of a planned upgrade
that had been tested and never had that come up.
They would get the warning during pg_upgrade --check, of course.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Dec 6, 2012 at 10:06:13PM -0500, Tom Lane wrote:
Stephen Frost <sfrost@snowman.net> writes:
* Andrew Dunstan (andrew@dunslane.net) wrote:
Making the user fix it seems much more sensible to me. Otherwise I
suspect we'll find users who get strangely surprised when they can
no longer find any trace of an expected index in their upgraded
database.Or preserve it as-is.
To do that, we would have to add an option to CREATE INDEX to create it
in an invalid state. Which is stupid...
I think we would have have pg_dump --binary-upgrade issue an UPDATE to
the system catalogs to mark it as invalid.
--
Bruce Momjian <bruce@momjian.us> http://momjian.us
EnterpriseDB http://enterprisedb.com
+ It's impossible for everything to be true. +
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
* Bruce Momjian (bruce@momjian.us) wrote:
On Thu, Dec 6, 2012 at 09:45:11PM -0500, Stephen Frost wrote:
Or preserve it as-is. I don't really like the 'make them fix it'
option, as a user could run into that in the middle of a planned upgrade
that had been tested and never had that come up.They would get the warning during pg_upgrade --check, of course.
Sure, if they happened to have a concurrent index creation going when
they ran the check... But what if they didn't and it only happened to
happen during the actual pg_upgrade? I'm still not thrilled with this
idea of making the user have to abort in the middle to address something
that, really, isn't a big deal to just preserve and deal with later...
Thanks,
Stephen
* Tom Lane (tgl@sss.pgh.pa.us) wrote:
Stephen Frost <sfrost@snowman.net> writes:
Or preserve it as-is.
To do that, we would have to add an option to CREATE INDEX to create it
in an invalid state. Which is stupid...
Only in a binary-upgrade mode.
Thanks,
Stephen
* Bruce Momjian (bruce@momjian.us) wrote:
I think we would have have pg_dump --binary-upgrade issue an UPDATE to
the system catalogs to mark it as invalid.
That'd work for me too- I'm not particular on if it's done as a direct
catalog update or some undocumented feature of CREATE INDEX.
Thanks,
Stephen