pg_upgrading to 10.1 corrupts (varchar,varchar) UNIQUE indices

Started by rihadover 8 years ago6 messagesgeneral
Jump to latest
#1rihad
rihad@mail.ru

Hi there, this is a reproducible error. We recently pg_upgraded our
production database to 10.1 from 9.6.6. The upgrade runs fine with the
suggestion to analyze all data.

[rihad@postgres-10-test]$ cat analyze_new_cluster.sh
#!/bin/sh

echo 'This script will generate minimal optimizer statistics rapidly'
echo 'so your system is usable, and then gather statistics twice more'
echo 'with increasing accuracy.  When it is done, your system will'
echo 'have the default level of optimizer statistics.'
echo

echo 'If you have used ALTER TABLE to modify the statistics target for'
echo 'any tables, you might want to remove them and restore them after'
echo 'running this script because they will delay fast statistics
generation.'
echo

echo 'If you would like default statistics as quickly as possible, cancel'
echo 'this script and run:'
echo '    "/10.1/bin/vacuumdb" --all --analyze-only'
echo

"/10.1/bin/vacuumdb" --all --analyze-in-stages
echo

echo 'Done'

which we run after the upgrade. It doesn't matter if we do the analyze
so in a test environment with no activity or run it concurrently with
the already started production queries.

The problem:

Some of our tables have indices of the form:

    "index_translations_on_locale_and_key" UNIQUE, btree (locale, key)

 locale     | character varying(255)      |           | |
 key        | character varying(255)      |           |          |

or

    "index_users_on_email_and_type" UNIQUE, btree (email, type)

 email           | character varying(255)      |           | not null |
''::character varying
 type            | character varying           |           | not null |

(these are different tables)

Trying to find data using the specified indices fails to find matching rows:

foo=# select * from translations where locale='de' and
key='extranet.options.places.age_brackets_hints.a';
 id | locale | key | value | created_at | updated_at | resolved
----+--------+-----+-------+------------+------------+----------
(0 rows)

foo=# explain select * from translations where locale='de' and
key='extranet.options.places.age_brackets_hints.a';
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Index Scan using index_translations_on_locale_and_key on translations 
(cost=0.41..2.63 rows=1 width=234)
   Index Cond: (((locale)::text = 'de'::text) AND ((key)::text =
'extranet.options.places.age_brackets_hints.a'::text))
(2 rows)

reindexing the table fixes the issue:

foo=# reindex index index_translations_on_locale_and_key ;
REINDEX
foo=# select * from translations where locale='de' and
key='extranet.options.places.age_brackets_hints.a';
   id   | locale |                     key |                       
value                        | created_at         |        
updated_at         | resolved
--------+--------+----------------------------------------------+-----------------------------------------------------+----------------------------+----------------------------+----------
 136373 | de     | extranet.options.places.age_brackets_hints.a | Alter
für einen vollen Gast-Tarif, z.B ab 12 Jahre  | 2017-08-22
11:27:27.774259 | 2017-09-02 09:05:45.244927 | f
(1 row)

foo=# explain select * from translations where locale='de' and
key='extranet.options.places.age_brackets_hints.a';
                                                       QUERY PLAN
------------------------------------------------------------------------------------------------------------------------
 Index Scan using index_translations_on_locale_and_key on translations 
(cost=0.41..2.63 rows=1 width=234)
   Index Cond: (((locale)::text = 'de'::text) AND ((key)::text =
'extranet.options.places.age_brackets_hints.a'::text))
(2 rows)

The upgrade guide states that only hash indices should be rebuilt after
the upgrade, not btree ones.

#2Magnus Hagander
magnus@hagander.net
In reply to: rihad (#1)
Re: pg_upgrading to 10.1 corrupts (varchar,varchar) UNIQUE indices

On Fri, Dec 22, 2017 at 12:53 PM, rihad <rihad@mail.ru> wrote:

Hi there, this is a reproducible error. We recently pg_upgraded our
production database to 10.1 from 9.6.6. The upgrade runs fine with the
suggestion to analyze all data.

[rihad@postgres-10-test]$ cat analyze_new_cluster.sh
#!/bin/sh

echo 'This script will generate minimal optimizer statistics rapidly'
echo 'so your system is usable, and then gather statistics twice more'
echo 'with increasing accuracy. When it is done, your system will'
echo 'have the default level of optimizer statistics.'
echo

echo 'If you have used ALTER TABLE to modify the statistics target for'
echo 'any tables, you might want to remove them and restore them after'
echo 'running this script because they will delay fast statistics
generation.'
echo

echo 'If you would like default statistics as quickly as possible, cancel'
echo 'this script and run:'
echo ' "/10.1/bin/vacuumdb" --all --analyze-only'
echo

"/10.1/bin/vacuumdb" --all --analyze-in-stages
echo

echo 'Done'

which we run after the upgrade. It doesn't matter if we do the analyze so
in a test environment with no activity or run it concurrently with the
already started production queries.

The problem:

Some of our tables have indices of the form:

"index_translations_on_locale_and_key" UNIQUE, btree (locale, key)

locale | character varying(255) | | |
key | character varying(255) | | |

or

"index_users_on_email_and_type" UNIQUE, btree (email, type)

email | character varying(255) | | not null |
''::character varying
type | character varying | | not null |

(these are different tables)

Trying to find data using the specified indices fails to find matching
rows:

foo=# select * from translations where locale='de' and
key='extranet.options.places.age_brackets_hints.a';
id | locale | key | value | created_at | updated_at | resolved
----+--------+-----+-------+------------+------------+----------
(0 rows)

foo=# explain select * from translations where locale='de' and
key='extranet.options.places.age_brackets_hints.a';
QUERY PLAN
------------------------------------------------------------
------------------------------------------------------------
Index Scan using index_translations_on_locale_and_key on translations
(cost=0.41..2.63 rows=1 width=234)
Index Cond: (((locale)::text = 'de'::text) AND ((key)::text =
'extranet.options.places.age_brackets_hints.a'::text))
(2 rows)

reindexing the table fixes the issue:

foo=# reindex index index_translations_on_locale_and_key ;
REINDEX
foo=# select * from translations where locale='de' and
key='extranet.options.places.age_brackets_hints.a';
id | locale | key |
value | created_at |
updated_at | resolved
--------+--------+------------------------------------------
----+-----------------------------------------------------+-
---------------------------+----------------------------+----------
136373 | de | extranet.options.places.age_brackets_hints.a | Alter
für einen vollen Gast-Tarif, z.B ab 12 Jahre | 2017-08-22 11:27:27.774259
| 2017-09-02 09:05:45.244927 | f
(1 row)

foo=# explain select * from translations where locale='de' and
key='extranet.options.places.age_brackets_hints.a';
QUERY PLAN
------------------------------------------------------------
------------------------------------------------------------
Index Scan using index_translations_on_locale_and_key on translations
(cost=0.41..2.63 rows=1 width=234)
Index Cond: (((locale)::text = 'de'::text) AND ((key)::text =
'extranet.options.places.age_brackets_hints.a'::text))
(2 rows)

The upgrade guide states that only hash indices should be rebuilt after
the upgrade, not btree ones.

What platform are you on, how was PostgreSQL installed, and exactly how was
pg_upgrade executed? (or is, since it's reproducible)

--
Magnus Hagander
Me: https://www.hagander.net/ <http://www.hagander.net/&gt;
Work: https://www.redpill-linpro.com/ <http://www.redpill-linpro.com/&gt;

#3Laurenz Albe
laurenz.albe@cybertec.at
In reply to: rihad (#1)
Re: pg_upgrading to 10.1 corrupts (varchar,varchar) UNIQUE indices

rihad wrote:

Hi there, this is a reproducible error. We recently pg_upgraded our
production database to 10.1 from 9.6.6. The upgrade runs fine with the
suggestion to analyze all data.

[...]

Trying to find data using the specified indices fails to find matching rows:

[...]

reindexing the table fixes the issue:

Did you also upgrade the operating system or at least libc?

Then the problem could come from a modified collation.
Compare /messages/by-id/CB4D1C6BAA80CF146CB0D4F2@eje.credativ.lan

Yours,
Laurenz Albe

#4rihad
rihad@mail.ru
In reply to: Magnus Hagander (#2)
Re: pg_upgrading to 10.1 corrupts (varchar,varchar) UNIQUE indices

On 12/22/2017 05:09 PM, Magnus Hagander wrote:

What platform are you on

FreeBSD 10.3-RELEASE-p26 amd64

, how was PostgreSQL installed

pkg install.
Additionally, 10.1 needed for pg_upgrade to run was installed using
portmaster -m PREFIX=/10.1, because FreeBSD doesn't allow more than one
major version to be installed simultaneously.
After the upgrade took place, 9.6.6 was pkg deleted and 10.1 was pkg
installed.

and exactly how was pg_upgrade executed? (or is, since it's reproducible)

sudo -u postgres /10.1/bin/pg_upgrade -j2 --link -r -b /usr/local/bin -B
/10.1/bin/ -d /var/db/postgres/data96/ -D /var/db/postgres/data10/

#5rihad
rihad@mail.ru
In reply to: Laurenz Albe (#3)
Re: pg_upgrading to 10.1 corrupts (varchar,varchar) UNIQUE indices

On 12/22/2017 05:09 PM, Laurenz Albe wrote:

rihad wrote:

Hi there, this is a reproducible error. We recently pg_upgraded our
production database to 10.1 from 9.6.6. The upgrade runs fine with the
suggestion to analyze all data.

[...]

Trying to find data using the specified indices fails to find matching rows:

[...]

reindexing the table fixes the issue:

Did you also upgrade the operating system or at least libc?

It looks like it.

$ ll /lib/libc.so.7
-r--r--r--  1 root  wheel  1647104 Jun 19  2017 /lib/libc.so.7

this never was a major/minor OS upgrade, just whatever FreeBSD patch
releases included.

Then the problem could come from a modified collation.
Compare /messages/by-id/CB4D1C6BAA80CF146CB0D4F2@eje.credativ.lan

de (German) was just an example, the inability to find matching rows
affected other languages like fr or even en, too.
It does look like a collation issue, though.

#6Laurenz Albe
laurenz.albe@cybertec.at
In reply to: rihad (#5)
Re: pg_upgrading to 10.1 corrupts (varchar,varchar) UNIQUE indices

rihad wrote:

On 12/22/2017 05:09 PM, Laurenz Albe wrote:

rihad wrote:

Hi there, this is a reproducible error. We recently pg_upgraded our
production database to 10.1 from 9.6.6. The upgrade runs fine with the
suggestion to analyze all data.

[...]

Trying to find data using the specified indices fails to find matching rows:

[...]

reindexing the table fixes the issue:

Did you also upgrade the operating system or at least libc?

It looks like it.

$ ll /lib/libc.so.7
-r--r--r-- 1 root wheel 1647104 Jun 19 2017 /lib/libc.so.7

this never was a major/minor OS upgrade, just whatever FreeBSD patch
releases included.

Then the problem could come from a modified collation.
Compare /messages/by-id/CB4D1C6BAA80CF146CB0D4F2@eje.credativ.lan

de (German) was just an example, the inability to find matching rows
affected other languages like fr or even en, too.
It does look like a collation issue, though.

That would be the best thing, because it would mean that there is no
PostgreSQL bug or hardware problem, and REINDEX will fix the problem.

Yours,
Laurenz Albe