pg_upgrade and materialized views

Started by Claudio Freireabout 8 years ago31 messagesbugs
Jump to latest
#1Claudio Freire
klaussfreire@gmail.com

I'm not 100% sure this is a pg_upgrade bug or a pg_dump
--binary-upgrade one, or some other thing, but at this point I'm
fairly certain there's something wrong in one of them.

I just tried to pg_upgrade a database from 9.5 to 10.2. I took a
snapshot off a replica, promoted it, and then did the pg_upgrade there
(to avoid breaking our production server).

It all went very well, except that a database-wide vacuum is
complaining about materialized views, not all of them, specifically
the ones in which we regularly use "REFRESH MATERIALIZED VIEW
CONCURRENTLY" on.

In our production master, those views contain rather old relfrozenxid:

mat=# select relname, relfrozenxid from pg_class where relname like
'%_mv' or relname = 'user_agents_canonical_user_agent_os';
relname | relfrozenxid
-------------------------------------+--------------
os_ranking_mv | 272288261
site_ranking_mv | 272260588
carrier_ranking_mv | 272273002
brand_ranking_mv | 226575108
device_specs_ranking_mv | 182006046
user_agents_canonical_user_agent_os | 129807014
(6 rows)

Of those, the last 3 get concurrent refreshes, the first 3 don't.

In the upgraded server, vacuum complained with:

INFO: vacuuming "public.user_agents_canonical_user_agent_os"
vacuumdb: vacuuming of database "mat" failed: ERROR: found xmin
244738497 from before relfrozenxid 245830003

Now, 245830003 looks a lot like the current xid during pg_upgrade, so
I believe pg_dump is somehow failing to restore relfrozenxid on those
matviews. In fact, trying pg_dump --binary-upgrade on any matview
shows that it's not setting relfrozenxid, probably because in a normal
dump, matviews are refreshed, but not when --binary-upgrade is used
(since it's usually used with --schema-only as well).

I haven't yet managed to build a minimal case to reproduce this, I'll
post it when I succeed, but I wanted to report the issue now since it
looks like a genuine bug.

#2Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Claudio Freire (#1)
Re: pg_upgrade and materialized views

Claudio Freire wrote:

In the upgraded server, vacuum complained with:

INFO: vacuuming "public.user_agents_canonical_user_agent_os"
vacuumdb: vacuuming of database "mat" failed: ERROR: found xmin
244738497 from before relfrozenxid 245830003

Wow, these checks were just added shortly before the minors released a
couple of weeks ago, and they've already found one genuine bug! Nice.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: Claudio Freire (#1)
Re: pg_upgrade and materialized views

Claudio Freire <klaussfreire@gmail.com> writes:

I'm not 100% sure this is a pg_upgrade bug or a pg_dump
--binary-upgrade one, or some other thing, but at this point I'm
fairly certain there's something wrong in one of them.

The symptoms you're mentioning suggest two bugs: (1) it's not clear
why binary upgrade should treat a matview's relfrozenxid differently
from a regular table's; (2) independently of that, it sounds like REFRESH
MATERIALIZED VIEW CONCURRENTLY is somehow preventing advancement of the
matview's relfrozenxid in the source DB.

I just tried to pg_upgrade a database from 9.5 to 10.2. I took a
snapshot off a replica, promoted it, and then did the pg_upgrade there
(to avoid breaking our production server).

And that brings replication behavior into the mix, too :-(. I'd
suggest seeing if you can duplicate these problems without any
replication involved.

regards, tom lane

#4Andres Freund
andres@anarazel.de
In reply to: Alvaro Herrera (#2)
Re: pg_upgrade and materialized views

On 2018-02-20 18:26:59 -0300, Alvaro Herrera wrote:

Claudio Freire wrote:

In the upgraded server, vacuum complained with:

INFO: vacuuming "public.user_agents_canonical_user_agent_os"
vacuumdb: vacuuming of database "mat" failed: ERROR: found xmin
244738497 from before relfrozenxid 245830003

Wow, these checks were just added shortly before the minors released a
couple of weeks ago, and they've already found one genuine bug! Nice.

My paranoia is vindicated once more!

#5Claudio Freire
klaussfreire@gmail.com
In reply to: Tom Lane (#3)
Re: pg_upgrade and materialized views

On Tue, Feb 20, 2018 at 6:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Claudio Freire <klaussfreire@gmail.com> writes:

I'm not 100% sure this is a pg_upgrade bug or a pg_dump
--binary-upgrade one, or some other thing, but at this point I'm
fairly certain there's something wrong in one of them.

...

(2) independently of that, it sounds like REFRESH
MATERIALIZED VIEW CONCURRENTLY is somehow preventing advancement of the
matview's relfrozenxid in the source DB.

Not necessarily. I have vacuum_table_freeze_max_age set to 350M, so
it's not yet due for freezing.

I just tried to pg_upgrade a database from 9.5 to 10.2. I took a
snapshot off a replica, promoted it, and then did the pg_upgrade there
(to avoid breaking our production server).

And that brings replication behavior into the mix, too :-(. I'd
suggest seeing if you can duplicate these problems without any
replication involved.

Indeed, I'll try.

#6Claudio Freire
klaussfreire@gmail.com
In reply to: Tom Lane (#3)
Re: pg_upgrade and materialized views

On Tue, Feb 20, 2018 at 6:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

(2) independently of that, it sounds like REFRESH
MATERIALIZED VIEW CONCURRENTLY is somehow preventing advancement of the
matview's relfrozenxid in the source DB.

About that, I did check the view's relfrozenxid and its rows' xmin to
see if there was some obvious breakage.

The following query:

select * from device_specs_ranking_mv where age(xmin) > age((select
relfrozenxid from pg_class where relname = 'device_specs_ranking_mv'))
limit 1;

Yields empty results both in the master and the replica, which, unless
I did something wrong in that query, would rule out replication
issues.

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Claudio Freire (#5)
Re: pg_upgrade and materialized views

Claudio Freire <klaussfreire@gmail.com> writes:

On Tue, Feb 20, 2018 at 6:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

(2) independently of that, it sounds like REFRESH
MATERIALIZED VIEW CONCURRENTLY is somehow preventing advancement of the
matview's relfrozenxid in the source DB.

Not necessarily. I have vacuum_table_freeze_max_age set to 350M, so
it's not yet due for freezing.

Perhaps, but it seems pretty suggestive that all of the non-concurrently
refreshed matviews have relfrozenxid significantly newer than all of
the concurrently refreshed ones. Maybe that's just coincidence, or a
predictable outcome of your usage pattern, but I think it needs
explaining.

regards, tom lane

#8Andres Freund
andres@anarazel.de
In reply to: Claudio Freire (#1)
Re: pg_upgrade and materialized views

On 2018-02-20 18:13:26 -0300, Claudio Freire wrote:

I'm not 100% sure this is a pg_upgrade bug or a pg_dump
--binary-upgrade one, or some other thing, but at this point I'm
fairly certain there's something wrong in one of them.

I just tried to pg_upgrade a database from 9.5 to 10.2. I took a
snapshot off a replica, promoted it, and then did the pg_upgrade there
(to avoid breaking our production server).

It all went very well, except that a database-wide vacuum is
complaining about materialized views, not all of them, specifically
the ones in which we regularly use "REFRESH MATERIALIZED VIEW
CONCURRENTLY" on.

In our production master, those views contain rather old relfrozenxid:

mat=# select relname, relfrozenxid from pg_class where relname like
'%_mv' or relname = 'user_agents_canonical_user_agent_os';
relname | relfrozenxid
-------------------------------------+--------------
os_ranking_mv | 272288261
site_ranking_mv | 272260588
carrier_ranking_mv | 272273002
brand_ranking_mv | 226575108
device_specs_ranking_mv | 182006046
user_agents_canonical_user_agent_os | 129807014
(6 rows)

Of those, the last 3 get concurrent refreshes, the first 3 don't.

In the upgraded server, vacuum complained with:

INFO: vacuuming "public.user_agents_canonical_user_agent_os"
vacuumdb: vacuuming of database "mat" failed: ERROR: found xmin
244738497 from before relfrozenxid 245830003

Now, 245830003 looks a lot like the current xid during pg_upgrade, so
I believe pg_dump is somehow failing to restore relfrozenxid on those
matviews. In fact, trying pg_dump --binary-upgrade on any matview
shows that it's not setting relfrozenxid, probably because in a normal
dump, matviews are refreshed, but not when --binary-upgrade is used
(since it's usually used with --schema-only as well).

The important part then happens in pg_dump. Note

/*
* To create binary-compatible heap files, we have to ensure the same
* physical column order, including dropped columns, as in the
* original. Therefore, we create dropped columns above and drop them
* here, also updating their attlen/attalign values so that the
* dropped column can be skipped properly. (We do not bother with
* restoring the original attbyval setting.) Also, inheritance
* relationships are set up by doing ALTER TABLE INHERIT rather than
* using an INHERITS clause --- the latter would possibly mess up the
* column order. That also means we have to take care about setting
* attislocal correctly, plus fix up any inherited CHECK constraints.
* Analogously, we set up typed tables using ALTER TABLE / OF here.
*/
if (dopt->binary_upgrade &&
(tbinfo->relkind == RELKIND_RELATION ||
tbinfo->relkind == RELKIND_FOREIGN_TABLE ||
tbinfo->relkind == RELKIND_PARTITIONED_TABLE))
{
...
appendPQExpBufferStr(q, "\n-- For binary upgrade, set heap's relfrozenxid and relminmxid\n");
appendPQExpBuffer(q, "UPDATE pg_catalog.pg_class\n"
"SET relfrozenxid = '%u', relminmxid = '%u'\n"
"WHERE oid = ",
tbinfo->frozenxid, tbinfo->minmxid);
appendStringLiteralAH(q, fmtId(tbinfo->dobj.name), fout);
appendPQExpBufferStr(q, "::pg_catalog.regclass;\n");

note that the above if clause doesn't include materialized tables. Which
sems to explain this bug? Could you check that just updating the above
if to include matviews fixes the bug for you?

Looking into this I also saw:

/*
* set_frozenxids()
*
* We have frozen all xids, so set datfrozenxid, relfrozenxid, and
* relminmxid to be the old cluster's xid counter, which we just set
* in the new cluster. User-table frozenxid and minmxid values will
* be set by pg_dump --binary-upgrade, but objects not set by the pg_dump
* must have proper frozen counters.
*/
static
void
set_frozenxids(bool minmxid_only)
...
/* set pg_class.relfrozenxid */
PQclear(executeQueryOrDie(conn,
"UPDATE pg_catalog.pg_class "
"SET relfrozenxid = '%u' "
/* only heap, materialized view, and TOAST are vacuumed */
"WHERE relkind IN ("
CppAsString2(RELKIND_RELATION) ", "
CppAsString2(RELKIND_MATVIEW) ", "
CppAsString2(RELKIND_TOASTVALUE) ")",
old_cluster.controldata.chkpnt_nxtxid));

which makes a bit uncomfortable, but I can't quite put my finger on
why.

Greetings,

Andres Freund

#9Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#8)
Re: pg_upgrade and materialized views

Hi,

On 2018-02-20 13:54:20 -0800, Andres Freund wrote:

note that the above if clause doesn't include materialized tables. Which
sems to explain this bug? Could you check that just updating the above
if to include matviews fixes the bug for you?

I suspect that a proper fix would be to move the relevant block one
level up, to have it's own if block. The current location seems to be
wrong, this is unrelated work to the header:
* To create binary-compatible heap files, we have to ensure the same
* physical column order, including dropped columns, as in the
* original. Therefore, we create dropped columns above and drop them
* here, also updating their attlen/attalign values so that the
* dropped column can be skipped properly. (We do not bother with
* restoring the original attbyval setting.) Also, inheritance
* relationships are set up by doing ALTER TABLE INHERIT rather than
* using an INHERITS clause --- the latter would possibly mess up the
* column order. That also means we have to take care about setting
* attislocal correctly, plus fix up any inherited CHECK constraints.
* Analogously, we set up typed tables using ALTER TABLE / OF here.
*/
of that block.

A later if block just concerned to restoring frozenxid for all relevant
types of tables seems more appropriate. I also wonder if we shouldn't
just always restore relfrozenxid / minmxid if set on the source system,
rather than having relkind specific dispatch.

Also, dear god, this code could use some refactoring :(

Greetings,

Andres Freund

#10Claudio Freire
klaussfreire@gmail.com
In reply to: Tom Lane (#7)
Re: pg_upgrade and materialized views

On Tue, Feb 20, 2018 at 6:46 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Claudio Freire <klaussfreire@gmail.com> writes:

On Tue, Feb 20, 2018 at 6:27 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

(2) independently of that, it sounds like REFRESH
MATERIALIZED VIEW CONCURRENTLY is somehow preventing advancement of the
matview's relfrozenxid in the source DB.

Not necessarily. I have vacuum_table_freeze_max_age set to 350M, so
it's not yet due for freezing.

Perhaps, but it seems pretty suggestive that all of the non-concurrently
refreshed matviews have relfrozenxid significantly newer than all of
the concurrently refreshed ones. Maybe that's just coincidence, or a
predictable outcome of your usage pattern, but I think it needs
explaining.

I think it's quite expectable.

The ones that use concurrently are expected to change only slightly
between refreshes. All those matviews get refreshed periodically in
mostly the same schedule, but some are either slow to refresh
concurrently, and thus we prefer a full refresh, or are expected to
change too much for a concurrent refresh to be useful. As such,
concurrently refreshed views are expected to have rows that remain
valid for a long while (old data that has stabilized, the views
themselves represent a few month's worth of data). Fully refreshed
views will always have recent xids because they are recreated often.
I'd say it makes sense.

On Tue, Feb 20, 2018 at 6:54 PM, Andres Freund <andres@anarazel.de> wrote:

The important part then happens in pg_dump. Note

/*
* To create binary-compatible heap files, we have to ensure the same
* physical column order, including dropped columns, as in the
* original. Therefore, we create dropped columns above and drop them
* here, also updating their attlen/attalign values so that the
* dropped column can be skipped properly. (We do not bother with
* restoring the original attbyval setting.) Also, inheritance
* relationships are set up by doing ALTER TABLE INHERIT rather than
* using an INHERITS clause --- the latter would possibly mess up the
* column order. That also means we have to take care about setting
* attislocal correctly, plus fix up any inherited CHECK constraints.
* Analogously, we set up typed tables using ALTER TABLE / OF here.
*/
if (dopt->binary_upgrade &&
(tbinfo->relkind == RELKIND_RELATION ||
tbinfo->relkind == RELKIND_FOREIGN_TABLE ||
tbinfo->relkind == RELKIND_PARTITIONED_TABLE))
{
...
appendPQExpBufferStr(q, "\n-- For binary upgrade, set heap's relfrozenxid and relminmxid\n");
appendPQExpBuffer(q, "UPDATE pg_catalog.pg_class\n"
"SET relfrozenxid = '%u', relminmxid = '%u'\n"
"WHERE oid = ",
tbinfo->frozenxid, tbinfo->minmxid);
appendStringLiteralAH(q, fmtId(tbinfo->dobj.name), fout);
appendPQExpBufferStr(q, "::pg_catalog.regclass;\n");

note that the above if clause doesn't include materialized tables. Which
sems to explain this bug? Could you check that just updating the above
if to include matviews fixes the bug for you?

The pg_dump --binary-only test does produce the necessary SQL to set
relfrozenxid after that change, so it looks like it would fix it.

To be able to fully confirm it, though, I'll have to build a mimal
case to reproduce the issue, because the snapshot I used it not usable
again (can't re-upgrade), and launching another snapshot takes a lot
of time. Basically, I'll get back to you with a confirmation, but it
does look good.

Looking into this I also saw:

/*
* set_frozenxids()
*
* We have frozen all xids, so set datfrozenxid, relfrozenxid, and
* relminmxid to be the old cluster's xid counter, which we just set
* in the new cluster. User-table frozenxid and minmxid values will
* be set by pg_dump --binary-upgrade, but objects not set by the pg_dump
* must have proper frozen counters.
*/
static
void
set_frozenxids(bool minmxid_only)
...
/* set pg_class.relfrozenxid */
PQclear(executeQueryOrDie(conn,
"UPDATE pg_catalog.pg_class "
"SET relfrozenxid = '%u' "
/* only heap, materialized view, and TOAST are vacuumed */
"WHERE relkind IN ("
CppAsString2(RELKIND_RELATION) ", "
CppAsString2(RELKIND_MATVIEW) ", "
CppAsString2(RELKIND_TOASTVALUE) ")",
old_cluster.controldata.chkpnt_nxtxid));

which makes a bit uncomfortable, but I can't quite put my finger on
why.

I looked into that one, it's not relevant to this case, since it's
working on template1 (check the conn used there).

#11Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#8)
Re: pg_upgrade and materialized views

Andres Freund <andres@anarazel.de> writes:

The important part then happens in pg_dump. Note

if (dopt->binary_upgrade &&
(tbinfo->relkind == RELKIND_RELATION ||
tbinfo->relkind == RELKIND_FOREIGN_TABLE ||
tbinfo->relkind == RELKIND_PARTITIONED_TABLE))

note that the above if clause doesn't include materialized tables. Which
sems to explain this bug? Could you check that just updating the above
if to include matviews fixes the bug for you?

I'm also wondering why it *does* include foreign tables. Surely
relfrozenxid is meaningless for a FT?

Looking into this I also saw:
set_frozenxids(bool minmxid_only)
which makes a bit uncomfortable, but I can't quite put my finger on
why.

The fact that it's inconsistent with the other list is surely a red flag,
eg seems like we should include RELKIND_PARTITIONED_TABLE there too.

regards, tom lane

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Claudio Freire (#10)
Re: pg_upgrade and materialized views

Claudio Freire <klaussfreire@gmail.com> writes:

Looking into this I also saw:
* set_frozenxids()
which makes a bit uncomfortable, but I can't quite put my finger on
why.

I looked into that one, it's not relevant to this case, since it's
working on template1 (check the conn used there).

Hm? Looks to me like it iterates through all the DBs in the cluster.

regards, tom lane

#13Claudio Freire
klaussfreire@gmail.com
In reply to: Tom Lane (#12)
Re: pg_upgrade and materialized views

On Tue, Feb 20, 2018 at 7:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Claudio Freire <klaussfreire@gmail.com> writes:

Looking into this I also saw:
* set_frozenxids()
which makes a bit uncomfortable, but I can't quite put my finger on
why.

I looked into that one, it's not relevant to this case, since it's
working on template1 (check the conn used there).

Hm? Looks to me like it iterates through all the DBs in the cluster.

regards, tom lane

You're right. But IIUC it's called before restoring globals, so only
postgres, template0 and template1 exist at that point.

#14Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#11)
Re: pg_upgrade and materialized views

On 2018-02-20 17:05:29 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

The important part then happens in pg_dump. Note

if (dopt->binary_upgrade &&
(tbinfo->relkind == RELKIND_RELATION ||
tbinfo->relkind == RELKIND_FOREIGN_TABLE ||
tbinfo->relkind == RELKIND_PARTITIONED_TABLE))

note that the above if clause doesn't include materialized tables. Which
sems to explain this bug? Could you check that just updating the above
if to include matviews fixes the bug for you?

I'm also wondering why it *does* include foreign tables. Surely
relfrozenxid is meaningless for a FT?

I think it's because that if block originally was concerned about
reconstructing table order, rather than relfrozenxid management. For
that the conditions make sense - matviews can't have columns
added/dropped but foreign tables can. That's why I'm suggesting to move
the relfrozenxid handling out of there, it seems very likely to cause
similar problems down the road.

Looking into this I also saw:
set_frozenxids(bool minmxid_only)
which makes a bit uncomfortable, but I can't quite put my finger on
why.

The fact that it's inconsistent with the other list is surely a red flag,
eg seems like we should include RELKIND_PARTITIONED_TABLE there too.

If I understand correctly the purpose of set_frozenxid() is to update
the horizons of system tables, *before* restoring the schema dump. So I
think excluding RELKIND_PARTITIONED_TABLE is harmless for now (i.e. we
can just change it in master).

I wonder if there's a scenario in which a schema restore uses enough
xids to get close to anti-wraparound territory?

Greetings,

Andres Freund

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Claudio Freire (#13)
Re: pg_upgrade and materialized views

Claudio Freire <klaussfreire@gmail.com> writes:

On Tue, Feb 20, 2018 at 7:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm? Looks to me like it iterates through all the DBs in the cluster.

You're right. But IIUC it's called before restoring globals, so only
postgres, template0 and template1 exist at that point.

Hmm ... there's another call later, but that one only happens for
pre-9.3 source DBs, which won't contain matviews nor partitioned tables.
So maybe it's OK but I'm not quite sure, and even if it is OK it's only
accidentally so.

Be nice if the documentation for this code weren't so relentlessly bad ---
the header comment for this function looks like it's telling you something
useful, but it isn't even close to accurate. The comment right above the
call in prepare_new_globals() seems unrelated to anyone's version of
reality either.

regards, tom lane

#16Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#14)
Re: pg_upgrade and materialized views

Andres Freund <andres@anarazel.de> writes:

I wonder if there's a scenario in which a schema restore uses enough
xids to get close to anti-wraparound territory?

Interesting question ... you'd need one heck of a lot of objects in
the cluster, but we've certainly heard of people with lots of objects.

We could stave that problem off by running the restore steps in
--single-transaction mode, if it weren't that pg_restore will reject the
combination of --create and --single-transaction. I wonder if we should
allow that, specifying that it means that the single transaction starts
after we reconnect to the new DB.

regards, tom lane

#17Claudio Freire
klaussfreire@gmail.com
In reply to: Claudio Freire (#10)
Re: pg_upgrade and materialized views

On Tue, Feb 20, 2018 at 7:05 PM, Claudio Freire <klaussfreire@gmail.com> wrote:

On Tue, Feb 20, 2018 at 6:54 PM, Andres Freund <andres@anarazel.de> wrote:

The important part then happens in pg_dump. Note

/*
* To create binary-compatible heap files, we have to ensure the same
* physical column order, including dropped columns, as in the
* original. Therefore, we create dropped columns above and drop them
* here, also updating their attlen/attalign values so that the
* dropped column can be skipped properly. (We do not bother with
* restoring the original attbyval setting.) Also, inheritance
* relationships are set up by doing ALTER TABLE INHERIT rather than
* using an INHERITS clause --- the latter would possibly mess up the
* column order. That also means we have to take care about setting
* attislocal correctly, plus fix up any inherited CHECK constraints.
* Analogously, we set up typed tables using ALTER TABLE / OF here.
*/
if (dopt->binary_upgrade &&
(tbinfo->relkind == RELKIND_RELATION ||
tbinfo->relkind == RELKIND_FOREIGN_TABLE ||
tbinfo->relkind == RELKIND_PARTITIONED_TABLE))
{
...
appendPQExpBufferStr(q, "\n-- For binary upgrade, set heap's relfrozenxid and relminmxid\n");
appendPQExpBuffer(q, "UPDATE pg_catalog.pg_class\n"
"SET relfrozenxid = '%u', relminmxid = '%u'\n"
"WHERE oid = ",
tbinfo->frozenxid, tbinfo->minmxid);
appendStringLiteralAH(q, fmtId(tbinfo->dobj.name), fout);
appendPQExpBufferStr(q, "::pg_catalog.regclass;\n");

note that the above if clause doesn't include materialized tables. Which
sems to explain this bug? Could you check that just updating the above
if to include matviews fixes the bug for you?

The pg_dump --binary-only test does produce the necessary SQL to set
relfrozenxid after that change, so it looks like it would fix it.

To be able to fully confirm it, though, I'll have to build a mimal
case to reproduce the issue, because the snapshot I used it not usable
again (can't re-upgrade), and launching another snapshot takes a lot
of time. Basically, I'll get back to you with a confirmation, but it
does look good.

Well, the attached script reproduces the issue.

Make a scratch dir somewhere, cd into it, and run it. It will download
9.6.7 and 10.2, build them, create a test db in 9.6.7, modify it a bit
here and there to get to the desired state, pg_upgrade it to 10.2 and
vacuum.

Might even be a good idea to create a regression test with that, but
AFAIK there's nothing remotely like that in the current tests.

The patch you propose makes the test succeed. It may be true that
there might be other similar issues lurking in that code, but it looks
like it fixes this particular issue.

Attachments:

pg_bug_reproduce.shapplication/x-sh; name=pg_bug_reproduce.shDownload
#18Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#16)
Re: pg_upgrade and materialized views

On 2018-02-20 17:29:01 -0500, Tom Lane wrote:

Andres Freund <andres@anarazel.de> writes:

I wonder if there's a scenario in which a schema restore uses enough
xids to get close to anti-wraparound territory?

Interesting question ... you'd need one heck of a lot of objects in
the cluster, but we've certainly heard of people with lots of objects.

We could stave that problem off by running the restore steps in
--single-transaction mode, if it weren't that pg_restore will reject the
combination of --create and --single-transaction. I wonder if we should
allow that, specifying that it means that the single transaction starts
after we reconnect to the new DB.

One problem is that we need to restore some types of objects that can be
quite voluminous (e.g. users) during the restore of global objects. But
we IIRC can't make all of that use a single transaction as some commands
like CREATE DATABASE / TABLESPACE etc don't like that.

I suspect one of the more realistic ways to use up huge amounts of xids
would be to have lots of large objects with individual permissions?

A quick and dirt option would be to add a few hand-scheduled transaction
commands at the right places in a pg_upgrade mode :/

Greetings,

Andres Freund

#19Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#15)
Re: pg_upgrade and materialized views

On 2018-02-20 17:24:03 -0500, Tom Lane wrote:

Claudio Freire <klaussfreire@gmail.com> writes:

On Tue, Feb 20, 2018 at 7:08 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Hm? Looks to me like it iterates through all the DBs in the cluster.

You're right. But IIUC it's called before restoring globals, so only
postgres, template0 and template1 exist at that point.

Hmm ... there's another call later, but that one only happens for
pre-9.3 source DBs, which won't contain matviews nor partitioned tables.
So maybe it's OK but I'm not quite sure, and even if it is OK it's only
accidentally so.

Be nice if the documentation for this code weren't so relentlessly bad ---
the header comment for this function looks like it's telling you something
useful, but it isn't even close to accurate. The comment right above the
call in prepare_new_globals() seems unrelated to anyone's version of
reality either.

Yea, this code, not just the comments, is bad. Combining the codepaths
for setting minmxid on all objects when upgrading from an old server
with the task of setting an appropriate horizon for objects created by
initdb, but nothing else, without so much as a comment about it?

Greetings,

Andres Freund

#20Claudio Freire
klaussfreire@gmail.com
In reply to: Claudio Freire (#17)
Re: pg_upgrade and materialized views

On Tue, Feb 20, 2018 at 7:56 PM, Claudio Freire <klaussfreire@gmail.com> wrote:

On Tue, Feb 20, 2018 at 7:05 PM, Claudio Freire <klaussfreire@gmail.com> wrote:

On Tue, Feb 20, 2018 at 6:54 PM, Andres Freund <andres@anarazel.de> wrote:

The important part then happens in pg_dump. Note

/*
* To create binary-compatible heap files, we have to ensure the same
* physical column order, including dropped columns, as in the
* original. Therefore, we create dropped columns above and drop them
* here, also updating their attlen/attalign values so that the
* dropped column can be skipped properly. (We do not bother with
* restoring the original attbyval setting.) Also, inheritance
* relationships are set up by doing ALTER TABLE INHERIT rather than
* using an INHERITS clause --- the latter would possibly mess up the
* column order. That also means we have to take care about setting
* attislocal correctly, plus fix up any inherited CHECK constraints.
* Analogously, we set up typed tables using ALTER TABLE / OF here.
*/
if (dopt->binary_upgrade &&
(tbinfo->relkind == RELKIND_RELATION ||
tbinfo->relkind == RELKIND_FOREIGN_TABLE ||
tbinfo->relkind == RELKIND_PARTITIONED_TABLE))
{
...
appendPQExpBufferStr(q, "\n-- For binary upgrade, set heap's relfrozenxid and relminmxid\n");
appendPQExpBuffer(q, "UPDATE pg_catalog.pg_class\n"
"SET relfrozenxid = '%u', relminmxid = '%u'\n"
"WHERE oid = ",
tbinfo->frozenxid, tbinfo->minmxid);
appendStringLiteralAH(q, fmtId(tbinfo->dobj.name), fout);
appendPQExpBufferStr(q, "::pg_catalog.regclass;\n");

note that the above if clause doesn't include materialized tables. Which
sems to explain this bug? Could you check that just updating the above
if to include matviews fixes the bug for you?

The pg_dump --binary-only test does produce the necessary SQL to set
relfrozenxid after that change, so it looks like it would fix it.

To be able to fully confirm it, though, I'll have to build a mimal
case to reproduce the issue, because the snapshot I used it not usable
again (can't re-upgrade), and launching another snapshot takes a lot
of time. Basically, I'll get back to you with a confirmation, but it
does look good.

Well, the attached script reproduces the issue.

Make a scratch dir somewhere, cd into it, and run it. It will download
9.6.7 and 10.2, build them, create a test db in 9.6.7, modify it a bit
here and there to get to the desired state, pg_upgrade it to 10.2 and
vacuum.

Might even be a good idea to create a regression test with that, but
AFAIK there's nothing remotely like that in the current tests.

The patch you propose makes the test succeed. It may be true that
there might be other similar issues lurking in that code, but it looks
like it fixes this particular issue.

Oops, the previous script has timing issues, the attached one really
works - just added some sleeps ;-)

Attachments:

pg_bug_reproduce.shapplication/x-sh; name=pg_bug_reproduce.shDownload
#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Claudio Freire (#20)
#22Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#21)
#23Claudio Freire
klaussfreire@gmail.com
In reply to: Tom Lane (#21)
#24Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#22)
#25Victor Yegorov
vyegorov@gmail.com
In reply to: Claudio Freire (#17)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Victor Yegorov (#25)
#27Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#26)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#27)
#29Andres Freund
andres@anarazel.de
In reply to: Tom Lane (#28)
#30Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andres Freund (#29)
#31Claudio Freire
klaussfreire@gmail.com
In reply to: Tom Lane (#28)