Error with index on unlogged table
Hi,
I was attempting to set up a data set to test pg_rewind, when I encountered
an error. I created a primary and standby, then:
# create table test (id serial primary key, thing text);
CREATE TABLE
# create unlogged table utest (id serial primary key, thing text);
CREATE TABLE
# insert into test (thing) values ('jifj');
INSERT 0 1
# insert into utest (thing) values ('jifj');
INSERT 0 1
$ pg_ctl -D standby1 promote
$ psql -p 5531 postgres # standby
# insert into test (thing) values ('moomoo');
INSERT 0 1
# insert into utest (thing) values ('moomoo');
ERROR: index "utest_pkey" contains unexpected zero page at block 0
HINT: Please REINDEX it.
This is built on commit e5f455f59fed0632371cddacddd79895b148dc07.
--
Thom
On Tue, Mar 24, 2015 at 5:53 PM, Thom Brown <thom@linux.com> wrote:
I was attempting to set up a data set to test pg_rewind, when I encountered
an error. I created a primary and standby, then:[...]
# insert into utest (thing) values ('moomoo');
ERROR: index "utest_pkey" contains unexpected zero page at block 0
HINT: Please REINDEX it.This is built on commit e5f455f59fed0632371cddacddd79895b148dc07.
Unlogged tables are not in WAL, and cannot be accessed while in
recovery, so having an empty index relation is expected on a promoted
standby IMO. Now perhaps we could have a more friendly error message
in _bt_checkpage(), _hash_checkpage() and gistcheckpage() with an
additional HINT to mention unlogged tables, but I am not sure that
this is much worth it. Mentioning this behavior in the docs would be
good instead.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On March 24, 2015 12:35:28 PM GMT+01:00, Michael Paquier <michael.paquier@gmail.com> wrote:
On Tue, Mar 24, 2015 at 5:53 PM, Thom Brown <thom@linux.com> wrote:
I was attempting to set up a data set to test pg_rewind, when I
encountered
an error. I created a primary and standby, then:
[...]
# insert into utest (thing) values ('moomoo');
ERROR: index "utest_pkey" contains unexpected zero page at block 0
HINT: Please REINDEX it.This is built on commit e5f455f59fed0632371cddacddd79895b148dc07.
Unlogged tables are not in WAL, and cannot be accessed while in
recovery, so having an empty index relation is expected on a promoted
standby IMO. Now perhaps we could have a more friendly error message
in _bt_checkpage(), _hash_checkpage() and gistcheckpage() with an
additional HINT to mention unlogged tables, but I am not sure that
this is much worth it. Mentioning this behavior in the docs would be
good instead.
I think Thom's point is that he promoted the node...
Thom, are you sure this want transient?
Andres
---
Please excuse brevity and formatting - I am writing this on my mobile phone.
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 24 March 2015 at 11:37, Andres Freund <andres@anarazel.de> wrote:
On March 24, 2015 12:35:28 PM GMT+01:00, Michael Paquier <
michael.paquier@gmail.com> wrote:On Tue, Mar 24, 2015 at 5:53 PM, Thom Brown <thom@linux.com> wrote:
I was attempting to set up a data set to test pg_rewind, when I
encountered
an error. I created a primary and standby, then:
[...]
# insert into utest (thing) values ('moomoo');
ERROR: index "utest_pkey" contains unexpected zero page at block 0
HINT: Please REINDEX it.This is built on commit e5f455f59fed0632371cddacddd79895b148dc07.
Unlogged tables are not in WAL, and cannot be accessed while in
recovery, so having an empty index relation is expected on a promoted
standby IMO. Now perhaps we could have a more friendly error message
in _bt_checkpage(), _hash_checkpage() and gistcheckpage() with an
additional HINT to mention unlogged tables, but I am not sure that
this is much worth it. Mentioning this behavior in the docs would be
good instead.I think Thom's point is that he promoted the node...
Thom, are you sure this want transient?
The index is unlogged until reindexing...
# select oid, relname, relpersistence from pg_class where relname in
('test','test_pkey','utest','utest_pkey');
oid | relname | relpersistence
-------+------------+----------------
16387 | test | p
16394 | test_pkey | p
16398 | utest | u
16405 | utest_pkey | u
(4 rows)
# reindex index utest_pkey;
REINDEX
# select oid, relname, relpersistence from pg_class where relname in
('test','test_pkey','utest','utest_pkey');
oid | relname | relpersistence
-------+------------+----------------
16387 | test | p
16394 | test_pkey | p
16398 | utest | u
16405 | utest_pkey | p
(4 rows)
Which is think also raises the question, why are unlogged indexes made
persistent by a reindex?
--
Thom
On 24 March 2015 at 11:46, Thom Brown <thom@linux.com> wrote:
On 24 March 2015 at 11:37, Andres Freund <andres@anarazel.de> wrote:
On March 24, 2015 12:35:28 PM GMT+01:00, Michael Paquier <
michael.paquier@gmail.com> wrote:On Tue, Mar 24, 2015 at 5:53 PM, Thom Brown <thom@linux.com> wrote:
I was attempting to set up a data set to test pg_rewind, when I
encountered
an error. I created a primary and standby, then:
[...]
# insert into utest (thing) values ('moomoo');
ERROR: index "utest_pkey" contains unexpected zero page at block 0
HINT: Please REINDEX it.This is built on commit e5f455f59fed0632371cddacddd79895b148dc07.
Unlogged tables are not in WAL, and cannot be accessed while in
recovery, so having an empty index relation is expected on a promoted
standby IMO. Now perhaps we could have a more friendly error message
in _bt_checkpage(), _hash_checkpage() and gistcheckpage() with an
additional HINT to mention unlogged tables, but I am not sure that
this is much worth it. Mentioning this behavior in the docs would be
good instead.I think Thom's point is that he promoted the node...
Thom, are you sure this want transient?
The index is unlogged until reindexing...
# select oid, relname, relpersistence from pg_class where relname in
('test','test_pkey','utest','utest_pkey');
oid | relname | relpersistence
-------+------------+----------------
16387 | test | p
16394 | test_pkey | p
16398 | utest | u
16405 | utest_pkey | u
(4 rows)# reindex index utest_pkey;
REINDEX# select oid, relname, relpersistence from pg_class where relname in
('test','test_pkey','utest','utest_pkey');
oid | relname | relpersistence
-------+------------+----------------
16387 | test | p
16394 | test_pkey | p
16398 | utest | u
16405 | utest_pkey | p
(4 rows)Which is think also raises the question, why are unlogged indexes made
persistent by a reindex?
I should also mention that it becomes unlogged again when running VACUUM
FULL or CLUSTER on the table.
--
Thom
On Tue, Mar 24, 2015 at 8:37 PM, Andres Freund <andres@anarazel.de> wrote:
On March 24, 2015 12:35:28 PM GMT+01:00, Michael Paquier wrote:
I think Thom's point is that he promoted the node...Thom, are you sure this want transient?
Well, I got his point :)
I was just thinking that this error message is legit,
ResetUnloggedRelationsInDbspaceDir reinitializing unlogged relation
and its indexes at the end of recovery.
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Tue, Mar 24, 2015 at 8:46 PM, Thom Brown wrote:
The index is unlogged until reindexing...
[...]
Which is think also raises the question, why are unlogged indexes made
persistent by a reindex?
That's a bug of HEAD, ~9.4 keeping the index as unlogged even after
REINDEX INDEX. What happens is that ReindexIndex relies on
relpersistence provided by makeRangeVar at parse time, which is just
incorrect as it uses RELPERSISTENCE_PERMANENT all the time. The patch
attached fixes that...
--
Michael
Attachments:
20150325_reindex_index_fix.patchapplication/x-patch; name=20150325_reindex_index_fix.patchDownload+7-1
On Wednesday, March 25, 2015, Michael Paquier <michael.paquier@gmail.com> wrote:
On Tue, Mar 24, 2015 at 8:46 PM, Thom Brown wrote:
The index is unlogged until reindexing...
[...]
Which is think also raises the question, why are unlogged indexes made
persistent by a reindex?That's a bug of HEAD, ~9.4 keeping the index as unlogged even after
REINDEX INDEX. What happens is that ReindexIndex relies on
relpersistence provided by makeRangeVar at parse time, which is just
incorrect as it uses RELPERSISTENCE_PERMANENT all the time. The patch
attached fixes that...
How about VACUUM FULL and CLUSTER as the problem seems to have been
reported to be there too?
Amit
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 25 March 2015 at 12:22, Amit Langote <amitlangote09@gmail.com> wrote:
On Wednesday, March 25, 2015, Michael Paquier <michael.paquier@gmail.com>
wrote:On Tue, Mar 24, 2015 at 8:46 PM, Thom Brown wrote:
The index is unlogged until reindexing...
[...]
Which is think also raises the question, why are unlogged indexes made
persistent by a reindex?That's a bug of HEAD, ~9.4 keeping the index as unlogged even after
REINDEX INDEX. What happens is that ReindexIndex relies on
relpersistence provided by makeRangeVar at parse time, which is just
incorrect as it uses RELPERSISTENCE_PERMANENT all the time. The patch
attached fixes that...How about VACUUM FULL and CLUSTER as the problem seems to have been
reported to be there too?
No, those are okay. They actually revert the index back to the same
persistence level as the table they're attached to.
--
Thom
On Wednesday, March 25, 2015, Thom Brown <thom@linux.com> wrote:
On 25 March 2015 at 12:22, Amit Langote <amitlangote09@gmail.com
<javascript:_e(%7B%7D,'cvml','amitlangote09@gmail.com');>> wrote:On Wednesday, March 25, 2015, Michael Paquier <michael.paquier@gmail.com
<javascript:_e(%7B%7D,'cvml','michael.paquier@gmail.com');>> wrote:On Tue, Mar 24, 2015 at 8:46 PM, Thom Brown wrote:
The index is unlogged until reindexing...
[...]
Which is think also raises the question, why are unlogged indexes made
persistent by a reindex?That's a bug of HEAD, ~9.4 keeping the index as unlogged even after
REINDEX INDEX. What happens is that ReindexIndex relies on
relpersistence provided by makeRangeVar at parse time, which is just
incorrect as it uses RELPERSISTENCE_PERMANENT all the time. The patch
attached fixes that...How about VACUUM FULL and CLUSTER as the problem seems to have been
reported to be there too?No, those are okay. They actually revert the index back to the same
persistence level as the table they're attached to.
Ah, I misread then; sorry about the noise.
Amit
Hi,
On 2015-03-25 11:38:30 +0900, Michael Paquier wrote:
On Tue, Mar 24, 2015 at 8:46 PM, Thom Brown wrote:
The index is unlogged until reindexing...
[...]
Which is think also raises the question, why are unlogged indexes made
persistent by a reindex?That's a bug of HEAD, ~9.4 keeping the index as unlogged even after
REINDEX INDEX. What happens is that ReindexIndex relies on
relpersistence provided by makeRangeVar at parse time, which is just
incorrect as it uses RELPERSISTENCE_PERMANENT all the time. The patch
attached fixes that...
What the hell? That's somewhat nasty. Nice that it got caught before 9.5
was released.
Did you check whether a similar bug was made in other places of
85b506bb? Could you additionally add a regression test to this end?
Seems like something worth testing.
Greetings,
Andres Freund
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Wed, Mar 25, 2015 at 10:53 AM, Andres Freund <andres@2ndquadrant.com>
wrote:
Hi,
On 2015-03-25 11:38:30 +0900, Michael Paquier wrote:
On Tue, Mar 24, 2015 at 8:46 PM, Thom Brown wrote:
The index is unlogged until reindexing...
[...]
Which is think also raises the question, why are unlogged indexes made
persistent by a reindex?That's a bug of HEAD, ~9.4 keeping the index as unlogged even after
REINDEX INDEX. What happens is that ReindexIndex relies on
relpersistence provided by makeRangeVar at parse time, which is just
incorrect as it uses RELPERSISTENCE_PERMANENT all the time. The patch
attached fixes that...What the hell? That's somewhat nasty. Nice that it got caught before 9.5
was released.
Unfortunately this is very nasty. Sorry!
Did you check whether a similar bug was made in other places of
85b506bb? Could you additionally add a regression test to this end?
Seems like something worth testing.
I'm checking it and adding some regression tests.
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Show quoted text
Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello
On Wed, Mar 25, 2015 at 12:46 PM, Fabrízio de Royes Mello <
fabriziomello@gmail.com> wrote:
On Wed, Mar 25, 2015 at 10:53 AM, Andres Freund <andres@2ndquadrant.com>
wrote:
Did you check whether a similar bug was made in other places of
85b506bb? Could you additionally add a regression test to this end?
Seems like something worth testing.I'm checking it and adding some regression tests.
I didn't found any other similar bug introduced by 85b506bb.
Attached the original patch provided by Michael with some regression tests.
Regards,
--
Fabrízio de Royes Mello
Consultoria/Coaching PostgreSQL
Show quoted text
Timbira: http://www.timbira.com.br
Blog: http://fabriziomello.github.io
Linkedin: http://br.linkedin.com/in/fabriziomello
Twitter: http://twitter.com/fabriziomello
Github: http://github.com/fabriziomello
Attachments:
20150325_reindex_index_fix_v2.patchtext/x-diff; charset=US-ASCII; name=20150325_reindex_index_fix_v2.patchDownload+25-1
On Wed, Mar 25, 2015 at 10:53 PM, Andres Freund <andres@2ndquadrant.com> wrote:
Hi,
On 2015-03-25 11:38:30 +0900, Michael Paquier wrote:
On Tue, Mar 24, 2015 at 8:46 PM, Thom Brown wrote:
The index is unlogged until reindexing...
[...]
Which is think also raises the question, why are unlogged indexes made
persistent by a reindex?That's a bug of HEAD, ~9.4 keeping the index as unlogged even after
REINDEX INDEX. What happens is that ReindexIndex relies on
relpersistence provided by makeRangeVar at parse time, which is just
incorrect as it uses RELPERSISTENCE_PERMANENT all the time. The patch
attached fixes that...What the hell? That's somewhat nasty. Nice that it got caught before 9.5
was released.Did you check whether a similar bug was made in other places of
85b506bb?
Yeah I got a look at the other code paths, particularly cluster and
matviews, and the relpersistence used is taken directly from a
Relation.
Could you additionally add a regression test to this end?
Seems like something worth testing.
Definitely. And I guess that Fabrizio already did that...
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On Thu, Mar 26, 2015 at 1:02 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:
On Wed, Mar 25, 2015 at 12:46 PM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:On Wed, Mar 25, 2015 at 10:53 AM, Andres Freund <andres@2ndquadrant.com>
wrote:Did you check whether a similar bug was made in other places of
85b506bb? Could you additionally add a regression test to this end?
Seems like something worth testing.I'm checking it and adding some regression tests.
I didn't found any other similar bug introduced by 85b506bb.
Attached the original patch provided by Michael with some regression tests.
Thanks for adding a test, this looks fine to me (did some sanity
checks and tutti-quanti for people wondering). On temporary tables
this was failing with an error in md.c...
--
Michael
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 26 March 2015 at 00:55, Michael Paquier <michael.paquier@gmail.com>
wrote:
On Thu, Mar 26, 2015 at 1:02 AM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:On Wed, Mar 25, 2015 at 12:46 PM, Fabrízio de Royes Mello
<fabriziomello@gmail.com> wrote:On Wed, Mar 25, 2015 at 10:53 AM, Andres Freund <andres@2ndquadrant.com
wrote:
Did you check whether a similar bug was made in other places of
85b506bb? Could you additionally add a regression test to this end?
Seems like something worth testing.I'm checking it and adding some regression tests.
I didn't found any other similar bug introduced by 85b506bb.
Attached the original patch provided by Michael with some regression
tests.
Thanks for adding a test, this looks fine to me (did some sanity
checks and tutti-quanti for people wondering). On temporary tables
this was failing with an error in md.c...
Thanks to both of you for fixing this.
I still, however, have a problem with the separate and original issue of:
# insert into utest (thing) values ('moomoo');
ERROR: index "utest_pkey" contains unexpected zero page at block 0
HINT: Please REINDEX it.
I don't see why the user should need to go re-indexing all unlogged tables
each time a standby is promoted. The index should just be empty and ready
to use.
--
Thom
On 2015-03-26 13:55:22 +0000, Thom Brown wrote:
I still, however, have a problem with the separate and original issue of:
# insert into utest (thing) values ('moomoo');
ERROR: index "utest_pkey" contains unexpected zero page at block 0
HINT: Please REINDEX it.I don't see why the user should need to go re-indexing all unlogged tables
each time a standby is promoted. The index should just be empty and ready
to use.
There's definitely something rather broken here. Investigating.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
On 2015-03-26 15:13:41 +0100, Andres Freund wrote:
On 2015-03-26 13:55:22 +0000, Thom Brown wrote:
I still, however, have a problem with the separate and original issue of:
# insert into utest (thing) values ('moomoo');
ERROR: index "utest_pkey" contains unexpected zero page at block 0
HINT: Please REINDEX it.I don't see why the user should need to go re-indexing all unlogged tables
each time a standby is promoted. The index should just be empty and ready
to use.There's definitely something rather broken here. Investigating.
As far as I can see this has been broken at least since the introduction
of fast promotion. WAL replay will update the init fork in shared
memory, but it'll not be guaranteed to be flushed to disk when the reset
happens. d3586fc8a et al. then also made it possible to hit the issue
without fast promotion.
To hit the issue there may not be a restartpoint (requiring a checkpoint
on the primary) since the creation of the unlogged table.
I think the problem here is that the *primary* makes no such
assumptions. Init forks are logged via stuff like
smgrwrite(index->rd_smgr, INIT_FORKNUM, BTREE_METAPAGE,
(char *) metapage, true);
if (XLogIsNeeded())
log_newpage(&index->rd_smgr->smgr_rnode.node, INIT_FORKNUM,
BTREE_METAPAGE, metapage, false);
/*
* An immediate sync is required even if we xlog'd the page, because the
* write did not go through shared_buffers and therefore a concurrent
* checkpoint may have moved the redo pointer past our xlog record.
*/
smgrimmedsync(index->rd_smgr, INIT_FORKNUM);
i.e. the data is written out directly to disk, circumventing
shared_buffers. It's pretty bad that we don't do the same on the
standby. For master I think we should just add a bit to the XLOG_FPI
record saying the data should be forced out to disk. I'm less sure
what's to be done in the back branches. Flushing every HEAP_NEWPAGE
record isn't really an option.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Hello,
At Thu, 26 Mar 2015 18:50:24 +0100, Andres Freund <andres@2ndquadrant.com> wrote in <20150326175024.GJ451@alap3.anarazel.de>
I think the problem here is that the *primary* makes no such
assumptions. Init forks are logged via stuff like
smgrwrite(index->rd_smgr, INIT_FORKNUM, BTREE_METAPAGE,
..
i.e. the data is written out directly to disk, circumventing
shared_buffers. It's pretty bad that we don't do the same on the
standby. For master I think we should just add a bit to the XLOG_FPI
record saying the data should be forced out to disk. I'm less sure
what's to be done in the back branches. Flushing every HEAP_NEWPAGE
record isn't really an option.
The problem exists only for INIT_FORKNUM. So I suppose it is
enough to check forknum to decide whether to sync immediately.
Specifically for this instance, syncing buffers of INIT_FORKNUM
at the end of XLOG_FPI block in xlog_redo fixed the problem.
The another (ugly!) solution sould be syncing only buffers for
INIT_FORKNUM and is BM_DIRTY in ResetUnlogggedRelations(op =
UNLOGGED_RELATION_INIT). This is catching-all-at-once solution
though it is a kind of reversion of fast promotion. But buffers
to be synced here should be pretty few.
regards,
--
Kyotaro Horiguchi
NTT Open Source Software Center
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
Michael Paquier wrote:
On Thu, Mar 26, 2015 at 1:02 AM, Fabr�zio de Royes Mello
<fabriziomello@gmail.com> wrote:
I didn't found any other similar bug introduced by 85b506bb.
Attached the original patch provided by Michael with some regression tests.
Thanks for adding a test, this looks fine to me (did some sanity
checks and tutti-quanti for people wondering). On temporary tables
this was failing with an error in md.c...
Yeah, I extended the test a bit to use a temp table too, and pushed.
Thanks everybody.
--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers