WAL consistency check facility

Started by Kuntal Ghoshover 9 years ago124 messageshackers
Jump to latest
#1Kuntal Ghosh
kuntalghosh.2007@gmail.com

Hi,

I've attached a patch to check if the current page is equal with the
FPW after applying WAL on it. This is how the patch works:

1. When a WAL record is inserted, a FPW is done for that operation.
But, a flag is kept to indicate whether that page needs to be
restored.
2. During recovery, when a redo operation is done, we do a comparison
with the FPW contained in the WAL record with the current page in the
buffer. For this purpose, I've used Michael's patch with minor changes
to check whether two pages are actually equal or not.
3. I've also added a guc variable (wal_consistency_mask) to indicate
the operations (HEAP,BTREE,HASH,GIN etc) for which this feature
(always FPW and consistency check) is to be enabled.

How to use the patch:
1. Apply the patch.
2. In postgresql.conf file, set wal_consistency_mask variable
accordingly. For debug messages, set log_min_messages = debug1.

Michael's patch:
/messages/by-id/CAB7nPqR4vxdKijP+Du82vOcOnGMvutq-gfqiU2dsH4bsM77hYg@mail.gmail.com

Reference thread:
/messages/by-id/CAB7nPqR4vxdKijP+Du82vOcOnGMvutq-gfqiU2dsH4bsM77hYg@mail.gmail.com

Please let me know your thoughts on this.
--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

Attachments:

walconsistency_v3.patchtext/x-patch; charset=US-ASCII; name=walconsistency_v3.patchDownload+655-26
#2Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#1)
Re: WAL consistency check facility

On Mon, Aug 22, 2016 at 9:44 PM, Kuntal Ghosh
<kuntalghosh.2007@gmail.com> wrote:

Please let me know your thoughts on this.

Since custom AMs have been introduced, I have kept that in a corner of
my mind and thought about it a bit. And while the goal of this patch
is clearly worth it, I don't think that the page masking interface is
clear at all. For example, your patch completely ignores
contrib/bloom, and we surely want to do something about it. The idea
would be to add a page masking routine in IndexAmRoutine and heap to
be able to perform page masking operations directly with that. This
would allow as well one to be able to perform page masking for bloom
or any custom access method, and this will allow this sanity check to
be more generic as well.

Another pin-point is: given a certain page, how do we identify of
which type it is? One possibility would be again to extend the AM
handler with some kind of is_self function with a prototype like that:
bool handler->is_self(Page);
If the page is of the type of the handler, this returns true, and
false otherwise. Still here performance would suck.

At the end, what we want is a clean interface, and more thoughts into it.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#3Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#2)
Re: WAL consistency check facility

On Mon, Aug 22, 2016 at 9:25 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Another pin-point is: given a certain page, how do we identify of
which type it is? One possibility would be again to extend the AM
handler with some kind of is_self function with a prototype like that:
bool handler->is_self(Page);
If the page is of the type of the handler, this returns true, and
false otherwise. Still here performance would suck.

At the end, what we want is a clean interface, and more thoughts into it.

I think that it makes sense to filter based on the resource manager
ID, but I can't see how we could reasonably filter based on the AM
name.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#4Simon Riggs
simon@2ndQuadrant.com
In reply to: Kuntal Ghosh (#1)
Re: WAL consistency check facility

On 22 August 2016 at 13:44, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote:

Please let me know your thoughts on this.

Do the regression tests pass with this option enabled?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#5Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#3)
Re: WAL consistency check facility

On Mon, Aug 22, 2016 at 9:16 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Aug 22, 2016 at 9:25 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Another pin-point is: given a certain page, how do we identify of
which type it is? One possibility would be again to extend the AM
handler with some kind of is_self function with a prototype like that:
bool handler->is_self(Page);
If the page is of the type of the handler, this returns true, and
false otherwise. Still here performance would suck.

At the end, what we want is a clean interface, and more thoughts into it.

I think that it makes sense to filter based on the resource manager
ID

+1.

I think the patch currently addresses only a subset of resource
manager id's (mainly Heap and Index resource manager id's). Do we
want to handle the complete resource manager list as defined in
rmgrlist.h?

Another thing that needs some thoughts is the UI of this patch,
currently it is using integer mask which might not be best way, but
again as it is intended to be mainly used for tests, it might be okay.

Do we want to enable some tests in the regression suite by using this option?

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#6Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Simon Riggs (#4)
Re: WAL consistency check facility

Yes, I've verified the outputs and log contents after running gmake
installcheck and gmake installcheck-world. The status of the test was
marked as pass for all the testcases.

On Mon, Aug 22, 2016 at 9:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 22 August 2016 at 13:44, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote:

Please let me know your thoughts on this.

Do the regression tests pass with this option enabled?

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Thanks & Regards,
Kuntal Ghosh

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#7Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#5)
Re: WAL consistency check facility

On Tue, Aug 23, 2016 at 1:32 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Mon, Aug 22, 2016 at 9:16 PM, Robert Haas <robertmhaas@gmail.com> wrote:

On Mon, Aug 22, 2016 at 9:25 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Another pin-point is: given a certain page, how do we identify of
which type it is? One possibility would be again to extend the AM
handler with some kind of is_self function with a prototype like that:
bool handler->is_self(Page);
If the page is of the type of the handler, this returns true, and
false otherwise. Still here performance would suck.

At the end, what we want is a clean interface, and more thoughts into it.

I think that it makes sense to filter based on the resource manager
ID

+1.

Yes actually that's better. That's simple enough and removes any need
to looking at pd_special.

I think the patch currently addresses only a subset of resource
manager id's (mainly Heap and Index resource manager id's). Do we
want to handle the complete resource manager list as defined in
rmgrlist.h?

Not all of them generate FPWs. I don't think it matters much.

Another thing that needs some thoughts is the UI of this patch,
currently it is using integer mask which might not be best way, but
again as it is intended to be mainly used for tests, it might be okay.

What we'd want to have is a way to visualize easily differences of
pages. Any other ideas than MASK_MARKER would be welcome of course.

Do we want to enable some tests in the regression suite by using this option?

We could get out a recovery test that sets up a standby/master and
runs the tests of src/test/regress with pg_regress with this parameter
enabled.

+ * bufmask.c
+ *      Routines for buffer masking, used to ensure that buffers used for
+ *      comparison across nodes are in a consistent state.
+ *
+ * Portions Copyright (c) 1996-2014, PostgreSQL Global Development Group
+ * Portions Copyright (c) 1994, Regents of the University of California
Copyright notices need to be updated. (It's already been 2 years!!)

Also, what's the use case of allowing only a certain set of rmgrs to
be checked. Wouldn't a simple on/off switch be simpler? As presented,
wal_consistency_mask is also going to be very quite confusing for
users. You should not need to apply some maths to set up this
parameter, a list of rmgr names may be more adapted if this level of
tuning is needed, still it seems to me that we don't need this much.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#8Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#7)
Re: WAL consistency check facility

On Tue, Aug 23, 2016 at 10:57 AM, Michael Paquier
<michael.paquier@gmail.com> wrote:

Also, what's the use case of allowing only a certain set of rmgrs to
be checked. Wouldn't a simple on/off switch be simpler?

I think there should be a way to test WAL for one particular resource
manager. For example, if someone develops a new index or some other
heap storage, only that particular module can be verified. Generating
WAL for all the resource managers together can also serve the purpose,
but it will be slightly difficult to verify it.

As presented,
wal_consistency_mask is also going to be very quite confusing for
users. You should not need to apply some maths to set up this
parameter, a list of rmgr names may be more adapted if this level of
tuning is needed,

Yeah, that can be better.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#9Simon Riggs
simon@2ndQuadrant.com
In reply to: Simon Riggs (#4)
Re: WAL consistency check facility

On 22 August 2016 at 16:56, Simon Riggs <simon@2ndquadrant.com> wrote:

On 22 August 2016 at 13:44, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote:

Please let me know your thoughts on this.

Do the regression tests pass with this option enabled?

Hi,

I'd like to be a reviewer on this. Please can you add this onto the CF
app so we can track the review?

Please supply details of the testing and test coverage.

Thanks

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#10Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Simon Riggs (#9)
Re: WAL consistency check facility

Hi,

I've added the feature in CP app. Following are the testing details:

1. In master, I've enabled following configurations:

* wal_level = replica
* max_wal_senders = 3
* wal_keep_segments = 4000
* hot_standby = on
* wal_consistency_mask = 511 /* Enable consistency check mask bit*/

2. In slave, I've enabled following configurations:

* standby_mode = on
* wal_consistency_mask = 511 /* Enable consistency check mask bit*/

3. Then, I performed gmake installcheck in master. I didn't get any
warning regarding WAL inconsistency in slave.

I've made following changes to the attached patch:

1. For BRIN pages, I've masked the unused space, PD_PAGE_FULL and
PD_HAS_FREE_LINES flags.
2. For Btree pages, I've masked BTP_HALF_DEAD, BTP_SPLIT_END,
BTP_HAS_GARBAGE and BTP_INCOMPLETE_SPLIT flags.
3. For GIN_DELETED page, I've masked the entire page since the page is
always initialized during recovery.
4. For Speculative Heap tuple insert operation, there was
inconsistency in t_ctid value. So, I've modified the t_ctid value (in
backup page) to current block number and offset number. Need
suggestions!!

What needs to be done:
1. Add support for other Resource Managers.
2. Modify masking techniques for existing Resource Managers (if required).
3. Modify the GUC parameter which will accept a list of rmgr names.
4. Modify the technique for identifying rmgr names for which the
feature should be enabled.
5. Generalize the page type identification technique.

On Wed, Aug 24, 2016 at 2:14 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 22 August 2016 at 16:56, Simon Riggs <simon@2ndquadrant.com> wrote:

On 22 August 2016 at 13:44, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote:

Please let me know your thoughts on this.

Do the regression tests pass with this option enabled?

Hi,

I'd like to be a reviewer on this. Please can you add this onto the CF
app so we can track the review?

Please supply details of the testing and test coverage.

Thanks

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Thanks & Regards,
Kuntal Ghosh

Attachments:

walconsistency_v4.patchtext/x-patch; charset=US-ASCII; name=walconsistency_v4.patchDownload+699-26
#11Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Kuntal Ghosh (#10)
Re: WAL consistency check facility

Kuntal Ghosh wrote:

4. For Speculative Heap tuple insert operation, there was
inconsistency in t_ctid value. So, I've modified the t_ctid value (in
backup page) to current block number and offset number. Need
suggestions!!

In speculative insertions, t_ctid is used to store the speculative
token. I think you should just mask that field out in that case (which
you can recognize because ip_posid is set to magic value 0xfffe).

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#12Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Alvaro Herrera (#11)
Re: WAL consistency check facility

Thanks a lot.

I just want to mention the situation where I was getting the
speculative token related inconsistency.

ItemPointer in backup page from master:
LOG: ItemPointer BlockNumber: 1 OffsetNumber:65534 Speculative: true
CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1

ItemPointer in current page from slave after redo:
LOG: ItemPointer BlockNumber: 0 OffsetNumber:1 Speculative: false
CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1

As the block numbers are different, I was getting the following warning:
WARNING: Inconsistent page (at byte 8166) found for record
0/127F4A48, rel 1663/16384/16946, forknum 0, blkno 0, Backup Page
Header : (pd_lower: 28 pd_upper: 8152 pd_special: 8192) Current Page
Header: (pd_lower: 28 pd_upper: 8152 pd_special: 8192)
CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1

In heap_xlog_insert, t_ctid is always set to blkno and xlrec->offnum.
I think this is why I was getting the above warning.

On Thu, Aug 25, 2016 at 10:33 PM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

Kuntal Ghosh wrote:

4. For Speculative Heap tuple insert operation, there was
inconsistency in t_ctid value. So, I've modified the t_ctid value (in
backup page) to current block number and offset number. Need
suggestions!!

In speculative insertions, t_ctid is used to store the speculative
token. I think you should just mask that field out in that case (which
you can recognize because ip_posid is set to magic value 0xfffe).

--
Álvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Thanks & Regards,
Kuntal Ghosh

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#13Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Kuntal Ghosh (#12)
Re: WAL consistency check facility

Kuntal Ghosh wrote:

Thanks a lot.

I just want to mention the situation where I was getting the
speculative token related inconsistency.

ItemPointer in backup page from master:
LOG: ItemPointer BlockNumber: 1 OffsetNumber:65534 Speculative: true
CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1

ItemPointer in current page from slave after redo:
LOG: ItemPointer BlockNumber: 0 OffsetNumber:1 Speculative: false
CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1

As the block numbers are different, I was getting the following warning:
WARNING: Inconsistent page (at byte 8166) found for record
0/127F4A48, rel 1663/16384/16946, forknum 0, blkno 0, Backup Page
Header : (pd_lower: 28 pd_upper: 8152 pd_special: 8192) Current Page
Header: (pd_lower: 28 pd_upper: 8152 pd_special: 8192)
CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1

In heap_xlog_insert, t_ctid is always set to blkno and xlrec->offnum.
I think this is why I was getting the above warning.

Umm, really? Then perhaps this *is* a bug. Peter?

--
�lvaro Herrera http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#14Simon Riggs
simon@2ndQuadrant.com
In reply to: Kuntal Ghosh (#10)
Re: WAL consistency check facility

Hi Kuntal,

Thanks for the patch.

Current patch has no docs, no tests and no explanatory comments, so
makes review quite hard.

The good news is you might discover a few bugs with it, so its worth
pursuing actively in this CF, though its not near to being
committable.

I think you should add this as part of the default testing for both
check and installcheck. I can't imagine why we'd have it and not use
it during testing.

On 25 August 2016 at 18:41, Kuntal Ghosh <kuntalghosh.2007@gmail.com> wrote:

* wal_consistency_mask = 511 /* Enable consistency check mask bit*/

What does this mean? (No docs)

What needs to be done:
1. Add support for other Resource Managers.

We probably need to have a discussion as to why you think this should
be Rmgr dependent?
Code comments would help there.

If it does, then you should probably do this by extending RmgrTable
with an rm_check, so you can call it like this...

RmgrTable[record->xl_rmid].rm_check

I'm interested in how we handle the new generic WAL format for blocks.
Surely if we can handle that then we won't need an Rmgr dependency?
I'm sure you have reasons, they just need to be explained long hand -
don't assume anything.

5. Generalize the page type identification technique.

Why not do this first?

There are some coding guideline stuff to check as well.

Thanks

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#15Amit Kapila
amit.kapila16@gmail.com
In reply to: Simon Riggs (#14)
Re: WAL consistency check facility

On Fri, Aug 26, 2016 at 9:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

I think you should add this as part of the default testing for both
check and installcheck. I can't imagine why we'd have it and not use
it during testing.

The actual consistency checks are done during redo (replay), so not
sure whats in you mind for enabling it with check or installcheck. I
think we can run few recovery/replay tests with this framework. Also
running the tests under this framework could be time-consuming as it
logs the entire page for each WAL record we write and then during
replay reads the same.

--
With Regards,
Amit Kapila.
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#16Simon Riggs
simon@2ndQuadrant.com
In reply to: Amit Kapila (#15)
Re: WAL consistency check facility

On 27 August 2016 at 07:36, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Aug 26, 2016 at 9:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

I think you should add this as part of the default testing for both
check and installcheck. I can't imagine why we'd have it and not use
it during testing.

The actual consistency checks are done during redo (replay), so not
sure whats in you mind for enabling it with check or installcheck. I
think we can run few recovery/replay tests with this framework. Also
running the tests under this framework could be time-consuming as it
logs the entire page for each WAL record we write and then during
replay reads the same.

I'd like to see an automated test added so we can be certain we don't
add things that break recovery. Don't mind much where or how.

The main use is to maintain that certainty while in production.

--
Simon Riggs http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#17Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Simon Riggs (#14)
Re: WAL consistency check facility

Hello Simon,

I'm really sorry for the inconveniences. Next time, I'll attach the
patch with proper documentation, test and comments.

I think you should add this as part of the default testing for both
check and installcheck. I can't imagine why we'd have it and not use
it during testing.

Since, this is redo(replay) feature, we can surely add this in
installcheck. But, as Amit mentioned, it could be time-consuming.

* wal_consistency_mask = 511 /* Enable consistency check mask bit*/

What does this mean? (No docs)

I was using this parameter as a masking integer to indicate the
operations(rmgr list) for which we need this feature to be enabled.
Since, this could be confusing, I've changed it accordingly so that it
accepts a list of rmgrIDs. (suggested by Michael, Amit and Robert)

1. Add support for other Resource Managers.

We probably need to have a discussion as to why you think this should
be Rmgr dependent?
Code comments would help there.

If it does, then you should probably do this by extending RmgrTable
with an rm_check, so you can call it like this...

RmgrTable[record->xl_rmid].rm_check

+1.
I'm modifying it accordingly. I'm calling this function after
RmgrTable[record->xl_rmid].rm_redo.

5. Generalize the page type identification technique.

Why not do this first?

At present, I'm using special page size and page ID to identify page
type. But, I've noticed some cases where the entire page is
initialized to zero (Ex: hash_xlog_squeeze_page). RmgrID and info bit
can help us to identify those pages.

--
Thanks & Regards,
Kuntal Ghosh
EnterpriseDB: http://www.enterprisedb.com

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#18Michael Paquier
michael@paquier.xyz
In reply to: Simon Riggs (#16)
Re: WAL consistency check facility

On Sat, Aug 27, 2016 at 6:16 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

On 27 August 2016 at 07:36, Amit Kapila <amit.kapila16@gmail.com> wrote:

On Fri, Aug 26, 2016 at 9:26 PM, Simon Riggs <simon@2ndquadrant.com> wrote:

I think you should add this as part of the default testing for both
check and installcheck. I can't imagine why we'd have it and not use
it during testing.

The actual consistency checks are done during redo (replay), so not
sure whats in you mind for enabling it with check or installcheck. I
think we can run few recovery/replay tests with this framework. Also
running the tests under this framework could be time-consuming as it
logs the entire page for each WAL record we write and then during
replay reads the same.

I'd like to see an automated test added so we can be certain we don't
add things that break recovery. Don't mind much where or how.

The main use is to maintain that certainty while in production.

For developers, having extra checks with the new routines in WAL_DEBUG
could also be useful for a code path producing WAL. Let's not forget
that as well.
--
Michael

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Alvaro Herrera (#13)
Re: WAL consistency check facility

On Fri, Aug 26, 2016 at 7:24 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:

As the block numbers are different, I was getting the following warning:
WARNING: Inconsistent page (at byte 8166) found for record
0/127F4A48, rel 1663/16384/16946, forknum 0, blkno 0, Backup Page
Header : (pd_lower: 28 pd_upper: 8152 pd_special: 8192) Current Page
Header: (pd_lower: 28 pd_upper: 8152 pd_special: 8192)
CONTEXT: xlog redo at 0/127F4A48 for Heap/INSERT+INIT: off 1

In heap_xlog_insert, t_ctid is always set to blkno and xlrec->offnum.
I think this is why I was getting the above warning.

Umm, really? Then perhaps this *is* a bug. Peter?

It's a matter of perspective, but probably not. The facts are:

* heap_insert() treats speculative insertions differently. In
particular, it does not set ctid in the caller-passed heap tuple
itself, leaving the ctid field to contain a speculative token value --
a per-backend monotonically increasing identifier. This identifier
represents some particular speculative insertion attempt within a
backend.

* On the redo side, heap_xlog_insert() was only changed mechanically
when upsert went in. So, it doesn't actually care about the stuff that
heap_insert() was made to care about to support speculative insertion.

* Furthermore, heap_insert() does *not* WAL log ctid under any
circumstances (that didn't change, either). Traditionally, the ctid
field was completely redundant anyway (since, of course, we're only
dealing with newly inserted tuples in heap_insert()). With speculative
insertions, there is a token within ctid, whose value represents
actual information that cannot be reconstructed after the fact (the
speculative token value). Even still, that isn't WAL-logged (see
comments above xl_heap_header struct). That's okay, because the
speculative insertion token value is only needed due to obscure issues
with "unprincipled deadlocks". The speculative token value itself is
only of interest to other, conflicting inserters, and only for the
instant in time immediately following physical insertion. The token
doesn't matter in the slightest to crash recovery, nor to Hot Standby
replicas.

While this design had some really nice properties (ask me if you are
unclear on this), it does break tools like the proposed WAL-checker
tool. I would compare this speculative token situation to the
situation with hint bits (when checksums are disabled, and
wal_log_hints = off).

I actually have a lot of sympathy for the idea that, in general, cases
like this should be avoided. But, would it really be worth it to
create a useless special case for speculative insertion (to WAL-log
the virtually useless speculative insertion token value)? I'm certain
that the answer must be "no": This tool ought to deal with speculative
insertion as a special case, and not vice-versa.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

In reply to: Kuntal Ghosh (#10)
Re: WAL consistency check facility

On Thu, Aug 25, 2016 at 9:41 AM, Kuntal Ghosh
<kuntalghosh.2007@gmail.com> wrote:

2. For Btree pages, I've masked BTP_HALF_DEAD, BTP_SPLIT_END,
BTP_HAS_GARBAGE and BTP_INCOMPLETE_SPLIT flags.

Why? I think that you should only perform this kind of masking where
it's clearly strictly necessary.

It is true that nbtree can allow cases where LP_DEAD is set with only
a share lock (by read-only queries), so I can see why BTP_HAS_GARBAGE
might need to be excluded; this is comparable to the heapam's use of
hint bits. However, it is unclear why you need to mask the remaining
btpo_flags that you list, because the other flags have clear-cut roles
in various atomic operations that we WAL-log.

--
Peter Geoghegan

--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

#21Amit Kapila
amit.kapila16@gmail.com
In reply to: Peter Geoghegan (#20)
In reply to: Amit Kapila (#21)
#23Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Peter Geoghegan (#22)
#24Simon Riggs
simon@2ndQuadrant.com
In reply to: Kuntal Ghosh (#17)
#25Michael Paquier
michael@paquier.xyz
In reply to: Simon Riggs (#24)
#26Amit Kapila
amit.kapila16@gmail.com
In reply to: Simon Riggs (#24)
#27Michael Paquier
michael@paquier.xyz
In reply to: Amit Kapila (#26)
#28Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#27)
#29Amit Kapila
amit.kapila16@gmail.com
In reply to: Amit Kapila (#26)
In reply to: Amit Kapila (#28)
#31Amit Kapila
amit.kapila16@gmail.com
In reply to: Peter Geoghegan (#30)
#32Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#24)
#33Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#32)
#34Robert Haas
robertmhaas@gmail.com
In reply to: Simon Riggs (#33)
#35Simon Riggs
simon@2ndQuadrant.com
In reply to: Robert Haas (#34)
In reply to: Robert Haas (#34)
#37Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Peter Geoghegan (#36)
#38Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Kuntal Ghosh (#37)
#39Amit Kapila
amit.kapila16@gmail.com
In reply to: Kuntal Ghosh (#37)
#40Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#37)
#41Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#40)
#42Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#41)
#43Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#42)
#44Amit Kapila
amit.kapila16@gmail.com
In reply to: Michael Paquier (#42)
#45Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#43)
#46Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Amit Kapila (#45)
#47Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#46)
#48Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#43)
#49Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#47)
#50Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#49)
#51Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#50)
#52Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#51)
#53Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#52)
#54Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Kuntal Ghosh (#53)
#55Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#50)
#56Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Robert Haas (#55)
#57Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#56)
#58Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#57)
#59Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#58)
#60Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#59)
#61Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#60)
#62Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#61)
#63Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#62)
#64Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#62)
#65Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#63)
#66Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#63)
#67Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#65)
In reply to: Robert Haas (#63)
#69Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#67)
#70Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#69)
#71Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#70)
#72Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#71)
#73Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Kuntal Ghosh (#72)
#74Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#73)
#75Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#74)
#76Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#75)
#77Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#76)
#78Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#76)
#79Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#78)
#80Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#77)
#81Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#79)
#82Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#81)
#83Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#82)
#84Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#83)
#85Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Kuntal Ghosh (#84)
#86Robert Haas
robertmhaas@gmail.com
In reply to: Kuntal Ghosh (#73)
#87Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Robert Haas (#86)
#88Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#87)
#89Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#84)
#90Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#89)
#91Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#90)
#92Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#91)
#93Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#92)
#94Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#93)
#95Robert Haas
robertmhaas@gmail.com
In reply to: Kuntal Ghosh (#94)
#96Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#95)
#97Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#96)
#98Peter Eisentraut
peter_e@gmx.net
In reply to: Michael Paquier (#93)
#99Michael Paquier
michael@paquier.xyz
In reply to: Peter Eisentraut (#98)
#100Robert Haas
robertmhaas@gmail.com
In reply to: Peter Eisentraut (#98)
#101Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Robert Haas (#100)
#102Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#101)
#103Michael Paquier
michael@paquier.xyz
In reply to: Michael Paquier (#102)
#104Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#103)
#105Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Robert Haas (#104)
#106Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Robert Haas (#104)
#107Michael Paquier
michael@paquier.xyz
In reply to: Kuntal Ghosh (#106)
#108Robert Haas
robertmhaas@gmail.com
In reply to: Kuntal Ghosh (#106)
#109Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#108)
#110Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Robert Haas (#108)
#111Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Michael Paquier (#109)
#112Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#109)
#113Amit Kapila
amit.kapila16@gmail.com
In reply to: Robert Haas (#108)
#114Robert Haas
robertmhaas@gmail.com
In reply to: Amit Kapila (#113)
#115Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Robert Haas (#108)
#116Robert Haas
robertmhaas@gmail.com
In reply to: Kuntal Ghosh (#115)
#117Kuntal Ghosh
kuntalghosh.2007@gmail.com
In reply to: Robert Haas (#116)
#118Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#116)
#119Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#118)
#120Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#119)
#121Robert Haas
robertmhaas@gmail.com
In reply to: Michael Paquier (#120)
#122Tom Lane
tgl@sss.pgh.pa.us
In reply to: Robert Haas (#121)
#123Robert Haas
robertmhaas@gmail.com
In reply to: Tom Lane (#122)
#124Michael Paquier
michael@paquier.xyz
In reply to: Robert Haas (#123)