Assert in pageinspect with NULL pages
Hi, hackers!
If we trying to call pageinspect functions for pages which are filled
with nulls, we will get core dump. It happens with null pages for all
indexes in pageinspect and for page_checksum. This problem was founded
firstly by Anastasia Lubennikova, and now I continue this task.
For example, next script leads to fail.
CREATE TABLE test1 (
x bigserial,
y bigint DEFAULT 0
);
INSERT INTO test1(y) SELECT 0 FROM generate_series(1,1E6) AS x;
SELECT page_checksum(repeat(E'\\000', 8192)::bytea, 1);
server closed the connection unexpectedly
This probably means the server terminated abnormally
before or while processing the request.
fatal: connection to server was lost
LOG: server process (PID 16465) was terminated by signal 6
DETAIL: Failed process was running: select
page_checksum(repeat(E'\\000', 8192)::bytea, 1);
LOG: terminating any other active server processes
LOG: all server processes terminated; reinitializing
LOG: database system was interrupted; last known up at 2022-02-16
14:03:16 +05
LOG: database system was not properly shut down; automatic recovery in
progress
LOG: redo starts at 0/14F1B20
LOG: invalid record length at 0/5D40CD8: wanted 24, got 0
LOG: redo done at 0/5D40BC0 system usage: CPU: user: 0.98 s, system:
0.02 s, elapsed: 1.01 s
LOG: checkpoint starting: end-of-recovery immediate wait
LOG: checkpoint complete: wrote 5500 buffers (33.6%); 0 WAL file(s)
added, 0 removed, 4 recycled; write=0.064 s, sync=0.007 s, total=0.080
s; sync files=45, longest=0.004 s, average=0.001 s; distance=74044 kB,
estimate=74044 kB
LOG: database system is ready to accept connections
Also it is possible to use
select brin_metapage_info(repeat(E'\\000', 8192)::bytea);
or
select bt_page_items(repeat(E'\\000', 8192)::bytea);
for getting fail.
I tried to make some additional checks for null pages into pageinspect'
functions. The attached patch also contains tests for this case.
What do you think?
--
Daria Lepikhova
Postgres Professional
Attachments:
0001-Add-checks-for-null-pages-to-pageinspect-functions.patchtext/x-patch; charset=UTF-8; name=0001-Add-checks-for-null-pages-to-pageinspect-functions.patchDownload
From 92d71d9b5c583f5f2733646a42c36e7e437ed0a1 Mon Sep 17 00:00:00 2001
From: "d.lepikhova" <d.lepikhova@postgrespro.ru>
Date: Tue, 15 Feb 2022 14:13:05 +0500
Subject: [PATCH] Add checks for null pages to pageinspect functions
---
contrib/pageinspect/brinfuncs.c | 12 ++++++++++++
contrib/pageinspect/btreefuncs.c | 12 ++++++++++++
contrib/pageinspect/expected/brin.out | 9 +++++++++
contrib/pageinspect/expected/btree.out | 3 +++
contrib/pageinspect/expected/checksum.out | 3 +++
contrib/pageinspect/expected/gin.out | 7 +++++++
contrib/pageinspect/expected/hash.out | 9 +++++++++
contrib/pageinspect/rawpage.c | 12 ++++++++++++
contrib/pageinspect/sql/brin.sql | 6 ++++++
contrib/pageinspect/sql/btree.sql | 3 +++
contrib/pageinspect/sql/checksum.sql | 4 ++++
contrib/pageinspect/sql/gin.sql | 5 +++++
contrib/pageinspect/sql/hash.sql | 6 ++++++
13 files changed, 91 insertions(+)
diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c
index f1e64a39ef2..56129a40abb 100644
--- a/contrib/pageinspect/brinfuncs.c
+++ b/contrib/pageinspect/brinfuncs.c
@@ -63,6 +63,12 @@ brin_page_type(PG_FUNCTION_ARGS)
errdetail("Expected size %d, got %d",
BLCKSZ, raw_page_size)));
+ /* check that the page is initialized before accessing any fields */
+ if (PageIsNew(page))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is empty")));
+
switch (BrinPageType(page))
{
case BRIN_PAGETYPE_META:
@@ -103,6 +109,12 @@ verify_brin_page(bytea *raw_page, uint16 type, const char *strtype)
page = VARDATA(raw_page);
+ /* check that the page is initialized before accessing any fields */
+ if (PageIsNew(page))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is empty")));
+
/* verify the special space says this page is what we want */
if (BrinPageType(page) != type)
ereport(ERROR,
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index 03debe336ba..7af35da19f3 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -519,6 +519,12 @@ bt_page_items_internal(PG_FUNCTION_ARGS, enum pageinspect_version ext_version)
UnlockReleaseBuffer(buffer);
relation_close(rel, AccessShareLock);
+ /* check that the page is initialized before accessing any fields */
+ if (PageIsNew(uargs->page))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is empty")));
+
uargs->offset = FirstOffsetNumber;
opaque = (BTPageOpaque) PageGetSpecialPointer(uargs->page);
@@ -615,6 +621,12 @@ bt_page_items_bytea(PG_FUNCTION_ARGS)
uargs->page = VARDATA(raw_page);
+ /* check that the page is initialized before accessing any fields */
+ if (PageIsNew(uargs->page))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is empty")));
+
uargs->offset = FirstOffsetNumber;
opaque = (BTPageOpaque) PageGetSpecialPointer(uargs->page);
diff --git a/contrib/pageinspect/expected/brin.out b/contrib/pageinspect/expected/brin.out
index 71eb190380c..8e309a2a5d1 100644
--- a/contrib/pageinspect/expected/brin.out
+++ b/contrib/pageinspect/expected/brin.out
@@ -48,4 +48,13 @@ SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx')
1 | 0 | 1 | f | f | f | {1 .. 1}
(1 row)
+-- Check that all functions fail gracefully with an empty page imput
+select brin_page_type(repeat(E'\\000', 8192)::bytea);
+ERROR: input page is empty
+select brin_metapage_info(repeat(E'\\000', 8192)::bytea);
+ERROR: input page is empty
+select brin_revmap_data(repeat(E'\\000', 8192)::bytea);
+ERROR: input page is empty
+select brin_page_items(repeat(E'\\000', 8192)::bytea, 'test1_a_idx');
+ERROR: input page is empty
DROP TABLE test1;
diff --git a/contrib/pageinspect/expected/btree.out b/contrib/pageinspect/expected/btree.out
index c60bc88560c..cc6c4181c85 100644
--- a/contrib/pageinspect/expected/btree.out
+++ b/contrib/pageinspect/expected/btree.out
@@ -70,4 +70,7 @@ tids |
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 2));
ERROR: block number 2 is out of range for relation "test1_a_idx"
+-- Check that bt_page_items() fails gracefully with an empty page input
+select bt_page_items(repeat(E'\\000', 8192)::bytea);
+ERROR: input page is empty
DROP TABLE test1;
diff --git a/contrib/pageinspect/expected/checksum.out b/contrib/pageinspect/expected/checksum.out
index a85388e158e..ff07a879216 100644
--- a/contrib/pageinspect/expected/checksum.out
+++ b/contrib/pageinspect/expected/checksum.out
@@ -38,3 +38,6 @@ SELECT blkno,
100 | 1139 | 28438 | 3648 | -30881 | -16305 | -27349
(3 rows)
+ --Check that page_checksum() with an empty page fails gracefully
+ select page_checksum(repeat(E'\\000', 8192)::bytea, 1);
+ERROR: input page is empty
diff --git a/contrib/pageinspect/expected/gin.out b/contrib/pageinspect/expected/gin.out
index ef7570b9723..ef0fdb59565 100644
--- a/contrib/pageinspect/expected/gin.out
+++ b/contrib/pageinspect/expected/gin.out
@@ -35,4 +35,11 @@ FROM gin_leafpage_items(get_raw_page('test1_y_idx',
-[ RECORD 1 ]
?column? | t
+-- Check that all functions fail gracefully with an empty page imput
+select gin_metapage_info(repeat(E'\\000', 8192)::bytea);
+ERROR: input page is empty
+select gin_page_opaque_info(repeat(E'\\000', 8192)::bytea);
+ERROR: input page is empty
+select gin_leafpage_items(repeat(E'\\000', 8192)::bytea);
+ERROR: input page is empty
DROP TABLE test1;
diff --git a/contrib/pageinspect/expected/hash.out b/contrib/pageinspect/expected/hash.out
index bd0628d0136..271f33bb481 100644
--- a/contrib/pageinspect/expected/hash.out
+++ b/contrib/pageinspect/expected/hash.out
@@ -163,4 +163,13 @@ SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 4));
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 5));
ERROR: page is not a hash bucket or overflow page
+-- Check that all functions fail gracefully with an empty page imput
+select hash_page_type(repeat(E'\\000', 8192)::bytea);
+ERROR: input page is empty
+select hash_metapage_info(repeat(E'\\000', 8192)::bytea);
+ERROR: input page is empty
+select hash_page_stats(repeat(E'\\000', 8192)::bytea);
+ERROR: input page is empty
+select hash_page_items(repeat(E'\\000', 8192)::bytea);
+ERROR: input page is empty
DROP TABLE test_hash;
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index 7e41af045f3..8a7bfccfcc3 100644
--- a/contrib/pageinspect/rawpage.c
+++ b/contrib/pageinspect/rawpage.c
@@ -230,6 +230,12 @@ get_page_from_raw(bytea *raw_page)
memcpy(page, VARDATA_ANY(raw_page), raw_page_size);
+ /* check that the page is initialized */
+ if (PageIsNew((Page) page))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is empty")));
+
return page;
}
@@ -375,6 +381,12 @@ page_checksum_internal(PG_FUNCTION_ARGS, enum pageinspect_version ext_version)
page = (PageHeader) VARDATA(raw_page);
+ /* check that the page is initialized */
+ if (PageIsNew((Page) page))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is empty")));
+
PG_RETURN_INT16(pg_checksum_page((char *) page, blkno));
}
diff --git a/contrib/pageinspect/sql/brin.sql b/contrib/pageinspect/sql/brin.sql
index 735bc3b6733..da810adf62b 100644
--- a/contrib/pageinspect/sql/brin.sql
+++ b/contrib/pageinspect/sql/brin.sql
@@ -15,4 +15,10 @@ SELECT * FROM brin_revmap_data(get_raw_page('test1_a_idx', 1)) LIMIT 5;
SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx')
ORDER BY blknum, attnum LIMIT 5;
+-- Check that all functions fail gracefully with an empty page imput
+select brin_page_type(repeat(E'\\000', 8192)::bytea);
+select brin_metapage_info(repeat(E'\\000', 8192)::bytea);
+select brin_revmap_data(repeat(E'\\000', 8192)::bytea);
+select brin_page_items(repeat(E'\\000', 8192)::bytea, 'test1_a_idx');
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/sql/btree.sql b/contrib/pageinspect/sql/btree.sql
index 96359179597..d2d99196077 100644
--- a/contrib/pageinspect/sql/btree.sql
+++ b/contrib/pageinspect/sql/btree.sql
@@ -21,4 +21,7 @@ SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 0));
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 1));
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 2));
+-- Check that bt_page_items() fails gracefully with an empty page input
+select bt_page_items(repeat(E'\\000', 8192)::bytea);
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/sql/checksum.sql b/contrib/pageinspect/sql/checksum.sql
index b877db0611f..56c4f6b52ed 100644
--- a/contrib/pageinspect/sql/checksum.sql
+++ b/contrib/pageinspect/sql/checksum.sql
@@ -29,3 +29,7 @@ SELECT blkno,
page_checksum(decode(repeat('e6d6', :block_size / 2), 'hex'), blkno) AS checksum_e6d6,
page_checksum(decode(repeat('4a5e', :block_size / 2), 'hex'), blkno) AS checksum_4a5e
FROM generate_series(0, 100, 50) AS a (blkno);
+
+ --Check that page_checksum() with an empty page fails gracefully
+ select page_checksum(repeat(E'\\000', 8192)::bytea, 1);
+
diff --git a/contrib/pageinspect/sql/gin.sql b/contrib/pageinspect/sql/gin.sql
index 423f5c57499..559c570a644 100644
--- a/contrib/pageinspect/sql/gin.sql
+++ b/contrib/pageinspect/sql/gin.sql
@@ -18,4 +18,9 @@ FROM gin_leafpage_items(get_raw_page('test1_y_idx',
(pg_relation_size('test1_y_idx') /
current_setting('block_size')::bigint)::int - 1));
+-- Check that all functions fail gracefully with an empty page imput
+select gin_metapage_info(repeat(E'\\000', 8192)::bytea);
+select gin_page_opaque_info(repeat(E'\\000', 8192)::bytea);
+select gin_leafpage_items(repeat(E'\\000', 8192)::bytea);
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/sql/hash.sql b/contrib/pageinspect/sql/hash.sql
index 64f33f1d52f..4ef23cb94fe 100644
--- a/contrib/pageinspect/sql/hash.sql
+++ b/contrib/pageinspect/sql/hash.sql
@@ -78,5 +78,11 @@ SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 3));
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 4));
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 5));
+-- Check that all functions fail gracefully with an empty page imput
+select hash_page_type(repeat(E'\\000', 8192)::bytea);
+select hash_metapage_info(repeat(E'\\000', 8192)::bytea);
+select hash_page_stats(repeat(E'\\000', 8192)::bytea);
+select hash_page_items(repeat(E'\\000', 8192)::bytea);
+
DROP TABLE test_hash;
--
2.25.1
On Thu, Feb 17, 2022 at 01:46:40PM +0500, Daria Lepikhova wrote:
INSERT INTO test1(y) SELECT 0 FROM generate_series(1,1E6) AS x;
SELECT page_checksum(repeat(E'\\000', 8192)::bytea, 1);
Indeed. Good catch, and that seems pretty old at quick glance for the
checksum part. I'll try to look at all that tomorrow.
--
Michael
On Thu, Feb 17, 2022 at 05:57:49PM +0900, Michael Paquier wrote:
On Thu, Feb 17, 2022 at 01:46:40PM +0500, Daria Lepikhova wrote:
INSERT INTO test1(y) SELECT 0 FROM generate_series(1,1E6) AS x;
SELECT page_checksum(repeat(E'\\000', 8192)::bytea, 1);Indeed. Good catch, and that seems pretty old at quick glance for the
checksum part. I'll try to look at all that tomorrow.
Indeed, the problem in page_checksum() has been there since the beginning as
far as I can see.
About the patch, it's incorrectly using a hardcoded 8192 block-size rather than
using the computed :block_size (or computing one when it's not already the
case).
I'm also wondering if it wouldn't be better to return NULL rather than throwing
an error for uninitialized pages.
While at it, it could also be worthwhile to add tests for invalid blkno and
block size in page_checksum().
BRIN can also crash if passed a non-brin index.
I've been sitting on this one for awhile. Feel free to include it in your
patchset.
commit 08010a6037fc4e24a9ba05e5386e766f4310d35e
Author: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue Jan 19 00:25:15 2021 -0600
pageinspect: brin_page_items(): check that given relation is not only an index, but a brin one
diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c
index f1e64a39ef2..3de6dd943ef 100644
--- a/contrib/pageinspect/brinfuncs.c
+++ b/contrib/pageinspect/brinfuncs.c
@@ -16,6 +16,7 @@
#include "access/brin_tuple.h"
#include "access/htup_details.h"
#include "catalog/index.h"
+#include "catalog/pg_am.h"
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "lib/stringinfo.h"
@@ -59,7 +60,7 @@ brin_page_type(PG_FUNCTION_ARGS)
if (raw_page_size != BLCKSZ)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("input page too small"),
+ errmsg("input page wrong size"),
errdetail("Expected size %d, got %d",
BLCKSZ, raw_page_size)));
@@ -97,7 +98,7 @@ verify_brin_page(bytea *raw_page, uint16 type, const char *strtype)
if (raw_page_size != BLCKSZ)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("input page too small"),
+ errmsg("input page wrong size"),
errdetail("Expected size %d, got %d",
BLCKSZ, raw_page_size)));
@@ -169,7 +170,14 @@ brin_page_items(PG_FUNCTION_ARGS)
MemoryContextSwitchTo(oldcontext);
indexRel = index_open(indexRelid, AccessShareLock);
- bdesc = brin_build_desc(indexRel);
+
+ /* Must be a BRIN index */
+ if (indexRel->rd_rel->relkind != RELKIND_INDEX ||
+ indexRel->rd_rel->relam != BRIN_AM_OID)
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("\"%s\" is not a BRIN index",
+ RelationGetRelationName(indexRel))));
/* minimally verify the page we got */
page = verify_brin_page(raw_page, BRIN_PAGETYPE_REGULAR, "regular");
@@ -178,6 +186,7 @@ brin_page_items(PG_FUNCTION_ARGS)
* Initialize output functions for all indexed datatypes; simplifies
* calling them later.
*/
+ bdesc = brin_build_desc(indexRel);
columns = palloc(sizeof(brin_column_state *) * RelationGetDescr(indexRel)->natts);
for (attno = 1; attno <= bdesc->bd_tupdesc->natts; attno++)
{
On Thu, Feb 17, 2022 at 05:40:41PM +0800, Julien Rouhaud wrote:
About the patch, it's incorrectly using a hardcoded 8192 block-size rather than
using the computed :block_size (or computing one when it's not already the
case).
Well, the tests of pageinspect fail would already fail when using a
different page size than 8k, like the checksum ones :)
Anywa, I agree with your point that if this is not a reason to not
make the tests more portable if we can do easily.
I'm also wondering if it wouldn't be better to return NULL rather than throwing
an error for uninitialized pages.
Agreed that this is a sensible choice. NULL would be helpful for the
case where one compiles all the checksums of a relation with a full
scan based on the relation size, for example. All these behaviors
ought to be documented properly, as well. For SRFs, this should just
return no rows rather than generating an error.
While at it, it could also be worthwhile to add tests for invalid blkno and
block size in page_checksum().
If we are on that, we could also have tests for the various "input
page too small" errors. Now, these are just improvements, so I'd
rather treat this part separately of the issue reported, and add the
extra tests only on HEAD.
I can't help but notice that we should have a similar protection on
PageIsNew() in heap_page_items()? Now the code happens to work for an
empty page thanks to PageGetMaxOffsetNumber(), but this extra
protection would make the code more solid in the long-term IMO for the
full-zero case? It is worth noting that, in pageinspect, two of the
heap functions are the only ones to not be strict..
Something similar could be said about page_header() in rawpage.c,
though it is not wrong to return only zeros for a page full of zeros.
Shouldn't we similarly care about bt_metap() and
bt_page_stats_internal(), both callers of ReadBuffer(). The root
point is that the RBM_NORMAL mode of ReadBufferExtended() considers a
page full of zeros as valid, as per its top comments.
For the same reason, get_raw_page_internal() would work just fine and
return a page full of zeros.
Also, instead of an error in get_page_from_raw(), we should make sure
that all its callers are able to map with the case of a new page,
where we return 8kB worth of zeros from this inner routine.
--
Michael
On Thu, Feb 17, 2022 at 09:00:20PM -0600, Justin Pryzby wrote:
BRIN can also crash if passed a non-brin index.
I've been sitting on this one for awhile. Feel free to include it in your
patchset.
Ugh. Thanks! I am keeping a note about this one.
--
Michael
Hello Michael,
18.02.2022 06:07, Michael Paquier wrpte:
On Thu, Feb 17, 2022 at 09:00:20PM -0600, Justin Pryzby wrote:
BRIN can also crash if passed a non-brin index.
I've been sitting on this one for awhile. Feel free to include it in your
patchset.Ugh. Thanks! I am keeping a note about this one.
Could you please confirm before committing the patchset that it fixes
the bug #16527 [1]/messages/by-id/16527-ef7606186f0610a1@postgresql.org? Or maybe I could check it?
(Original patch proposed by Daria doesn't cover that case, but if the
patch going to be improved, probably it's worth fixing that bug too.)
[1]: /messages/by-id/16527-ef7606186f0610a1@postgresql.org
/messages/by-id/16527-ef7606186f0610a1@postgresql.org
Best regards,
Alexander
On Mon, Feb 21, 2022 at 10:00:00AM +0300, Alexander Lakhin wrote:
Could you please confirm before committing the patchset that it fixes
the bug #16527 [1]? Or maybe I could check it?
(Original patch proposed by Daria doesn't cover that case, but if the
patch going to be improved, probably it's worth fixing that bug too.)
I have not directly checked, but that looks like the same issue to
me. By the way, I was waiting for an updated patch set, based on the
review provided upthread:
/messages/by-id/Yg8MPrV49/9Usqs1@paquier.xyz
And it seems better to group everything in a single commit as the same
areas are touched.
--
Michael
First of all, thanks for the review and remarks!
18.02.2022 08:02, Michael Paquier пишет:
On Thu, Feb 17, 2022 at 05:40:41PM +0800, Julien Rouhaud wrote:
About the patch, it's incorrectly using a hardcoded 8192 block-size rather than
using the computed :block_size (or computing one when it's not already the
case).Well, the tests of pageinspect fail would already fail when using a
different page size than 8k, like the checksum ones :)Anywa, I agree with your point that if this is not a reason to not
make the tests more portable if we can do easily.
Fixed.
I'm also wondering if it wouldn't be better to return NULL rather than throwing
an error for uninitialized pages.Agreed that this is a sensible choice. NULL would be helpful for the
case where one compiles all the checksums of a relation with a full
scan based on the relation size, for example. All these behaviors
ought to be documented properly, as well. For SRFs, this should just
return no rows rather than generating an error.
So, I tried to implement this remark. However, further getting rid of
throwing an error and replacing it with a NULL return led to reproduce
the problem that Alexander Lakhin mentioned And here I need little more
time to figure it out.
And one more addition. In the previous version of the patch, I forgot to
add tests for the gist index, but the described problem is also relevant
for it.
--
Daria Lepikhova
Postgres Professional
Attachments:
0001-Add-checks-for-null-pages-into-pageinspect-v2.patchtext/x-patch; charset=UTF-8; name=0001-Add-checks-for-null-pages-into-pageinspect-v2.patchDownload
From 7dbfa7f01bf485d2812b24af012721b4a1e1e785 Mon Sep 17 00:00:00 2001
From: "d.lepikhova" <d.lepikhova@postgrespro.ru>
Date: Tue, 22 Feb 2022 16:25:31 +0500
Subject: [PATCH] Add checks for null pages into pageinspect
---
contrib/pageinspect/brinfuncs.c | 10 ++++++++++
contrib/pageinspect/btreefuncs.c | 10 ++++++++++
contrib/pageinspect/expected/brin.out | 13 +++++++++++++
contrib/pageinspect/expected/btree.out | 5 +++++
contrib/pageinspect/expected/checksum.out | 7 +++++++
contrib/pageinspect/expected/gin.out | 7 +++++++
contrib/pageinspect/expected/gist.out | 5 +++++
contrib/pageinspect/expected/hash.out | 9 +++++++++
contrib/pageinspect/rawpage.c | 10 ++++++++++
contrib/pageinspect/sql/brin.sql | 6 ++++++
contrib/pageinspect/sql/btree.sql | 3 +++
contrib/pageinspect/sql/checksum.sql | 3 +++
contrib/pageinspect/sql/gin.sql | 5 +++++
contrib/pageinspect/sql/gist.sql | 5 +++++
contrib/pageinspect/sql/hash.sql | 6 ++++++
15 files changed, 104 insertions(+)
diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c
index 50892b5cc20..a2fee13e2d1 100644
--- a/contrib/pageinspect/brinfuncs.c
+++ b/contrib/pageinspect/brinfuncs.c
@@ -63,6 +63,10 @@ brin_page_type(PG_FUNCTION_ARGS)
errdetail("Expected size %d, got %d",
BLCKSZ, raw_page_size)));
+ /* check that the page is initialized before accessing any fields */
+ if (PageIsNew(page))
+ PG_RETURN_NULL();
+
switch (BrinPageType(page))
{
case BRIN_PAGETYPE_META:
@@ -103,6 +107,12 @@ verify_brin_page(bytea *raw_page, uint16 type, const char *strtype)
page = VARDATA(raw_page);
+ /* check that the page is initialized before accessing any fields */
+ if (PageIsNew(page))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is empty")));
+
/* verify the special space says this page is what we want */
if (BrinPageType(page) != type)
ereport(ERROR,
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index 03debe336ba..5ea7af5b998 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -519,6 +519,12 @@ bt_page_items_internal(PG_FUNCTION_ARGS, enum pageinspect_version ext_version)
UnlockReleaseBuffer(buffer);
relation_close(rel, AccessShareLock);
+ /* check that the page is initialized before accessing any fields */
+ if (PageIsNew(uargs->page))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is empty")));
+
uargs->offset = FirstOffsetNumber;
opaque = (BTPageOpaque) PageGetSpecialPointer(uargs->page);
@@ -615,6 +621,10 @@ bt_page_items_bytea(PG_FUNCTION_ARGS)
uargs->page = VARDATA(raw_page);
+ /* check that the page is initialized before accessing any fields */
+ if (PageIsNew(uargs->page))
+ PG_RETURN_NULL();
+
uargs->offset = FirstOffsetNumber;
opaque = (BTPageOpaque) PageGetSpecialPointer(uargs->page);
diff --git a/contrib/pageinspect/expected/brin.out b/contrib/pageinspect/expected/brin.out
index 71eb190380c..2839ee9435b 100644
--- a/contrib/pageinspect/expected/brin.out
+++ b/contrib/pageinspect/expected/brin.out
@@ -48,4 +48,17 @@ SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx')
1 | 0 | 1 | f | f | f | {1 .. 1}
(1 row)
+-- Check that all functions fail gracefully with an empty page imput
+select brin_page_type(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+ brin_page_type
+----------------
+
+(1 row)
+
+select brin_metapage_info(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+ERROR: input page is empty
+select brin_revmap_data(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+ERROR: input page is empty
+select brin_page_items(repeat(E'\\000', current_setting('block_size')::int)::bytea, 'test1_a_idx');
+ERROR: input page is empty
DROP TABLE test1;
diff --git a/contrib/pageinspect/expected/btree.out b/contrib/pageinspect/expected/btree.out
index c60bc88560c..7c503877826 100644
--- a/contrib/pageinspect/expected/btree.out
+++ b/contrib/pageinspect/expected/btree.out
@@ -70,4 +70,9 @@ tids |
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 2));
ERROR: block number 2 is out of range for relation "test1_a_idx"
+-- Check that bt_page_items() fails gracefully with an empty page input
+select bt_page_items(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+-[ RECORD 1 ]-+-
+bt_page_items |
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/expected/checksum.out b/contrib/pageinspect/expected/checksum.out
index a85388e158e..033eb2ca2bc 100644
--- a/contrib/pageinspect/expected/checksum.out
+++ b/contrib/pageinspect/expected/checksum.out
@@ -38,3 +38,10 @@ SELECT blkno,
100 | 1139 | 28438 | 3648 | -30881 | -16305 | -27349
(3 rows)
+ --Check that page_checksum() with an empty page fails gracefully
+ select page_checksum(repeat(E'\\000', current_setting('block_size')::int)::bytea, 1);
+ page_checksum
+---------------
+
+(1 row)
+
diff --git a/contrib/pageinspect/expected/gin.out b/contrib/pageinspect/expected/gin.out
index ef7570b9723..a37a3749b80 100644
--- a/contrib/pageinspect/expected/gin.out
+++ b/contrib/pageinspect/expected/gin.out
@@ -35,4 +35,11 @@ FROM gin_leafpage_items(get_raw_page('test1_y_idx',
-[ RECORD 1 ]
?column? | t
+-- Check that all functions fail gracefully with an empty page imput
+select gin_metapage_info(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+ERROR: input page is empty
+select gin_page_opaque_info(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+ERROR: input page is empty
+select gin_leafpage_items(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+ERROR: input page is empty
DROP TABLE test1;
diff --git a/contrib/pageinspect/expected/gist.out b/contrib/pageinspect/expected/gist.out
index b26b56ba15d..d69d653abf5 100644
--- a/contrib/pageinspect/expected/gist.out
+++ b/contrib/pageinspect/expected/gist.out
@@ -64,4 +64,9 @@ SELECT itemoffset, ctid, itemlen FROM gist_page_items_bytea(get_raw_page('test_g
6 | (6,65535) | 40
(6 rows)
+-- Check that all functions fail gracefully with an empty page imput
+SELECT * FROM gist_page_opaque_info(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+ERROR: input page is empty
+SELECT * FROM gist_page_items(repeat(E'\\000', current_setting('block_size')::int)::bytea, 'test_gist_idx');
+ERROR: input page is empty
DROP TABLE test_gist;
diff --git a/contrib/pageinspect/expected/hash.out b/contrib/pageinspect/expected/hash.out
index bd0628d0136..f8820f07c42 100644
--- a/contrib/pageinspect/expected/hash.out
+++ b/contrib/pageinspect/expected/hash.out
@@ -163,4 +163,13 @@ SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 4));
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 5));
ERROR: page is not a hash bucket or overflow page
+-- Check that all functions fail gracefully with an empty page imput
+select hash_page_type(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+ERROR: input page is empty
+select hash_metapage_info(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+ERROR: input page is empty
+select hash_page_stats(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+ERROR: input page is empty
+select hash_page_items(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+ERROR: input page is empty
DROP TABLE test_hash;
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index 7e41af045f3..e1eb80ad3d4 100644
--- a/contrib/pageinspect/rawpage.c
+++ b/contrib/pageinspect/rawpage.c
@@ -230,6 +230,12 @@ get_page_from_raw(bytea *raw_page)
memcpy(page, VARDATA_ANY(raw_page), raw_page_size);
+ /* check that the page is initialized */
+ if (PageIsNew((Page) page))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is empty")));
+
return page;
}
@@ -375,6 +381,10 @@ page_checksum_internal(PG_FUNCTION_ARGS, enum pageinspect_version ext_version)
page = (PageHeader) VARDATA(raw_page);
+ /* check that the page is initialized */
+ if (PageIsNew((Page) page))
+ PG_RETURN_NULL();
+
PG_RETURN_INT16(pg_checksum_page((char *) page, blkno));
}
diff --git a/contrib/pageinspect/sql/brin.sql b/contrib/pageinspect/sql/brin.sql
index 735bc3b6733..65da9403995 100644
--- a/contrib/pageinspect/sql/brin.sql
+++ b/contrib/pageinspect/sql/brin.sql
@@ -15,4 +15,10 @@ SELECT * FROM brin_revmap_data(get_raw_page('test1_a_idx', 1)) LIMIT 5;
SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx')
ORDER BY blknum, attnum LIMIT 5;
+-- Check that all functions fail gracefully with an empty page imput
+select brin_page_type(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+select brin_metapage_info(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+select brin_revmap_data(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+select brin_page_items(repeat(E'\\000', current_setting('block_size')::int)::bytea, 'test1_a_idx');
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/sql/btree.sql b/contrib/pageinspect/sql/btree.sql
index 96359179597..e35e894c531 100644
--- a/contrib/pageinspect/sql/btree.sql
+++ b/contrib/pageinspect/sql/btree.sql
@@ -21,4 +21,7 @@ SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 0));
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 1));
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 2));
+-- Check that bt_page_items() fails gracefully with an empty page input
+select bt_page_items(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/sql/checksum.sql b/contrib/pageinspect/sql/checksum.sql
index b877db0611f..4c1922dc09a 100644
--- a/contrib/pageinspect/sql/checksum.sql
+++ b/contrib/pageinspect/sql/checksum.sql
@@ -29,3 +29,6 @@ SELECT blkno,
page_checksum(decode(repeat('e6d6', :block_size / 2), 'hex'), blkno) AS checksum_e6d6,
page_checksum(decode(repeat('4a5e', :block_size / 2), 'hex'), blkno) AS checksum_4a5e
FROM generate_series(0, 100, 50) AS a (blkno);
+
+ --Check that page_checksum() with an empty page fails gracefully
+ select page_checksum(repeat(E'\\000', current_setting('block_size')::int)::bytea, 1);
diff --git a/contrib/pageinspect/sql/gin.sql b/contrib/pageinspect/sql/gin.sql
index 423f5c57499..02826023d3e 100644
--- a/contrib/pageinspect/sql/gin.sql
+++ b/contrib/pageinspect/sql/gin.sql
@@ -18,4 +18,9 @@ FROM gin_leafpage_items(get_raw_page('test1_y_idx',
(pg_relation_size('test1_y_idx') /
current_setting('block_size')::bigint)::int - 1));
+-- Check that all functions fail gracefully with an empty page imput
+select gin_metapage_info(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+select gin_page_opaque_info(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+select gin_leafpage_items(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/sql/gist.sql b/contrib/pageinspect/sql/gist.sql
index 1560d1e15c3..d3277ba71d5 100644
--- a/contrib/pageinspect/sql/gist.sql
+++ b/contrib/pageinspect/sql/gist.sql
@@ -26,4 +26,9 @@ SELECT * FROM gist_page_items(get_raw_page('test_gist_idx', 1), 'test_gist_idx')
-- platform-dependent (endianess), so omit the actual key data from the output.
SELECT itemoffset, ctid, itemlen FROM gist_page_items_bytea(get_raw_page('test_gist_idx', 0));
+-- Check that all functions fail gracefully with an empty page imput
+SELECT * FROM gist_page_opaque_info(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+SELECT * FROM gist_page_items(repeat(E'\\000', current_setting('block_size')::int)::bytea, 'test_gist_idx');
+
+
DROP TABLE test_gist;
diff --git a/contrib/pageinspect/sql/hash.sql b/contrib/pageinspect/sql/hash.sql
index 64f33f1d52f..dba5b5f9c33 100644
--- a/contrib/pageinspect/sql/hash.sql
+++ b/contrib/pageinspect/sql/hash.sql
@@ -78,5 +78,11 @@ SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 3));
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 4));
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 5));
+-- Check that all functions fail gracefully with an empty page imput
+select hash_page_type(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+select hash_metapage_info(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+select hash_page_stats(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+select hash_page_items(repeat(E'\\000', current_setting('block_size')::int)::bytea);
+
DROP TABLE test_hash;
--
2.25.1
18.02.2022 08:00, Justin Pryzby пишет:
BRIN can also crash if passed a non-brin index.
I've been sitting on this one for awhile. Feel free to include it in your
patchset.commit 08010a6037fc4e24a9ba05e5386e766f4310d35e
Author: Justin Pryzby <pryzbyj@telsasoft.com>
Date: Tue Jan 19 00:25:15 2021 -0600pageinspect: brin_page_items(): check that given relation is not only an index, but a brin one
diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c index f1e64a39ef2..3de6dd943ef 100644 --- a/contrib/pageinspect/brinfuncs.c +++ b/contrib/pageinspect/brinfuncs.c @@ -16,6 +16,7 @@ #include "access/brin_tuple.h" #include "access/htup_details.h" #include "catalog/index.h" +#include "catalog/pg_am.h" #include "catalog/pg_type.h" #include "funcapi.h" #include "lib/stringinfo.h" @@ -59,7 +60,7 @@ brin_page_type(PG_FUNCTION_ARGS) if (raw_page_size != BLCKSZ) ereport(ERROR, (errcode(ERRCODE_INVALID_PARAMETER_VALUE), - errmsg("input page too small"), + errmsg("input page wrong size"), errdetail("Expected size %d, got %d", BLCKSZ, raw_page_size)));@@ -97,7 +98,7 @@ verify_brin_page(bytea *raw_page, uint16 type, const char *strtype)
if (raw_page_size != BLCKSZ)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("input page too small"),
+ errmsg("input page wrong size"),
errdetail("Expected size %d, got %d",
BLCKSZ, raw_page_size)));@@ -169,7 +170,14 @@ brin_page_items(PG_FUNCTION_ARGS)
MemoryContextSwitchTo(oldcontext);indexRel = index_open(indexRelid, AccessShareLock); - bdesc = brin_build_desc(indexRel); + + /* Must be a BRIN index */ + if (indexRel->rd_rel->relkind != RELKIND_INDEX || + indexRel->rd_rel->relam != BRIN_AM_OID) + ereport(ERROR, + (errcode(ERRCODE_WRONG_OBJECT_TYPE), + errmsg("\"%s\" is not a BRIN index", + RelationGetRelationName(indexRel))));/* minimally verify the page we got */
page = verify_brin_page(raw_page, BRIN_PAGETYPE_REGULAR, "regular");
@@ -178,6 +186,7 @@ brin_page_items(PG_FUNCTION_ARGS)
* Initialize output functions for all indexed datatypes; simplifies
* calling them later.
*/
+ bdesc = brin_build_desc(indexRel);
columns = palloc(sizeof(brin_column_state *) * RelationGetDescr(indexRel)->natts);
for (attno = 1; attno <= bdesc->bd_tupdesc->natts; attno++)
{
Thanks! This is a very valuable note. I think I will definitely add it
to the next version of the patch, as soon as I deal with the error
exception.
--
Daria Lepikhova
Postgres Professional
On Thu, Feb 17, 2022 at 09:00:20PM -0600, Justin Pryzby wrote:
BRIN can also crash if passed a non-brin index.
I've been sitting on this one for awhile. Feel free to include it in your
patchset.
So, I have begun a close lookup of this thread, and the problem you
are reporting here is worth fixing on its own, before we do something
for new pages as reported originally. There is more that caught my
attention than the brin AM not being check properly, because a couple
of code paths are fuzzy with the checks on the page sizes. My
impression of the whole thing is that we'd better generalize the use
of get_page_from_raw(), improving the code on many sides when the user
can provide a raw page in input (also I'd like to think that it is a
better practice anyway as any of those functions may finish to look
8-byte fields, but the current coding would silently break in
alignment-picky environments):
- Some code paths (hash, btree) would trigger elog(ERROR) but we
cannot use that for errors that can be triggered by the user.
- bt_page_items_bytea(), page_header() and fsm_page_contents() were
fuzzy on the page size check, so it was rather easy to generate
garbage with random data.
- page_checksum_internal() used a PageHeader, where I would have
expected a Page.
- More consistency in the error strings, meaning less contents to
translate in the long-term.
This first batch leads me to the attached, with tests to stress all
that for all the functions taking raw pages in input.
--
Michael
Attachments:
0001-Fixes-for-pageinspect-with-page-sizes.patchtext/x-diff; charset=us-asciiDownload
From 588ffddf2bd2c0d1e6168a2e7093c2488caec94b Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Tue, 15 Mar 2022 17:59:04 +0900
Subject: [PATCH] Fixes for pageinspect with page sizes
---
contrib/pageinspect/brinfuncs.c | 36 ++++++++++----------------
contrib/pageinspect/btreefuncs.c | 28 ++++++++++----------
contrib/pageinspect/expected/brin.out | 4 +++
contrib/pageinspect/expected/btree.out | 15 +++++++++++
contrib/pageinspect/expected/gin.out | 11 ++++++++
contrib/pageinspect/expected/gist.out | 15 +++++++++++
contrib/pageinspect/expected/hash.out | 17 ++++++++++++
contrib/pageinspect/expected/page.out | 11 ++++++++
contrib/pageinspect/fsmfuncs.c | 4 ++-
contrib/pageinspect/gistfuncs.c | 9 +++++++
contrib/pageinspect/hashfuncs.c | 6 +++--
contrib/pageinspect/rawpage.c | 29 +++------------------
contrib/pageinspect/sql/brin.sql | 4 +++
contrib/pageinspect/sql/btree.sql | 13 ++++++++++
contrib/pageinspect/sql/gin.sql | 9 +++++++
contrib/pageinspect/sql/gist.sql | 13 ++++++++++
contrib/pageinspect/sql/hash.sql | 13 ++++++++++
contrib/pageinspect/sql/page.sql | 9 +++++++
18 files changed, 179 insertions(+), 67 deletions(-)
diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c
index b7c8365218..bd0ea8b18c 100644
--- a/contrib/pageinspect/brinfuncs.c
+++ b/contrib/pageinspect/brinfuncs.c
@@ -16,6 +16,7 @@
#include "access/brin_tuple.h"
#include "access/htup_details.h"
#include "catalog/index.h"
+#include "catalog/pg_am_d.h"
#include "catalog/pg_type.h"
#include "funcapi.h"
#include "lib/stringinfo.h"
@@ -31,6 +32,8 @@ PG_FUNCTION_INFO_V1(brin_page_items);
PG_FUNCTION_INFO_V1(brin_metapage_info);
PG_FUNCTION_INFO_V1(brin_revmap_data);
+#define IS_BRIN(r) ((r)->rd_rel->relam == BRIN_AM_OID)
+
typedef struct brin_column_state
{
int nstored;
@@ -45,8 +48,7 @@ Datum
brin_page_type(PG_FUNCTION_ARGS)
{
bytea *raw_page = PG_GETARG_BYTEA_P(0);
- Page page = VARDATA(raw_page);
- int raw_page_size;
+ Page page;
char *type;
if (!superuser())
@@ -54,14 +56,7 @@ brin_page_type(PG_FUNCTION_ARGS)
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser to use raw page functions")));
- raw_page_size = VARSIZE(raw_page) - VARHDRSZ;
-
- if (raw_page_size != BLCKSZ)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("input page too small"),
- errdetail("Expected size %d, got %d",
- BLCKSZ, raw_page_size)));
+ page = get_page_from_raw(raw_page);
switch (BrinPageType(page))
{
@@ -89,19 +84,7 @@ brin_page_type(PG_FUNCTION_ARGS)
static Page
verify_brin_page(bytea *raw_page, uint16 type, const char *strtype)
{
- Page page;
- int raw_page_size;
-
- raw_page_size = VARSIZE(raw_page) - VARHDRSZ;
-
- if (raw_page_size != BLCKSZ)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("input page too small"),
- errdetail("Expected size %d, got %d",
- BLCKSZ, raw_page_size)));
-
- page = VARDATA(raw_page);
+ Page page = get_page_from_raw(raw_page);
/* verify the special space says this page is what we want */
if (BrinPageType(page) != type)
@@ -143,6 +126,13 @@ brin_page_items(PG_FUNCTION_ARGS)
SetSingleFuncCall(fcinfo, 0);
indexRel = index_open(indexRelid, AccessShareLock);
+
+ if (!IS_BRIN(indexRel))
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("\"%s\" is not a %s index",
+ RelationGetRelationName(indexRel), "BRIN")));
+
bdesc = brin_build_desc(indexRel);
/* minimally verify the page we got */
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index 03debe336b..d9628dd664 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -206,8 +206,10 @@ bt_page_stats_internal(PG_FUNCTION_ARGS, enum pageinspect_version ext_version)
rel = relation_openrv(relrv, AccessShareLock);
if (!IS_INDEX(rel) || !IS_BTREE(rel))
- elog(ERROR, "relation \"%s\" is not a btree index",
- RelationGetRelationName(rel));
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("\"%s\" is not a %s index",
+ RelationGetRelationName(rel), "btree")));
/*
* Reject attempts to read non-local temporary relations; we would be
@@ -476,8 +478,10 @@ bt_page_items_internal(PG_FUNCTION_ARGS, enum pageinspect_version ext_version)
rel = relation_openrv(relrv, AccessShareLock);
if (!IS_INDEX(rel) || !IS_BTREE(rel))
- elog(ERROR, "relation \"%s\" is not a btree index",
- RelationGetRelationName(rel));
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("\"%s\" is not a %s index",
+ RelationGetRelationName(rel), "btree")));
/*
* Reject attempts to read non-local temporary relations; we would be
@@ -588,7 +592,6 @@ bt_page_items_bytea(PG_FUNCTION_ARGS)
Datum result;
FuncCallContext *fctx;
struct user_args *uargs;
- int raw_page_size;
if (!superuser())
ereport(ERROR,
@@ -601,19 +604,12 @@ bt_page_items_bytea(PG_FUNCTION_ARGS)
MemoryContext mctx;
TupleDesc tupleDesc;
- raw_page_size = VARSIZE(raw_page) - VARHDRSZ;
-
- if (raw_page_size < SizeOfPageHeaderData)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("input page too small (%d bytes)", raw_page_size)));
-
fctx = SRF_FIRSTCALL_INIT();
mctx = MemoryContextSwitchTo(fctx->multi_call_memory_ctx);
uargs = palloc(sizeof(struct user_args));
- uargs->page = VARDATA(raw_page);
+ uargs->page = get_page_from_raw(raw_page);
uargs->offset = FirstOffsetNumber;
@@ -698,8 +694,10 @@ bt_metap(PG_FUNCTION_ARGS)
rel = relation_openrv(relrv, AccessShareLock);
if (!IS_INDEX(rel) || !IS_BTREE(rel))
- elog(ERROR, "relation \"%s\" is not a btree index",
- RelationGetRelationName(rel));
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("\"%s\" is not a %s index",
+ RelationGetRelationName(rel), "btree")));
/*
* Reject attempts to read non-local temporary relations; we would be
diff --git a/contrib/pageinspect/expected/brin.out b/contrib/pageinspect/expected/brin.out
index 71eb190380..10cd36c177 100644
--- a/contrib/pageinspect/expected/brin.out
+++ b/contrib/pageinspect/expected/brin.out
@@ -48,4 +48,8 @@ SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx')
1 | 0 | 1 | f | f | f | {1 .. 1}
(1 row)
+-- Failure for non-BRIN index.
+CREATE INDEX test1_a_btree ON test1 (a);
+SELECT brin_page_items(get_raw_page('test1_a_btree', 0), 'test1_a_btree');
+ERROR: "test1_a_btree" is not a BRIN index
DROP TABLE test1;
diff --git a/contrib/pageinspect/expected/btree.out b/contrib/pageinspect/expected/btree.out
index c60bc88560..48a9e7a49e 100644
--- a/contrib/pageinspect/expected/btree.out
+++ b/contrib/pageinspect/expected/btree.out
@@ -70,4 +70,19 @@ tids |
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 2));
ERROR: block number 2 is out of range for relation "test1_a_idx"
+-- Failure when using a non-btree index.
+CREATE INDEX test1_a_hash ON test1 USING hash(a);
+SELECT bt_metap('test1_a_hash');
+ERROR: "test1_a_hash" is not a btree index
+SELECT bt_page_stats('test1_a_hash', 0);
+ERROR: "test1_a_hash" is not a btree index
+SELECT bt_page_items('test1_a_hash', 0);
+ERROR: "test1_a_hash" is not a btree index
+-- Failure with incorrect page size
+-- Suppress the DETAIL message, to allow the tests to work across various
+-- default page sizes.
+\set VERBOSITY terse
+SELECT bt_page_items('aaa'::bytea);
+ERROR: invalid page size
+\set VERBOSITY default
DROP TABLE test1;
diff --git a/contrib/pageinspect/expected/gin.out b/contrib/pageinspect/expected/gin.out
index ef7570b972..1e8a22f240 100644
--- a/contrib/pageinspect/expected/gin.out
+++ b/contrib/pageinspect/expected/gin.out
@@ -36,3 +36,14 @@ FROM gin_leafpage_items(get_raw_page('test1_y_idx',
?column? | t
DROP TABLE test1;
+-- Failure with incorrect page size
+-- Suppress the DETAIL message, to allow the tests to work across various
+-- default page sizes.
+\set VERBOSITY terse
+SELECT gin_leafpage_items('aaa'::bytea);
+ERROR: invalid page size
+SELECT gin_metapage_info('bbb'::bytea);
+ERROR: invalid page size
+SELECT gin_page_opaque_info('ccc'::bytea);
+ERROR: invalid page size
+\set VERBOSITY default
diff --git a/contrib/pageinspect/expected/gist.out b/contrib/pageinspect/expected/gist.out
index 93abfcdd5e..7cfcb9ed8c 100644
--- a/contrib/pageinspect/expected/gist.out
+++ b/contrib/pageinspect/expected/gist.out
@@ -64,4 +64,19 @@ SELECT itemoffset, ctid, itemlen FROM gist_page_items_bytea(get_raw_page('test_g
6 | (6,65535) | 40
(6 rows)
+-- Failure with non-GiST index.
+CREATE INDEX test_gist_btree on test_gist(t);
+SELECT gist_page_items(get_raw_page('test_gist_btree', 0), 'test_gist_btree');
+ERROR: "test_gist_btree" is not a GiST index
+-- Failure with incorrect page size
+-- Suppress the DETAIL message, to allow the tests to work across various
+-- default page sizes.
+\set VERBOSITY terse
+SELECT gist_page_items_bytea('aaa'::bytea);
+ERROR: invalid page size
+SELECT gist_page_items('aaa'::bytea, 'test_gist_idx'::regclass);
+ERROR: invalid page size
+SELECT gist_page_opaque_info('aaa'::bytea);
+ERROR: invalid page size
+\set VERBOSITY default
DROP TABLE test_gist;
diff --git a/contrib/pageinspect/expected/hash.out b/contrib/pageinspect/expected/hash.out
index bd0628d013..e45034ad05 100644
--- a/contrib/pageinspect/expected/hash.out
+++ b/contrib/pageinspect/expected/hash.out
@@ -163,4 +163,21 @@ SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 4));
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 5));
ERROR: page is not a hash bucket or overflow page
+-- Failure with non-hash index
+CREATE INDEX test_hash_a_btree ON test_hash USING btree (a);
+SELECT hash_bitmap_info('test_hash_a_btree', 0);
+ERROR: "test_hash_a_btree" is not a hash index
+-- Failure with incorrect page size
+-- Suppress the DETAIL message, to allow the tests to work across various
+-- default page sizes.
+\set VERBOSITY terse
+SELECT hash_metapage_info('aaa'::bytea);
+ERROR: invalid page size
+SELECT hash_page_items('bbb'::bytea);
+ERROR: invalid page size
+SELECT hash_page_stats('ccc'::bytea);
+ERROR: invalid page size
+SELECT hash_page_type('ddd'::bytea);
+ERROR: invalid page size
+\set VERBOSITY default
DROP TABLE test_hash;
diff --git a/contrib/pageinspect/expected/page.out b/contrib/pageinspect/expected/page.out
index 4e325ae56d..0aedf44dda 100644
--- a/contrib/pageinspect/expected/page.out
+++ b/contrib/pageinspect/expected/page.out
@@ -207,3 +207,14 @@ select tuple_data_split('test8'::regclass, t_data, t_infomask, t_infomask2, t_bi
(1 row)
drop table test8;
+-- Failure with incorrect page size
+-- Suppress the DETAIL message, to allow the tests to work across various
+-- default page sizes.
+\set VERBOSITY terse
+SELECT fsm_page_contents('aaa'::bytea);
+ERROR: invalid page size
+SELECT page_checksum('bbb'::bytea, 0);
+ERROR: invalid page size
+SELECT page_header('ccc'::bytea);
+ERROR: invalid page size
+\set VERBOSITY default
diff --git a/contrib/pageinspect/fsmfuncs.c b/contrib/pageinspect/fsmfuncs.c
index dadd62aa20..b914da1d4a 100644
--- a/contrib/pageinspect/fsmfuncs.c
+++ b/contrib/pageinspect/fsmfuncs.c
@@ -36,6 +36,7 @@ fsm_page_contents(PG_FUNCTION_ARGS)
{
bytea *raw_page = PG_GETARG_BYTEA_P(0);
StringInfoData sinfo;
+ Page page;
FSMPage fsmpage;
int i;
@@ -44,7 +45,8 @@ fsm_page_contents(PG_FUNCTION_ARGS)
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser to use raw page functions")));
- fsmpage = (FSMPage) PageGetContents(VARDATA(raw_page));
+ page = get_page_from_raw(raw_page);
+ fsmpage = (FSMPage) PageGetContents(page);
initStringInfo(&sinfo);
diff --git a/contrib/pageinspect/gistfuncs.c b/contrib/pageinspect/gistfuncs.c
index 10d6dd44d4..a31cff47fe 100644
--- a/contrib/pageinspect/gistfuncs.c
+++ b/contrib/pageinspect/gistfuncs.c
@@ -14,6 +14,7 @@
#include "access/htup.h"
#include "access/relation.h"
#include "catalog/namespace.h"
+#include "catalog/pg_am_d.h"
#include "funcapi.h"
#include "miscadmin.h"
#include "pageinspect.h"
@@ -28,6 +29,8 @@ PG_FUNCTION_INFO_V1(gist_page_opaque_info);
PG_FUNCTION_INFO_V1(gist_page_items);
PG_FUNCTION_INFO_V1(gist_page_items_bytea);
+#define IS_GIST(r) ((r)->rd_rel->relam == GIST_AM_OID)
+
#define ItemPointerGetDatum(X) PointerGetDatum(X)
@@ -174,6 +177,12 @@ gist_page_items(PG_FUNCTION_ARGS)
/* Open the relation */
indexRel = index_open(indexRelid, AccessShareLock);
+ if (!IS_GIST(indexRel))
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("\"%s\" is not a %s index",
+ RelationGetRelationName(indexRel), "GiST")));
+
page = get_page_from_raw(raw_page);
/* Avoid bogus PageGetMaxOffsetNumber() call with deleted pages */
diff --git a/contrib/pageinspect/hashfuncs.c b/contrib/pageinspect/hashfuncs.c
index 090eba4a93..ff73c363fc 100644
--- a/contrib/pageinspect/hashfuncs.c
+++ b/contrib/pageinspect/hashfuncs.c
@@ -417,8 +417,10 @@ hash_bitmap_info(PG_FUNCTION_ARGS)
indexRel = index_open(indexRelid, AccessShareLock);
if (!IS_HASH(indexRel))
- elog(ERROR, "relation \"%s\" is not a hash index",
- RelationGetRelationName(indexRel));
+ ereport(ERROR,
+ (errcode(ERRCODE_WRONG_OBJECT_TYPE),
+ errmsg("\"%s\" is not a %s index",
+ RelationGetRelationName(indexRel), "hash")));
if (RELATION_IS_OTHER_TEMP(indexRel))
ereport(ERROR,
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index 7e41af045f..92ffb2d930 100644
--- a/contrib/pageinspect/rawpage.c
+++ b/contrib/pageinspect/rawpage.c
@@ -246,7 +246,6 @@ Datum
page_header(PG_FUNCTION_ARGS)
{
bytea *raw_page = PG_GETARG_BYTEA_P(0);
- int raw_page_size;
TupleDesc tupdesc;
@@ -263,18 +262,7 @@ page_header(PG_FUNCTION_ARGS)
(errcode(ERRCODE_INSUFFICIENT_PRIVILEGE),
errmsg("must be superuser to use raw page functions")));
- raw_page_size = VARSIZE(raw_page) - VARHDRSZ;
-
- /*
- * Check that enough data was supplied, so that we don't try to access
- * fields outside the supplied buffer.
- */
- if (raw_page_size < SizeOfPageHeaderData)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("input page too small (%d bytes)", raw_page_size)));
-
- page = (PageHeader) VARDATA(raw_page);
+ page = (PageHeader) get_page_from_raw(raw_page);
/* Build a tuple descriptor for our result type */
if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
@@ -350,8 +338,7 @@ page_checksum_internal(PG_FUNCTION_ARGS, enum pageinspect_version ext_version)
{
bytea *raw_page = PG_GETARG_BYTEA_P(0);
int64 blkno = (ext_version == PAGEINSPECT_V1_8 ? PG_GETARG_UINT32(1) : PG_GETARG_INT64(1));
- int raw_page_size;
- PageHeader page;
+ Page page;
if (!superuser())
ereport(ERROR,
@@ -363,17 +350,7 @@ page_checksum_internal(PG_FUNCTION_ARGS, enum pageinspect_version ext_version)
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("invalid block number")));
- raw_page_size = VARSIZE(raw_page) - VARHDRSZ;
-
- /*
- * Check that the supplied page is of the right size.
- */
- if (raw_page_size != BLCKSZ)
- ereport(ERROR,
- (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("incorrect size of input page (%d bytes)", raw_page_size)));
-
- page = (PageHeader) VARDATA(raw_page);
+ page = get_page_from_raw(raw_page);
PG_RETURN_INT16(pg_checksum_page((char *) page, blkno));
}
diff --git a/contrib/pageinspect/sql/brin.sql b/contrib/pageinspect/sql/brin.sql
index 735bc3b673..8717229c5d 100644
--- a/contrib/pageinspect/sql/brin.sql
+++ b/contrib/pageinspect/sql/brin.sql
@@ -15,4 +15,8 @@ SELECT * FROM brin_revmap_data(get_raw_page('test1_a_idx', 1)) LIMIT 5;
SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx')
ORDER BY blknum, attnum LIMIT 5;
+-- Failure for non-BRIN index.
+CREATE INDEX test1_a_btree ON test1 (a);
+SELECT brin_page_items(get_raw_page('test1_a_btree', 0), 'test1_a_btree');
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/sql/btree.sql b/contrib/pageinspect/sql/btree.sql
index 9635917959..a0b8ba1d0f 100644
--- a/contrib/pageinspect/sql/btree.sql
+++ b/contrib/pageinspect/sql/btree.sql
@@ -21,4 +21,17 @@ SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 0));
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 1));
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 2));
+-- Failure when using a non-btree index.
+CREATE INDEX test1_a_hash ON test1 USING hash(a);
+SELECT bt_metap('test1_a_hash');
+SELECT bt_page_stats('test1_a_hash', 0);
+SELECT bt_page_items('test1_a_hash', 0);
+
+-- Failure with incorrect page size
+-- Suppress the DETAIL message, to allow the tests to work across various
+-- default page sizes.
+\set VERBOSITY terse
+SELECT bt_page_items('aaa'::bytea);
+\set VERBOSITY default
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/sql/gin.sql b/contrib/pageinspect/sql/gin.sql
index 423f5c5749..5b74a04bc6 100644
--- a/contrib/pageinspect/sql/gin.sql
+++ b/contrib/pageinspect/sql/gin.sql
@@ -19,3 +19,12 @@ FROM gin_leafpage_items(get_raw_page('test1_y_idx',
current_setting('block_size')::bigint)::int - 1));
DROP TABLE test1;
+
+-- Failure with incorrect page size
+-- Suppress the DETAIL message, to allow the tests to work across various
+-- default page sizes.
+\set VERBOSITY terse
+SELECT gin_leafpage_items('aaa'::bytea);
+SELECT gin_metapage_info('bbb'::bytea);
+SELECT gin_page_opaque_info('ccc'::bytea);
+\set VERBOSITY default
diff --git a/contrib/pageinspect/sql/gist.sql b/contrib/pageinspect/sql/gist.sql
index 3e83239a9d..1ea45851f4 100644
--- a/contrib/pageinspect/sql/gist.sql
+++ b/contrib/pageinspect/sql/gist.sql
@@ -26,4 +26,17 @@ SELECT * FROM gist_page_items(get_raw_page('test_gist_idx', 1), 'test_gist_idx')
-- platform-dependent (endianness), so omit the actual key data from the output.
SELECT itemoffset, ctid, itemlen FROM gist_page_items_bytea(get_raw_page('test_gist_idx', 0));
+-- Failure with non-GiST index.
+CREATE INDEX test_gist_btree on test_gist(t);
+SELECT gist_page_items(get_raw_page('test_gist_btree', 0), 'test_gist_btree');
+
+-- Failure with incorrect page size
+-- Suppress the DETAIL message, to allow the tests to work across various
+-- default page sizes.
+\set VERBOSITY terse
+SELECT gist_page_items_bytea('aaa'::bytea);
+SELECT gist_page_items('aaa'::bytea, 'test_gist_idx'::regclass);
+SELECT gist_page_opaque_info('aaa'::bytea);
+\set VERBOSITY default
+
DROP TABLE test_gist;
diff --git a/contrib/pageinspect/sql/hash.sql b/contrib/pageinspect/sql/hash.sql
index 64f33f1d52..537d68ce2e 100644
--- a/contrib/pageinspect/sql/hash.sql
+++ b/contrib/pageinspect/sql/hash.sql
@@ -78,5 +78,18 @@ SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 3));
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 4));
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 5));
+-- Failure with non-hash index
+CREATE INDEX test_hash_a_btree ON test_hash USING btree (a);
+SELECT hash_bitmap_info('test_hash_a_btree', 0);
+
+-- Failure with incorrect page size
+-- Suppress the DETAIL message, to allow the tests to work across various
+-- default page sizes.
+\set VERBOSITY terse
+SELECT hash_metapage_info('aaa'::bytea);
+SELECT hash_page_items('bbb'::bytea);
+SELECT hash_page_stats('ccc'::bytea);
+SELECT hash_page_type('ddd'::bytea);
+\set VERBOSITY default
DROP TABLE test_hash;
diff --git a/contrib/pageinspect/sql/page.sql b/contrib/pageinspect/sql/page.sql
index d333b763d7..15a90455e5 100644
--- a/contrib/pageinspect/sql/page.sql
+++ b/contrib/pageinspect/sql/page.sql
@@ -82,3 +82,12 @@ select t_bits, t_data from heap_page_items(get_raw_page('test8', 0));
select tuple_data_split('test8'::regclass, t_data, t_infomask, t_infomask2, t_bits)
from heap_page_items(get_raw_page('test8', 0));
drop table test8;
+
+-- Failure with incorrect page size
+-- Suppress the DETAIL message, to allow the tests to work across various
+-- default page sizes.
+\set VERBOSITY terse
+SELECT fsm_page_contents('aaa'::bytea);
+SELECT page_checksum('bbb'::bytea, 0);
+SELECT page_header('ccc'::bytea);
+\set VERBOSITY default
--
2.35.1
On Tue, Mar 15, 2022 at 06:32:44PM +0900, Michael Paquier wrote:
+ if (!IS_BRIN(indexRel)) + ereport(ERROR, + (errcode(ERRCODE_WRONG_OBJECT_TYPE), + errmsg("\"%s\" is not a %s index", + RelationGetRelationName(indexRel), "BRIN")));
If it were me, I'd write this without the extra parens around (errcode()).
+-- Suppress the DETAIL message, to allow the tests to work across various +-- default page sizes.
I think you mean "various non-default page sizes" or "various page sizes".
--
Justin
On Tue, Mar 15, 2022 at 06:56:46AM -0500, Justin Pryzby wrote:
On Tue, Mar 15, 2022 at 06:32:44PM +0900, Michael Paquier wrote:
+-- Suppress the DETAIL message, to allow the tests to work across various +-- default page sizes.I think you mean "various non-default page sizes" or "various page sizes".
Thanks. Indeed, this sounded a bit weird. I have been able to look
at all that this morning and backpatched this first part.
--
Michael
On Mon, Feb 21, 2022 at 10:00:00AM +0300, Alexander Lakhin wrote:
Could you please confirm before committing the patchset that it fixes
the bug #16527 [1]? Or maybe I could check it?
(Original patch proposed by Daria doesn't cover that case, but if the
patch going to be improved, probably it's worth fixing that bug too.)
FWIW, I think that this one has been fixed by 076f4d9, where we make
sure that the page is correctly aligned. Are you still seeing this
problem?
--
Michael
On Wed, Feb 23, 2022 at 12:09:02PM +0500, Daria Lepikhova wrote:
And one more addition. In the previous version of the patch, I forgot to add
tests for the gist index, but the described problem is also relevant for it.
So, I have looked at this second part of the thread, and concluded
that we should not fail for empty pages. First, we fetch pages from
the buffer pool in normal mode, where empty pages are valid. There is
also a second point in favor of doing so: code paths dedicated to hash
indexes already do that, marking such pages as simply "unused". The
proposal from Julien upthread sounds cleaner to me though in the long
run, as NULL gives the user the possibility to do a full-table scan
with simple clauses to filter out anything found as NULL.
Painting more PageIsNew() across the place requires a bit more work
than a simple ereport(ERROR) in get_page_from_raw(), of course, but
the gain is the portability of the functions.
(One can have a lot of fun playing with random inputs and breaking
most code paths, but that's not new.)
--
Michael
Attachments:
pageinspect-zeros-v2.patchtext/x-diff; charset=us-asciiDownload
diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c
index bf12901ac3..87e75f663c 100644
--- a/contrib/pageinspect/brinfuncs.c
+++ b/contrib/pageinspect/brinfuncs.c
@@ -58,6 +58,9 @@ brin_page_type(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ if (PageIsNew(page))
+ PG_RETURN_NULL();
+
switch (BrinPageType(page))
{
case BRIN_PAGETYPE_META:
@@ -86,6 +89,9 @@ verify_brin_page(bytea *raw_page, uint16 type, const char *strtype)
{
Page page = get_page_from_raw(raw_page);
+ if (PageIsNew(page))
+ return page;
+
/* verify the special space says this page is what we want */
if (BrinPageType(page) != type)
ereport(ERROR,
@@ -138,6 +144,13 @@ brin_page_items(PG_FUNCTION_ARGS)
/* minimally verify the page we got */
page = verify_brin_page(raw_page, BRIN_PAGETYPE_REGULAR, "regular");
+ if (PageIsNew(page))
+ {
+ brin_free_desc(bdesc);
+ index_close(indexRel, AccessShareLock);
+ PG_RETURN_NULL();
+ }
+
/*
* Initialize output functions for all indexed datatypes; simplifies
* calling them later.
@@ -313,6 +326,9 @@ brin_metapage_info(PG_FUNCTION_ARGS)
page = verify_brin_page(raw_page, BRIN_PAGETYPE_META, "metapage");
+ if (PageIsNew(page))
+ PG_RETURN_NULL();
+
/* Build a tuple descriptor for our result type */
if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
elog(ERROR, "return type must be a row type");
@@ -364,6 +380,12 @@ brin_revmap_data(PG_FUNCTION_ARGS)
/* minimally verify the page we got */
page = verify_brin_page(raw_page, BRIN_PAGETYPE_REVMAP, "revmap");
+ if (PageIsNew(page))
+ {
+ MemoryContextSwitchTo(mctx);
+ PG_RETURN_NULL();
+ }
+
state = palloc(sizeof(*state));
state->tids = ((RevmapContents *) PageGetContents(page))->rm_tids;
state->idx = 0;
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index d9628dd664..dde640fd33 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -611,6 +611,12 @@ bt_page_items_bytea(PG_FUNCTION_ARGS)
uargs->page = get_page_from_raw(raw_page);
+ if (PageIsNew(uargs->page))
+ {
+ MemoryContextSwitchTo(mctx);
+ PG_RETURN_NULL();
+ }
+
uargs->offset = FirstOffsetNumber;
opaque = (BTPageOpaque) PageGetSpecialPointer(uargs->page);
diff --git a/contrib/pageinspect/expected/brin.out b/contrib/pageinspect/expected/brin.out
index 10cd36c177..15786a0c0d 100644
--- a/contrib/pageinspect/expected/brin.out
+++ b/contrib/pageinspect/expected/brin.out
@@ -52,4 +52,29 @@ SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx')
CREATE INDEX test1_a_btree ON test1 (a);
SELECT brin_page_items(get_raw_page('test1_a_btree', 0), 'test1_a_btree');
ERROR: "test1_a_btree" is not a BRIN index
+-- Test for empty pages with raw input.
+SHOW block_size \gset
+SELECT brin_page_type(decode(repeat('00', :block_size), 'hex'));
+ brin_page_type
+----------------
+
+(1 row)
+
+SELECT brin_page_items(decode(repeat('00', :block_size), 'hex'), 'test1_a_idx');
+ brin_page_items
+-----------------
+(0 rows)
+
+SELECT brin_metapage_info(decode(repeat('00', :block_size), 'hex'));
+ brin_metapage_info
+--------------------
+
+(1 row)
+
+SELECT brin_revmap_data(decode(repeat('00', :block_size), 'hex'));
+ brin_revmap_data
+------------------
+
+(1 row)
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/expected/btree.out b/contrib/pageinspect/expected/btree.out
index 80b3dfe861..8815b56b15 100644
--- a/contrib/pageinspect/expected/btree.out
+++ b/contrib/pageinspect/expected/btree.out
@@ -85,4 +85,10 @@ ERROR: "test1_a_hash" is not a btree index
SELECT bt_page_items('aaa'::bytea);
ERROR: invalid page size
\set VERBOSITY default
+-- Test for empty pages with raw input.
+SHOW block_size \gset
+SELECT bt_page_items(decode(repeat('00', :block_size), 'hex'));
+-[ RECORD 1 ]-+-
+bt_page_items |
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/expected/gin.out b/contrib/pageinspect/expected/gin.out
index 802f48284b..ce36175366 100644
--- a/contrib/pageinspect/expected/gin.out
+++ b/contrib/pageinspect/expected/gin.out
@@ -47,3 +47,17 @@ ERROR: invalid page size
SELECT gin_page_opaque_info('ccc'::bytea);
ERROR: invalid page size
\set VERBOSITY default
+-- Test for empty pages with raw input.
+SHOW block_size \gset
+SELECT gin_leafpage_items(decode(repeat('00', :block_size), 'hex'));
+-[ RECORD 1 ]------+-
+gin_leafpage_items |
+
+SELECT gin_metapage_info(decode(repeat('00', :block_size), 'hex'));
+-[ RECORD 1 ]-----+-
+gin_metapage_info |
+
+SELECT gin_page_opaque_info(decode(repeat('00', :block_size), 'hex'));
+-[ RECORD 1 ]--------+-
+gin_page_opaque_info |
+
diff --git a/contrib/pageinspect/expected/gist.out b/contrib/pageinspect/expected/gist.out
index 3f33e04066..648f740500 100644
--- a/contrib/pageinspect/expected/gist.out
+++ b/contrib/pageinspect/expected/gist.out
@@ -79,4 +79,22 @@ ERROR: invalid page size
SELECT gist_page_opaque_info('aaa'::bytea);
ERROR: invalid page size
\set VERBOSITY default
+-- Test for empty pages with raw input.
+SHOW block_size \gset
+SELECT gist_page_items_bytea(decode(repeat('00', :block_size), 'hex'));
+ gist_page_items_bytea
+-----------------------
+(0 rows)
+
+SELECT gist_page_items(decode(repeat('00', :block_size), 'hex'), 'test_gist_idx'::regclass);
+ gist_page_items
+-----------------
+(0 rows)
+
+SELECT gist_page_opaque_info(decode(repeat('00', :block_size), 'hex'));
+ gist_page_opaque_info
+-----------------------
+
+(1 row)
+
DROP TABLE test_gist;
diff --git a/contrib/pageinspect/expected/hash.out b/contrib/pageinspect/expected/hash.out
index 6c606630dd..4727b2ffe9 100644
--- a/contrib/pageinspect/expected/hash.out
+++ b/contrib/pageinspect/expected/hash.out
@@ -180,4 +180,16 @@ ERROR: invalid page size
SELECT hash_page_type('ddd'::bytea);
ERROR: invalid page size
\set VERBOSITY default
+-- Test for empty pages with raw input.
+SHOW block_size \gset
+SELECT hash_metapage_info(decode(repeat('00', :block_size), 'hex'));
+ERROR: page is not a hash meta page
+SELECT hash_page_items(decode(repeat('00', :block_size), 'hex'));
+ERROR: page is not a hash bucket or overflow page
+SELECT hash_page_stats(decode(repeat('00', :block_size), 'hex'));
+ERROR: page is not a hash bucket or overflow page
+SELECT hash_page_type(decode(repeat('00', :block_size), 'hex'));
+-[ RECORD 1 ]--+-------
+hash_page_type | unused
+
DROP TABLE test_hash;
diff --git a/contrib/pageinspect/expected/page.out b/contrib/pageinspect/expected/page.out
index 9bbdda7f18..7a21fc9ffc 100644
--- a/contrib/pageinspect/expected/page.out
+++ b/contrib/pageinspect/expected/page.out
@@ -218,3 +218,23 @@ ERROR: invalid page size
SELECT page_header('ccc'::bytea);
ERROR: invalid page size
\set VERBOSITY default
+-- Test for empty pages with raw input.
+SHOW block_size \gset
+SELECT fsm_page_contents(decode(repeat('00', :block_size), 'hex'));
+ fsm_page_contents
+-------------------
+
+(1 row)
+
+SELECT page_header(decode(repeat('00', :block_size), 'hex'));
+ page_header
+-----------------------
+ (0/0,0,0,0,0,0,0,0,0)
+(1 row)
+
+SELECT page_checksum(decode(repeat('00', :block_size), 'hex'), 1);
+ page_checksum
+---------------
+
+(1 row)
+
diff --git a/contrib/pageinspect/fsmfuncs.c b/contrib/pageinspect/fsmfuncs.c
index b914da1d4a..2b9679e454 100644
--- a/contrib/pageinspect/fsmfuncs.c
+++ b/contrib/pageinspect/fsmfuncs.c
@@ -46,6 +46,10 @@ fsm_page_contents(PG_FUNCTION_ARGS)
errmsg("must be superuser to use raw page functions")));
page = get_page_from_raw(raw_page);
+
+ if (PageIsNew(page))
+ PG_RETURN_NULL();
+
fsmpage = (FSMPage) PageGetContents(page);
initStringInfo(&sinfo);
diff --git a/contrib/pageinspect/ginfuncs.c b/contrib/pageinspect/ginfuncs.c
index f55128857e..d8825bccb2 100644
--- a/contrib/pageinspect/ginfuncs.c
+++ b/contrib/pageinspect/ginfuncs.c
@@ -49,6 +49,9 @@ gin_metapage_info(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ if (PageIsNew(page))
+ PG_RETURN_NULL();
+
opaq = (GinPageOpaque) PageGetSpecialPointer(page);
if (opaq->flags != GIN_META)
ereport(ERROR,
@@ -107,6 +110,9 @@ gin_page_opaque_info(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ if (PageIsNew(page))
+ PG_RETURN_NULL();
+
opaq = (GinPageOpaque) PageGetSpecialPointer(page);
/* Build a tuple descriptor for our result type */
@@ -184,6 +190,12 @@ gin_leafpage_items(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ if (PageIsNew(page))
+ {
+ MemoryContextSwitchTo(mctx);
+ PG_RETURN_NULL();
+ }
+
if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
diff --git a/contrib/pageinspect/gistfuncs.c b/contrib/pageinspect/gistfuncs.c
index a31cff47fe..4a93b38d7f 100644
--- a/contrib/pageinspect/gistfuncs.c
+++ b/contrib/pageinspect/gistfuncs.c
@@ -55,6 +55,9 @@ gist_page_opaque_info(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ if (PageIsNew(page))
+ PG_RETURN_NULL();
+
opaq = (GISTPageOpaque) PageGetSpecialPointer(page);
/* Build a tuple descriptor for our result type */
@@ -113,6 +116,9 @@ gist_page_items_bytea(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ if (PageIsNew(page))
+ PG_RETURN_NULL();
+
/* Avoid bogus PageGetMaxOffsetNumber() call with deleted pages */
if (GistPageIsDeleted(page))
elog(NOTICE, "page is deleted");
@@ -185,6 +191,12 @@ gist_page_items(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ if (PageIsNew(page))
+ {
+ index_close(indexRel, AccessShareLock);
+ PG_RETURN_NULL();
+ }
+
/* Avoid bogus PageGetMaxOffsetNumber() call with deleted pages */
if (GistPageIsDeleted(page))
elog(NOTICE, "page is deleted");
diff --git a/contrib/pageinspect/rawpage.c b/contrib/pageinspect/rawpage.c
index 92ffb2d930..730a46b1d8 100644
--- a/contrib/pageinspect/rawpage.c
+++ b/contrib/pageinspect/rawpage.c
@@ -352,6 +352,9 @@ page_checksum_internal(PG_FUNCTION_ARGS, enum pageinspect_version ext_version)
page = get_page_from_raw(raw_page);
+ if (PageIsNew(page))
+ PG_RETURN_NULL();
+
PG_RETURN_INT16(pg_checksum_page((char *) page, blkno));
}
diff --git a/contrib/pageinspect/sql/brin.sql b/contrib/pageinspect/sql/brin.sql
index 8717229c5d..4e967a1fc5 100644
--- a/contrib/pageinspect/sql/brin.sql
+++ b/contrib/pageinspect/sql/brin.sql
@@ -19,4 +19,11 @@ SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx')
CREATE INDEX test1_a_btree ON test1 (a);
SELECT brin_page_items(get_raw_page('test1_a_btree', 0), 'test1_a_btree');
+-- Test for empty pages with raw input.
+SHOW block_size \gset
+SELECT brin_page_type(decode(repeat('00', :block_size), 'hex'));
+SELECT brin_page_items(decode(repeat('00', :block_size), 'hex'), 'test1_a_idx');
+SELECT brin_metapage_info(decode(repeat('00', :block_size), 'hex'));
+SELECT brin_revmap_data(decode(repeat('00', :block_size), 'hex'));
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/sql/btree.sql b/contrib/pageinspect/sql/btree.sql
index fdda777b9e..ce6e7c9982 100644
--- a/contrib/pageinspect/sql/btree.sql
+++ b/contrib/pageinspect/sql/btree.sql
@@ -34,4 +34,8 @@ SELECT bt_page_items('test1_a_hash', 0);
SELECT bt_page_items('aaa'::bytea);
\set VERBOSITY default
+-- Test for empty pages with raw input.
+SHOW block_size \gset
+SELECT bt_page_items(decode(repeat('00', :block_size), 'hex'));
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/sql/gin.sql b/contrib/pageinspect/sql/gin.sql
index aadb07856d..6b128f3848 100644
--- a/contrib/pageinspect/sql/gin.sql
+++ b/contrib/pageinspect/sql/gin.sql
@@ -28,3 +28,9 @@ SELECT gin_leafpage_items('aaa'::bytea);
SELECT gin_metapage_info('bbb'::bytea);
SELECT gin_page_opaque_info('ccc'::bytea);
\set VERBOSITY default
+
+-- Test for empty pages with raw input.
+SHOW block_size \gset
+SELECT gin_leafpage_items(decode(repeat('00', :block_size), 'hex'));
+SELECT gin_metapage_info(decode(repeat('00', :block_size), 'hex'));
+SELECT gin_page_opaque_info(decode(repeat('00', :block_size), 'hex'));
diff --git a/contrib/pageinspect/sql/gist.sql b/contrib/pageinspect/sql/gist.sql
index 8abeb19140..6a4274b4e3 100644
--- a/contrib/pageinspect/sql/gist.sql
+++ b/contrib/pageinspect/sql/gist.sql
@@ -39,4 +39,10 @@ SELECT gist_page_items('aaa'::bytea, 'test_gist_idx'::regclass);
SELECT gist_page_opaque_info('aaa'::bytea);
\set VERBOSITY default
+-- Test for empty pages with raw input.
+SHOW block_size \gset
+SELECT gist_page_items_bytea(decode(repeat('00', :block_size), 'hex'));
+SELECT gist_page_items(decode(repeat('00', :block_size), 'hex'), 'test_gist_idx'::regclass);
+SELECT gist_page_opaque_info(decode(repeat('00', :block_size), 'hex'));
+
DROP TABLE test_gist;
diff --git a/contrib/pageinspect/sql/hash.sql b/contrib/pageinspect/sql/hash.sql
index fcddd706ae..3cb7940513 100644
--- a/contrib/pageinspect/sql/hash.sql
+++ b/contrib/pageinspect/sql/hash.sql
@@ -92,4 +92,11 @@ SELECT hash_page_stats('ccc'::bytea);
SELECT hash_page_type('ddd'::bytea);
\set VERBOSITY default
+-- Test for empty pages with raw input.
+SHOW block_size \gset
+SELECT hash_metapage_info(decode(repeat('00', :block_size), 'hex'));
+SELECT hash_page_items(decode(repeat('00', :block_size), 'hex'));
+SELECT hash_page_stats(decode(repeat('00', :block_size), 'hex'));
+SELECT hash_page_type(decode(repeat('00', :block_size), 'hex'));
+
DROP TABLE test_hash;
diff --git a/contrib/pageinspect/sql/page.sql b/contrib/pageinspect/sql/page.sql
index 38b1681541..2dda6895bf 100644
--- a/contrib/pageinspect/sql/page.sql
+++ b/contrib/pageinspect/sql/page.sql
@@ -91,3 +91,9 @@ SELECT fsm_page_contents('aaa'::bytea);
SELECT page_checksum('bbb'::bytea, 0);
SELECT page_header('ccc'::bytea);
\set VERBOSITY default
+
+-- Test for empty pages with raw input.
+SHOW block_size \gset
+SELECT fsm_page_contents(decode(repeat('00', :block_size), 'hex'));
+SELECT page_header(decode(repeat('00', :block_size), 'hex'));
+SELECT page_checksum(decode(repeat('00', :block_size), 'hex'), 1);
Hello Michael,
16.03.2022 10:39, Michael Paquier wrote:
On Mon, Feb 21, 2022 at 10:00:00AM +0300, Alexander Lakhin wrote:
Could you please confirm before committing the patchset that it fixes
the bug #16527 [1]? Or maybe I could check it?
(Original patch proposed by Daria doesn't cover that case, but if the
patch going to be improved, probably it's worth fixing that bug too.)FWIW, I think that this one has been fixed by 076f4d9, where we make
sure that the page is correctly aligned. Are you still seeing this
problem?
Yes, I've reproduced that issue on master (46d9bfb0).
pageinspect-zeros-v2.patch doesn't help too.
Best regards.
Alexander
On Wed, Mar 16, 2022 at 08:35:59PM +0300, Alexander Lakhin wrote:
Yes, I've reproduced that issue on master (46d9bfb0).
Okay. I'll try to look more closely at that, then. It feels like we
are just missing a MAXALIGN() in those code paths. Are you using any
specific CFLAGS or something environment-sensitive (macos, 32b, etc.)?
pageinspect-zeros-v2.patch doesn't help too.
Well, the scope is different, so that's not a surprise.
--
Michael
Hello Michael,
18.03.2022 05:06, Michael Paquier wrote:
Okay. I'll try to look more closely at that, then. It feels like we
are just missing a MAXALIGN() in those code paths. Are you using any
specific CFLAGS or something environment-sensitive (macos, 32b, etc.)?
No, just x86_64, Ubuntu 20.04, gcc 11, valgrind 3.15. I just put that
query in page.sql and see the server abort.
Best regards,
Alexander
On Fri, Mar 18, 2022 at 06:43:39AM +0300, Alexander Lakhin wrote:
Hello Michael,
No, just x86_64, Ubuntu 20.04, gcc 11, valgrind 3.15. I just put that query
in page.sql and see the server abort.
Bah, of course, I have missed the valgrind part of the problem.
The problem is that we attempt to verify a heap page here but its
pd_special is BLCKSZ. This causes BrinPageType() to grab back a
position of the area at the exact end of the page, via
PageGetSpecialPointer(), hence the failure in reading two bytes
outside the page. The result here is that the set of defenses in
verify_brin_page() is poor: we should at least make sure that the
special area is available for a read. As far as I can see, this is
also possible in bt_page_items_bytea(), gist_page_opaque_info(),
gin_metapage_info() and gin_page_opaque_info(). All those code paths
should be protected with some checks on PageGetSpecialSize(), I
guess, before attempting to read the special area of the page. Hash
indexes are protected by checking the expected size of the special
area, and one code path of btree relies on the opened relation to be a
btree index.
--
Michael
Hi,
On Wed, Mar 23, 2022 at 05:16:41PM +0900, Michael Paquier wrote:
On Fri, Mar 18, 2022 at 06:43:39AM +0300, Alexander Lakhin wrote:
Hello Michael,
No, just x86_64, Ubuntu 20.04, gcc 11, valgrind 3.15. I just put that query
in page.sql and see the server abort.Bah, of course, I have missed the valgrind part of the problem.
The problem is that we attempt to verify a heap page here but its
pd_special is BLCKSZ. This causes BrinPageType() to grab back a
position of the area at the exact end of the page, via
PageGetSpecialPointer(), hence the failure in reading two bytes
outside the page. The result here is that the set of defenses in
verify_brin_page() is poor: we should at least make sure that the
special area is available for a read. As far as I can see, this is
also possible in bt_page_items_bytea(), gist_page_opaque_info(),
gin_metapage_info() and gin_page_opaque_info(). All those code paths
should be protected with some checks on PageGetSpecialSize(), I
guess, before attempting to read the special area of the page. Hash
indexes are protected by checking the expected size of the special
area, and one code path of btree relies on the opened relation to be a
btree index.
I worked on a patch to fix the problem. The functions you mention were indeed
missing some check, but after digging a bit more I found that other problem
existed. For instance, feeding a btree page to a gist_page_items_bytea() (both
pages have the same special size) can be quite problematic too, as you can end
up accessing bogus data when retrieving the items. I also added additional
sanity checks with what is available (gist_page_id for gist, zero-level for
btree leaf pages and so on), and tried to cover everything with tests.
I'm a bit worried about the btree tests stability. I avoid emitting the level
found to help with that, but it still depends on what other AM will put in
their special page.
Attachments:
v1-0001-pageinspect-add-more-sanity-checks-when-accessing.patchtext/plain; charset=us-asciiDownload
From fa2841b83a57daf8de95e8b154afc2bf5dd60dda Mon Sep 17 00:00:00 2001
From: Julien Rouhaud <julien.rouhaud@free.fr>
Date: Thu, 24 Mar 2022 15:51:21 +0800
Subject: [PATCH v1] pageinspect: add more sanity checks when accessing pages.
As seen in bug #16527, feeding an incorrect page can lead to ouf-of-bound
reads when trying to access the page special area. To fix it, check that the
special area has the expected size before trying to access in in the functions
that weren't already doing so.
While at it, add other checks when possible to validate that the given page is
from the expected format to avoid other possible invalid access.
Author: Julien Rouhaud
Reported-By: Alexander Lakhin
Discussion: https://postgr.es/m/16527-ef7606186f0610a1%40postgresql.org
Discussion: https://postgr.es/m/150535d6-65ac-16ee-b28a-851965fc485b%40gmail.com
---
contrib/pageinspect/brinfuncs.c | 18 +++++++++++++
contrib/pageinspect/btreefuncs.c | 16 ++++++++++++
contrib/pageinspect/expected/brin.out | 12 +++++++++
contrib/pageinspect/expected/btree.out | 23 +++++++++++++++--
contrib/pageinspect/expected/gin.out | 9 +++++++
contrib/pageinspect/expected/gist.out | 11 ++++++++
contrib/pageinspect/expected/hash.out | 14 +++++++++++
contrib/pageinspect/ginfuncs.c | 16 ++++++++++++
contrib/pageinspect/gistfuncs.c | 35 ++++++++++++++++++++++++++
contrib/pageinspect/sql/brin.sql | 5 ++++
contrib/pageinspect/sql/btree.sql | 13 ++++++++--
contrib/pageinspect/sql/gin.sql | 3 +++
contrib/pageinspect/sql/gist.sql | 4 +++
contrib/pageinspect/sql/hash.sql | 11 ++++++++
14 files changed, 186 insertions(+), 4 deletions(-)
diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c
index bf12901ac3..d4de80eee5 100644
--- a/contrib/pageinspect/brinfuncs.c
+++ b/contrib/pageinspect/brinfuncs.c
@@ -58,6 +58,15 @@ brin_page_type(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ /* verify the special space has the expected size */
+ if (PageGetSpecialSize(page) != MAXALIGN(sizeof(BrinSpecialSpace)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid brin page"),
+ errdetail("Special size %d, expected %d",
+ (int) PageGetSpecialSize(page),
+ (int) MAXALIGN(sizeof(BrinSpecialSpace)))));
+
switch (BrinPageType(page))
{
case BRIN_PAGETYPE_META:
@@ -86,6 +95,15 @@ verify_brin_page(bytea *raw_page, uint16 type, const char *strtype)
{
Page page = get_page_from_raw(raw_page);
+ /* verify the special space has the expected size */
+ if (PageGetSpecialSize(page) != sizeof(BrinSpecialSpace))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid brin page"),
+ errdetail("Special size %d, expected %d",
+ (int) PageGetSpecialSize(page),
+ (int) MAXALIGN(sizeof(BrinSpecialSpace)))));
+
/* verify the special space says this page is what we want */
if (BrinPageType(page) != type)
ereport(ERROR,
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index d9628dd664..9b618abe47 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -613,6 +613,15 @@ bt_page_items_bytea(PG_FUNCTION_ARGS)
uargs->offset = FirstOffsetNumber;
+ /* verify the special space has the expected size */
+ if (PageGetSpecialSize(uargs->page) != MAXALIGN(sizeof(BTPageOpaqueData)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid btree page"),
+ errdetail("Special size %d, expected %d",
+ (int) PageGetSpecialSize(uargs->page),
+ (int) MAXALIGN(sizeof(BTPageOpaqueData)))));
+
opaque = (BTPageOpaque) PageGetSpecialPointer(uargs->page);
if (P_ISMETA(opaque))
@@ -620,6 +629,12 @@ bt_page_items_bytea(PG_FUNCTION_ARGS)
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("block is a meta page")));
+ if (P_ISLEAF(opaque) && opaque->btpo_level != 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INDEX_CORRUPTED),
+ errmsg("index table contains corrupted page"),
+ errdetail("Page is leaf but found non-zero level")));
+
if (P_ISDELETED(opaque))
elog(NOTICE, "page is deleted");
@@ -631,6 +646,7 @@ bt_page_items_bytea(PG_FUNCTION_ARGS)
elog(NOTICE, "page from block is deleted");
fctx->max_calls = 0;
}
+
uargs->leafpage = P_ISLEAF(opaque);
uargs->rightmost = P_RIGHTMOST(opaque);
diff --git a/contrib/pageinspect/expected/brin.out b/contrib/pageinspect/expected/brin.out
index 10cd36c177..07a18feeb4 100644
--- a/contrib/pageinspect/expected/brin.out
+++ b/contrib/pageinspect/expected/brin.out
@@ -1,6 +1,9 @@
CREATE TABLE test1 (a int, b text);
INSERT INTO test1 VALUES (1, 'one');
CREATE INDEX test1_a_idx ON test1 USING brin (a);
+SELECT brin_page_type(get_raw_page('test1', 0));
+ERROR: input page is not a valid brin page
+DETAIL: Special size 0, expected 8
SELECT brin_page_type(get_raw_page('test1_a_idx', 0));
brin_page_type
----------------
@@ -19,6 +22,9 @@ SELECT brin_page_type(get_raw_page('test1_a_idx', 2));
regular
(1 row)
+SELECT * FROM brin_metapage_info(get_raw_page('test1', 0));
+ERROR: input page is not a valid brin page
+DETAIL: Special size 0, expected 8
SELECT * FROM brin_metapage_info(get_raw_page('test1_a_idx', 0));
magic | version | pagesperrange | lastrevmappage
------------+---------+---------------+----------------
@@ -28,6 +34,9 @@ SELECT * FROM brin_metapage_info(get_raw_page('test1_a_idx', 0));
SELECT * FROM brin_metapage_info(get_raw_page('test1_a_idx', 1));
ERROR: page is not a BRIN page of type "metapage"
DETAIL: Expected special type 0000f091, got 0000f092.
+SELECT * FROM brin_revmap_data(get_raw_page('test1', 0));
+ERROR: input page is not a valid brin page
+DETAIL: Special size 0, expected 8
SELECT * FROM brin_revmap_data(get_raw_page('test1_a_idx', 0)) LIMIT 5;
ERROR: page is not a BRIN page of type "revmap"
DETAIL: Expected special type 0000f092, got 0000f091.
@@ -41,6 +50,9 @@ SELECT * FROM brin_revmap_data(get_raw_page('test1_a_idx', 1)) LIMIT 5;
(0,0)
(5 rows)
+SELECT * FROM brin_page_items(get_raw_page('test1', 0), 'test1')
+ ORDER BY blknum, attnum LIMIT 5;
+ERROR: "test1" is not an index
SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx')
ORDER BY blknum, attnum LIMIT 5;
itemoffset | blknum | attnum | allnulls | hasnulls | placeholder | value
diff --git a/contrib/pageinspect/expected/btree.out b/contrib/pageinspect/expected/btree.out
index 80b3dfe861..b9705aaae8 100644
--- a/contrib/pageinspect/expected/btree.out
+++ b/contrib/pageinspect/expected/btree.out
@@ -1,7 +1,9 @@
-CREATE TABLE test1 (a int8, b text);
-INSERT INTO test1 VALUES (72057594037927937, 'text');
+CREATE TABLE test1 (a int8, b int4range);
+INSERT INTO test1 VALUES (72057594037927937, '[0,1)');
CREATE INDEX test1_a_idx ON test1 USING btree (a);
\x
+SELECT * FROM bt_metap('test1');
+ERROR: "test1" is not a btree index
SELECT * FROM bt_metap('test1_a_idx');
-[ RECORD 1 ]-------------+-------
magic | 340322
@@ -14,6 +16,8 @@ last_cleanup_num_delpages | 0
last_cleanup_num_tuples | -1
allequalimage | t
+SELECT * FROM bt_page_stats('test1', 0);
+ERROR: "test1" is not a btree index
SELECT * FROM bt_page_stats('test1_a_idx', -1);
ERROR: invalid block number
SELECT * FROM bt_page_stats('test1_a_idx', 0);
@@ -34,6 +38,8 @@ btpo_flags | 3
SELECT * FROM bt_page_stats('test1_a_idx', 2);
ERROR: block number out of range
+SELECT * FROM bt_page_items('test1', 0);
+ERROR: "test1" is not a btree index
SELECT * FROM bt_page_items('test1_a_idx', -1);
ERROR: invalid block number
SELECT * FROM bt_page_items('test1_a_idx', 0);
@@ -52,6 +58,9 @@ tids |
SELECT * FROM bt_page_items('test1_a_idx', 2);
ERROR: block number out of range
+SELECT * FROM bt_page_items(get_raw_page('test1', 0));
+ERROR: input page is not a valid btree page
+DETAIL: Special size 0, expected 16
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', -1));
ERROR: invalid block number
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 0));
@@ -78,6 +87,16 @@ SELECT bt_page_stats('test1_a_hash', 0);
ERROR: "test1_a_hash" is not a btree index
SELECT bt_page_items('test1_a_hash', 0);
ERROR: "test1_a_hash" is not a btree index
+SELECT bt_page_items(get_raw_page('test1_a_hash', 0));
+ERROR: block is a meta page
+CREATE INDEX test1_b_gist ON test1 USING gist(b);
+SELECT bt_page_items(get_raw_page('test1_b_gist', 0));
+ERROR: index table contains corrupted page
+DETAIL: Page is leaf but found non-zero level
+CREATE INDEX test1_a_brin ON test1 USING brin(a);
+SELECT bt_page_items(get_raw_page('test1_a_brin', 0));
+ERROR: input page is not a valid btree page
+DETAIL: Special size 8, expected 16
-- Failure with incorrect page size
-- Suppress the DETAIL message, to allow the tests to work across various
-- page sizes.
diff --git a/contrib/pageinspect/expected/gin.out b/contrib/pageinspect/expected/gin.out
index 802f48284b..7f67d979bc 100644
--- a/contrib/pageinspect/expected/gin.out
+++ b/contrib/pageinspect/expected/gin.out
@@ -2,6 +2,9 @@ CREATE TABLE test1 (x int, y int[]);
INSERT INTO test1 VALUES (1, ARRAY[11, 111]);
CREATE INDEX test1_y_idx ON test1 USING gin (y) WITH (fastupdate = off);
\x
+SELECT * FROM gin_metapage_info(get_raw_page('test1', 0));
+ERROR: input page is not a valid GIN metapage
+DETAIL: Special size 0, expected 8
SELECT * FROM gin_metapage_info(get_raw_page('test1_y_idx', 0));
-[ RECORD 1 ]----+-----------
pending_head | 4294967295
@@ -18,12 +21,18 @@ version | 2
SELECT * FROM gin_metapage_info(get_raw_page('test1_y_idx', 1));
ERROR: input page is not a GIN metapage
DETAIL: Flags 0002, expected 0008
+SELECT * FROM gin_page_opaque_info(get_raw_page('test1', 0));
+ERROR: input page is not a valid GIN data leaf page
+DETAIL: Special size 0, expected 8
SELECT * FROM gin_page_opaque_info(get_raw_page('test1_y_idx', 1));
-[ RECORD 1 ]---------
rightlink | 4294967295
maxoff | 0
flags | {leaf}
+SELECT * FROM gin_leafpage_items(get_raw_page('test1', 0));
+ERROR: input page is not a valid GIN data leaf page
+DETAIL: Special size 0, expected 8
SELECT * FROM gin_leafpage_items(get_raw_page('test1_y_idx', 1));
ERROR: input page is not a compressed GIN data leaf page
DETAIL: Flags 0002, expected 0083
diff --git a/contrib/pageinspect/expected/gist.out b/contrib/pageinspect/expected/gist.out
index 3f33e04066..d5fb5abb72 100644
--- a/contrib/pageinspect/expected/gist.out
+++ b/contrib/pageinspect/expected/gist.out
@@ -30,6 +30,11 @@ SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 2));
(1 row)
COMMIT;
+SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist', 0));
+ERROR: input page is not a valid gist page
+DETAIL: Special size 0, expected 16
+SELECT * FROM gist_page_items(get_raw_page('test_gist', 0), 'test_gist');
+ERROR: "test_gist" is not an index
SELECT * FROM gist_page_items(get_raw_page('test_gist_idx', 0), 'test_gist_idx');
itemoffset | ctid | itemlen | dead | keys
------------+-----------+---------+------+-------------------
@@ -53,6 +58,9 @@ SELECT * FROM gist_page_items(get_raw_page('test_gist_idx', 1), 'test_gist_idx')
-- gist_page_items_bytea prints the raw key data as a bytea. The output of that is
-- platform-dependent (endianness), so omit the actual key data from the output.
+SELECT itemoffset, ctid, itemlen FROM gist_page_items_bytea(get_raw_page('test_gist', 0));
+ERROR: input page is not a valid gist page
+DETAIL: Special size 0, expected 16
SELECT itemoffset, ctid, itemlen FROM gist_page_items_bytea(get_raw_page('test_gist_idx', 0));
itemoffset | ctid | itemlen
------------+-----------+---------
@@ -68,6 +76,9 @@ SELECT itemoffset, ctid, itemlen FROM gist_page_items_bytea(get_raw_page('test_g
CREATE INDEX test_gist_btree on test_gist(t);
SELECT gist_page_items(get_raw_page('test_gist_btree', 0), 'test_gist_btree');
ERROR: "test_gist_btree" is not a GiST index
+SELECT gist_page_items_bytea(get_raw_page('test_gist_btree', 0));
+ERROR: input page is not a valid gist page
+DETAIL: page id 0, expected 65409
-- Failure with incorrect page size
-- Suppress the DETAIL message, to allow the tests to work across various
-- page sizes.
diff --git a/contrib/pageinspect/expected/hash.out b/contrib/pageinspect/expected/hash.out
index 6c606630dd..566a5e27e5 100644
--- a/contrib/pageinspect/expected/hash.out
+++ b/contrib/pageinspect/expected/hash.out
@@ -2,6 +2,8 @@ CREATE TABLE test_hash (a int, b text);
INSERT INTO test_hash VALUES (1, 'one');
CREATE INDEX test_hash_a_idx ON test_hash USING hash (a);
\x
+SELECT hash_page_type(get_raw_page('test_hash', 0));
+ERROR: index table contains corrupted page
SELECT hash_page_type(get_raw_page('test_hash_a_idx', 0));
-[ RECORD 1 ]--+---------
hash_page_type | metapage
@@ -28,6 +30,8 @@ hash_page_type | bitmap
SELECT hash_page_type(get_raw_page('test_hash_a_idx', 6));
ERROR: block number 6 is out of range for relation "test_hash_a_idx"
+SELECT * FROM hash_bitmap_info('test_hash', 0);
+ERROR: "test_hash" is not an index
SELECT * FROM hash_bitmap_info('test_hash_a_idx', -1);
ERROR: invalid block number
SELECT * FROM hash_bitmap_info('test_hash_a_idx', 0);
@@ -46,6 +50,10 @@ SELECT * FROM hash_bitmap_info('test_hash_a_idx', 6);
ERROR: block number 6 is out of range for relation "test_hash_a_idx"
SELECT magic, version, ntuples, bsize, bmsize, bmshift, maxbucket, highmask,
lowmask, ovflpoint, firstfree, nmaps, procid, spares, mapp FROM
+hash_metapage_info(get_raw_page('test_hash', 0));
+ERROR: index table contains corrupted page
+SELECT magic, version, ntuples, bsize, bmsize, bmshift, maxbucket, highmask,
+lowmask, ovflpoint, firstfree, nmaps, procid, spares, mapp FROM
hash_metapage_info(get_raw_page('test_hash_a_idx', 0));
-[ RECORD 1 ]--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
magic | 105121344
@@ -86,6 +94,10 @@ hash_metapage_info(get_raw_page('test_hash_a_idx', 5));
ERROR: page is not a hash meta page
SELECT live_items, dead_items, page_size, hasho_prevblkno, hasho_nextblkno,
hasho_bucket, hasho_flag, hasho_page_id FROM
+hash_page_stats(get_raw_page('test_hash', 0));
+ERROR: index table contains corrupted page
+SELECT live_items, dead_items, page_size, hasho_prevblkno, hasho_nextblkno,
+hasho_bucket, hasho_flag, hasho_page_id FROM
hash_page_stats(get_raw_page('test_hash_a_idx', 0));
ERROR: page is not a hash bucket or overflow page
SELECT live_items, dead_items, page_size, hasho_prevblkno, hasho_nextblkno,
@@ -144,6 +156,8 @@ SELECT live_items, dead_items, page_size, hasho_prevblkno, hasho_nextblkno,
hasho_bucket, hasho_flag, hasho_page_id FROM
hash_page_stats(get_raw_page('test_hash_a_idx', 5));
ERROR: page is not a hash bucket or overflow page
+SELECT * FROM hash_page_items(get_raw_page('test_hash', 0));
+ERROR: index table contains corrupted page
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 0));
ERROR: page is not a hash bucket or overflow page
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 1));
diff --git a/contrib/pageinspect/ginfuncs.c b/contrib/pageinspect/ginfuncs.c
index f55128857e..6a3a485571 100644
--- a/contrib/pageinspect/ginfuncs.c
+++ b/contrib/pageinspect/ginfuncs.c
@@ -49,6 +49,14 @@ gin_metapage_info(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid GIN metapage"),
+ errdetail("Special size %d, expected %d",
+ (int) PageGetSpecialSize(page),
+ (int) MAXALIGN(sizeof(GinPageOpaqueData)))));
+
opaq = (GinPageOpaque) PageGetSpecialPointer(page);
if (opaq->flags != GIN_META)
ereport(ERROR,
@@ -107,6 +115,14 @@ gin_page_opaque_info(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid GIN data leaf page"),
+ errdetail("Special size %d, expected %d",
+ (int) PageGetSpecialSize(page),
+ (int) MAXALIGN(sizeof(GinPageOpaqueData)))));
+
opaq = (GinPageOpaque) PageGetSpecialPointer(page);
/* Build a tuple descriptor for our result type */
diff --git a/contrib/pageinspect/gistfuncs.c b/contrib/pageinspect/gistfuncs.c
index a31cff47fe..05e58a2ab2 100644
--- a/contrib/pageinspect/gistfuncs.c
+++ b/contrib/pageinspect/gistfuncs.c
@@ -55,7 +55,23 @@ gist_page_opaque_info(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ /* verify the special space has the expected size */
+ if (PageGetSpecialSize(page) != sizeof(GISTPageOpaqueData))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid gist page"),
+ errdetail("Special size %d, expected %d",
+ (int) PageGetSpecialSize(page),
+ (int) MAXALIGN(sizeof(GISTPageOpaqueData)))));
+
opaq = (GISTPageOpaque) PageGetSpecialPointer(page);
+ if (opaq->gist_page_id != GIST_PAGE_ID)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid gist page"),
+ errdetail("page id %d, expected %d",
+ opaq->gist_page_id,
+ GIST_PAGE_ID)));
/* Build a tuple descriptor for our result type */
if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
@@ -101,6 +117,7 @@ gist_page_items_bytea(PG_FUNCTION_ARGS)
bytea *raw_page = PG_GETARG_BYTEA_P(0);
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
Page page;
+ GISTPageOpaque opaq;
OffsetNumber offset;
OffsetNumber maxoff = InvalidOffsetNumber;
@@ -113,6 +130,24 @@ gist_page_items_bytea(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ /* verify the special space has the expected size */
+ if (PageGetSpecialSize(page) != sizeof(GISTPageOpaqueData))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid gist page"),
+ errdetail("Special size %d, expected %d",
+ (int) PageGetSpecialSize(page),
+ (int) MAXALIGN(sizeof(GISTPageOpaqueData)))));
+
+ opaq = (GISTPageOpaque) PageGetSpecialPointer(page);
+ if (opaq->gist_page_id != GIST_PAGE_ID)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid gist page"),
+ errdetail("page id %d, expected %d",
+ opaq->gist_page_id,
+ GIST_PAGE_ID)));
+
/* Avoid bogus PageGetMaxOffsetNumber() call with deleted pages */
if (GistPageIsDeleted(page))
elog(NOTICE, "page is deleted");
diff --git a/contrib/pageinspect/sql/brin.sql b/contrib/pageinspect/sql/brin.sql
index 8717229c5d..9b9641ff61 100644
--- a/contrib/pageinspect/sql/brin.sql
+++ b/contrib/pageinspect/sql/brin.sql
@@ -2,16 +2,21 @@ CREATE TABLE test1 (a int, b text);
INSERT INTO test1 VALUES (1, 'one');
CREATE INDEX test1_a_idx ON test1 USING brin (a);
+SELECT brin_page_type(get_raw_page('test1', 0));
SELECT brin_page_type(get_raw_page('test1_a_idx', 0));
SELECT brin_page_type(get_raw_page('test1_a_idx', 1));
SELECT brin_page_type(get_raw_page('test1_a_idx', 2));
+SELECT * FROM brin_metapage_info(get_raw_page('test1', 0));
SELECT * FROM brin_metapage_info(get_raw_page('test1_a_idx', 0));
SELECT * FROM brin_metapage_info(get_raw_page('test1_a_idx', 1));
+SELECT * FROM brin_revmap_data(get_raw_page('test1', 0));
SELECT * FROM brin_revmap_data(get_raw_page('test1_a_idx', 0)) LIMIT 5;
SELECT * FROM brin_revmap_data(get_raw_page('test1_a_idx', 1)) LIMIT 5;
+SELECT * FROM brin_page_items(get_raw_page('test1', 0), 'test1')
+ ORDER BY blknum, attnum LIMIT 5;
SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx')
ORDER BY blknum, attnum LIMIT 5;
diff --git a/contrib/pageinspect/sql/btree.sql b/contrib/pageinspect/sql/btree.sql
index fdda777b9e..de7a59e3dc 100644
--- a/contrib/pageinspect/sql/btree.sql
+++ b/contrib/pageinspect/sql/btree.sql
@@ -1,21 +1,25 @@
-CREATE TABLE test1 (a int8, b text);
-INSERT INTO test1 VALUES (72057594037927937, 'text');
+CREATE TABLE test1 (a int8, b int4range);
+INSERT INTO test1 VALUES (72057594037927937, '[0,1)');
CREATE INDEX test1_a_idx ON test1 USING btree (a);
\x
+SELECT * FROM bt_metap('test1');
SELECT * FROM bt_metap('test1_a_idx');
+SELECT * FROM bt_page_stats('test1', 0);
SELECT * FROM bt_page_stats('test1_a_idx', -1);
SELECT * FROM bt_page_stats('test1_a_idx', 0);
SELECT * FROM bt_page_stats('test1_a_idx', 1);
SELECT * FROM bt_page_stats('test1_a_idx', 2);
+SELECT * FROM bt_page_items('test1', 0);
SELECT * FROM bt_page_items('test1_a_idx', -1);
SELECT * FROM bt_page_items('test1_a_idx', 0);
SELECT * FROM bt_page_items('test1_a_idx', 1);
SELECT * FROM bt_page_items('test1_a_idx', 2);
+SELECT * FROM bt_page_items(get_raw_page('test1', 0));
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', -1));
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 0));
SELECT * FROM bt_page_items(get_raw_page('test1_a_idx', 1));
@@ -26,6 +30,11 @@ CREATE INDEX test1_a_hash ON test1 USING hash(a);
SELECT bt_metap('test1_a_hash');
SELECT bt_page_stats('test1_a_hash', 0);
SELECT bt_page_items('test1_a_hash', 0);
+SELECT bt_page_items(get_raw_page('test1_a_hash', 0));
+CREATE INDEX test1_b_gist ON test1 USING gist(b);
+SELECT bt_page_items(get_raw_page('test1_b_gist', 0));
+CREATE INDEX test1_a_brin ON test1 USING brin(a);
+SELECT bt_page_items(get_raw_page('test1_a_brin', 0));
-- Failure with incorrect page size
-- Suppress the DETAIL message, to allow the tests to work across various
diff --git a/contrib/pageinspect/sql/gin.sql b/contrib/pageinspect/sql/gin.sql
index aadb07856d..fb27c70f3e 100644
--- a/contrib/pageinspect/sql/gin.sql
+++ b/contrib/pageinspect/sql/gin.sql
@@ -4,11 +4,14 @@ CREATE INDEX test1_y_idx ON test1 USING gin (y) WITH (fastupdate = off);
\x
+SELECT * FROM gin_metapage_info(get_raw_page('test1', 0));
SELECT * FROM gin_metapage_info(get_raw_page('test1_y_idx', 0));
SELECT * FROM gin_metapage_info(get_raw_page('test1_y_idx', 1));
+SELECT * FROM gin_page_opaque_info(get_raw_page('test1', 0));
SELECT * FROM gin_page_opaque_info(get_raw_page('test1_y_idx', 1));
+SELECT * FROM gin_leafpage_items(get_raw_page('test1', 0));
SELECT * FROM gin_leafpage_items(get_raw_page('test1_y_idx', 1));
INSERT INTO test1 SELECT x, ARRAY[1,10] FROM generate_series(2,10000) x;
diff --git a/contrib/pageinspect/sql/gist.sql b/contrib/pageinspect/sql/gist.sql
index 8abeb19140..ea7d0e0796 100644
--- a/contrib/pageinspect/sql/gist.sql
+++ b/contrib/pageinspect/sql/gist.sql
@@ -18,17 +18,21 @@ SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 1));
SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist_idx', 2));
COMMIT;
+SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist', 0));
+SELECT * FROM gist_page_items(get_raw_page('test_gist', 0), 'test_gist');
SELECT * FROM gist_page_items(get_raw_page('test_gist_idx', 0), 'test_gist_idx');
SELECT * FROM gist_page_items(get_raw_page('test_gist_idx', 1), 'test_gist_idx') LIMIT 5;
-- gist_page_items_bytea prints the raw key data as a bytea. The output of that is
-- platform-dependent (endianness), so omit the actual key data from the output.
+SELECT itemoffset, ctid, itemlen FROM gist_page_items_bytea(get_raw_page('test_gist', 0));
SELECT itemoffset, ctid, itemlen FROM gist_page_items_bytea(get_raw_page('test_gist_idx', 0));
-- Failure with non-GiST index.
CREATE INDEX test_gist_btree on test_gist(t);
SELECT gist_page_items(get_raw_page('test_gist_btree', 0), 'test_gist_btree');
+SELECT gist_page_items_bytea(get_raw_page('test_gist_btree', 0));
-- Failure with incorrect page size
-- Suppress the DETAIL message, to allow the tests to work across various
diff --git a/contrib/pageinspect/sql/hash.sql b/contrib/pageinspect/sql/hash.sql
index fcddd706ae..8a27bf77bf 100644
--- a/contrib/pageinspect/sql/hash.sql
+++ b/contrib/pageinspect/sql/hash.sql
@@ -4,6 +4,7 @@ CREATE INDEX test_hash_a_idx ON test_hash USING hash (a);
\x
+SELECT hash_page_type(get_raw_page('test_hash', 0));
SELECT hash_page_type(get_raw_page('test_hash_a_idx', 0));
SELECT hash_page_type(get_raw_page('test_hash_a_idx', 1));
SELECT hash_page_type(get_raw_page('test_hash_a_idx', 2));
@@ -13,6 +14,7 @@ SELECT hash_page_type(get_raw_page('test_hash_a_idx', 5));
SELECT hash_page_type(get_raw_page('test_hash_a_idx', 6));
+SELECT * FROM hash_bitmap_info('test_hash', 0);
SELECT * FROM hash_bitmap_info('test_hash_a_idx', -1);
SELECT * FROM hash_bitmap_info('test_hash_a_idx', 0);
SELECT * FROM hash_bitmap_info('test_hash_a_idx', 1);
@@ -23,6 +25,10 @@ SELECT * FROM hash_bitmap_info('test_hash_a_idx', 5);
SELECT * FROM hash_bitmap_info('test_hash_a_idx', 6);
+SELECT magic, version, ntuples, bsize, bmsize, bmshift, maxbucket, highmask,
+lowmask, ovflpoint, firstfree, nmaps, procid, spares, mapp FROM
+hash_metapage_info(get_raw_page('test_hash', 0));
+
SELECT magic, version, ntuples, bsize, bmsize, bmshift, maxbucket, highmask,
lowmask, ovflpoint, firstfree, nmaps, procid, spares, mapp FROM
hash_metapage_info(get_raw_page('test_hash_a_idx', 0));
@@ -47,6 +53,10 @@ SELECT magic, version, ntuples, bsize, bmsize, bmshift, maxbucket, highmask,
lowmask, ovflpoint, firstfree, nmaps, procid, spares, mapp FROM
hash_metapage_info(get_raw_page('test_hash_a_idx', 5));
+SELECT live_items, dead_items, page_size, hasho_prevblkno, hasho_nextblkno,
+hasho_bucket, hasho_flag, hasho_page_id FROM
+hash_page_stats(get_raw_page('test_hash', 0));
+
SELECT live_items, dead_items, page_size, hasho_prevblkno, hasho_nextblkno,
hasho_bucket, hasho_flag, hasho_page_id FROM
hash_page_stats(get_raw_page('test_hash_a_idx', 0));
@@ -71,6 +81,7 @@ SELECT live_items, dead_items, page_size, hasho_prevblkno, hasho_nextblkno,
hasho_bucket, hasho_flag, hasho_page_id FROM
hash_page_stats(get_raw_page('test_hash_a_idx', 5));
+SELECT * FROM hash_page_items(get_raw_page('test_hash', 0));
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 0));
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 1));
SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 2));
--
2.35.0
On Thu, Mar 24, 2022 at 04:27:40PM +0800, Julien Rouhaud wrote:
I worked on a patch to fix the problem. The functions you mention were indeed
missing some check, but after digging a bit more I found that other problem
existed. For instance, feeding a btree page to a gist_page_items_bytea() (both
pages have the same special size) can be quite problematic too, as you can end
up accessing bogus data when retrieving the items. I also added additional
sanity checks with what is available (gist_page_id for gist, zero-level for
btree leaf pages and so on), and tried to cover everything with tests.
Thanks for the patch!
I have reviewed what you have sent, bumping on a couple of issues:
- The tests of btree and BRIN failed with 32-bit builds, because
MAXALIGN returns shorter special area sizes in those cases. This can
be fixed by abusing of \set VERBOSITY to mask the error details. We
already do that in some of the tests to make them portable.
- Some of the tests were not necessary, overlapping with stuff that
already existed.
- Some checks missed a MAXALIGN().
- Did some tweaks with the error messages.
- Some error messages used an incorrect term to define the index AM
type, aka s/gist/GiST/ or s/brin/BRIN/.
- errdetail() requires a sentence, finishing with a dot (some places
of hashfuncs.c missed that.
- Not your fault, but hash indexes would complain about corrupted
indexes which could be misleading for the user if passing down a
correct page from an incorrect index AM.
- While on it, I have made the error messages more generic in the
places where I could do so. What I have finished with seems to have a
good balance.
I'm a bit worried about the btree tests stability. I avoid emitting the level
found to help with that, but it still depends on what other AM will put in
their special page.
Well, the limit of the pageinspect model comes from the fact that it
is possible to pass down any bytea and all those code paths would
happily process the blobs as long as they are 8kB. Pages can be
crafted as well to bypass some of the checks. This is superuser-only,
so people have to be careful, but preventing out-of-bound reads is a
different class of problem, as long as these come from valid pages.
With all that in place, I get the attached. It is Friday, so I am not
going to take my bets on the buildfarm today with a rather-risky
patch. Monday/Tuesday will be fine.
--
Michael
Attachments:
v2-0001-pageinspect-add-more-sanity-checks-when-accessing.patchtext/x-diff; charset=us-asciiDownload
From f0f9a9cc10be30e10728bd99bc7425bf7f9d3a9e Mon Sep 17 00:00:00 2001
From: Michael Paquier <michael@paquier.xyz>
Date: Fri, 25 Mar 2022 11:42:28 +0900
Subject: [PATCH v2] pageinspect: add more sanity checks when accessing pages.
As seen in bug #16527, feeding an incorrect page can lead to ouf-of-bound
reads when trying to access the page special area. To fix it, check that the
special area has the expected size before trying to access in in the functions
that weren't already doing so.
While at it, add other checks when possible to validate that the given page is
from the expected format to avoid other possible invalid access.
Author: Julien Rouhaud
Reported-By: Alexander Lakhin
Discussion: https://postgr.es/m/16527-ef7606186f0610a1%40postgresql.org
Discussion: https://postgr.es/m/150535d6-65ac-16ee-b28a-851965fc485b%40gmail.com
---
contrib/pageinspect/brinfuncs.c | 18 +++++++++++++
contrib/pageinspect/btreefuncs.c | 14 +++++++++++
contrib/pageinspect/expected/brin.out | 10 ++++++++
contrib/pageinspect/expected/btree.out | 22 +++++++++++++---
contrib/pageinspect/expected/gin.out | 12 +++++++--
contrib/pageinspect/expected/gist.out | 12 +++++++--
contrib/pageinspect/expected/hash.out | 14 +++++++++--
contrib/pageinspect/ginfuncs.c | 22 +++++++++++++---
contrib/pageinspect/gistfuncs.c | 35 ++++++++++++++++++++++++++
contrib/pageinspect/hashfuncs.c | 11 +++++---
contrib/pageinspect/sql/brin.sql | 8 ++++++
contrib/pageinspect/sql/btree.sql | 18 ++++++++++---
contrib/pageinspect/sql/gin.sql | 9 +++++--
contrib/pageinspect/sql/gist.sql | 9 +++++--
contrib/pageinspect/sql/hash.sql | 10 ++++++--
15 files changed, 197 insertions(+), 27 deletions(-)
diff --git a/contrib/pageinspect/brinfuncs.c b/contrib/pageinspect/brinfuncs.c
index bf12901ac3..0387b2847a 100644
--- a/contrib/pageinspect/brinfuncs.c
+++ b/contrib/pageinspect/brinfuncs.c
@@ -58,6 +58,15 @@ brin_page_type(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ /* verify the special space has the expected size */
+ if (PageGetSpecialSize(page) != MAXALIGN(sizeof(BrinSpecialSpace)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid %s page", "BRIN"),
+ errdetail("Expected special size %d, got %d.",
+ (int) MAXALIGN(sizeof(BrinSpecialSpace)),
+ (int) PageGetSpecialSize(page))));
+
switch (BrinPageType(page))
{
case BRIN_PAGETYPE_META:
@@ -86,6 +95,15 @@ verify_brin_page(bytea *raw_page, uint16 type, const char *strtype)
{
Page page = get_page_from_raw(raw_page);
+ /* verify the special space has the expected size */
+ if (PageGetSpecialSize(page) != MAXALIGN(sizeof(BrinSpecialSpace)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid %s page", "BRIN"),
+ errdetail("Expected special size %d, got %d.",
+ (int) MAXALIGN(sizeof(BrinSpecialSpace)),
+ (int) PageGetSpecialSize(page))));
+
/* verify the special space says this page is what we want */
if (BrinPageType(page) != type)
ereport(ERROR,
diff --git a/contrib/pageinspect/btreefuncs.c b/contrib/pageinspect/btreefuncs.c
index d9628dd664..edb7954bd3 100644
--- a/contrib/pageinspect/btreefuncs.c
+++ b/contrib/pageinspect/btreefuncs.c
@@ -613,6 +613,15 @@ bt_page_items_bytea(PG_FUNCTION_ARGS)
uargs->offset = FirstOffsetNumber;
+ /* verify the special space has the expected size */
+ if (PageGetSpecialSize(uargs->page) != MAXALIGN(sizeof(BTPageOpaqueData)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid %s page", "btree"),
+ errdetail("Expected special size %d, got %d.",
+ (int) MAXALIGN(sizeof(BTPageOpaqueData)),
+ (int) PageGetSpecialSize(uargs->page))));
+
opaque = (BTPageOpaque) PageGetSpecialPointer(uargs->page);
if (P_ISMETA(opaque))
@@ -620,6 +629,11 @@ bt_page_items_bytea(PG_FUNCTION_ARGS)
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("block is a meta page")));
+ if (P_ISLEAF(opaque) && opaque->btpo_level != 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("block is not a valid leaf page")));
+
if (P_ISDELETED(opaque))
elog(NOTICE, "page is deleted");
diff --git a/contrib/pageinspect/expected/brin.out b/contrib/pageinspect/expected/brin.out
index 10cd36c177..62ee783b60 100644
--- a/contrib/pageinspect/expected/brin.out
+++ b/contrib/pageinspect/expected/brin.out
@@ -52,4 +52,14 @@ SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx')
CREATE INDEX test1_a_btree ON test1 (a);
SELECT brin_page_items(get_raw_page('test1_a_btree', 0), 'test1_a_btree');
ERROR: "test1_a_btree" is not a BRIN index
+-- Mask DETAIL messages as these are not portable across architectures.
+\set VERBOSITY terse
+-- Invalid special area size
+SELECT brin_page_type(get_raw_page('test1', 0));
+ERROR: input page is not a valid BRIN page
+SELECT * FROM brin_metapage_info(get_raw_page('test1', 0));
+ERROR: input page is not a valid BRIN page
+SELECT * FROM brin_revmap_data(get_raw_page('test1', 0));
+ERROR: input page is not a valid BRIN page
+\set VERBOSITY default
DROP TABLE test1;
diff --git a/contrib/pageinspect/expected/btree.out b/contrib/pageinspect/expected/btree.out
index 80b3dfe861..8f16891037 100644
--- a/contrib/pageinspect/expected/btree.out
+++ b/contrib/pageinspect/expected/btree.out
@@ -1,5 +1,5 @@
-CREATE TABLE test1 (a int8, b text);
-INSERT INTO test1 VALUES (72057594037927937, 'text');
+CREATE TABLE test1 (a int8, b int4range);
+INSERT INTO test1 VALUES (72057594037927937, '[0,1)');
CREATE INDEX test1_a_idx ON test1 USING btree (a);
\x
SELECT * FROM bt_metap('test1_a_idx');
@@ -78,11 +78,25 @@ SELECT bt_page_stats('test1_a_hash', 0);
ERROR: "test1_a_hash" is not a btree index
SELECT bt_page_items('test1_a_hash', 0);
ERROR: "test1_a_hash" is not a btree index
--- Failure with incorrect page size
+SELECT bt_page_items(get_raw_page('test1_a_hash', 0));
+ERROR: block is a meta page
+CREATE INDEX test1_b_gist ON test1 USING gist(b);
+-- Special area of GiST is the same as btree, this complains about inconsistent
+-- leaf data on the page.
+SELECT bt_page_items(get_raw_page('test1_b_gist', 0));
+ERROR: block is not a valid leaf page
+-- Several failure modes.
-- Suppress the DETAIL message, to allow the tests to work across various
--- page sizes.
+-- page sizes and architectures.
\set VERBOSITY terse
+-- invalid page size
SELECT bt_page_items('aaa'::bytea);
ERROR: invalid page size
+-- invalid special area size
+CREATE INDEX test1_a_brin ON test1 USING brin(a);
+SELECT bt_page_items(get_raw_page('test1', 0));
+ERROR: input page is not a valid btree page
+SELECT bt_page_items(get_raw_page('test1_a_brin', 0));
+ERROR: input page is not a valid btree page
\set VERBOSITY default
DROP TABLE test1;
diff --git a/contrib/pageinspect/expected/gin.out b/contrib/pageinspect/expected/gin.out
index 802f48284b..b7774105e0 100644
--- a/contrib/pageinspect/expected/gin.out
+++ b/contrib/pageinspect/expected/gin.out
@@ -36,14 +36,22 @@ FROM gin_leafpage_items(get_raw_page('test1_y_idx',
?column? | t
DROP TABLE test1;
--- Failure with incorrect page size
+-- Failure with various modes.
-- Suppress the DETAIL message, to allow the tests to work across various
--- page sizes.
+-- page sizes and architectures.
\set VERBOSITY terse
+-- invalid page size
SELECT gin_leafpage_items('aaa'::bytea);
ERROR: invalid page size
SELECT gin_metapage_info('bbb'::bytea);
ERROR: invalid page size
SELECT gin_page_opaque_info('ccc'::bytea);
ERROR: invalid page size
+-- invalid special area size
+SELECT * FROM gin_metapage_info(get_raw_page('test1', 0));
+ERROR: relation "test1" does not exist
+SELECT * FROM gin_page_opaque_info(get_raw_page('test1', 0));
+ERROR: relation "test1" does not exist
+SELECT * FROM gin_leafpage_items(get_raw_page('test1', 0));
+ERROR: relation "test1" does not exist
\set VERBOSITY default
diff --git a/contrib/pageinspect/expected/gist.out b/contrib/pageinspect/expected/gist.out
index 3f33e04066..c18a7091b2 100644
--- a/contrib/pageinspect/expected/gist.out
+++ b/contrib/pageinspect/expected/gist.out
@@ -68,15 +68,23 @@ SELECT itemoffset, ctid, itemlen FROM gist_page_items_bytea(get_raw_page('test_g
CREATE INDEX test_gist_btree on test_gist(t);
SELECT gist_page_items(get_raw_page('test_gist_btree', 0), 'test_gist_btree');
ERROR: "test_gist_btree" is not a GiST index
--- Failure with incorrect page size
+-- Failure with various modes.
-- Suppress the DETAIL message, to allow the tests to work across various
--- page sizes.
+-- page sizes and architectures.
\set VERBOSITY terse
+-- invalid page size
SELECT gist_page_items_bytea('aaa'::bytea);
ERROR: invalid page size
SELECT gist_page_items('aaa'::bytea, 'test_gist_idx'::regclass);
ERROR: invalid page size
SELECT gist_page_opaque_info('aaa'::bytea);
ERROR: invalid page size
+-- invalid special area size
+SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist', 0));
+ERROR: input page is not a valid GiST page
+SELECT gist_page_items_bytea(get_raw_page('test_gist', 0));
+ERROR: input page is not a valid GiST page
+SELECT gist_page_items_bytea(get_raw_page('test_gist_btree', 0));
+ERROR: input page is not a valid GiST page
\set VERBOSITY default
DROP TABLE test_gist;
diff --git a/contrib/pageinspect/expected/hash.out b/contrib/pageinspect/expected/hash.out
index 6c606630dd..96c9511457 100644
--- a/contrib/pageinspect/expected/hash.out
+++ b/contrib/pageinspect/expected/hash.out
@@ -167,10 +167,11 @@ ERROR: page is not a hash bucket or overflow page
CREATE INDEX test_hash_a_btree ON test_hash USING btree (a);
SELECT hash_bitmap_info('test_hash_a_btree', 0);
ERROR: "test_hash_a_btree" is not a hash index
--- Failure with incorrect page size
+-- Failure with various modes.
-- Suppress the DETAIL message, to allow the tests to work across various
--- page sizes.
+-- page sizes and architectures.
\set VERBOSITY terse
+-- invalid page size
SELECT hash_metapage_info('aaa'::bytea);
ERROR: invalid page size
SELECT hash_page_items('bbb'::bytea);
@@ -179,5 +180,14 @@ SELECT hash_page_stats('ccc'::bytea);
ERROR: invalid page size
SELECT hash_page_type('ddd'::bytea);
ERROR: invalid page size
+-- invalid special area size
+SELECT hash_metapage_info(get_raw_page('test_hash', 0));
+ERROR: input page is not a valid hash page
+SELECT hash_page_items(get_raw_page('test_hash', 0));
+ERROR: input page is not a valid hash page
+SELECT hash_page_stats(get_raw_page('test_hash', 0));
+ERROR: input page is not a valid hash page
+SELECT hash_page_type(get_raw_page('test_hash', 0));
+ERROR: input page is not a valid hash page
\set VERBOSITY default
DROP TABLE test_hash;
diff --git a/contrib/pageinspect/ginfuncs.c b/contrib/pageinspect/ginfuncs.c
index f55128857e..7ad6d2d3bd 100644
--- a/contrib/pageinspect/ginfuncs.c
+++ b/contrib/pageinspect/ginfuncs.c
@@ -49,6 +49,14 @@ gin_metapage_info(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid GIN metapage"),
+ errdetail("Expected special size %d, got %d.",
+ (int) MAXALIGN(sizeof(GinPageOpaqueData)),
+ (int) PageGetSpecialSize(page))));
+
opaq = (GinPageOpaque) PageGetSpecialPointer(page);
if (opaq->flags != GIN_META)
ereport(ERROR,
@@ -107,6 +115,14 @@ gin_page_opaque_info(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GinPageOpaqueData)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid GIN data leaf page"),
+ errdetail("Expected special size %d, got %d.",
+ (int) MAXALIGN(sizeof(GinPageOpaqueData)),
+ (int) PageGetSpecialSize(page))));
+
opaq = (GinPageOpaque) PageGetSpecialPointer(page);
/* Build a tuple descriptor for our result type */
@@ -188,9 +204,9 @@ gin_leafpage_items(PG_FUNCTION_ARGS)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
errmsg("input page is not a valid GIN data leaf page"),
- errdetail("Special size %d, expected %d",
- (int) PageGetSpecialSize(page),
- (int) MAXALIGN(sizeof(GinPageOpaqueData)))));
+ errdetail("Expected special size %d, got %d.",
+ (int) MAXALIGN(sizeof(GinPageOpaqueData)),
+ (int) PageGetSpecialSize(page))));
opaq = (GinPageOpaque) PageGetSpecialPointer(page);
if (opaq->flags != (GIN_DATA | GIN_LEAF | GIN_COMPRESSED))
diff --git a/contrib/pageinspect/gistfuncs.c b/contrib/pageinspect/gistfuncs.c
index a31cff47fe..6bb81ffb84 100644
--- a/contrib/pageinspect/gistfuncs.c
+++ b/contrib/pageinspect/gistfuncs.c
@@ -55,7 +55,23 @@ gist_page_opaque_info(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ /* verify the special space has the expected size */
+ if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GISTPageOpaqueData)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid %s page", "GiST"),
+ errdetail("Expected special size %d, got %d.",
+ (int) MAXALIGN(sizeof(GISTPageOpaqueData)),
+ (int) PageGetSpecialSize(page))));
+
opaq = (GISTPageOpaque) PageGetSpecialPointer(page);
+ if (opaq->gist_page_id != GIST_PAGE_ID)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid %s page", "GiST"),
+ errdetail("Expected %08x, got %08x.",
+ GIST_PAGE_ID,
+ opaq->gist_page_id)));
/* Build a tuple descriptor for our result type */
if (get_call_result_type(fcinfo, NULL, &tupdesc) != TYPEFUNC_COMPOSITE)
@@ -101,6 +117,7 @@ gist_page_items_bytea(PG_FUNCTION_ARGS)
bytea *raw_page = PG_GETARG_BYTEA_P(0);
ReturnSetInfo *rsinfo = (ReturnSetInfo *) fcinfo->resultinfo;
Page page;
+ GISTPageOpaque opaq;
OffsetNumber offset;
OffsetNumber maxoff = InvalidOffsetNumber;
@@ -113,6 +130,24 @@ gist_page_items_bytea(PG_FUNCTION_ARGS)
page = get_page_from_raw(raw_page);
+ /* verify the special space has the expected size */
+ if (PageGetSpecialSize(page) != MAXALIGN(sizeof(GISTPageOpaqueData)))
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid %s page", "GiST"),
+ errdetail("Expected special size %d, got %d.",
+ (int) MAXALIGN(sizeof(GISTPageOpaqueData)),
+ (int) PageGetSpecialSize(page))));
+
+ opaq = (GISTPageOpaque) PageGetSpecialPointer(page);
+ if (opaq->gist_page_id != GIST_PAGE_ID)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid %s page", "GiST"),
+ errdetail("Expected %08x, got %08x.",
+ GIST_PAGE_ID,
+ opaq->gist_page_id)));
+
/* Avoid bogus PageGetMaxOffsetNumber() call with deleted pages */
if (GistPageIsDeleted(page))
elog(NOTICE, "page is deleted");
diff --git a/contrib/pageinspect/hashfuncs.c b/contrib/pageinspect/hashfuncs.c
index ff73c363fc..6de21d6608 100644
--- a/contrib/pageinspect/hashfuncs.c
+++ b/contrib/pageinspect/hashfuncs.c
@@ -66,14 +66,17 @@ verify_hash_page(bytea *raw_page, int flags)
if (PageGetSpecialSize(page) != MAXALIGN(sizeof(HashPageOpaqueData)))
ereport(ERROR,
- (errcode(ERRCODE_INDEX_CORRUPTED),
- errmsg("index table contains corrupted page")));
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("input page is not a valid %s page", "hash"),
+ errdetail("Expected special size %d, got %d.",
+ (int) MAXALIGN(sizeof(HashPageOpaqueData)),
+ (int) PageGetSpecialSize(page))));
pageopaque = (HashPageOpaque) PageGetSpecialPointer(page);
if (pageopaque->hasho_page_id != HASHO_PAGE_ID)
ereport(ERROR,
(errcode(ERRCODE_INVALID_PARAMETER_VALUE),
- errmsg("page is not a hash page"),
+ errmsg("input page is not a valid %s page", "hash"),
errdetail("Expected %08x, got %08x.",
HASHO_PAGE_ID, pageopaque->hasho_page_id)));
@@ -134,7 +137,7 @@ verify_hash_page(bytea *raw_page, int flags)
ereport(ERROR,
(errcode(ERRCODE_INDEX_CORRUPTED),
errmsg("invalid version for metadata"),
- errdetail("Expected %d, got %d",
+ errdetail("Expected %d, got %d.",
HASH_VERSION, metap->hashm_version)));
}
diff --git a/contrib/pageinspect/sql/brin.sql b/contrib/pageinspect/sql/brin.sql
index 8717229c5d..dc5d1661b6 100644
--- a/contrib/pageinspect/sql/brin.sql
+++ b/contrib/pageinspect/sql/brin.sql
@@ -19,4 +19,12 @@ SELECT * FROM brin_page_items(get_raw_page('test1_a_idx', 2), 'test1_a_idx')
CREATE INDEX test1_a_btree ON test1 (a);
SELECT brin_page_items(get_raw_page('test1_a_btree', 0), 'test1_a_btree');
+-- Mask DETAIL messages as these are not portable across architectures.
+\set VERBOSITY terse
+-- Invalid special area size
+SELECT brin_page_type(get_raw_page('test1', 0));
+SELECT * FROM brin_metapage_info(get_raw_page('test1', 0));
+SELECT * FROM brin_revmap_data(get_raw_page('test1', 0));
+\set VERBOSITY default
+
DROP TABLE test1;
diff --git a/contrib/pageinspect/sql/btree.sql b/contrib/pageinspect/sql/btree.sql
index fdda777b9e..44d83f90ba 100644
--- a/contrib/pageinspect/sql/btree.sql
+++ b/contrib/pageinspect/sql/btree.sql
@@ -1,5 +1,5 @@
-CREATE TABLE test1 (a int8, b text);
-INSERT INTO test1 VALUES (72057594037927937, 'text');
+CREATE TABLE test1 (a int8, b int4range);
+INSERT INTO test1 VALUES (72057594037927937, '[0,1)');
CREATE INDEX test1_a_idx ON test1 USING btree (a);
\x
@@ -26,12 +26,22 @@ CREATE INDEX test1_a_hash ON test1 USING hash(a);
SELECT bt_metap('test1_a_hash');
SELECT bt_page_stats('test1_a_hash', 0);
SELECT bt_page_items('test1_a_hash', 0);
+SELECT bt_page_items(get_raw_page('test1_a_hash', 0));
+CREATE INDEX test1_b_gist ON test1 USING gist(b);
+-- Special area of GiST is the same as btree, this complains about inconsistent
+-- leaf data on the page.
+SELECT bt_page_items(get_raw_page('test1_b_gist', 0));
--- Failure with incorrect page size
+-- Several failure modes.
-- Suppress the DETAIL message, to allow the tests to work across various
--- page sizes.
+-- page sizes and architectures.
\set VERBOSITY terse
+-- invalid page size
SELECT bt_page_items('aaa'::bytea);
+-- invalid special area size
+CREATE INDEX test1_a_brin ON test1 USING brin(a);
+SELECT bt_page_items(get_raw_page('test1', 0));
+SELECT bt_page_items(get_raw_page('test1_a_brin', 0));
\set VERBOSITY default
DROP TABLE test1;
diff --git a/contrib/pageinspect/sql/gin.sql b/contrib/pageinspect/sql/gin.sql
index aadb07856d..7a3bfdfae2 100644
--- a/contrib/pageinspect/sql/gin.sql
+++ b/contrib/pageinspect/sql/gin.sql
@@ -20,11 +20,16 @@ FROM gin_leafpage_items(get_raw_page('test1_y_idx',
DROP TABLE test1;
--- Failure with incorrect page size
+-- Failure with various modes.
-- Suppress the DETAIL message, to allow the tests to work across various
--- page sizes.
+-- page sizes and architectures.
\set VERBOSITY terse
+-- invalid page size
SELECT gin_leafpage_items('aaa'::bytea);
SELECT gin_metapage_info('bbb'::bytea);
SELECT gin_page_opaque_info('ccc'::bytea);
+-- invalid special area size
+SELECT * FROM gin_metapage_info(get_raw_page('test1', 0));
+SELECT * FROM gin_page_opaque_info(get_raw_page('test1', 0));
+SELECT * FROM gin_leafpage_items(get_raw_page('test1', 0));
\set VERBOSITY default
diff --git a/contrib/pageinspect/sql/gist.sql b/contrib/pageinspect/sql/gist.sql
index 8abeb19140..e76f6fa8d1 100644
--- a/contrib/pageinspect/sql/gist.sql
+++ b/contrib/pageinspect/sql/gist.sql
@@ -30,13 +30,18 @@ SELECT itemoffset, ctid, itemlen FROM gist_page_items_bytea(get_raw_page('test_g
CREATE INDEX test_gist_btree on test_gist(t);
SELECT gist_page_items(get_raw_page('test_gist_btree', 0), 'test_gist_btree');
--- Failure with incorrect page size
+-- Failure with various modes.
-- Suppress the DETAIL message, to allow the tests to work across various
--- page sizes.
+-- page sizes and architectures.
\set VERBOSITY terse
+-- invalid page size
SELECT gist_page_items_bytea('aaa'::bytea);
SELECT gist_page_items('aaa'::bytea, 'test_gist_idx'::regclass);
SELECT gist_page_opaque_info('aaa'::bytea);
+-- invalid special area size
+SELECT * FROM gist_page_opaque_info(get_raw_page('test_gist', 0));
+SELECT gist_page_items_bytea(get_raw_page('test_gist', 0));
+SELECT gist_page_items_bytea(get_raw_page('test_gist_btree', 0));
\set VERBOSITY default
DROP TABLE test_gist;
diff --git a/contrib/pageinspect/sql/hash.sql b/contrib/pageinspect/sql/hash.sql
index fcddd706ae..ccc984c086 100644
--- a/contrib/pageinspect/sql/hash.sql
+++ b/contrib/pageinspect/sql/hash.sql
@@ -82,14 +82,20 @@ SELECT * FROM hash_page_items(get_raw_page('test_hash_a_idx', 5));
CREATE INDEX test_hash_a_btree ON test_hash USING btree (a);
SELECT hash_bitmap_info('test_hash_a_btree', 0);
--- Failure with incorrect page size
+-- Failure with various modes.
-- Suppress the DETAIL message, to allow the tests to work across various
--- page sizes.
+-- page sizes and architectures.
\set VERBOSITY terse
+-- invalid page size
SELECT hash_metapage_info('aaa'::bytea);
SELECT hash_page_items('bbb'::bytea);
SELECT hash_page_stats('ccc'::bytea);
SELECT hash_page_type('ddd'::bytea);
+-- invalid special area size
+SELECT hash_metapage_info(get_raw_page('test_hash', 0));
+SELECT hash_page_items(get_raw_page('test_hash', 0));
+SELECT hash_page_stats(get_raw_page('test_hash', 0));
+SELECT hash_page_type(get_raw_page('test_hash', 0));
\set VERBOSITY default
DROP TABLE test_hash;
--
2.35.1
On Fri, Mar 25, 2022 at 11:44:26AM +0900, Michael Paquier wrote:
I have reviewed what you have sent, bumping on a couple of issues:
Thanks!
I'm happy with all the changes, except:
+ if (P_ISLEAF(opaque) && opaque->btpo_level != 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_INVALID_PARAMETER_VALUE),
+ errmsg("block is not a valid leaf page")));
All other messages specify which kind of page it's about, so I think it would
be better to specify "btree" leaf page here, especially since some other AMs
also have leaf pages.
- The tests of btree and BRIN failed with 32-bit builds, because
MAXALIGN returns shorter special area sizes in those cases. This can
be fixed by abusing of \set VERBOSITY to mask the error details. We
already do that in some of the tests to make them portable.
Yeah, that's the other stability problem I was worried about. I should have
tried to compile with -m32.
I'm a bit worried about the btree tests stability. I avoid emitting the level
found to help with that, but it still depends on what other AM will put in
their special page.Well, the limit of the pageinspect model comes from the fact that it
is possible to pass down any bytea and all those code paths would
happily process the blobs as long as they are 8kB. Pages can be
crafted as well to bypass some of the checks. This is superuser-only,
so people have to be careful, but preventing out-of-bound reads is a
different class of problem, as long as these come from valid pages.
Agreed. Also pageinspect can be handy when debugging corruption, so I think it
shouldn't try too hard to discard buggy pages.
On Fri, Mar 25, 2022 at 11:03:47AM +0800, Julien Rouhaud wrote:
I'm happy with all the changes, except:
+ if (P_ISLEAF(opaque) && opaque->btpo_level != 0) + ereport(ERROR, + (errcode(ERRCODE_INVALID_PARAMETER_VALUE), + errmsg("block is not a valid leaf page")));All other messages specify which kind of page it's about, so I think it would
be better to specify "btree" leaf page here, especially since some other AMs
also have leaf pages.
Makes sense. Fine by me to stick an extra "btree" here.
--
Michael
On Wed, Mar 23, 2022 at 1:16 AM Michael Paquier <michael@paquier.xyz> wrote:
As far as I can see, this is
also possible in bt_page_items_bytea(), gist_page_opaque_info(),
gin_metapage_info() and gin_page_opaque_info(). All those code paths
should be protected with some checks on PageGetSpecialSize(), I
guess, before attempting to read the special area of the page. Hash
indexes are protected by checking the expected size of the special
area, and one code path of btree relies on the opened relation to be a
btree index.
amcheck's palloc_btree_page() function validates that an 8KiB page is
in fact an nbtree page, in a maximally paranoid way. Might be an
example worth following here.
--
Peter Geoghegan
On Thu, Mar 24, 2022 at 08:54:14PM -0700, Peter Geoghegan wrote:
amcheck's palloc_btree_page() function validates that an 8KiB page is
in fact an nbtree page, in a maximally paranoid way. Might be an
example worth following here.
Sure, we could do that. However, I don't think that going down to
that is something we have any need for in pageinspect, as the module
is also useful to look at the contents of the full page, even if
slightly corrupted, and too many checks would prevent a lookup at the
internal contents of a page.
--
Michael
On Fri, Mar 25, 2022 at 12:27:24PM +0900, Michael Paquier wrote:
Makes sense. Fine by me to stick an extra "btree" here.
I have been able to look at that again today (earlier than expected),
and except for one incorrect thing in the GIN tests where we were not
testing the correct pattern, the rest was correct. So applied and
backpatched. The buildfarm is not complaining based on the first
reports.
--
Michael
On Fri, Mar 25, 2022 at 12:57 AM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Mar 24, 2022 at 08:54:14PM -0700, Peter Geoghegan wrote:
amcheck's palloc_btree_page() function validates that an 8KiB page is
in fact an nbtree page, in a maximally paranoid way. Might be an
example worth following here.Sure, we could do that. However, I don't think that going down to
that is something we have any need for in pageinspect, as the module
is also useful to look at the contents of the full page, even if
slightly corrupted, and too many checks would prevent a lookup at the
internal contents of a page.
It's my impression that there are many ways of crashing the system
using pageinspect right now. We aren't very careful about making sure
that our code that reads from disk doesn't crash if the data on disk
is corrupted, and all of that systemic weakness is inherited by
pageinspect. But, with pageinspect, you can supply your own pages, not
just use the ones that actually exist on disk.
Fixing this seems like a lot of work and problematic for various
reasons. And I do agree that if we can allow people to look at what's
going on with a corrupt page, that's useful. On the other hand, if by
"looking" they crash the system, that sucks a lot. So overall I think
we need a lot more sanity checks here.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Sun, 27 Mar 2022 at 16:24, Robert Haas <robertmhaas@gmail.com> wrote:
On Fri, Mar 25, 2022 at 12:57 AM Michael Paquier <michael@paquier.xyz> wrote:
On Thu, Mar 24, 2022 at 08:54:14PM -0700, Peter Geoghegan wrote:
amcheck's palloc_btree_page() function validates that an 8KiB page is
in fact an nbtree page, in a maximally paranoid way. Might be an
example worth following here.Sure, we could do that. However, I don't think that going down to
that is something we have any need for in pageinspect, as the module
is also useful to look at the contents of the full page, even if
slightly corrupted, and too many checks would prevent a lookup at the
internal contents of a page.It's my impression that there are many ways of crashing the system
using pageinspect right now. We aren't very careful about making sure
that our code that reads from disk doesn't crash if the data on disk
is corrupted, and all of that systemic weakness is inherited by
pageinspect. But, with pageinspect, you can supply your own pages, not
just use the ones that actually exist on disk.Fixing this seems like a lot of work and problematic for various
reasons. And I do agree that if we can allow people to look at what's
going on with a corrupt page, that's useful. On the other hand, if by
"looking" they crash the system, that sucks a lot. So overall I think
we need a lot more sanity checks here.
I noticed this thread due to recent commit 291e517a, and noticed that
it has some overlap with one of my patches [0]https://commitfest.postgresql.org/37/3543/ /messages/by-id/CAEze2WjE3+tGO9Fs9+iZMU+z6mMZKo54W1Zt98WKqbEUHbHOBg@mail.gmail.com, in which I fix the
corresponding issue in core postgres by assuming (and in
assert-enabled builds, verifying) the size and location of the special
area. As such, you might be interested in that patch.
Note that currently in core postgres a corrupted special area pointer
will likely target neighbouring blocks in the buffer pool; resulting
in spreading corruption when the special area is updated. This
spreading corruption should be limited to only the corrupted block
with my patch.
Kind regards,
Matthias van de Meent
[0]: https://commitfest.postgresql.org/37/3543/ /messages/by-id/CAEze2WjE3+tGO9Fs9+iZMU+z6mMZKo54W1Zt98WKqbEUHbHOBg@mail.gmail.com
/messages/by-id/CAEze2WjE3+tGO9Fs9+iZMU+z6mMZKo54W1Zt98WKqbEUHbHOBg@mail.gmail.com
On Sun, Mar 27, 2022 at 7:24 AM Robert Haas <robertmhaas@gmail.com> wrote:
It's my impression that there are many ways of crashing the system
using pageinspect right now. We aren't very careful about making sure
that our code that reads from disk doesn't crash if the data on disk
is corrupted, and all of that systemic weakness is inherited by
pageinspect.
It's significantly worse than the core code, I'd say (before we even
consider the fact that you can supply your own pages). At least with
index AMs, where functions like _bt_checkpage() and gistcheckpage()
provide the most basic sanity checks for any page that is read from
disk.
Fixing this seems like a lot of work and problematic for various
reasons. And I do agree that if we can allow people to look at what's
going on with a corrupt page, that's useful. On the other hand, if by
"looking" they crash the system, that sucks a lot. So overall I think
we need a lot more sanity checks here.
I don't think it's particularly hard to do a little more. That's all
it would take to prevent practically all crashes. We're not dealing
with adversarial page images here.
Sure, it would be overkill to fully adapt something like
palloc_btree_page() for pageinspect, for the reason Michael gave. But
there are a couple of checks that it does that would avoid practically
all real world hard crashes, without any plausible downside:
* The basic _bt_checkpage() call that every nbtree page uses in the
core code (as well as in amcheck).
* The maxoffset > MaxIndexTuplesPerPage check.
You could even go a bit further, and care about individual index
tuples. My commit 93ee38eade added some basic sanity checks for index
tuples to bt_page_items(). That could go a bit further as well. In
particular, the sanity checks from amcheck's PageGetItemIdCareful()
function could be carried over. That would make it impossible for
bt_page_items() to read past the end of the page when given a page
that has an index tuple whose item pointer offset has some wildly
unreasonable value.
I'm not volunteering. Just saying that this is quite possible.
--
Peter Geoghegan
On Sun, Mar 27, 2022 at 4:26 PM Peter Geoghegan <pg@bowt.ie> wrote:
We're not dealing
with adversarial page images here.
I think it's bad that we have to make that assumption, considering
that there's nothing whatever to keep users from supplying arbitrary
page images to pageinspect. But I also agree that if we're unable or
unwilling to make things perfect, it's still good to make them better.
--
Robert Haas
EDB: http://www.enterprisedb.com
On Sun, Mar 27, 2022 at 2:02 PM Robert Haas <robertmhaas@gmail.com> wrote:
On Sun, Mar 27, 2022 at 4:26 PM Peter Geoghegan <pg@bowt.ie> wrote:
We're not dealing
with adversarial page images here.I think it's bad that we have to make that assumption, considering
that there's nothing whatever to keep users from supplying arbitrary
page images to pageinspect.
Maybe it isn't strictly necessary for bt_page_items(), but making that
level of guarantee is really hard, and not particularly useful. And
that's the easy case for pageinspect: gist_page_items() takes a raw
bytea, and puts it through the underlying types output functions.
I think that it might actually be fundamentally impossible to
guarantee that that'll be safe, because we have no idea what the
output function might be doing. It's arbitrary user-defined code that
could easily be implemented in C. Combined with an arbitrary page
image.
But I also agree that if we're unable or
unwilling to make things perfect, it's still good to make them better.
I think that most of the functions can approach being perfectly
robust, with a little work. In practical terms they can almost
certainly be made so robust that no real user of bt_page_items() will
ever crash the server. Somebody that goes out of their way to do that
*might* find a way (even with the easier cases), but that doesn't
particularly concern me.
--
Peter Geoghegan
On Sun, Mar 27, 2022 at 02:36:54PM -0700, Peter Geoghegan wrote:
I think that it might actually be fundamentally impossible to
guarantee that that'll be safe, because we have no idea what the
output function might be doing. It's arbitrary user-defined code that
could easily be implemented in C. Combined with an arbitrary page
image.
My guess that you'd bring in a lot more safety if we completely cut
the capacity to fetch and pass down raw pages across the SQL calls
because you can add checks for the wanted AM, replacing all that with
a (regclass,blkno) pair. I am however ready to buy that this may not
be worth rewriting 10~15 years after the fact.
I think that most of the functions can approach being perfectly
robust, with a little work. In practical terms they can almost
certainly be made so robust that no real user of bt_page_items() will
ever crash the server. Somebody that goes out of their way to do that
*might* find a way (even with the easier cases), but that doesn't
particularly concern me.
That does not concern me either, and my own take is to take as
realistic things that can be fetched from a get_raw_page() call rather
than a bytea specifically crafted. Now, I have found myself in cases
where it was useful to see the contents of a page, as long as you can
go through the initial checks, particularly in cases where corruptions
involved unaligned contents. Trigerring an error on a sanity check
for a specific block may be fine, now I have also found myself doing
some full scans to see in one shot the extent of a damaged relation
file using the functions of pageinspect. Fun times.
--
Michael
I've suddenly found that the test in this patch is based on a fact that
heap pages don't have PageSpecial or it is of different size with btree
pages Special area:
CREATE TABLE test1 (a int8, b int4range);
SELECT bt_page_items(get_raw_page('test1', 0));
In the current state is is so, but it is not guaranteed. For example, in
the proposed 64xid patch [1]/messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com -- Best regards, Maxim Orlov. we introduce PageSpecial into heap pages and
its size on occasion equals to the size of btree page special so check from
above is false. Even if we pass heap page into pageinspect pretending it is
btree page. So the test will fail.
Generally it seems not only a wrong test but the consequence of a bigger
scale fact that we can not always distinguish different AM's pages. So I'd
propose at least that we don't come to a conclusion that a page is valid
based on PageSpecial size is right.
[1]: /messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com -- Best regards, Maxim Orlov.
/messages/by-id/CACG=ezZe1NQSCnfHOr78AtAZxJZeCvxrts0ygrxYwe=pyyjVWA@mail.gmail.com
--
Best regards,
Maxim Orlov.
Hi,
On Mon, Mar 28, 2022 at 08:29:29PM +0300, Maxim Orlov wrote:
I've suddenly found that the test in this patch is based on a fact that
heap pages don't have PageSpecial or it is of different size with btree
pages Special area:CREATE TABLE test1 (a int8, b int4range);
SELECT bt_page_items(get_raw_page('test1', 0));
In the current state is is so, but it is not guaranteed. For example, in
the proposed 64xid patch [1] we introduce PageSpecial into heap pages and
its size on occasion equals to the size of btree page special so check from
above is false. Even if we pass heap page into pageinspect pretending it is
btree page. So the test will fail.Generally it seems not only a wrong test but the consequence of a bigger
scale fact that we can not always distinguish different AM's pages. So I'd
propose at least that we don't come to a conclusion that a page is valid
based on PageSpecial size is right.
We don't assume that the page is valid, or of the correct AM, just based on the
special area size. We do reject it if the size is invalid, but we also have
other tests based on what's actually possible to test (checking flag sanity,
magic values and so on).
Note that Peter G. suggested to add more checks for btree based on
palloc_btree_page(), did you check if those were enough to fix this test with
the 64bits xid patchset applied?
On Wed, Mar 16, 2022 at 05:43:48PM +0900, Michael Paquier wrote:
So, I have looked at this second part of the thread, and concluded
that we should not fail for empty pages. First, we fetch pages from
the buffer pool in normal mode, where empty pages are valid. There is
also a second point in favor of doing so: code paths dedicated to hash
indexes already do that, marking such pages as simply "unused". The
proposal from Julien upthread sounds cleaner to me though in the long
run, as NULL gives the user the possibility to do a full-table scan
with simple clauses to filter out anything found as NULL.
It has been a couple of weeks, but I have been able to come back to
this set of issues for all-zero pages, double-checked the whole and
applied a set of fixes down to 10. So we should be completely done
here.
--
Michael