bug in pageinspect's "tuple data" feature
If you have a sufficiently broken data page, pageinspect throws an
error when trying to examine the page:
ERROR: invalid memory alloc request size 18446744073709551451
This is pretty unhelpful; it would be better not to try to print the
data instead of dying. With that, at least you can know where the
problem is.
This was introduced in d6061f83a166 (2015). Proposed patch to fix it
(by having the code print a null "data" instead of dying) is attached.
--
�lvaro Herrera
Attachments:
pginspect.patchtext/x-diff; charset=us-asciiDownload
diff --git a/contrib/pageinspect/heapfuncs.c b/contrib/pageinspect/heapfuncs.c
index 64a6e351d5..909fdee406 100644
--- a/contrib/pageinspect/heapfuncs.c
+++ b/contrib/pageinspect/heapfuncs.c
@@ -228,11 +228,17 @@ heap_page_items(PG_FUNCTION_ARGS)
/* Copy raw tuple data into bytea attribute */
tuple_data_len = lp_len - tuphdr->t_hoff;
- tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
- SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
- memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff,
- tuple_data_len);
- values[13] = PointerGetDatum(tuple_data_bytea);
+ if (tuple_data_len < BLCKSZ)
+ {
+ tuple_data_bytea = (bytea *) palloc(tuple_data_len + VARHDRSZ);
+ SET_VARSIZE(tuple_data_bytea, tuple_data_len + VARHDRSZ);
+ memcpy(VARDATA(tuple_data_bytea), (char *) tuphdr + tuphdr->t_hoff,
+ tuple_data_len);
+ values[13] = PointerGetDatum(tuple_data_bytea);
+ nulls[13] = false;
+ }
+ else
+ nulls[13] = true;
/*
* We already checked that the item is completely within the raw
On 21/11/2020 21:32, Alvaro Herrera wrote:
If you have a sufficiently broken data page, pageinspect throws an
error when trying to examine the page:ERROR: invalid memory alloc request size 18446744073709551451
This is pretty unhelpful; it would be better not to try to print the
data instead of dying. With that, at least you can know where the
problem is.This was introduced in d6061f83a166 (2015). Proposed patch to fix it
(by having the code print a null "data" instead of dying) is attached.
Null seems misleading. Maybe something like "invalid", or print a warning?
- Heikki
On Mon, Nov 23, 2020 at 09:11:26AM +0200, Heikki Linnakangas wrote:
On 21/11/2020 21:32, Alvaro Herrera wrote:
This is pretty unhelpful; it would be better not to try to print the
data instead of dying. With that, at least you can know where the
problem is.This was introduced in d6061f83a166 (2015). Proposed patch to fix it
(by having the code print a null "data" instead of dying) is attached.Null seems misleading. Maybe something like "invalid", or print a warning?
How did you get into this state to begin with? get_raw_page() uses
ReadBufferExtended() which gives some level of protection already, so
shouldn't it be better to return an ERROR with ERRCODE_DATA_CORRUPTED
and the block involved?
--
Michael
On 2020-Nov-24, Michael Paquier wrote:
On Mon, Nov 23, 2020 at 09:11:26AM +0200, Heikki Linnakangas wrote:
On 21/11/2020 21:32, Alvaro Herrera wrote:
This is pretty unhelpful; it would be better not to try to print the
data instead of dying. With that, at least you can know where the
problem is.This was introduced in d6061f83a166 (2015). Proposed patch to fix it
(by having the code print a null "data" instead of dying) is attached.Null seems misleading. Maybe something like "invalid", or print a warning?
Good idea, thanks.
How did you get into this state to begin with?
The data was corrupted for whatever reason. I don't care why or how, I
just need to fix it. If the data isn't corrupted, then I don't use
pageinspect in the first place.
get_raw_page() uses ReadBufferExtended() which gives some level of
protection already, so shouldn't it be better to return an ERROR with
ERRCODE_DATA_CORRUPTED and the block involved?
What would I gain from doing that? It's even more unhelpful, because it
is intentional rather than accidental.
On Sat, Nov 21, 2020 at 2:32 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
If you have a sufficiently broken data page, pageinspect throws an
error when trying to examine the page:ERROR: invalid memory alloc request size 18446744073709551451
This is pretty unhelpful; it would be better not to try to print the
data instead of dying. With that, at least you can know where the
problem is.This was introduced in d6061f83a166 (2015). Proposed patch to fix it
(by having the code print a null "data" instead of dying) is attached.
So I agree with the problem statement and I don't have any particular
objection to the patch as proposed, but I think it's just the tip of
the iceberg. These functions are tremendously careless about
validating the input data and will crash if you breath on them too
hard; we've just band-aided around that problem by making them
super-user only (e.g. see 3e1338475ffc2eac25de60a9de9ce689b763aced).
While that does address the security concern since superusers can do
tons of bad things anyway, it's not much help if you want to actually
be able to run them on damaged pages without having the world end, and
it's no help at all if you'd like to let them be run by a
non-superuser, since a constructed page can easily blow up the world.
The patch as proposed fixes one of many problems in this area and may
well be useful enough to commit without doing anything else, but I'd
actually really like to see us do the same sort of hardening here that
is present in the recently-committed amcheck-for-heap support, which
is robust against a wide variety of things of this sort rather than
just this one particularly. Again, this is not to say that you should
be on the hook for that; it's a general statement.
--
Robert Haas
EDB: http://www.enterprisedb.com