The documentation for storage type 'plain' actually allows single byte header
The following documentation comment has been logged on the website:
Page: https://www.postgresql.org/docs/15/index.html
Description:
https://www.postgresql.org/docs/devel/storage-toast.html - This is the
development version.
PLAIN prevents either compression or out-of-line storage; furthermore it
disables use of single-byte headers for varlena types. This is the only
possible strategy for columns of non-TOAST-able data types.
However, it does allow "single byte" headers. How to verify this?
CREATE EXTENSION pageinspect;
CREATE TABLE test(a VARCHAR(10000) STORAGE PLAIN);
INSERT INTO test VALUES (repeat('A',10));
Now peek into the page with pageinspect functions
SELECT left(encode(t_data, 'hex'), 40) FROM
heap_page_items(get_raw_page('test', 0));
This returned value of "1741414141414141414141".
Here the first byte 0x17 = 0001 0111 in binary.
Length + 1 is stored in the length bits (1-7). So Len = 0001011-1 = (11-1)
[base-10] = 10 [base-10]
which exactly matches the expected length. Further the data "41" repeated 10
times also indicates character A (65 or 0x41 in ASCII) repeated 10 times.
So....This does **not** disable 1-B header. That sentence should be removed
from the documentation unless this is a bug.
On Tue, 2023-01-10 at 15:53 +0000, PG Doc comments form wrote:
https://www.postgresql.org/docs/devel/storage-toast.html - This is the
development version.PLAIN prevents either compression or out-of-line storage; furthermore it
disables use of single-byte headers for varlena types. This is the only
possible strategy for columns of non-TOAST-able data types.However, it does allow "single byte" headers. How to verify this?
CREATE EXTENSION pageinspect;
CREATE TABLE test(a VARCHAR(10000) STORAGE PLAIN);
INSERT INTO test VALUES (repeat('A',10));Now peek into the page with pageinspect functions
SELECT left(encode(t_data, 'hex'), 40) FROM
heap_page_items(get_raw_page('test', 0));This returned value of "1741414141414141414141".
Here the first byte 0x17 = 0001 0111 in binary.
Length + 1 is stored in the length bits (1-7). So Len = 0001011-1 = (11-1)
[base-10] = 10 [base-10]
which exactly matches the expected length. Further the data "41" repeated 10
times also indicates character A (65 or 0x41 in ASCII) repeated 10 times.So....This does **not** disable 1-B header. That sentence should be removed
from the documentation unless this is a bug.
I think that the documentation is wrong. The attached patch removes the
offending half-sentence.
Yours,
Laurenz Albe
Attachments:
0001-Fix-documentation-for-STORAGE-PLAIN.patchtext/x-patch; charset=UTF-8; name=0001-Fix-documentation-for-STORAGE-PLAIN.patchDownload
From 5bf0b43fe73384a21f59d9ad1f7a8d7cbc81f8c4 Mon Sep 17 00:00:00 2001
From: Laurenz Albe <laurenz.albe@cybertec.at>
Date: Thu, 12 Jan 2023 15:41:56 +0100
Subject: [PATCH] Fix documentation for STORAGE PLAIN
Commit 3e23b68dac0, which introduced single-byte varlena headers,
added documentation that STORAGE PLAIN would prevent such single-byte
headers. This has never been true.
---
doc/src/sgml/storage.sgml | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index e5b9f3f1ff..4795a485d0 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -456,9 +456,7 @@ for storing <acronym>TOAST</acronym>-able columns on disk:
<listitem>
<para>
<literal>PLAIN</literal> prevents either compression or
- out-of-line storage; furthermore it disables use of single-byte headers
- for varlena types.
- This is the only possible strategy for
+ out-of-line storage. This is the only possible strategy for
columns of non-<acronym>TOAST</acronym>-able data types.
</para>
</listitem>
--
2.39.0
Laurenz Albe <laurenz.albe@cybertec.at> writes:
On Tue, 2023-01-10 at 15:53 +0000, PG Doc comments form wrote:
PLAIN prevents either compression or out-of-line storage; furthermore it
disables use of single-byte headers for varlena types. This is the only
possible strategy for columns of non-TOAST-able data types.
However, it does allow "single byte" headers. How to verify this?
CREATE EXTENSION pageinspect;
CREATE TABLE test(a VARCHAR(10000) STORAGE PLAIN);
INSERT INTO test VALUES (repeat('A',10));Now peek into the page with pageinspect functions
SELECT left(encode(t_data, 'hex'), 40) FROM
heap_page_items(get_raw_page('test', 0));This returned value of "1741414141414141414141".
I think that the documentation is wrong. The attached patch removes the
offending half-sentence.
The documentation is correct, what is broken is the code. I'm not
sure when we broke it, but what I see in tracing through the INSERT
is that we are forming the tuple using a tupdesc with the wrong
value of attstorage. It looks like the tupdesc belongs to the
virtual slot representing the output of the INSERT statement,
which is not identical to the target relation's tupdesc.
(The virtual slot's tupdesc is probably reverse-engineered from
just the data types of the columns, so it'll have whatever is the
default attstorage for the data type. It's blind luck that this
attstorage value isn't used for anything more consequential,
like TOAST decisions.)
regards, tom lane
Hi,
On 2023-01-15 16:40:27 -0500, Tom Lane wrote:
The documentation is correct, what is broken is the code. I'm not
sure when we broke it
Looks to be an old issue, predating the slot type stuff. It reproduces at
least as far back as 10.
I've not thought through this fully. But after a first look, this might be
hard to fix without incuring a lot of overhead / complexity. We check whether
projection is needed between nodes with tlist_matches_tupdesc() - targetlists
don't know about storage. And we decide whether we need to project in
nodeModifyTuple solely based on
/* Extract non-junk columns of the subplan's result tlist. */
foreach(l, subplan->targetlist)
{
TargetEntry *tle = (TargetEntry *) lfirst(l);
if (!tle->resjunk)
insertTargetList = lappend(insertTargetList, tle);
else
need_projection = true;
}
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2023-01-15 16:40:27 -0500, Tom Lane wrote:
The documentation is correct, what is broken is the code. I'm not
sure when we broke it
I've not thought through this fully. But after a first look, this might be
hard to fix without incuring a lot of overhead / complexity.
It appeared to me that it was failing at this step in
ExecGetInsertNewTuple:
if (relinfo->ri_newTupleSlot->tts_ops != planSlot->tts_ops)
{
ExecCopySlot(relinfo->ri_newTupleSlot, planSlot);
return relinfo->ri_newTupleSlot;
}
ri_newTupleSlot has the tupdesc we want, planSlot is a virtual slot
that has the bogus tupdesc, and for some reason heap_form_tuple is
getting called with planSlot's tupdesc not ri_newTupleSlot's. I'm
not quite sure if this is just a thinko somewhere or there's a
deficiency in the design of the slot APIs.
The UPDATE path seems to work fine, btw.
regards, tom lane
Hi,
On 2023-01-15 18:08:21 -0500, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2023-01-15 16:40:27 -0500, Tom Lane wrote:
The documentation is correct, what is broken is the code. I'm not
sure when we broke itI've not thought through this fully. But after a first look, this might be
hard to fix without incuring a lot of overhead / complexity.It appeared to me that it was failing at this step in
ExecGetInsertNewTuple:if (relinfo->ri_newTupleSlot->tts_ops != planSlot->tts_ops)
{
ExecCopySlot(relinfo->ri_newTupleSlot, planSlot);
return relinfo->ri_newTupleSlot;
}ri_newTupleSlot has the tupdesc we want, planSlot is a virtual slot
that has the bogus tupdesc, and for some reason heap_form_tuple is
getting called with planSlot's tupdesc not ri_newTupleSlot's.
The way we copy a slot into a heap slot is to materialize the source slot and
copy the heap tuple into target slot. Which is also what happened before the
slot type abstraction (hence the problem also existing before that was
introduced).
I'm not quite sure if this is just a thinko somewhere or there's a
deficiency in the design of the slot APIs.
I think it's fairly fundamental that copying between two slots assumes a
compatible tupdescs.
I think the problem is more in the determination whether we need to project,
or not (i.e. ExecInitInsertProjection()). But we can't really make a good
decision, because we just determine the types of "incoming" tuples based on
targetlists, which don't contain information about the storage type.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2023-01-15 18:08:21 -0500, Tom Lane wrote:
ri_newTupleSlot has the tupdesc we want, planSlot is a virtual slot
that has the bogus tupdesc, and for some reason heap_form_tuple is
getting called with planSlot's tupdesc not ri_newTupleSlot's.
The way we copy a slot into a heap slot is to materialize the source slot and
copy the heap tuple into target slot. Which is also what happened before the
slot type abstraction (hence the problem also existing before that was
introduced).
Hmm. For the case of virtual->physical slot, that doesn't sound
terribly efficient.
I think it's fairly fundamental that copying between two slots assumes a
compatible tupdescs.
We could possibly make some effort to inject the desired attstorage
properties into the planSlot's tupdesc. Not sure where would be a
good place.
regards, tom lane
Hi,
On 2023-01-15 18:41:22 -0500, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2023-01-15 18:08:21 -0500, Tom Lane wrote:
ri_newTupleSlot has the tupdesc we want, planSlot is a virtual slot
that has the bogus tupdesc, and for some reason heap_form_tuple is
getting called with planSlot's tupdesc not ri_newTupleSlot's.The way we copy a slot into a heap slot is to materialize the source slot and
copy the heap tuple into target slot. Which is also what happened before the
slot type abstraction (hence the problem also existing before that was
introduced).Hmm. For the case of virtual->physical slot, that doesn't sound
terribly efficient.
It's ok, I think. For virtual->heap we form the tuple in the context of the
destination heap slot. I don't think we could avoid creating a HeapTuple. I
guess we could try to avoid needing to deform the heap tuple again in the
target slot, but I'm not sure that's worth the complexity (we'd need to
readjust by-reference datums to point into the heap tuple). It might be worth
adding a version of ExecCopySlot() that explicitly does that, I think it could
be useful for some executor nodes that know that columns will be accessed
immediately after.
I think it's fairly fundamental that copying between two slots assumes a
compatible tupdescs.We could possibly make some effort to inject the desired attstorage
properties into the planSlot's tupdesc. Not sure where would be a
good place.
I'm not sure that'd get us very far. Consider the case of
INSERT INTO table_using_plain SELECT * FROM table_using_extended;
In that case we just deal with heap tuples coming in, without a need to
project, without a need to copy from one slot to another.
I don't see how we can fix this mess entirely without tracking the storage
type a lot more widely. Most importantly in targetlists, as we use the
targetlists to compute the tupledescs of executor nodes, which then influence
where we build projections.
Given that altering a column to PLAIN doesn't rewrite the table, we already
have to be prepared to receive short or compressed varlenas, even after
setting STORAGE to PLAIN.
I think we should consider just reformulating the "furthermore it disables use
of single-byte headers for varlena types" portion to say that short varlenas
are disabled for non-toastable datatypes. I don't see much point in investing
a lot of complexity making this a hard restriction. Afaict the only point in
changing to PLAIN is to disallow external storage and compression, which it
achieves eved when using short varlenas.
The compression bit is a bit worse, I guess. We probably have the same problem
with EXTERNAL, which supposedly doesn't allow compression - but I don't think
we have code ensuring that we decompress in-line datums. It'll end up
happening if there's other columns that get newly compressed or stored
externally, but not guaranteed.
Greetings,
Andres Freund
Hi,
On 2023-01-15 16:49:01 -0800, Andres Freund wrote:
I don't see how we can fix this mess entirely without tracking the storage
type a lot more widely. Most importantly in targetlists, as we use the
targetlists to compute the tupledescs of executor nodes, which then influence
where we build projections.Given that altering a column to PLAIN doesn't rewrite the table, we already
have to be prepared to receive short or compressed varlenas, even after
setting STORAGE to PLAIN.I think we should consider just reformulating the "furthermore it disables use
of single-byte headers for varlena types" portion to say that short varlenas
are disabled for non-toastable datatypes. I don't see much point in investing
a lot of complexity making this a hard restriction. Afaict the only point in
changing to PLAIN is to disallow external storage and compression, which it
achieves eved when using short varlenas.The compression bit is a bit worse, I guess. We probably have the same problem
with EXTERNAL, which supposedly doesn't allow compression - but I don't think
we have code ensuring that we decompress in-line datums. It'll end up
happening if there's other columns that get newly compressed or stored
externally, but not guaranteed.
One way we could deal with it would be to force the tuple to be processed by
heap_toast_insert_or_update() when there's a difference between typstorage and
attstorage. I think to make that cheap enough to determine, we'd have to cache
that information in the relcache. I haven't thought it through, but I suspect
it'd be problematic to add a pg_type lookup to RelationBuildTupleDesc(),
leading to building that information on demand later.
Greetings,
Andres Freund
On Sun, 2023-01-15 at 16:40 -0500, Tom Lane wrote:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
On Tue, 2023-01-10 at 15:53 +0000, PG Doc comments form wrote:
PLAIN prevents either compression or out-of-line storage; furthermore it
disables use of single-byte headers for varlena types. This is the only
possible strategy for columns of non-TOAST-able data types.However, it does allow "single byte" headers. How to verify this?
CREATE EXTENSION pageinspect;
CREATE TABLE test(a VARCHAR(10000) STORAGE PLAIN);
INSERT INTO test VALUES (repeat('A',10));Now peek into the page with pageinspect functions
SELECT left(encode(t_data, 'hex'), 40) FROM
heap_page_items(get_raw_page('test', 0));This returned value of "1741414141414141414141".
I think that the documentation is wrong. The attached patch removes the
offending half-sentence.The documentation is correct, what is broken is the code.
I see. But what is the reason for that anyway? Why not allow short varlena
headers if TOAST storage is set to PLAIN?
Yours,
Laurenz Albe
Laurenz Albe <laurenz.albe@cybertec.at> writes:
On Sun, 2023-01-15 at 16:40 -0500, Tom Lane wrote:
The documentation is correct, what is broken is the code.
I see. But what is the reason for that anyway? Why not allow short varlena
headers if TOAST storage is set to PLAIN?
The original motivation for that whole mechanism was to protect data
types for which the C functions haven't been upgraded to support
non-traditional varlena headers. So I was worried that this behavior
would somehow break those cases (which still exist, eg oidvector and
int2vector). However, the thing that actually marks such a datatype
is that pg_type.typstorage is PLAIN, and as far as I can find we do
still honor that case in full. If that's the case then every tupdesc
we ever create for such a column will say PLAIN, so there's no
opportunity for the wrong thing to happen.
So maybe it's okay to move the goalposts and acknowledge that setting
attstorage to PLAIN isn't a complete block on applying toast-related
transformations. I wonder though whether short-header is the only
case that can slide through. In particular, for "INSERT ... SELECT
FROM othertable", I suspect it's possible for a compressed-in-line
datum to slide through without decompression. (We certainly must
fix out-of-line datums, but that doesn't necessarily mean we undo
compression.) So I'm not convinced that the proposed wording is
fully correct yet.
regards, tom lane
On Mon, 2023-01-16 at 11:50 -0500, Tom Lane wrote:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
On Sun, 2023-01-15 at 16:40 -0500, Tom Lane wrote:
The documentation is correct, what is broken is the code.
I see. But what is the reason for that anyway? Why not allow short varlena
headers if TOAST storage is set to PLAIN?The original motivation for that whole mechanism was to protect data
types for which the C functions haven't been upgraded to support
non-traditional varlena headers. So I was worried that this behavior
would somehow break those cases (which still exist, eg oidvector and
int2vector). However, the thing that actually marks such a datatype
is that pg_type.typstorage is PLAIN, and as far as I can find we do
still honor that case in full. If that's the case then every tupdesc
we ever create for such a column will say PLAIN, so there's no
opportunity for the wrong thing to happen.So maybe it's okay to move the goalposts and acknowledge that setting
attstorage to PLAIN isn't a complete block on applying toast-related
transformations. I wonder though whether short-header is the only
case that can slide through. In particular, for "INSERT ... SELECT
FROM othertable", I suspect it's possible for a compressed-in-line
datum to slide through without decompression. (We certainly must
fix out-of-line datums, but that doesn't necessarily mean we undo
compression.) So I'm not convinced that the proposed wording is
fully correct yet.
I see, thanks for the explanation.
Since the only storage format I have ever had use for are EXTENDED
and EXTERNAL, it is not very important for me if PLAIN supports short
headers or not. Since single-byte headers are part of the TOAST
mechanism (and documented as such), it makes sense to disable them
in PLAIN. Then the documentation could describe PLAIN as
"skip all TOAST processing".
So we should probably go with the simplest fix that restores
consistency.
Yours,
Laurenz Albe
On Thu, Jan 12, 2023 at 03:43:57PM +0100, Laurenz Albe wrote:
On Tue, 2023-01-10 at 15:53 +0000, PG Doc comments form wrote:
https://www.postgresql.org/docs/devel/storage-toast.html - This is the
development version.PLAIN prevents either compression or out-of-line storage; furthermore it
disables use of single-byte headers for varlena types. This is the only
possible strategy for columns of non-TOAST-able data types.However, it does allow "single byte" headers. How to verify this?
CREATE EXTENSION pageinspect;
CREATE TABLE test(a VARCHAR(10000) STORAGE PLAIN);
INSERT INTO test VALUES (repeat('A',10));Now peek into the page with pageinspect functions
SELECT left(encode(t_data, 'hex'), 40) FROM
heap_page_items(get_raw_page('test', 0));This returned value of "1741414141414141414141".
Here the first byte 0x17 = 0001 0111 in binary.
Length + 1 is stored in the length bits (1-7). So Len = 0001011-1 = (11-1)
[base-10] = 10 [base-10]
which exactly matches the expected length. Further the data "41" repeated 10
times also indicates character A (65 or 0x41 in ASCII) repeated 10 times.So....This does **not** disable 1-B header. That sentence should be removed
from the documentation unless this is a bug.I think that the documentation is wrong. The attached patch removes the
offending half-sentence.Yours,
Laurenz Albe
From 5bf0b43fe73384a21f59d9ad1f7a8d7cbc81f8c4 Mon Sep 17 00:00:00 2001
From: Laurenz Albe <laurenz.albe@cybertec.at>
Date: Thu, 12 Jan 2023 15:41:56 +0100
Subject: [PATCH] Fix documentation for STORAGE PLAINCommit 3e23b68dac0, which introduced single-byte varlena headers,
added documentation that STORAGE PLAIN would prevent such single-byte
headers. This has never been true.
---
doc/src/sgml/storage.sgml | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml index e5b9f3f1ff..4795a485d0 100644 --- a/doc/src/sgml/storage.sgml +++ b/doc/src/sgml/storage.sgml @@ -456,9 +456,7 @@ for storing <acronym>TOAST</acronym>-able columns on disk: <listitem> <para> <literal>PLAIN</literal> prevents either compression or - out-of-line storage; furthermore it disables use of single-byte headers - for varlena types. - This is the only possible strategy for + out-of-line storage. This is the only possible strategy for columns of non-<acronym>TOAST</acronym>-able data types. </para> </listitem> -- 2.39.0
Where did we end with this? Is a doc patch the solution?
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.
On Fri, 2023-09-29 at 18:19 -0400, Bruce Momjian wrote:
On Thu, Jan 12, 2023 at 03:43:57PM +0100, Laurenz Albe wrote:
On Tue, 2023-01-10 at 15:53 +0000, PG Doc comments form wrote:
https://www.postgresql.org/docs/devel/storage-toast.html - This is the
development version.PLAIN prevents either compression or out-of-line storage; furthermore it
disables use of single-byte headers for varlena types. This is the only
possible strategy for columns of non-TOAST-able data types.However, it does allow "single byte" headers. How to verify this?
CREATE EXTENSION pageinspect;
CREATE TABLE test(a VARCHAR(10000) STORAGE PLAIN);
INSERT INTO test VALUES (repeat('A',10));Now peek into the page with pageinspect functions
SELECT left(encode(t_data, 'hex'), 40) FROM
heap_page_items(get_raw_page('test', 0));This returned value of "1741414141414141414141".
Here the first byte 0x17 = 0001 0111 in binary.
Length + 1 is stored in the length bits (1-7). So Len = 0001011-1 = (11-1)
[base-10] = 10 [base-10]
which exactly matches the expected length. Further the data "41" repeated 10
times also indicates character A (65 or 0x41 in ASCII) repeated 10 times.So....This does **not** disable 1-B header. That sentence should be removed
from the documentation unless this is a bug.I think that the documentation is wrong. The attached patch removes the
offending half-sentence.Yours,
Laurenz AlbeFrom 5bf0b43fe73384a21f59d9ad1f7a8d7cbc81f8c4 Mon Sep 17 00:00:00 2001
From: Laurenz Albe <laurenz.albe@cybertec.at>
Date: Thu, 12 Jan 2023 15:41:56 +0100
Subject: [PATCH] Fix documentation for STORAGE PLAINCommit 3e23b68dac0, which introduced single-byte varlena headers,
added documentation that STORAGE PLAIN would prevent such single-byte
headers. This has never been true.
---
doc/src/sgml/storage.sgml | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml index e5b9f3f1ff..4795a485d0 100644 --- a/doc/src/sgml/storage.sgml +++ b/doc/src/sgml/storage.sgml @@ -456,9 +456,7 @@ for storing <acronym>TOAST</acronym>-able columns on disk: <listitem> <para> <literal>PLAIN</literal> prevents either compression or - out-of-line storage; furthermore it disables use of single-byte headers - for varlena types. - This is the only possible strategy for + out-of-line storage. This is the only possible strategy for columns of non-<acronym>TOAST</acronym>-able data types. </para> </listitem> -- 2.39.0Where did we end with this? Is a doc patch the solution?
I don't think this went anywhere, and a doc patch is not the solution.
Tom has argued convincingly that single-byte headers are an effect of the TOAST
system, and that STORAGE PLAIN should disable all effects of TOAST.
So this would need a code patch.
Yours,
Laurenz Albe
Laurenz Albe <laurenz.albe@cybertec.at> writes:
On Fri, 2023-09-29 at 18:19 -0400, Bruce Momjian wrote:
Where did we end with this? Is a doc patch the solution?
I don't think this went anywhere, and a doc patch is not the solution.
Tom has argued convincingly that single-byte headers are an effect of the TOAST
system, and that STORAGE PLAIN should disable all effects of TOAST.
Well, that was the original idea: you could use STORAGE PLAIN if you
had C code that wasn't yet toast-aware. However, given the lack of
complaints, it seems there's no non-toast-aware code left anywhere.
And that's not too surprising, because the evolutionary pressure to
fix such code would be mighty strong, and a lot of time has passed.
I'm now inclined to think that changing the docs is better than
changing the code; we'd be more likely to create new problems than
fix anything useful.
I wonder though if there's really just one place claiming that
that's how it works. A trawl through the code comments might
be advisable.
regards, tom lane
On Fri, Sep 29, 2023 at 06:45:52PM -0400, Tom Lane wrote:
Laurenz Albe <laurenz.albe@cybertec.at> writes:
On Fri, 2023-09-29 at 18:19 -0400, Bruce Momjian wrote:
Where did we end with this? Is a doc patch the solution?
I don't think this went anywhere, and a doc patch is not the solution.
Tom has argued convincingly that single-byte headers are an effect of the TOAST
system, and that STORAGE PLAIN should disable all effects of TOAST.Well, that was the original idea: you could use STORAGE PLAIN if you
had C code that wasn't yet toast-aware. However, given the lack of
complaints, it seems there's no non-toast-aware code left anywhere.
And that's not too surprising, because the evolutionary pressure to
fix such code would be mighty strong, and a lot of time has passed.I'm now inclined to think that changing the docs is better than
changing the code; we'd be more likely to create new problems than
fix anything useful.I wonder though if there's really just one place claiming that
that's how it works. A trawl through the code comments might
be advisable.
[ Discussion moved to hackers, same subject. ]
Here is the original thread from pgsql-docs:
/messages/by-id/167336599095.2667301.15497893107226841625@wrigleys.postgresql.org
The report is about single-byte headers being used for varlena values
with PLAIN storage.
Here is the reproducible report:
CREATE EXTENSION pageinspect;
CREATE TABLE test(a VARCHAR(10000) STORAGE PLAIN);
INSERT INTO test VALUES (repeat('A',10));
Now peek into the page with pageinspect functions
SELECT left(encode(t_data, 'hex'), 40) FROM
heap_page_items(get_raw_page('test', 0));
This returned value of "1741414141414141414141".
Here the first byte 0x17 = 0001 0111 in binary.
Length + 1 is stored in the length bits (1-7). So Len = 0001011-1 = (11-1)
[base-10] = 10 [base-10]
which exactly matches the expected length. Further the data "41" repeated 10
times also indicates character A (65 or 0x41 in ASCII) repeated 10 times.
I researched this and thought it would be a case where we were lacking a
check before creating a single-byte header, but I couldn't find anything
missing. I think the problem is that the _source_ tupleDesc attstorage
attribute is being used to decide if we should use a short header, while
it is really the storage type of the destination that we should be
checking. Unfortunately, I don't think the destination is accessible at
the location were we are deciding about a short header.
I am confused how to proceed. I feel we need to fully understand why
this happening before we adjust anything. Here is a backtrace --- the
short header is being created in fill_val() and the attstorage value
there is 'x'/EXTENDED.
---------------------------------------------------------------------------
#0 fill_val (att=0x56306f61dae8, bit=0x0, bitmask=0x7ffcfcfc1fb4, dataP=0x7ffcfcfc1f90, infomask=0x56306f61e25c, datum=94766026487048, isnull=false) at heaptuple.c:278
#1 0x000056306e7800eb in heap_fill_tuple (tupleDesc=0x56306f61dad0, values=0x56306f61dc20, isnull=0x56306f61dc28, data=0x56306f61e260 "", data_size=11, infomask=0x56306f61e25c, bit=0x0) at heaptuple.c:427
#2 0x000056306e781708 in heap_form_tuple (tupleDescriptor=0x56306f61dad0, values=0x56306f61dc20, isnull=0x56306f61dc28) at heaptuple.c:1181
#3 0x000056306ea13dcb in tts_virtual_copy_heap_tuple (slot=0x56306f61dbd8) at execTuples.c:280
#4 0x000056306ea1346e in ExecCopySlotHeapTuple (slot=0x56306f61dbd8) at ../../../src/include/executor/tuptable.h:463
#5 0x000056306ea14928 in tts_buffer_heap_copyslot (dstslot=0x56306f61e1a8, srcslot=0x56306f61dbd8) at execTuples.c:798
#6 0x000056306ea4342e in ExecCopySlot (dstslot=0x56306f61e1a8, srcslot=0x56306f61dbd8) at ../../../src/include/executor/tuptable.h:487
#7 0x000056306ea44785 in ExecGetInsertNewTuple (relinfo=0x56306f61d678, planSlot=0x56306f61dbd8) at nodeModifyTable.c:685
#8 0x000056306ea49123 in ExecModifyTable (pstate=0x56306f61d470) at nodeModifyTable.c:3789
#9 0x000056306ea0ef3c in ExecProcNodeFirst (node=0x56306f61d470) at execProcnode.c:464
#10 0x000056306ea03702 in ExecProcNode (node=0x56306f61d470) at ../../../src/include/executor/executor.h:273
#11 0x000056306ea05fe2 in ExecutePlan (estate=0x56306f61d228, planstate=0x56306f61d470, use_parallel_mode=false, operation=CMD_INSERT, sendTuples=false, numberTuples=0, direction=ForwardScanDirection, dest=0x56306f588170, execute_once=true) at execMain.c:1670
#12 0x000056306ea03c63 in standard_ExecutorRun (queryDesc=0x56306f527888, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:365
#13 0x000056306ea03aee in ExecutorRun (queryDesc=0x56306f527888, direction=ForwardScanDirection, count=0, execute_once=true) at execMain.c:309
#14 0x000056306ec70cf5 in ProcessQuery (plan=0x56306f588020, sourceText=0x56306f552a98 "INSERT INTO test VALUES (repeat('A',10));", params=0x0, queryEnv=0x0, dest=0x56306f588170, qc=0x7ffcfcfc25c0) at pquery.c:160
#15 0x000056306ec72514 in PortalRunMulti (portal=0x56306f5cccf8, isTopLevel=true, setHoldSnapshot=false, dest=0x56306f588170, altdest=0x56306f588170, qc=0x7ffcfcfc25c0) at pquery.c:1277
#16 0x000056306ec71b3a in PortalRun (portal=0x56306f5cccf8, count=9223372036854775807, isTopLevel=true, run_once=true, dest=0x56306f588170, altdest=0x56306f588170, qc=0x7ffcfcfc25c0) at pquery.c:791
#17 0x000056306ec6b465 in exec_simple_query (query_string=0x56306f552a98 "INSERT INTO test VALUES (repeat('A',10));") at postgres.c:1273
#18 0x000056306ec6fdd3 in PostgresMain (dbname=0x56306f58ab88 "test", username=0x56306f50ee68 "postgres") at postgres.c:4657
#19 0x000056306ebb304c in BackendRun (port=0x56306f57dc20) at postmaster.c:4423
#20 0x000056306ebb26fc in BackendStartup (port=0x56306f57dc20) at postmaster.c:4108
#21 0x000056306ebaf134 in ServerLoop () at postmaster.c:1767
#22 0x000056306ebaead5 in PostmasterMain (argc=1, argv=0x56306f50ce30) at postmaster.c:1466
#23 0x000056306ea8108c in main (argc=1, argv=0x56306f50ce30) at main.c:198
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.
On Fri, Oct 20, 2023 at 09:48:05PM -0400, Bruce Momjian wrote:
Here is the original thread from pgsql-docs:
/messages/by-id/167336599095.2667301.15497893107226841625@wrigleys.postgresql.org
The report is about single-byte headers being used for varlena values
with PLAIN storage.Here is the reproducible report:
CREATE EXTENSION pageinspect;
CREATE TABLE test(a VARCHAR(10000) STORAGE PLAIN);
INSERT INTO test VALUES (repeat('A',10));Now peek into the page with pageinspect functions
SELECT left(encode(t_data, 'hex'), 40) FROM
heap_page_items(get_raw_page('test', 0));This returned value of "1741414141414141414141".
Here the first byte 0x17 = 0001 0111 in binary.
Length + 1 is stored in the length bits (1-7). So Len = 0001011-1 = (11-1)
[base-10] = 10 [base-10]
which exactly matches the expected length. Further the data "41" repeated 10
times also indicates character A (65 or 0x41 in ASCII) repeated 10 times.I researched this and thought it would be a case where we were lacking a
check before creating a single-byte header, but I couldn't find anything
missing. I think the problem is that the _source_ tupleDesc attstorage
attribute is being used to decide if we should use a short header, while
it is really the storage type of the destination that we should be
checking. Unfortunately, I don't think the destination is accessible at
the location were we are deciding about a short header.I am confused how to proceed. I feel we need to fully understand why
this happening before we adjust anything. Here is a backtrace --- the
short header is being created in fill_val() and the attstorage value
there is 'x'/EXTENDED.
I did some more research. It turns out that the source slot/planSlot is
populating its pg_attribute information via makeTargetEntry() and it
has no concept of a storage type.
Digging further, I found that we cannot get rid of the the use of
att->attstorage != TYPSTORAGE_PLAIN in macros ATT_IS_PACKABLE and
VARLENA_ATT_IS_PACKABLE macros in src/backend/access/common/heaptuple.c
because there are internal uses of fill_val() that can't handle packed
varlena headers.
I ended up with a doc patch that adds a C comment about this odd
behavior and removes doc text about PLAIN storage not using packed
headers.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.
On Sat, Oct 21, 2023 at 09:56:13PM -0400, Bruce Momjian wrote:
I did some more research. It turns out that the source slot/planSlot is
populating its pg_attribute information via makeTargetEntry() and it
has no concept of a storage type.Digging further, I found that we cannot get rid of the the use of
att->attstorage != TYPSTORAGE_PLAIN in macros ATT_IS_PACKABLE and
VARLENA_ATT_IS_PACKABLE macros in src/backend/access/common/heaptuple.c
because there are internal uses of fill_val() that can't handle packed
varlena headers.I ended up with a doc patch that adds a C comment about this odd
behavior and removes doc text about PLAIN storage not using packed
headers.
Oops, patch attached.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.
Attachments:
varlena_short.difftext/x-diff; charset=us-asciiDownload
diff --git a/doc/src/sgml/storage.sgml b/doc/src/sgml/storage.sgml
index 148fb1b49d..3ea4e5526d 100644
--- a/doc/src/sgml/storage.sgml
+++ b/doc/src/sgml/storage.sgml
@@ -456,9 +456,7 @@ for storing <acronym>TOAST</acronym>-able columns on disk:
<listitem>
<para>
<literal>PLAIN</literal> prevents either compression or
- out-of-line storage; furthermore it disables use of single-byte headers
- for varlena types.
- This is the only possible strategy for
+ out-of-line storage. This is the only possible strategy for
columns of non-<acronym>TOAST</acronym>-able data types.
</para>
</listitem>
diff --git a/src/backend/access/common/heaptuple.c b/src/backend/access/common/heaptuple.c
index d6a4ddfd51..c52d40dce0 100644
--- a/src/backend/access/common/heaptuple.c
+++ b/src/backend/access/common/heaptuple.c
@@ -68,7 +68,16 @@
#include "utils/memutils.h"
-/* Does att's datatype allow packing into the 1-byte-header varlena format? */
+/*
+ * Does att's datatype allow packing into the 1-byte-header varlena format?
+ * While functions that use TupleDescAttr() and assign attstorage =
+ * TYPSTORAGE_PLAIN cannot use packed varlena headers, functions that call
+ * TupleDescInitEntry() use typeForm->typstorage (TYPSTORAGE_EXTENDED) and
+ * can use packed varlena headers, e.g.:
+ * CREATE TABLE test(a VARCHAR(10000) STORAGE PLAIN);
+ * INSERT INTO test VALUES (repeat('A',10));
+ * This can be verified with pageinspect.
+ */
#define ATT_IS_PACKABLE(att) \
((att)->attlen == -1 && (att)->attstorage != TYPSTORAGE_PLAIN)
/* Use this if it's already known varlena */
diff --git a/src/backend/utils/adt/rangetypes.c b/src/backend/utils/adt/rangetypes.c
index d65e5625c7..24bad52923 100644
--- a/src/backend/utils/adt/rangetypes.c
+++ b/src/backend/utils/adt/rangetypes.c
@@ -2608,7 +2608,8 @@ range_contains_elem_internal(TypeCacheEntry *typcache, const RangeType *r, Datum
* values into a range object. They are modeled after heaptuple.c's
* heap_compute_data_size() and heap_fill_tuple(), but we need not handle
* null values here. TYPE_IS_PACKABLE must test the same conditions as
- * heaptuple.c's ATT_IS_PACKABLE macro.
+ * heaptuple.c's ATT_IS_PACKABLE macro. See the comments thare for more
+ * details.
*/
/* Does datatype allow packing into the 1-byte-header varlena format? */
On Sat, Oct 21, 2023 at 09:59:04PM -0400, Bruce Momjian wrote:
On Sat, Oct 21, 2023 at 09:56:13PM -0400, Bruce Momjian wrote:
I did some more research. It turns out that the source slot/planSlot is
populating its pg_attribute information via makeTargetEntry() and it
has no concept of a storage type.Digging further, I found that we cannot get rid of the the use of
att->attstorage != TYPSTORAGE_PLAIN in macros ATT_IS_PACKABLE and
VARLENA_ATT_IS_PACKABLE macros in src/backend/access/common/heaptuple.c
because there are internal uses of fill_val() that can't handle packed
varlena headers.I ended up with a doc patch that adds a C comment about this odd
behavior and removes doc text about PLAIN storage not using packed
headers.Oops, patch attached.
Patch applied to all supported versions.
--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com
Only you can decide what is important to you.