Save a few bytes in pg_attribute
After the discussion in [0]/messages/by-id/20230313204119.4mkepdvixcxrwpsc@awork3.anarazel.de ff., I was looking around in pg_attribute
and noticed that we could possibly save 4 bytes. We could change both
attstattarget and attinhcount from int4 to int2, which together with
some reordering would save 4 bytes from the fixed portion.
attstattarget is already limited to 10000, so this wouldn't lose
anything. For attinhcount, I don't see any documented limits. But it
seems unlikely to me that someone would need more than 32k immediate
inheritance parents on a column. (Maybe an overflow check would be
useful, though, to prevent shenanigans.)
The attached patch seems to work. Thoughts?
[0]: /messages/by-id/20230313204119.4mkepdvixcxrwpsc@awork3.anarazel.de
/messages/by-id/20230313204119.4mkepdvixcxrwpsc@awork3.anarazel.de
Attachments:
0001-Save-a-few-bytes-in-pg_attribute.patchtext/plain; charset=UTF-8; name=0001-Save-a-few-bytes-in-pg_attribute.patchDownload+31-32
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
After the discussion in [0] ff., I was looking around in pg_attribute
and noticed that we could possibly save 4 bytes. We could change both
attstattarget and attinhcount from int4 to int2, which together with
some reordering would save 4 bytes from the fixed portion.
attstattarget is already limited to 10000, so this wouldn't lose
anything. For attinhcount, I don't see any documented limits. But it
seems unlikely to me that someone would need more than 32k immediate
inheritance parents on a column. (Maybe an overflow check would be
useful, though, to prevent shenanigans.)
The attached patch seems to work. Thoughts?
I agree that attinhcount could be narrowed, but I have some concern
about attstattarget. IIRC, the limit on attstattarget was once 1000
and then we raised it to 10000. Is it inconceivable that we might
want to raise it to 100000 someday?
regards, tom lane
On 3/20/23 15:37, Tom Lane wrote:
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes:
After the discussion in [0] ff., I was looking around in pg_attribute
and noticed that we could possibly save 4 bytes. We could change both
attstattarget and attinhcount from int4 to int2, which together with
some reordering would save 4 bytes from the fixed portion.attstattarget is already limited to 10000, so this wouldn't lose
anything. For attinhcount, I don't see any documented limits. But it
seems unlikely to me that someone would need more than 32k immediate
inheritance parents on a column. (Maybe an overflow check would be
useful, though, to prevent shenanigans.)The attached patch seems to work. Thoughts?
I agree that attinhcount could be narrowed, but I have some concern
about attstattarget. IIRC, the limit on attstattarget was once 1000
and then we raised it to 10000. Is it inconceivable that we might
want to raise it to 100000 someday?
Yeah, I don't think it'd be wise to make it harder to increase the
statistics target limit.
IMHO it'd be much better to just not store the statistics target for
attributes that have it default (which we now identify by -1), or for
system attributes (where we store 0). I'd bet vast majority of systems
will just use the default / GUC value. So if we're interested in saving
these bytes, we could just store NULL in these cases, no?
regards
--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company
Tomas Vondra <tomas.vondra@enterprisedb.com> writes:
IMHO it'd be much better to just not store the statistics target for
attributes that have it default (which we now identify by -1), or for
system attributes (where we store 0). I'd bet vast majority of systems
will just use the default / GUC value. So if we're interested in saving
these bytes, we could just store NULL in these cases, no?
Hmm, we'd have to move it to the nullable part of the row and expend
more code to fetch it; but I don't think it's touched in many places,
so that might be a good tradeoff. Couple of notes:
* As things stand I think we have a null bitmap in every row of
pg_attribute already (surely attfdwoptions and attmissingval are never
both filled), so there's no extra cost there.
* Putting it in the variable part of the row means it wouldn't appear
in tuple descriptors, but that seems fine.
I wonder if the same is true of attinhcount. Since it's nonzero for
partitions' attributes, it might be non-null in a fairly large fraction
of pg_attribute rows in some use-cases, but it still seems like we'd not
be losing anything. It wouldn't need to be touched in any
high-performance code paths AFAICS.
regards, tom lane
On Mon, 20 Mar 2023, 11:00 pm Peter Eisentraut,
<peter.eisentraut@enterprisedb.com> wrote:
After the discussion in [0] ff., I was looking around in pg_attribute
and noticed that we could possibly save 4 bytes. We could change both
attstattarget and attinhcount from int4 to int2, which together with
some reordering would save 4 bytes from the fixed portion.
I just want to highlight 1ef61ddce9, which fixed a very long-standing
bug that meant that pg_inherits.inhseqno was effectively just 16-bit.
Perhaps because nobody seemed to report that as a limitation 16 bits
is enough space. I only noticed that as a bug from code reading.
David
Hi,
On 2023-03-20 10:37:36 -0400, Tom Lane wrote:
I agree that attinhcount could be narrowed, but I have some concern
about attstattarget. IIRC, the limit on attstattarget was once 1000
and then we raised it to 10000. Is it inconceivable that we might
want to raise it to 100000 someday?
Hard to believe that'd happen in a minor version - and I don't think there'd
an issue with widening it again in a major version?
I doubt we'll ever go to 100k without a major redesign of stats storage/access
- the size of the stats datums would make that pretty impractical right now.
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
On 2023-03-20 10:37:36 -0400, Tom Lane wrote:
I agree that attinhcount could be narrowed, but I have some concern
about attstattarget. IIRC, the limit on attstattarget was once 1000
and then we raised it to 10000. Is it inconceivable that we might
want to raise it to 100000 someday?
Hard to believe that'd happen in a minor version - and I don't think there'd
an issue with widening it again in a major version?
True. However, I think Tomas' idea of making these columns nullable
is even better than narrowing them.
regards, tom lane
Hi,
On 2023-03-20 11:00:12 +0100, Peter Eisentraut wrote:
After the discussion in [0] ff., I was looking around in pg_attribute and
noticed that we could possibly save 4 bytes. We could change both
attstattarget and attinhcount from int4 to int2, which together with some
reordering would save 4 bytes from the fixed portion.
attndims seems like another good candidate to shrink.
Greetings,
Andres Freund
On 21.03.23 00:51, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2023-03-20 10:37:36 -0400, Tom Lane wrote:
I agree that attinhcount could be narrowed, but I have some concern
about attstattarget. IIRC, the limit on attstattarget was once 1000
and then we raised it to 10000. Is it inconceivable that we might
want to raise it to 100000 someday?Hard to believe that'd happen in a minor version - and I don't think there'd
an issue with widening it again in a major version?True. However, I think Tomas' idea of making these columns nullable
is even better than narrowing them.
The context of my message was to do the proposed change for PG16 to buy
back a few bytes that are being added by another feature, and then
consider doing a larger detangling of pg_attribute and tuple descriptors
in PG17, which might well involve taking the attstattarget out of the
hot path. Making attstattarget nullable (i.e., not part of the fixed
part of pg_attribute) would require fairly significant surgery, so I
think it would be better done as part of a more comprehensive change
that would allow the same treatment for other columns as well.
Hi,
On 2023-03-21 17:36:48 +0100, Peter Eisentraut wrote:
On 21.03.23 00:51, Tom Lane wrote:
Andres Freund <andres@anarazel.de> writes:
On 2023-03-20 10:37:36 -0400, Tom Lane wrote:
I agree that attinhcount could be narrowed, but I have some concern
about attstattarget. IIRC, the limit on attstattarget was once 1000
and then we raised it to 10000. Is it inconceivable that we might
want to raise it to 100000 someday?Hard to believe that'd happen in a minor version - and I don't think there'd
an issue with widening it again in a major version?True. However, I think Tomas' idea of making these columns nullable
is even better than narrowing them.
Why not do both?
The context of my message was to do the proposed change for PG16 to buy back
a few bytes that are being added by another feature
How much would you need to buy back to "reach parity"?
Greetings,
Andres Freund
On 21.03.23 17:43, Andres Freund wrote:
The context of my message was to do the proposed change for PG16 to buy back
a few bytes that are being added by another featureHow much would you need to buy back to "reach parity"?
I don't think we can find enough to make the impact zero bytes. It's
also not clear exactly what the impact of each byte would be (compared
to possible complications in other parts of the code, for example). But
if there are a few low-hanging fruit, it seems like we could pick them,
to old us over until we have a better solution to the underlying issue.
Hi,
On 2023-03-21 18:15:40 +0100, Peter Eisentraut wrote:
On 21.03.23 17:43, Andres Freund wrote:
The context of my message was to do the proposed change for PG16 to buy back
a few bytes that are being added by another featureHow much would you need to buy back to "reach parity"?
I don't think we can find enough to make the impact zero bytes. It's also
not clear exactly what the impact of each byte would be (compared to
possible complications in other parts of the code, for example). But if
there are a few low-hanging fruit, it seems like we could pick them, to old
us over until we have a better solution to the underlying issue.
attndims 4->2
attstattarget 4->2
attinhcount 4->2
+ some reordering only gets you from 112->108 unfortunately, due to a 1 byte
alignment hole and 2 bytes of trailing padding.
before:
/* size: 112, cachelines: 2, members: 22 */
/* sum members: 111, holes: 1, sum holes: 1 */
/* last cacheline: 48 bytes */
after:
/* size: 108, cachelines: 2, members: 22 */
/* sum members: 105, holes: 1, sum holes: 1 */
/* padding: 2 */
/* last cacheline: 44 bytes */
You might be able to fill the hole + padding with your data - but IIRC that
was 3 4byte integers?
FWIW, I think we should consider getting rid of attcacheoff. I doubt it's
worth its weight these days, because deforming via slots starts at the
beginning anyway. The overhead of maintaining it is not insubstantial, and
it's just architecturally ugly to to update tupledescs continually.
Not for your current goal, but I do wonder how hard it'd be to make it work to
store multiple booleans as bitmasks. Probably ties into the discussion around
not relying on struct "mapping" for catalog tables (which we IIRC decided is
the sensible way the NAMEDATALEN restriction).
E.g. pg_attribute has 6 booleans, and attgenerated effectively is a boolean
too, and attidentity could easily be modeled as such as well.
If were to not rely on struct mapping anymore, we could possibly transparently
do this as part of forming/deforming heap tuples. Using something like
TYPALIGN_BIT. The question is whether it'd be too expensive to decode...
Greetings,
Andres Freund
Andres Freund <andres@anarazel.de> writes:
FWIW, I think we should consider getting rid of attcacheoff. I doubt it's
worth its weight these days, because deforming via slots starts at the
beginning anyway. The overhead of maintaining it is not insubstantial, and
it's just architecturally ugly to to update tupledescs continually.
I'd be for that if we can convince ourselves there's not a material
speed penalty. As you say, it's quite ugly.
regards, tom lane
On Tue, 21 Mar 2023 at 19:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
FWIW, I think we should consider getting rid of attcacheoff. I doubt it's
worth its weight these days, because deforming via slots starts at the
beginning anyway. The overhead of maintaining it is not insubstantial, and
it's just architecturally ugly to to update tupledescs continually.I'd be for that if we can convince ourselves there's not a material
speed penalty. As you say, it's quite ugly.
Yes, attcacheoff is a tremendous performance boon in many cases. But
all is not lost:
When I was working on other improvements I experimented with storing
the attributes used in (de)serializing tuples to disk in a separate
structured array in the TupleDesc, a prototype patch of which I shared
here [0]/messages/by-id/CAEze2Wh8-metSryZX_Ubj-uv6kb+2YnzHAejmEdubjhmGusBAg@mail.gmail.com. I didn't see a speed difference back then so I didn't
further venture into that path (as it adds complexity without
performance benefits), but I think it can be relevant to this thread
because with that patch we actually don't need the attcacheoff in the
pg_atttribute struct: it only needs to be present in the derived
"TupleAttrAlignData" structs which carry the
length/alignment/storage/byval info.
Kind regards,
Matthias van de Meent
[0]: /messages/by-id/CAEze2Wh8-metSryZX_Ubj-uv6kb+2YnzHAejmEdubjhmGusBAg@mail.gmail.com
Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
... with that patch we actually don't need the attcacheoff in the
pg_atttribute struct: it only needs to be present in the derived
"TupleAttrAlignData" structs which carry the
length/alignment/storage/byval info.
Yeah, I was wondering about that too: keeping attcacheoff as local
state in slots might get us all its win without so much conceptual
dirtiness.
regards, tom lane
Hi,
On 2023-03-21 20:20:40 +0100, Matthias van de Meent wrote:
On Tue, 21 Mar 2023 at 19:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
FWIW, I think we should consider getting rid of attcacheoff. I doubt it's
worth its weight these days, because deforming via slots starts at the
beginning anyway. The overhead of maintaining it is not insubstantial, and
it's just architecturally ugly to to update tupledescs continually.I'd be for that if we can convince ourselves there's not a material
speed penalty. As you say, it's quite ugly.Yes, attcacheoff is a tremendous performance boon in many cases.
Which? We don't use fastgetattr() in many places these days. And in some quick
measurements it's a wash or small loss when deforming slot tuples, even when
the attcacheoff optimization would apply, because the branches for managing it
add more overhead than they safe.
Greetings,
Andres Freund
On Tue, 21 Mar 2023 at 20:58, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2023-03-21 20:20:40 +0100, Matthias van de Meent wrote:
On Tue, 21 Mar 2023 at 19:55, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Andres Freund <andres@anarazel.de> writes:
FWIW, I think we should consider getting rid of attcacheoff. I doubt it's
worth its weight these days, because deforming via slots starts at the
beginning anyway. The overhead of maintaining it is not insubstantial, and
it's just architecturally ugly to to update tupledescs continually.I'd be for that if we can convince ourselves there's not a material
speed penalty. As you say, it's quite ugly.Yes, attcacheoff is a tremendous performance boon in many cases.
Which? We don't use fastgetattr() in many places these days. And in some quick
measurements it's a wash or small loss when deforming slot tuples, even when
the attcacheoff optimization would apply, because the branches for managing it
add more overhead than they safe.
My experience with attcacheoff performance is in indexes, specifically
index_getattr(). Sure, multi-column indexes are uncommon, but the
difference between have and have-not for cached attribute offsets is
several %.
Kind regards,
Matthias van de Meent
Hi,
On 2023-03-21 15:26:38 -0400, Tom Lane wrote:
Matthias van de Meent <boekewurm+postgres@gmail.com> writes:
... with that patch we actually don't need the attcacheoff in the
pg_atttribute struct: it only needs to be present in the derived
"TupleAttrAlignData" structs which carry the
length/alignment/storage/byval info.Yeah, I was wondering about that too: keeping attcacheoff as local
state in slots might get us all its win without so much conceptual
dirtiness.
It's also the place where it's the least likely to help - afaict attcacheoff
is only really beneficial for fastgetattr(). Which conditions it's use more
strictly - not only can there not be any NULLs before the accessed column,
there may not be any NULLs in the tuple at all.
Greetings,
Andres Freund
Hi,
On 2023-03-21 21:02:08 +0100, Matthias van de Meent wrote:
On Tue, 21 Mar 2023 at 20:58, Andres Freund <andres@anarazel.de> wrote:
On 2023-03-21 20:20:40 +0100, Matthias van de Meent wrote:
Yes, attcacheoff is a tremendous performance boon in many cases.
Which? We don't use fastgetattr() in many places these days. And in some quick
measurements it's a wash or small loss when deforming slot tuples, even when
the attcacheoff optimization would apply, because the branches for managing it
add more overhead than they safe.My experience with attcacheoff performance is in indexes, specifically
index_getattr(). Sure, multi-column indexes are uncommon, but the
difference between have and have-not for cached attribute offsets is
several %.
I did indeed not think of index_getattr(), just heap related things.
Do you have a good test workload handy - I'm kinda curious to compare the cost
of removing attcacheoff vs the gain of not maintaining it for index workloads.
It looks like many of the index_getattr() cases could be made faster without
attcacheoff. A lot of places seem to loop over all attributes, and the key to
accelerating that is to keep state between the iterations. Attcacheoff is
that, but quite stunted, because it only works if there aren't any NULLs (even
if the NULL is in a later column).
Greetings,
Andres Freund
On Tue, 21 Mar 2023 at 23:05, Andres Freund <andres@anarazel.de> wrote:
Hi,
On 2023-03-21 21:02:08 +0100, Matthias van de Meent wrote:
On Tue, 21 Mar 2023 at 20:58, Andres Freund <andres@anarazel.de> wrote:
On 2023-03-21 20:20:40 +0100, Matthias van de Meent wrote:
Yes, attcacheoff is a tremendous performance boon in many cases.
Which? We don't use fastgetattr() in many places these days. And in some quick
measurements it's a wash or small loss when deforming slot tuples, even when
the attcacheoff optimization would apply, because the branches for managing it
add more overhead than they safe.My experience with attcacheoff performance is in indexes, specifically
index_getattr(). Sure, multi-column indexes are uncommon, but the
difference between have and have-not for cached attribute offsets is
several %.I did indeed not think of index_getattr(), just heap related things.
Do you have a good test workload handy - I'm kinda curious to compare the cost
of removing attcacheoff vs the gain of not maintaining it for index workloads.
Rebuilding indexes has been my go-to workload for comparing
attribute-related btree performance optimizations in [0]/messages/by-id/CAEze2WhyBT2bKZRdj_U0KS2Sbewa1XoO_BzgpzLC09sa5LUROg@mail.gmail.com and [1]/messages/by-id/CAEze2Wg52tsSWA9Fy7OCXx-K7pPLMNxA_fmQ6-+_pzR-AoODDA@mail.gmail.com.
Results of tests from '21 in which we're always calculating offsets
from 0 show a slowdown of 4-18% in attcacheoff-enabled workloads if
we're calculating offsets dynamically.
It looks like many of the index_getattr() cases could be made faster without
attcacheoff. A lot of places seem to loop over all attributes, and the key to
accelerating that is to keep state between the iterations.
Indeed, it's not great. You can take a look at [1]/messages/by-id/CAEze2Wg52tsSWA9Fy7OCXx-K7pPLMNxA_fmQ6-+_pzR-AoODDA@mail.gmail.com, which is where I'm
trying to optimize btree's handling of comparing tuples; which
includes work on reducing overhead for attribute accesses.
Note that each btree page should be able to do with comparing at most
2*log(ntups) columns, where this is currently natts * log(ntups).
Attcacheoff is
that, but quite stunted, because it only works if there aren't any NULLs (even
if the NULL is in a later column).
Yes, that isn't great either, but most indexes I've seen have tuples
that are either all NULL, or have no nulls; only seldom I see indexes
that have mixed NULL/not-null index tuple attributes.
Kind regards,
Matthias van de Meent.
[0]: /messages/by-id/CAEze2WhyBT2bKZRdj_U0KS2Sbewa1XoO_BzgpzLC09sa5LUROg@mail.gmail.com
[1]: /messages/by-id/CAEze2Wg52tsSWA9Fy7OCXx-K7pPLMNxA_fmQ6-+_pzR-AoODDA@mail.gmail.com