Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

Started by Greg Burd5 months ago15 messageshackers

greg@burd.me

5 months ago

Hello,

I've been working on expanding HOT updates [1]/messages/by-id/78574B24-BE0A-42C5-8075-3FA9FA63B8FC@amazon.com and have recently had
some success (will be posted soon to the thread) moving the work done
within HeapDetermineColumnsInfo() into the executor. I found that
heap_update() is used in two paths: heap_tuple_update() and
simple_tuple_update(), both call into heap_update() which then uses
HeapDetermineColumnsInfo() to find the set of indexed attributes that
have been modified during the update. This makes sense on the normal
path from the executor, but the simple_tuple_update() path is only
called from CatalogTupleUpdate() and when I looked at how catalog tuples
were formed/mutated on that path I discovered something that I had to fix.

Catalog tuples have knowledge of what attributes are mutated implicitly,
but they don't preserve that information and so
HeapDetermineColumnsInfo() has to re-discover that later on (while the
page is locked). While on the surface this isn't such a big deal, it's
been that way and worked well for years I have other motivations (see
[1]: /messages/by-id/78574B24-BE0A-42C5-8075-3FA9FA63B8FC@amazon.com

So, I set about to create a way to retain the knowledge of what was
modified while mutating catalog tuples and pass that information on to
CatalogTupleUpdate() as a Bitmapset. That is the underlying goal of
this patch set. To make no behavioral changes, only syntactic ones, and
end up with something better overall.

That turned out to be a bit of a challenge as we have a lot of places
where catalog information is updated. We also have two different
methods for updating said information. And we intermix them. At times
we hack our way to a solution and ignore convention. I went down the
rabbit hole on this one, but I'm back up for a cup of tea because what
I've done seems materially better to me and I've accomplished the goal
(and can eventually make use of the result in [1]/messages/by-id/78574B24-BE0A-42C5-8075-3FA9FA63B8FC@amazon.com). Is this patch
useful even without that other work? I think so, just on the basis of
cleanup. Let me explain.

Historically there have been two methods for modifying heap tuples
destined for the catalog; Form/GETSTRUCT, and values/nulls/replaces.
This new method provides a more intuitive and less error-prone approach
without changing the fundamentals of the process. In addition, it is
now possible to retain knowledge of the set of mutated attributes when
working with catalog tuples. This isn't used in practice (yet [1]/messages/by-id/78574B24-BE0A-42C5-8075-3FA9FA63B8FC@amazon.com), but
could be a way to avoid the overhead of HeapDetermineColumnsInfo() in
heap_update() where (while holding a lock) we re-discover that set by
comparing old/new HeapTuple datums when trying to find what indexed
attributes have new values and prevent HOT updates.

Method 1: Form/GETSTRUCT

The "Form/GETSTRUCT" model allows for direct access to the tuple data
that is then modified, copied, and then updated via CatalogTupleUpdate().

Old:
Form_pg_index relform = (Form_pg_index) GETSTRUCT(tuple);
relform->inisclustered = false;
CatalogTupleUpdate(pg_index, &tuple->t_self, tuple);

New:
Bitmapset *updated = NULL;
Form_pg_index relform = (Form_pg_index) GETSTRUCT(tuple);
HeapTupleUpdateField(pg_index, indisclustered, false, indexForm, updated);
CatalogTupleUpdate(pg_index, &indexTuple->t_self, indexTuple, updated, NULL);
bms_free(updated);

Method 2: values/nulls/replaces

The "values/nulls/replaces" model collects the necessary information to
either update or create a heap tuple using heap_modify_tuple() or
heap_form_tuple() or sometimes heap_modify_tuple_by_cols(). While all
those functions remain unchanged now the prefered function for catalog
tuple modification is heap_update_tuple().

Old:
bool nulls[Natts_pg_type];
bool replaces[Natts_pg_type];
Datum values[Natts_pg_type];

values[Anum_pg_type_typtype - 1] = CharGetDatum(typeType);
nulls[Anum_pg_type_typdefaultbin - 1] = true;
replaces[Anum_pg_type_oid - 1] = false;

tuple = heap_modify_tuple(tuple, desc, values, nulls, replaces);
CatalogTupleUpdate(pg_type_desc, &tuple->t_self, tuple);

New:
Datum values[Natts_pg_type] = {0};
bool nulls[Natts_pg_type] = {false};
Bitmapset *updated = NULL;

HeapTupleUpdateValue(pg_type, typtype, CharGetDatum(typeType), values,
nulls, updated);
HeapTupleUpdateValueNull(pg_type, typdefaultbin, values, nulls, updated);
HeapTupleSetColumnNotUpdated(pg_type, oid, updated);

tuple = heap_update_tuple(tuple, desc, values, nulls, updated);
CatalogTupleUpdate(pg_type_desc, &tuple->t_self, tuple, updated, NULL);

In addition CatalogTupleUpdate and CatalogTupleInsert now have an
optional argument for the CatalogIndexState, when not provided it is
created and then released for the caller. This eliminates the need for
their "WithInfo" companion functions.

heap_update_tuple() is functionally equivalent to heap_modify_tuple(),
but takes a Bitmapset called "updated" rather than an array of bool
(generally) called "replaces" as a method for indicating what was
modified. Additionally, this new function tries to balance the
trade-offs of calling heap_getattr() versus heap_deform_tuple() based on
the ratio of attributes updated and their known runtime complexities.
Both paths are functionally equivalent.

The changes also include initialization of the values/nulls arrays
rather than loops or memset(). Some effort was also given to unify
naming of these variables and their order so as to lower cognitive
overhead.

So, let me dive into the attached patches. On advice from my mentor
Noah Misch I've divided the changes in two. The first patch contains
all the infrastructure and the set of files where applying the changes
got "interesting" and should be highlighted. The second is the
mechanical changes to the remaining files, and it's HUGE, apologies for
the blast radius here but it couldn't be helped.

0001 - Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

htup.h: is where you'll find the new macros. There is a "getter" called
HeapTupleValue() to find the field in the values array and not worry
about "-1". After that, they break down into a two cases, insert and
update, with forms addressing method 1 (Form/GETSTRUCT) and method 2 (values/nulls/updated).

Insert macros take the form "HeapTupleSet...()" as in
"HeapTupleSetField(...)" or "HeapTupleSetValue(...)". There is the
assignment to NULL which is "HeapTupleSetFieldNull(...)" and
"HeapTupleSetValueNull(...)" as well.

Arguments to these macros are:
table_name, field, value, (nulls), (form_ptr)

These macros simply translate back to the forms that we use now and in
some cases combine work or ensure correctness.

Update macros follow a similar pattern "HeapTupleUpdate...()" and then
have "HeapTupleUpdateField..." and "HeapTupleUpdateValue..." cases to
cover the two methods we use.

Arguments are again uniform:
table_name, field, value, values, nulls, updated

There are additional macros for a few update-related special cases as well:

* HeapTupleMarkColumnUpdated() - not something you'll normally use as
the update methods all capture this information already, but there are
occasions where this is useful.

* HeapTupleUpdateSetAllColumnsUpdated() - 99.9% of the time we have
historically started off with a "replaces" array where every value in
that array is "false", but there are a few places where we had occasion
to do the opposite, start with the assumption that all attributes were
modified. This covers that case setting all the bits in the "updated"
Bitmapset correctly.

heaptuple.c: you'll find here my new heap_update_tuple() function. This
one is interesting as it tries to balance out the cost of heap_getattr()
vs heap_deform_tuple(). Why did I bother? Read the comment on
heap_modify_tuple(), and I was there... so why not? You may disagree
with my approach, here's a portion of the comment that explains it:

* Performance strategy:
* - If updating many attributes (> 2*natts/3), use heap_getattr() to extract
* only the few non-updated attributes. This is O(k*n) where k is the number
* of non-updated attributes, which is small when updating many.
* - If updating few attributes (<= 2*natts/3), use heap_deform_tuple() to
* extract all attributes at once (O(n)), then replace the updated ones.
* This avoids the O(n^2) cost of many heap_getattr() calls.
*
* The threshold of 2*natts/3 balances the fixed O(n) cost of heap_deform_tuple
* against the variable O(k*n) cost of heap_getattr, where k = natts - num_updated.

indexing.c: has the change to CatalogTupleUpdate/Insert and removal of
their "WithInfo" companions. If that second part is controversial then
I don't mind reverting it, but IMO this is cleaner and might either
inspire more call sites to create and reuse CatalogIndexState or for
someone (me?) to address the comment that pre-dates my work:

* XXX: At some point we might cache the CatalogIndexState data
somewhere (perhaps
* in the relcache) so that callers needn't trouble over this.

The remaining files in this patch (we're still on 0001) are in the
category of *interesting* or non-obvious changes I ran into that were
"fun" to fix...

pg_aggregate.c: in AggregateCreate() we run into a "compromise" pattern
where the code code either update or insert. In such cases hackers
should use the macros for update, not insert, everywhere. This records
the additional information for the update path and doesn't burden the
insert path with much more than a Bitmapset to free.

This reminds me, there is also the case where a function is looping and
updating a number of catalog tuples. Care must be taken to scope the
Bitmapset and free it every iteration as in:

while( doing all the things )
{
Bitmapset updated = NULL;

... update a tuple
CatalogTupleUpdate();
bms_free(updated);
}

This also comes up when a function has multiple updates not in a loop.
Don't reuse the Bitmapset, just scope it. If you do reuse it then:

bms_free(updated);
updated = NULL;

and later... when you start using it again:

Assert(bms_is_empty(updated));

pg_constraint.c: brings us to namestrcpy(), take a look at
RenameConstraintById(). When you run into namestrcpy() you've likely
just done the mutation to the form directly but you'll want to mark that
field as updated, as in:

namestrcpy(&(con->conname), newname);
HeapTupleMarkColumnUpdated(pg_constraint, conname, updated);

This pattern is also needed after functions like pg_add_s32_overflow(),
just HeapTupleMarkColumnUpdated() and you're fine. I used the regex:
"namestrcpy|pg_.*_.*flow" to find these cases, YMMV.

alter.c: take a look at AlterObjectRename_internal() which uses
get_object_attnum_name() to get the AttrNumber that is to be altered.
This is fine, but it's not something we can just plug into a macro and
get the proper name expansion. So, we do it the manual way and add a comment:

/*
* NOTE: We can't use the HeapTupleMarkColumnUpdated() macro here because
* 'Anum_name' isn't a table/column name, it's a index for the relation
* passed into the function as an argument.
*/
namestrcpy(&nameattrdata, new_name);
values[Anum_name - 1] = NameGetDatum(&nameattrdata);
updated = bms_add_member(updated, Anum_name - FirstLowInvalidHeapAttributeNumber);

analyze.c: uses the HeapTupleSetAllColumnsUpdated() at the start of the
update_attstats() function to match the replaced code. Later there's
another interesting deviation from the normal pattern in that function
where we want to update the nth stakind/opt/coll this was frustrating at
first until I found that this worked:

HeapTupleSetValue(pg_statistic, stakind1 + k,
Int16GetDatum(stats->stakind[k]), values);

I'll be "cpp", here's what that turns into:

(values)[Anum_pg_statistic_stakind1 + k - 1] = (Int16GetDatum(stats->stakind[k]));

So, we're taking the attribute number adding "k" then subtracting 1, and
voila "Bob's your uncle." While a bit unconventional, I think the
resulting form is understandable, much more than before which was:

i = Anum_pg_statistic_stakind1 - 1;
for (k = 0; k < STATISTIC_NUM_SLOTS; k++)
values[i++] = Int16GetDatum(stats->stakind[k]); /* stakindN */

attribute_stats.c: take a look at set_stats_slot() where I have a
comment explaining this "field + n" method adding a "Don't Panic" might
help too. Here's that comment:

/*
* NOTE: This might seem odd, to be adding 'i' to the name of the field on
* these macros, but that's what we need to do here to find the proper
* slot offset and record the proper value into the updated bitmap.
*
* Example: HeapTupleValue(pg_statistic, stakind1 + i, values);
*
* Becomes: values[Anum_pg_statistic_stakind1 + i - 1];
*
* The result is you're indexing the i'th value, exactly what we needed.
*/

The really amazing and valuable piece here is how that plays out in the following:

HeapTupleUpdateValue(pg_statistic, stakind1 + i, Int16GetDatum(stakind),
values, nulls, updated);

Playing the role of "cpp" again we get:

do { \
(values)[Anum_statistic_stakind1 + i - 1] = (Datum)
(Int16GetDatum(stakind)); \
(nulls)[Anum_statistic_stakind1 + i - 1] = false; \
Anum_statistic_stakind1 - (-7))
} while(0);

The result is that we've set the value/isnull at the correct position in
both the values and nulls arrays and then added the correct attribute
number into the Bitmapset (drops the mic).

Okay, and that about does it for "0001" with the foundation and then the
odd cases.

0002 - Update the remainder of catalog updates using the new APIs

This is a lot of applying the pattern over and over until it all works.
That's it. Nothing much more interesting here.

This is a lot of change, I get that, I think there's value in change
when the result is cleaner and more maintainable. The one place I'll
agree up front where this might case problems is back-patching fixes.
That'll have to have the new approach in some branches and the old
approach in others. It's a 5-year commitment, I don't take that lightly.

If you made it this far, I owe you a beer/coffee/token-of-appreciation.
Without this patch my [1]/messages/by-id/78574B24-BE0A-42C5-8075-3FA9FA63B8FC@amazon.com will have an ugly wart
(HeapDetermineColumnsInfo()) for the CatalogTupleUpdate path only. I
can live with that, I'd rather not.

best,

-greg

[1]: /messages/by-id/78574B24-BE0A-42C5-8075-3FA9FA63B8FC@amazon.com

Michael Paquier

michael@paquier.xyz

5 months ago

In reply to: Greg Burd (#1)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

On Fri, Nov 14, 2025 at 10:31:28AM -0500, Greg Burd wrote:

Catalog tuples have knowledge of what attributes are mutated implicitly,
but they don't preserve that information and so
HeapDetermineColumnsInfo() has to re-discover that later on (while the
page is locked). While on the surface this isn't such a big deal, it's
been that way and worked well for years I have other motivations (see
[1] again) for improving this.

Yeah, I've heard that this has been also discussed at the New York
pgconf, where designs were mentioned so as we do not execute
expressions while locking pages. One issue is that this could easily
deadlock if an expression has the idea to re-read the same page we are
locking. So I have a summary of what you have been doing in mind, a
very rough one I suspect.

That turned out to be a bit of a challenge as we have a lot of places
where catalog information is updated. We also have two different
methods for updating said information. And we intermix them. At times
we hack our way to a solution and ignore convention. I went down the
rabbit hole on this one, but I'm back up for a cup of tea because what
I've done seems materially better to me and I've accomplished the goal
(and can eventually make use of the result in [1]). Is this patch
useful even without that other work? I think so, just on the basis of
cleanup. Let me explain.

It's a mix of historical and individual commit style, so having things
slightly different depending on the catalog code path is not a
surprise, at least here.

[... Two methods for heap tuple updates ... ]

tuple = heap_modify_tuple(tuple, desc, values, nulls, replaces);
CatalogTupleUpdate(pg_type_desc, &tuple->t_self, tuple);

As far as I can see after applying 0002, 90%-ish of the existing
callers of heap_modify_tuple() are updated to use heap_update_tuple().
That's interesting to see.

heap_update_tuple() is functionally equivalent to heap_modify_tuple(),
but takes a Bitmapset called "updated" rather than an array of bool
(generally) called "replaces" as a method for indicating what was
modified. Additionally, this new function tries to balance the
trade-offs of calling heap_getattr() versus heap_deform_tuple() based on
the ratio of attributes updated and their known runtime complexities.
Both paths are functionally equivalent.

I don't think that we can completely remove heap_update_tuple() as
code path used for the catalogs updates, but we can get very close to
that. Perhaps we should document at the top of heap_update_tuple()
that developers should not use this API for catalog changes, switching
to the bitmap interface instead in priority?

* Performance strategy:
* - If updating many attributes (> 2*natts/3), use heap_getattr() to extract
* only the few non-updated attributes. This is O(k*n) where k is the number
* of non-updated attributes, which is small when updating many.
* - If updating few attributes (<= 2*natts/3), use heap_deform_tuple() to
* extract all attributes at once (O(n)), then replace the updated ones.
* This avoids the O(n^2) cost of many heap_getattr() calls.
*
* The threshold of 2*natts/3 balances the fixed O(n) cost of heap_deform_tuple
* against the variable O(k*n) cost of heap_getattr, where k = natts - num_updated.

This change has been puzzling me quite a bit. Why this choice of
formula with 2/3 of arguments? The maximum number of arguments is
1400 by design, so would one choice matter more than the other in most
cases? Catalogs don't have that many arguments either, so I am not
sure if we need to care with this amount of tuning, TBH, but I am OK
to be wrong.

indexing.c: has the change to CatalogTupleUpdate/Insert and removal of
their "WithInfo" companions. If that second part is controversial then
I don't mind reverting it, but IMO this is cleaner and might either
inspire more call sites to create and reuse CatalogIndexState or for
someone (me?) to address the comment that pre-dates my work:

I'm OK with the simplification on this one, but let's make that a
separate patch extracted from 0001. The changes with
CatalogTupleUpdate() and CatalogTupleInsert() don't have to be
included with the changes that introduce a bitmap integration to track
the attributes that are updated. This is pointing to the fact that
there are not that many callers of the WithInfo() flavors in the tree,
compared to the non-info ones (24 AFAIK, vs 160-ish for the Insert
path it seems, for example).

pg_aggregate.c: in AggregateCreate() we run into a "compromise" pattern
where the code code either update or insert. In such cases hackers
should use the macros for update, not insert, everywhere. This records
the additional information for the update path and doesn't burden the
insert path with much more than a Bitmapset to free.

Hmm. Fine by me. The bitmap is not used for an INSERT path, so it's
a bit of a waste, but I can see why you have done so to satisfy the
"replace" case, from CREATE OR REPLACE (never liked this kind of
grammar as it can be impredictible implementation-wise with syscache
lookups required, just a personal rant).

Another thing that you could do here is enforce a check in a couple of
places where CatalogTupleInsert() is called for the "updated" bitmap:
all the bits in the map should be set, like in OperatorCreate(). That
may catch bugs at runtime perhaps when adding fields when dealing with
an insert instead of an update?

pg_constraint.c: brings us to namestrcpy(), take a look at
RenameConstraintById(). When you run into namestrcpy() you've likely
just done the mutation to the form directly but you'll want to mark that
field as updated, as in:

alter.c: take a look at AlterObjectRename_internal() which uses
get_object_attnum_name() to get the AttrNumber that is to be altered.
This is fine, but it's not something we can just plug into a macro and
get the proper name expansion. So, we do it the manual way and add a comment:

Not surprising to have exceptions like that. This is a global path
used for the renames of a bunch of objects. This makes me wonder if
we really have to absolutely change all the code paths that currently
use heap_modify_tuple(). Like this one in alter.c for renames, the
change is not an improvement in readability. Same for
AlterObjectNamespace_internal(). These would live better if still
based on heap_modify_tuple(), because of the fact that they can be fed
several object types.

analyze.c: uses the HeapTupleSetAllColumnsUpdated() at the start of the
update_attstats() function to match the replaced code. Later there's
another interesting deviation from the normal pattern in that function
where we want to update the nth stakind/opt/coll this was frustrating at
first until I found that this worked:

HeapTupleSetValue(pg_statistic, stakind1 + k,
Int16GetDatum(stats->stakind[k]), values);

I'm getting concerned about HeapTupleSetAllColumnsUpdated() while
reaching this part of your message, FWIW.

So, we're taking the attribute number adding "k" then subtracting 1, and
voila "Bob's your uncle." While a bit unconventional, I think the
resulting form is understandable, much more than before which was:

i = Anum_pg_statistic_stakind1 - 1;
for (k = 0; k < STATISTIC_NUM_SLOTS; k++)
values[i++] = Int16GetDatum(stats->stakind[k]); /* stakindN */

And more concerns. This change makes me question how wise it is to
have HeapTupleUpdateSetAllColumnsUpdated(), actually. Couldn't this
encourage bugs if somebody forgets to update some of the fields in the
new tuple, still the bitmap has been updated with a range to mark all
the attributes as updated. Having HeapTupleUpdateField() do an update
of the bitmap makes sure that the bitmap and the fields updated are
never out-of-sync. I'm OK with marking the individual attributes as
updated. Having a more aggressive API that marks all of them as
updated sounds dangerous to me.

0002 - Update the remainder of catalog updates using the new APIs

This is a lot of applying the pattern over and over until it all works.
That's it. Nothing much more interesting here.

This is a lot of change, I get that, I think there's value in change
when the result is cleaner and more maintainable. The one place I'll
agree up front where this might case problems is back-patching fixes.
That'll have to have the new approach in some branches and the old
approach in others. It's a 5-year commitment, I don't take that lightly.

Backpatches imply catalog tuple manipulations from time to time, so
that would create noise, yes.

If you made it this far, I owe you a beer/coffee/token-of-appreciation.

Sentence noted.

Without this patch my [1] will have an ugly wart
(HeapDetermineColumnsInfo()) for the CatalogTupleUpdate path only. I
can live with that, I'd rather not.

It seems to me that this answer is better answered as "rather not",
especially if this improves the code maintenance in the long-term.

Patch 0001 could be split into more pieces, particularly regarding the
argument it makes about the fact that there are few callers of
CatalogTuplesMultiInsertWithInfo(), CatalogTupleInsertWithInfo() and
CatalogTupleUpdateWithInfo(), where we can just plug NULL for the
index state. I'd suggest to make these routines leaner in a first
patch.

             /* No, insert new tuple */
             stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
-            CatalogTupleInsertWithInfo(sd, stup, indstate);
+            CatalogTupleInsert(sd, stup, NULL);
         }

This one in update_attstats() looks wrong. Why are you passing NULL
instead of an opened index state? We keep track of indstate for all
the pg_statistic tuples updated.

HeapTupleSetField() is used nowhere. Do we really need it?
--
Michael

Greg Burd

greg@burd.me

5 months ago

In reply to: Michael Paquier (#2)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

On Dec 2 2025, at 2:09 am, Michael Paquier <michael@paquier.xyz> wrote:

On Fri, Nov 14, 2025 at 10:31:28AM -0500, Greg Burd wrote:

Catalog tuples have knowledge of what attributes are mutated implicitly,
but they don't preserve that information and so
HeapDetermineColumnsInfo() has to re-discover that later on (while the
page is locked). While on the surface this isn't such a big deal, it's
been that way and worked well for years I have other motivations (see
[1] again) for improving this.

Yeah, I've heard that this has been also discussed at the New York
pgconf, where designs were mentioned so as we do not execute
expressions while locking pages. One issue is that this could easily
deadlock if an expression has the idea to re-read the same page we are
locking. So I have a summary of what you have been doing in mind, a
very rough one I suspect.

Hi Michael,

Thanks for taking the time to review this chunky patch. I've been
quietly reworking it a bit, I'll go over that in a bit because I'd
appreciate opinions before I'm too far into the changes.

Yes, the motivation for this patch started off as a secondary effect of
the other work I've mentioned. It's morphing a bit given some feedback
off-list. There are a few reasons to standardize catalog modification
code and move it away from direct Form/GETSTRUCT or
values/nulls/replaces-style access. First, these current methods can be
error prone and require attention to a variety of details that can
easily be overlooked. Second, at present while we know what fields are
modified we don't preserve that information and pass it on to the
heap_update() code. Third, I understand that at some point we'll need
in-memory representations completely decoupled from the disk to support
upgradeable catalogs. I started off just trying to make
HeapDetermineColumnsInfo() go away, these seem like good goals as well
and touch the same area of code. I think, if possible, it would make
sense to ensure that a large-ish change like this doesn't need to be
undone/redone to accomplish these other goals.

That turned out to be a bit of a challenge as we have a lot of places
where catalog information is updated. We also have two different
methods for updating said information. And we intermix them. At times
we hack our way to a solution and ignore convention. I went down the
rabbit hole on this one, but I'm back up for a cup of tea because what
I've done seems materially better to me and I've accomplished the goal
(and can eventually make use of the result in [1]). Is this patch
useful even without that other work? I think so, just on the basis of
cleanup. Let me explain.

It's a mix of historical and individual commit style, so having things
slightly different depending on the catalog code path is not a
surprise, at least here.

Got it, I don't begrudge anyone's style or how it's done now. I just
want to address the goals I've laid out and hopefully make things a bit
better along the way.

[... Two methods for heap tuple updates ... ]

tuple = heap_modify_tuple(tuple, desc, values, nulls, replaces);
CatalogTupleUpdate(pg_type_desc, &tuple->t_self, tuple);

As far as I can see after applying 0002, 90%-ish of the existing
callers of heap_modify_tuple() are updated to use heap_update_tuple().
That's interesting to see.

Yes, that was intentional.

heap_update_tuple() is functionally equivalent to heap_modify_tuple(),
but takes a Bitmapset called "updated" rather than an array of bool
(generally) called "replaces" as a method for indicating what was
modified. Additionally, this new function tries to balance the
trade-offs of calling heap_getattr() versus heap_deform_tuple() based on
the ratio of attributes updated and their known runtime complexities.
Both paths are functionally equivalent.

[NOTE: below I think you meant to say "heap_modify_tuple()", so I
correct it]

I don't think that we can completely remove heap_modify_tuple() as
code path used for the catalogs updates, but we can get very close to
that. Perhaps we should document at the top of heap_modify_tuple()
that developers should not use this API for catalog changes, switching
to the bitmap interface instead in priority?

I agree that it's hard to remove heap_modify_tuple() function straight
away which is why I kept it in and created a new function
heap_update_tuple(). I agree, comments are needed to explain why and
point out the preferred way.

* Performance strategy:
* - If updating many attributes (> 2*natts/3), use heap_getattr() to extract
* only the few non-updated attributes. This is O(k*n) where k is
the number
* of non-updated attributes, which is small when updating many.
* - If updating few attributes (<= 2*natts/3), use
heap_deform_tuple() to
* extract all attributes at once (O(n)), then replace the updated ones.
* This avoids the O(n^2) cost of many heap_getattr() calls.
*
* The threshold of 2*natts/3 balances the fixed O(n) cost of heap_deform_tuple
* against the variable O(k*n) cost of heap_getattr, where k = natts
- num_updated.

This change has been puzzling me quite a bit. Why this choice of
formula with 2/3 of arguments? The maximum number of arguments is
1400 by design, so would one choice matter more than the other in most
cases? Catalogs don't have that many arguments either, so I am not
sure if we need to care with this amount of tuning, TBH, but I am OK
to be wrong.

Good points, I saw the comment about optimizing it and I took a swing.
There are two categories for catalog updates, changing a few or setting
them all. My hope was to choose the less computationally onerous path
based on the ratio of the amount updated vs the number in the relation.

indexing.c: has the change to CatalogTupleUpdate/Insert and removal of
their "WithInfo" companions. If that second part is controversial then
I don't mind reverting it, but IMO this is cleaner and might either
inspire more call sites to create and reuse CatalogIndexState or for
someone (me?) to address the comment that pre-dates my work:

I'm OK with the simplification on this one, but let's make that a
separate patch extracted from 0001. The changes with
CatalogTupleUpdate() and CatalogTupleInsert() don't have to be
included with the changes that introduce a bitmap integration to track
the attributes that are updated. This is pointing to the fact that
there are not that many callers of the WithInfo() flavors in the tree,
compared to the non-info ones (24 AFAIK, vs 160-ish for the Insert
path it seems, for example).

That makes sense, but another approach is to do the work mentioned in
the comment and somehow cache the CatalogIndexState (which is a somewhat
redacted ResultRelInfo by another name) somewhere.

pg_aggregate.c: in AggregateCreate() we run into a "compromise" pattern
where the code code either update or insert. In such cases hackers
should use the macros for update, not insert, everywhere. This records
the additional information for the update path and doesn't burden the
insert path with much more than a Bitmapset to free.

Hmm. Fine by me. The bitmap is not used for an INSERT path, so it's
a bit of a waste, but I can see why you have done so to satisfy the
"replace" case, from CREATE OR REPLACE (never liked this kind of
grammar as it can be impredictible implementation-wise with syscache
lookups required, just a personal rant).

Right, this isn't ideal. The "waste" of alloc/free of a Bitmapset is
real, and different people will have different views on that, but IMO
it's worth it. That said, I did rethink this a bit more.

Another thing that you could do here is enforce a check in a couple of
places where CatalogTupleInsert() is called for the "updated" bitmap:
all the bits in the map should be set, like in OperatorCreate(). That
may catch bugs at runtime perhaps when adding fields when dealing with
an insert instead of an update?

That's not a bad idea, catching bugs early is a good outcome.

pg_constraint.c: brings us to namestrcpy(), take a look at
RenameConstraintById(). When you run into namestrcpy() you've likely
just done the mutation to the form directly but you'll want to mark that
field as updated, as in:

alter.c: take a look at AlterObjectRename_internal() which uses
get_object_attnum_name() to get the AttrNumber that is to be altered.
This is fine, but it's not something we can just plug into a macro and
get the proper name expansion. So, we do it the manual way and add a comment:

Not surprising to have exceptions like that. This is a global path
used for the renames of a bunch of objects. This makes me wonder if
we really have to absolutely change all the code paths that currently
use heap_modify_tuple(). Like this one in alter.c for renames, the
change is not an improvement in readability. Same for
AlterObjectNamespace_internal(). These would live better if still
based on heap_modify_tuple(), because of the fact that they can be fed
several object types.

I'd agree except that the paths using heap_update_tuple() won't benefit
from HOT updates.

I think readability is debatable, but I am playing games with macros
which at times makes me feel a bit dirty.

analyze.c: uses the HeapTupleSetAllColumnsUpdated() at the start of the
update_attstats() function to match the replaced code. Later there's
another interesting deviation from the normal pattern in that function
where we want to update the nth stakind/opt/coll this was frustrating at
first until I found that this worked:

HeapTupleSetValue(pg_statistic, stakind1 + k,
Int16GetDatum(stats->stakind[k]), values);

I'm getting concerned about HeapTupleSetAllColumnsUpdated() while
reaching this part of your message, FWIW.

The reason for HeapTupleSetAllColumnsUpdated() was to mimic existing
logic and change as little as possible by simply adding a layer of
friendly macros into the mix.

So, we're taking the attribute number adding "k" then subtracting 1, and
voila "Bob's your uncle." While a bit unconventional, I think the
resulting form is understandable, much more than before which was:

i = Anum_pg_statistic_stakind1 - 1;
for (k = 0; k < STATISTIC_NUM_SLOTS; k++)
values[i++] = Int16GetDatum(stats->stakind[k]); /* stakindN */

And more concerns. This change makes me question how wise it is to
have HeapTupleUpdateSetAllColumnsUpdated(), actually. Couldn't this
encourage bugs if somebody forgets to update some of the fields in the
new tuple, still the bitmap has been updated with a range to mark all
the attributes as updated. Having HeapTupleUpdateField() do an update
of the bitmap makes sure that the bitmap and the fields updated are
never out-of-sync. I'm OK with marking the individual attributes as
updated. Having a more aggressive API that marks all of them as
updated sounds dangerous to me.

Well, sure I suppose so but again this was simply a replacement for the
existing code which initialized the "replaces" array to all true rather
than the normal case of all false.

0002 - Update the remainder of catalog updates using the new APIs

This is a lot of applying the pattern over and over until it all
works.
That's it. Nothing much more interesting here.

This is a lot of change, I get that, I think there's value in change
when the result is cleaner and more maintainable. The one place I'll
agree up front where this might case problems is back-patching fixes.
That'll have to have the new approach in some branches and the old
approach in others. It's a 5-year commitment, I don't take that lightly.

Backpatches imply catalog tuple manipulations from time to time, so
that would create noise, yes.

If you made it this far, I owe you a
beer/coffee/token-of-appreciation.

Sentence noted.

Excellent!

Without this patch my [1] will have an ugly wart
(HeapDetermineColumnsInfo()) for the CatalogTupleUpdate path only. I
can live with that, I'd rather not.

It seems to me that this answer is better answered as "rather not",
especially if this improves the code maintenance in the long-term.

Patch 0001 could be split into more pieces, particularly regarding the
argument it makes about the fact that there are few callers of
CatalogTuplesMultiInsertWithInfo(), CatalogTupleInsertWithInfo() and
CatalogTupleUpdateWithInfo(), where we can just plug NULL for the
index state. I'd suggest to make these routines leaner in a first
patch.
/* No, insert new tuple */
stup = heap_form_tuple(RelationGetDescr(sd), values, nulls);
-            CatalogTupleInsertWithInfo(sd, stup, indstate);
+            CatalogTupleInsert(sd, stup, NULL);
}
This one in update_attstats() looks wrong. Why are you passing NULL
instead of an opened index state? We keep track of indstate for all
the pg_statistic tuples updated.

HeapTupleSetField() is used nowhere. Do we really need it?
--
Michael

As I hinted at earlier on, I've continue to develop this patch [1]https://github.com/gburd/postgres/pull/18.
It's not ready to post as patches just yet. I've paused making all the
changes as the time required is non-trivial. Here's some of what I
thought through and a taste so that I can (hopefully) get feedback (pos
or neg) before investing the rest of a week into updating every case.

bool replaces[Natts...] was static, Bitmapset * is a heap alloc/free
====================================================================

Yes, and I tried a few ideas out to avoid this. If you recall my
earlier work to test Bitmapset[2]/messages/by-id/7BD1ABDB-B03A-464A-9BA9-A73B55AD8A1F@getmailspring.com was in part to see if I could extend
that datatype to allow for a static version. Closest I could come was this

+#define CatalogUpdateContext(table_name, var, tuple) \
+	do { \
+		struct { \
+			bool            nulls[Natts_##table_name]; \
+			Datum           values[Natts_##table_name]; \
+			Bitmapset       updated; \
+			bitmapword      bits[BITMAPSET_SIZE(Natts_##table_name)]; \
+		} _##table_name##_##var##_ = {.nulls = {false}, .values = {0}, \
+				.updated.nwords = BITMAPSET_SIZE(Natts_##table_name), \
+				.updated.type = T_Bitmapset, .bits = {0} }; \
+		CatalogTupleContext *(var##_ctx) = \
+			(CatalogTupleContext *)&_##table_name##_##var##_; \
+		Form_##table_name (var##_form) = \
+			(Form_##table_name) GETSTRUCT(tuple); \
+	} while(0)

Which would create the right size statically allocated Bitmapset, but it
wouldn't be "valid" because the last word in the set is all zeros and so
calling into any function that calls bms_is_valid() would assert. I
could simply embed the logic for bms_add_member() etc into the macros,
but then I'd be creating a mess for the future. The only advantage is
that you'd be avoiding a palloc()/free() possibly a realloc(). There's
something to be said for that, but this felt a bit over the line.

When I thought about adding a patch that extended/changed Bitmapset to
allow for static allocation I felt that too would become the objection
that sinks this whole idea, so for now I'm not going to try that. Maybe
some other time.

In the end, after trying a few things (including pre-sizing the
Bitmapset by creating it as a singleton setting the Natts+1 value at
initialization to avoid realloc) I went back to accepting the
palloc/free model as simple enough.

E_TOO_MANY_MACROS
=======================================================================

I'm not thrilled with the multitude of macros for somewhat similar
cases. So I tried using _Generic(), but that's not supported on MSVC
and only allowed to be generic at compile time over a single type and
dispatch to a function, not another macro, so you couldn't simply
de-reference in the Form case.

+static inline void
+_CatalogTupleValueSet(CatalogTupleContext *ctx, int attnum, Datum
value, size_t off, size_t sz)
+{
+	ctx->values[attnum] = value;
+	ctx->nulls[attnum] = false;
+}
+
+static inline void
+_CatalogTupleFormSet(void *form, int attnum, Datum value,
+                     size_t off, size_t sz)
+{
+	memcpy((char *)form + off, &value, sz);
+}
+
+#define CatalogTupleSet(ctx_or_form, table_name, field, value) \
+	_Generic((ctx_or_form), \
+		CatalogTupleContext *: _CatalogTupleValueSet, \
+		Form_##table_name *  : _CatalogTupleFormSet \
+	)(ctx_or_form, \
+	  Anum_##table_name##_##field - 1, \
+	  value, \
+	  offsetof(FormData_##table_name, field), \
+	  sizeof(((FormData_##table_name *)0)->field))

Another method for macros was to depend on sizeof() at compile time to
adjust what was built.

+#define _CAT_CTX_SET(ctx, attnum, value) \
+	((ctx)->values[(attnum)] = (value), \
+	 (ctx)->nulls[(attnum)]  = false, \
+	 (ctx)->updated = bms_add_member((ctx)->updated, (attnum)))
+
+#define _CAT_FORM_SET(form, table_name, field, value) \
+	(form)[Anum_##table_name##_##field - 1] = (value)
+
+#define CatalogTupleSet_turnary(ctx_or_form, table_name, field, value) \
+	(sizeof(*(ctx_or_form)) == sizeof(CatalogTupleContext) \
+	 ? (_CAT_CTX_SET((CatalogTupleContext *)(ctx_or_form), \
+	                 Anum_##table_name##_##field - 1, value), \
+	    (void)0)                      /* force statement context */ \
+	 : (_CAT_FORM_SET(((Anum_##table_name *)(ctx_or_form)), table_name,
field, value), \
+	    (void)0))

This feels wrong as who knows if today or someday the size of a form
just happens to be the same as the CatalogTupleContext? So I put that aside.

I looked into some wild ideas that included tuple arity to encode the
action type as Thomas mentioned would be innovative and cool something like:

HeapTupleSet((pg_type, values, nulls),
(typname, NameGetDatum(&name)),
(typnamespace, ObjectIdGetDatum(typeNamespace)),
(typowner, ObjectIdGetDatum(ownerId)),
(typlen, Int16GetDatum(sizeof(int32),
(typdefaultbi), /* list of 1 means null! */
(typdefault),
(some_attr, FooGetDatum(...), foo > bar), /* list of 3 =
conditional null! */
...);

But a) I couldn't ever get that to work as I'd hoped it would and b)
inventing a DSL in the C pre-processor isn't a goal of mine.

Everything needs you to pass in "values, nulls, updated"!
=====================================================================

I thought about this a bit too, the only way to avoid this was to own
the declaration as well as the manipulation of these things. I don't
want to change the entirety of the way we manipulate catalog tuples for
insert/update, I just wanted to capture what attributes changed on
update so I could use that later to avoid HeapDetermineColumnsInfo().

But, I've taken a swing at it and here's where I've landed:

Old "Form/GETSTRUCT" model:
Form_pg_index form = (Form_pg_index) GETSTRUCT(tuple);

form->inisclustered = false;
CatalogTupleUpdate(relation, &tuple->t_self, tuple);

New:
CatalogUpdateFormContext(pg_index, ctx);
CatalogSetForm(pg_index, ctx, tuple);

CatalogTupleUpdateField(ctx, pg_index, indisclustered, false);
ModifyCatalogTupleField(relation, tuple, ctx);

Old "values/nulls/replaces" model:
bool nulls[Natts_pg_type];
bool replaces[Natts_pg_type];
Datum values[Natts_pg_type];

values[Anum_pg_type_typtype - 1] = CharGetDatum(typeType);
nulls[Anum_pg_type_typdefaultbin - 1] = true;
replaces[Anum_pg_type_oid - 1] = false;

tup = heap_modify_tuple(tuple, desc, values, nulls, replaces);
CatalogTupleUpdate(relation, &tuple->t_self, tuple);

New:
CatalogUpdateValuesContext(pg_type, ctx);

CatalogTupleUpdateValue(ctx, pg_type, typtype, CharGetDatum(typeType));
ModifyCatalogTupleValues(relation, tuple, ctx);

I'm about a third of the way done converting to this new pattern and
it's better. As I said earlier I'm concerned that after a lot of effort
the pattern would need to change again, so I'd rather shake out those
concerns/ideas now early on. You can see the changes in [1]https://github.com/gburd/postgres/pull/18, take a
look at htup.h, aclchk.c, pg_aggregate.c, and a few other places.

Now there are macros for: 1) declaration, 2) setting/mutating, 3)
modifying/inserting. I guess I was starting to feel like I was digging
a hole no one would appreciate or agree was necessary so I'm asking for
early feedback because rule #1 when you find yourself digging a hole is
to stop digging.

best.

-greg

[1]: https://github.com/gburd/postgres/pull/18
[2]: /messages/by-id/7BD1ABDB-B03A-464A-9BA9-A73B55AD8A1F@getmailspring.com

Michael Paquier

michael@paquier.xyz

5 months ago

In reply to: Greg Burd (#3)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

On Tue, Dec 02, 2025 at 10:21:36AM -0500, Greg Burd wrote:

On Dec 2 2025, at 2:09 am, Michael Paquier <michael@paquier.xyz> wrote:
Thanks for taking the time to review this chunky patch. I've been
quietly reworking it a bit, I'll go over that in a bit because I'd
appreciate opinions before I'm too far into the changes.

Noted.

I started off just trying to make
HeapDetermineColumnsInfo() go away, these seem like good goals as well
and touch the same area of code.

This one is an interesting piece of argument.

indexing.c: has the change to CatalogTupleUpdate/Insert and removal of
their "WithInfo" companions. If that second part is controversial then
I don't mind reverting it, but IMO this is cleaner and might either
inspire more call sites to create and reuse CatalogIndexState or for
someone (me?) to address the comment that pre-dates my work:

I'm OK with the simplification on this one, but let's make that a
separate patch extracted from 0001. The changes with
CatalogTupleUpdate() and CatalogTupleInsert() don't have to be
included with the changes that introduce a bitmap integration to track
the attributes that are updated. This is pointing to the fact that
there are not that many callers of the WithInfo() flavors in the tree,
compared to the non-info ones (24 AFAIK, vs 160-ish for the Insert
path it seems, for example).

That makes sense, but another approach is to do the work mentioned in
the comment and somehow cache the CatalogIndexState (which is a somewhat
redacted ResultRelInfo by another name) somewhere.

Yes, perhaps we should just do that. That may simplify some of the
existing catalog paths, where we decide to open directly the indexes,
as you have already noticed while working on 0002.

As I hinted at earlier on, I've continue to develop this patch [1].
It's not ready to post as patches just yet. I've paused making all the
changes as the time required is non-trivial. Here's some of what I
thought through and a taste so that I can (hopefully) get feedback (pos
or neg) before investing the rest of a week into updating every case.

Okay, noted.

+#define CatalogUpdateContext(table_name, var, tuple) \
+	do { \
+		struct { \
+			bool            nulls[Natts_##table_name]; \
+			Datum           values[Natts_##table_name]; \
+			Bitmapset       updated; \
+			bitmapword      bits[BITMAPSET_SIZE(Natts_##table_name)]; \
+		} _##table_name##_##var##_ = {.nulls = {false}, .values = {0}, \
+				.updated.nwords = BITMAPSET_SIZE(Natts_##table_name), \
+				.updated.type = T_Bitmapset, .bits = {0} }; \
+		CatalogTupleContext *(var##_ctx) = \
+			(CatalogTupleContext *)&_##table_name##_##var##_; \
+		Form_##table_name (var##_form) = \
+			(Form_##table_name) GETSTRUCT(tuple); \
+	} while(0)
Which would create the right size statically allocated Bitmapset, but it
wouldn't be "valid" because the last word in the set is all zeros and so
calling into any function that calls bms_is_valid() would assert. I
could simply embed the logic for bms_add_member() etc into the macros,
but then I'd be creating a mess for the future. The only advantage is
that you'd be avoiding a palloc()/free() possibly a realloc(). There's
something to be said for that, but this felt a bit over the line.

Hmm. That feels like too much magic to me. We don't have such
complex macros in the tree, currently. And do we really need to
sacrifice readability over some extra palloc()/free() in what are
non-hot code paths?

In the end, after trying a few things (including pre-sizing the
Bitmapset by creating it as a singleton setting the Natts+1 value at
initialization to avoid realloc) I went back to accepting the
palloc/free model as simple enough.

Yeah, I'd agree that this is better.

+#define CatalogTupleSet_turnary(ctx_or_form, table_name, field, value) \
+	(sizeof(*(ctx_or_form)) == sizeof(CatalogTupleContext) \
+	 ? (_CAT_CTX_SET((CatalogTupleContext *)(ctx_or_form), \
+	                 Anum_##table_name##_##field - 1, value), \
+	    (void)0)                      /* force statement context */ \
+	 : (_CAT_FORM_SET(((Anum_##table_name *)(ctx_or_form)), table_name,
field, value), \
+	    (void)0))

This feels wrong as who knows if today or someday the size of a form
just happens to be the same as the CatalogTupleContext? So I put that aside.

I wouldn't do that either, yes.

New:
CatalogUpdateValuesContext(pg_type, ctx);

CatalogTupleUpdateValue(ctx, pg_type, typtype, CharGetDatum(typeType));
ModifyCatalogTupleValues(relation, tuple, ctx);

I'm about a third of the way done converting to this new pattern and
it's better. As I said earlier I'm concerned that after a lot of effort
the pattern would need to change again, so I'd rather shake out those
concerns/ideas now early on. You can see the changes in [1], take a
look at htup.h, aclchk.c, pg_aggregate.c, and a few other places.

One thing not mentioned here is CatalogUpdateValuesDecl(), macro that
hides the bitmap with the updated values, so you'd need one to ensure
that the rest works. Perhaps all that should not be in htup.h. We
may want to header reordering to show that there is a split with heap
tuples, also one of your goals.

Now there are macros for: 1) declaration, 2) setting/mutating, 3)
modifying/inserting. I guess I was starting to feel like I was digging
a hole no one would appreciate or agree was necessary so I'm asking for
early feedback because rule #1 when you find yourself digging a hole is
to stop digging.

I'm not completely sure about this one. Some of the directions and
decisions that need to be taken here also depend on your other work
posted at [1]/messages/by-id/78574B24-BE0A-42C5-8075-3FA9FA63B8FC@amazon.com -- Michael. Looking at v23 from this other thread, the code paths
touched there are different from what's touched here (heaptuple.c vs
heapam.c). You have mentioned that both share some ground. Is there
a specific part in the patch of the other thread I should look at to
get an idea of the bigger picture you have in mind?

[1]: /messages/by-id/78574B24-BE0A-42C5-8075-3FA9FA63B8FC@amazon.com -- Michael
--
Michael

Greg Burd

greg@burd.me

4 months ago

In reply to: Michael Paquier (#4)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

On Dec 2 2025, at 11:16 pm, Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Dec 02, 2025 at 10:21:36AM -0500, Greg Burd wrote:

On Dec 2 2025, at 2:09 am, Michael Paquier <michael@paquier.xyz> wrote:
Thanks for taking the time to review this chunky patch. I've been
quietly reworking it a bit, I'll go over that in a bit because I'd
appreciate opinions before I'm too far into the changes.

Michael,

Thanks for continuing to help me along with this patch, I appreciate
your attention and time.

Noted.

I started off just trying to make
HeapDetermineColumnsInfo() go away, these seem like good goals as well
and touch the same area of code.

This one is an interesting piece of argument.

I've looked in the email archives a bit and didn't come up with much
related to catalog upgrades. I don't know if there is much on the
record information on this idea, but from what I've been told the idea
of decoupling heap from catalogs with the goal of getting closer to
making online upgrades possible has been referenced a few times in
hallway tracks at various conferences.

indexing.c: has the change to CatalogTupleUpdate/Insert and removal of
their "WithInfo" companions. If that second part is controversial then
I don't mind reverting it, but IMO this is cleaner and might either
inspire more call sites to create and reuse CatalogIndexState or for
someone (me?) to address the comment that pre-dates my work:

I'm OK with the simplification on this one, but let's make that a
separate patch extracted from 0001. The changes with
CatalogTupleUpdate() and CatalogTupleInsert() don't have to be
included with the changes that introduce a bitmap integration to track
the attributes that are updated. This is pointing to the fact that
there are not that many callers of the WithInfo() flavors in the tree,
compared to the non-info ones (24 AFAIK, vs 160-ish for the Insert
path it seems, for example).

That makes sense, but another approach is to do the work mentioned in
the comment and somehow cache the CatalogIndexState (which is a somewhat
redacted ResultRelInfo by another name) somewhere.

Yes, perhaps we should just do that. That may simplify some of the
existing catalog paths, where we decide to open directly the indexes,
as you have already noticed while working on 0002.

I'll see what I can do to extract this piece into a separate patch.

New:
CatalogUpdateValuesContext(pg_type, ctx);

CatalogTupleUpdateValue(ctx, pg_type, typtype, CharGetDatum(typeType));
ModifyCatalogTupleValues(relation, tuple, ctx);

I'm about a third of the way done converting to this new pattern and
it's better. As I said earlier I'm concerned that after a lot of effort
the pattern would need to change again, so I'd rather shake out those
concerns/ideas now early on. You can see the changes in [1], take a
look at htup.h, aclchk.c, pg_aggregate.c, and a few other places.

One thing not mentioned here is CatalogUpdateValuesDecl(), macro that
hides the bitmap with the updated values, so you'd need one to ensure
that the rest works. Perhaps all that should not be in htup.h. We
may want to header reordering to show that there is a split with heap
tuples, also one of your goals.

Yes, my current thinking for htup.h is here [1]https://github.com/gburd/postgres/blob/cf-6221/src/include/access/htup.h#L90-L277 L90-L277 in a GitHub
branch on my fork of the Postgres repo.

Places I've converted so as to see the new pattern:
* aclchk.c [2]https://github.com/gburd/postgres/blob/cf-6221/src/backend/catalog/aclchk.c#L1319-L1341 shows a insert or modify case
* heap.c [3]https://github.com/gburd/postgres/blob/cf-6221/src/backend/catalog/heap.c#L1686C2-L1746 modify a catalog tuple
* attribute_stats.c [4]https://github.com/gburd/postgres/blob/cf-6221/src/backend/statistics/attribute_stats.c#L750-L806 a bit more tricky use case

There are more.

Now there are macros for: 1) declaration, 2) setting/mutating, 3)
modifying/inserting. I guess I was starting to feel like I was digging
a hole no one would appreciate or agree was necessary so I'm asking for
early feedback because rule #1 when you find yourself digging a hole is
to stop digging.

I'm not completely sure about this one. Some of the directions and
decisions that need to be taken here also depend on your other work
posted at [1]. Looking at v23 from this other thread, the code paths
touched there are different from what's touched here (heaptuple.c vs
heapam.c). You have mentioned that both share some ground. Is there
a specific part in the patch of the other thread I should look at to
get an idea of the bigger picture you have in mind?

Yes, the links referenced above on that branch. I'll post an
updated/rebased patch set here again for reference, but understand it's
only partially done, as a way to capture the idea (and not depend on
links into GitHub).

[1]: /messages/by-id/78574B24-BE0A-42C5-8075-3FA9FA63B8FC@amazon.com

Right, within that thread there is this message [5]/messages/by-id/BCF1BF66-7A1B-4E01-87DC-0BE45EDF2F98@greg.burd.me that notes:

Ideally, my patch to restructure how catalog tuples are updated [<this
thread>] is committed and we can fully remove
HeapDetermineColumnsInfo() and likely speed up all catalog updates in
the process. That's what motivated [<this thread>], please take a
look, it required a huge number of changes so I thought it deserved a
life/thread of its own.

There are two major paths into heap_update(), one from
heapam_tuple_update() and one from simple_heap_update(). The first is
from the executor side via the table AM API and my other thread [6]/messages/by-id/78574B24-BE0A-42C5-8075-3FA9FA63B8FC@amazon.com that
you referenced is trying to address that. This thread is focused on the
catalog updates that flow into simple_heap_update(). Really that
function should be renamed to catalog_heap_update() as that's the only
purpose AFAICT.

Does that connect the dots a bit more clearly?

--
Michael

Included in the attached patch set is that first patch from the other
thread as the 0003 patch here and shows the ultimate goal, but it is a work-in-progress.

This patch set removes the need to call HeapDetermineColumnsInfo() on
the simple_heap_update() path because the Bitmapset of modified columns
is passed in and intersected against the indexed attributes
accomplishing the same outcome.

I haven't yet split out the "WithInfo" functions or worked out how to
cache the CatalogIndexState. I thought I'd finalize the approach, then
work that out as it might be impacted by refinements to proposed approach.

* 0001 Reorganize heap update logic and update simple_heap_update()

Focus on the macros in htup.h and the changes to the files that use
these macros in this commit, my ask of reviewers is that we agree on the
pattern established in this commit before I fully implement it across
all files in the 0002 patch.

* 0002 Update the remainder of catalog updates using the new APIs

Most files in this patch are still using the first version of the
macros. Only aclchk.c, heap.c, and index.c have been converted to show
how this approach plays out.

* 0003 Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

This commit is borrowed from the other thread to demonstrate how these
changes impact heap updates. Thanks for your consideration, I
appreciate your time.

best.

-greg

[1]: https://github.com/gburd/postgres/blob/cf-6221/src/include/access/htup.h#L90-L277
[2]: https://github.com/gburd/postgres/blob/cf-6221/src/backend/catalog/aclchk.c#L1319-L1341
[3]: https://github.com/gburd/postgres/blob/cf-6221/src/backend/catalog/heap.c#L1686C2-L1746
[4]: https://github.com/gburd/postgres/blob/cf-6221/src/backend/statistics/attribute_stats.c#L750-L806
[5]: /messages/by-id/BCF1BF66-7A1B-4E01-87DC-0BE45EDF2F98@greg.burd.me
[6]: /messages/by-id/78574B24-BE0A-42C5-8075-3FA9FA63B8FC@amazon.com

Michael Paquier

michael@paquier.xyz

4 months ago

In reply to: Greg Burd (#5)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

On Wed, Dec 03, 2025 at 09:11:03AM -0500, Greg Burd wrote:

On Dec 2 2025, at 11:16 pm, Michael Paquier <michael@paquier.xyz> wrote:
I've looked in the email archives a bit and didn't come up with much
related to catalog upgrades. I don't know if there is much on the
record information on this idea, but from what I've been told the idea
of decoupling heap from catalogs with the goal of getting closer to
making online upgrades possible has been referenced a few times in
hallway tracks at various conferences.

John Naylor has been looking at this problem at some point, if I
recall correctly. Adding him in CC here for comments and opinions, or
perhaps I am just wrong in assuming that he has looked at this area.

I'll see what I can do to extract this piece into a separate patch.

Thanks. I suspect that it would help with the clarity of the changes.
At least I am getting the same impression after reading v2, which is
close enough to v1 except for the various renames.

There are two major paths into heap_update(), one from
heapam_tuple_update() and one from simple_heap_update(). The first is
from the executor side via the table AM API and my other thread [6] that
you referenced is trying to address that. This thread is focused on the
catalog updates that flow into simple_heap_update(). Really that
function should be renamed to catalog_heap_update() as that's the only
purpose AFAICT.

You have a point based on how things are presented in this patch set,
where there is only one caller of simple_heap_update(), versus two on
HEAD.

Does that connect the dots a bit more clearly?

* 0003 Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

This commit is borrowed from the other thread to demonstrate how these
changes impact heap updates. Thanks for your consideration, I
appreciate your time.

Now I see the connection, thanks to 0003 where simple_heap_update()
gains its Bitmapset that tracks the set of updated tuples, for the
business that happens in CatalogTupleUpdate(). All that is going to
need a careful lookup.
--
Michael

John Naylor

john.naylor@enterprisedb.com

4 months ago

In reply to: Michael Paquier (#6)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

On Thu, Dec 4, 2025 at 12:50 PM Michael Paquier <michael@paquier.xyz> wrote:

On Wed, Dec 03, 2025 at 09:11:03AM -0500, Greg Burd wrote:

On Dec 2 2025, at 11:16 pm, Michael Paquier <michael@paquier.xyz> wrote:
I've looked in the email archives a bit and didn't come up with much
related to catalog upgrades. I don't know if there is much on the
record information on this idea, but from what I've been told the idea
of decoupling heap from catalogs with the goal of getting closer to
making online upgrades possible has been referenced a few times in
hallway tracks at various conferences.

John Naylor has been looking at this problem at some point, if I
recall correctly. Adding him in CC here for comments and opinions, or
perhaps I am just wrong in assuming that he has looked at this area.

Hmm, I think the decoupling on-disk format from in-memory format was
most directly relevant for the idea of changing the "name" type from a
fixed length type, to a domain over text.

Catalog manipulation as speculated for online upgrades could probably
operate on the values/nulls array.

--
John Naylor
Amazon Web Services

Greg Burd

greg@burd.me

3 months ago

In reply to: John Naylor (#7)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

Rebased.

I had started down a road where I was creating a number of new macros to
cover up more of the implementation details related to catalog tuple
updates. Then I felt that I was too far down the path and reverted to
the design in v1 with a simple layer that tracks what's been updated in
a Bitmapset and then passes that along to simple_heap_update() and then
to heapam_update(). This avoids the need to call
HeapDetermineColumnsInfo() inside heapam_update() when updating catalog
tuples. The rest of the calls to heapam_handle_update() still use
HeapDetermineColumnsInfo(), but if you look at my cf-5556 thread [1]https://commitfest.postgresql.org/patch/5556/
you'll see that I have fixed that side of the equation as well. cf-6221
and cf-5556 are two pieces of a puzzle.

0001 - adds macros to htup.h that cover the two models we have for
catalog tuple updates; Form/GETSTRUCT, and values/nulls/replaces. Also
included in this patch are a few of the more tricky cases as a way to
call them out.

0002 - the remaining files containing catalog tuple updates converted to
the use the new macros

0003 - a patch borrowed from cf-5556 that splits the top half of
heap_update() into heapam_handle_update() and simple_heap_update(). The
only change here vs in cf-5556 is that we pass on the updated bitmap
into heap_update().

thanks for your time and feedback,

best.

-greg

[1]: https://commitfest.postgresql.org/patch/5556/

Greg Burd

greg@burd.me

3 months ago

In reply to: Greg Burd (#8)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

Looks like I missed something that headerscheck found. :)

-greg

#10

Greg Burd

greg@burd.me

3 months ago

In reply to: Greg Burd (#9)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

Updated to address conflicts (e3094679b98).

-greg

#11

Greg Burd

greg@burd.me

3 months ago

In reply to: Greg Burd (#10)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

Updated to address conflicts introduced by efbebb4e858, rebased onto 71c1136989b.

-greg

#12

Andres Freund

andres@anarazel.de

3 months ago

In reply to: Greg Burd (#11)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

Hi,

Sorry for being a downer - but: Is the gain here really worth the squeeze?
This is a *lot* of changes just to avoid a bunch of comparisons when doing
catalog changes.

Greetings,

Andres Freund

#13

Greg Burd

greg@burd.me

3 months ago

In reply to: Andres Freund (#12)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

On Fri, Jan 30, 2026, at 12:20 PM, Andres Freund wrote:

Hi,

Hello, thanks for taking a peek at this one.

Sorry for being a downer - but: Is the gain here really worth the squeeze?

You are certainly not wrong, there are a lot of changes in this patch set. To be honest, I'm not sure if it is "worth the squeeze" either.

What this patch does: it removes the need to rediscover the set of indexed attributes that changed during catalog tuple updates.

Impact of that change: as-yet-unmeasured performance gains due to not having to redo work while holding a lock on the heap page.

So, the "squeeze" is not critical. This work grew out of CF-5556 where I move into the executor the equivalent logic as HeapDetermineColumnsInfo() and open the door to expanding the cases where we can have HOT updates. The two use cases for heap_update() are from heapam_tuple_update() called from the executor and simple_heap_update() called when updating a catalog tuple. The other patch set (5556) covers one side, this covers the other.

This is a *lot* of changes just to avoid a bunch of comparisons when doing
catalog changes.

I was working on 5556, trying to provide a more detailed performance analysis, considering how I might fold the common bits at the top of heap_update() into a common function used by both heapam_tuple_update() and simple_heap_update() and then fold HeapDetermineColumnsInfo() into simple_heap_update() entirely with a comment that says something like, "it would take a lot of churn to fix up all the places we're updating catalog tuples..." etc.

If I do that, then this patch set is optional and as you said likely not worth the squeeze. That said, I do find our current patterns for catalog tuples a bit... well, they aren't what I'd choose to show people first as examples of programming excellence in Postgres. But they work, and have done so for a long time and that's worth something.

Maybe down the road I can revisit this and revamp things a bit more completely and maybe do so in a more incremental/piecemeal way.

best, and thanks for taking the time to express your thoughts, :)

-greg

Show quoted text

Greetings,

Andres Freund

#14

Tom Lane

tgl@sss.pgh.pa.us

3 months ago

In reply to: Greg Burd (#13)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

"Greg Burd" <greg@burd.me> writes:

On Fri, Jan 30, 2026, at 12:20 PM, Andres Freund wrote:

Sorry for being a downer - but: Is the gain here really worth the squeeze?

You are certainly not wrong, there are a lot of changes in this patch set. To be honest, I'm not sure if it is "worth the squeeze" either.

What this patch does: it removes the need to rediscover the set of indexed attributes that changed during catalog tuple updates.

Impact of that change: as-yet-unmeasured performance gains due to not having to redo work while holding a lock on the heap page.

Keep in mind that a patch like this

* requires a ton of work to review
* breaks a lot of pending patches
* breaks a lot of extensions
* causes significant back-patching pain for the next five years

With that in mind, you'd have to show *massive* performance
improvements for it to have any chance of getting accepted.
Frankly I doubt it can possibly clear that bar.

You could lower the bar considerably if, instead of forcing a
system-wide API change, you maintained compatibility with the
existing API and introduced a new one alongside it, using the
new one only in selected code paths that seem particularly
performance-critical. That would remove the objection of
breaking extensions and pending patches, and probably greatly
reduce the other two costs as well, depending on how selective
you are about where to introduce the new API.

With that in mind, I really wonder why it's so critical to change
from replacement flags represented as array-of-bool to replacement
flags represented as a Bitmapset. Seems like those are fundamentally
equivalent. A Bitmapset is more compact, sure, but I fail to see
why we care about that in this context: the replacement flags are
typically function-local variables that won't live very long.
I'm also concerned that the palloc required to create a Bitmapset
will be a net drag on performance compared to a local array.

regards, tom lane

#15

Greg Burd

greg@burd.me

3 months ago

In reply to: Tom Lane (#14)

Re: Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

On Fri, Jan 30, 2026, at 1:54 PM, Tom Lane wrote:

"Greg Burd" <greg@burd.me> writes:

On Fri, Jan 30, 2026, at 12:20 PM, Andres Freund wrote:

Sorry for being a downer - but: Is the gain here really worth the squeeze?

You are certainly not wrong, there are a lot of changes in this patch set. To be honest, I'm not sure if it is "worth the squeeze" either.

What this patch does: it removes the need to rediscover the set of indexed attributes that changed during catalog tuple updates.

Impact of that change: as-yet-unmeasured performance gains due to not having to redo work while holding a lock on the heap page.

Hey Tom, thanks for chiming in on this one!

Keep in mind that a patch like this

* requires a ton of work to review
* breaks a lot of pending patches
* breaks a lot of extensions
* causes significant back-patching pain for the next five years

Yes, I figured as much. And that's a big ask, I get that and I think I acknowledged that in my opening email on this thread. :)

With that in mind, you'd have to show *massive* performance
improvements for it to have any chance of getting accepted.
Frankly I doubt it can possibly clear that bar.

You and me both, there won't be *massive* performance gains from this.

You could lower the bar considerably if, instead of forcing a
system-wide API change, you maintained compatibility with the
existing API and introduced a new one alongside it, using the
new one only in selected code paths that seem particularly
performance-critical. That would remove the objection of
breaking extensions and pending patches, and probably greatly
reduce the other two costs as well, depending on how selective
you are about where to introduce the new API.

Certainly, that's one path forward. I don't think I went down this path targeting performance. This was more a way to show that my other patch (CF-5556) didn't overlook the catalog tuple side of things.

I'll review the core of this idea and see if I can boil it down to something we can use on performance-critical paths, maybe updating statistics would benefit.

With that in mind, I really wonder why it's so critical to change
from replacement flags represented as array-of-bool to replacement
flags represented as a Bitmapset. Seems like those are fundamentally
equivalent. A Bitmapset is more compact, sure, but I fail to see
why we care about that in this context: the replacement flags are
typically function-local variables that won't live very long.
I'm also concerned that the palloc required to create a Bitmapset
will be a net drag on performance compared to a local array.

regards, tom lane

Tom, I wasn't perfectly upfront with my intentions for this patch. I'll be better at that in the future. The use of a Bitmapset is really a foreshadowing of how I'd like to eventually restructure the HOT updates and at some point introduce a refreshed PHOT (or WARM-like) solution for heap updates. The thought was to use the Bitmapset as a filter in when choosing which indexes required updates.

As for the palloc(), yes that's unfortunate when substituting in a Bitmapset and does add some overhead. I'd toyed with a way to statically allocate a Bitmapset, but that just violated some of the basics of that data structure (NULL for empty being the primary one).

The lesson there is to not get ahead of myself. I'm going to pull the plug on this patch.

thanks all.

-greg

Refactor how we form HeapTuples for CatalogTuple(Insert|Update)

Attachments:

Attachments:

Attachments:

Attachments:

Attachments:

Attachments: