Table AM Interface Enhancements

Started by Alexander Korotkovover 2 years ago104 messages
Jump to latest
#1Alexander Korotkov
aekorotkov@gmail.com

Hello PostgreSQL Hackers,

I am pleased to submit a series of patches related to the Table Access
Method (AM) interface, which I initially announced during my talk at
PGCon 2023 [1]. These patches are primarily designed to support the
OrioleDB engine, but I believe they could be beneficial for other
table AM implementations as well.

The focus of these patches is to introduce more flexibility and
capabilities into the Table AM interface. This is particularly
relevant for advanced use cases like index-organized tables,
alternative MVCC implementations, etc.

Here's a brief overview of the patches included in this set:

0001-Allow-locking-updated-tuples-in-tuple_update-and--v1.patch

Optimizes the process of locking concurrently updated tuples during
update and delete operations. Helpful for table AMs where refinding
existing tuples is expensive.

0002-Add-EvalPlanQual-delete-returning-isolation-test-v1.patch

The new isolation test is related to the previous patch. These two
patches were previously discussed in [2].

0003-Allow-table-AM-to-store-complex-data-structures-i-v1.patch

Allows table AM to store complex data structure in rd_amcache rather
than a single chunk of memory.

0004-Add-table-AM-tuple_is_current-method-v1.patch

This allows us to abstract how/whether table AM uses transaction identifiers.

0005-Generalize-relation-analyze-in-table-AM-interface-v1.patch

Provides a more flexible API for sampling tuples, beneficial for
non-standard table types like index-organized tables.

0006-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v1.patch

Provides a new table AM API method to encapsulate the whole INSERT ...
ON CONFLICT ... algorithm rather than just implementation of
speculative tokens.

0007-Allow-table-AM-tuple_insert-method-to-return-the--v1.patch

This allows table AM to return a native tuple slot, which is aware of
table AM-specific system attributes.

0008-Let-table-AM-insertion-methods-control-index-inse-v1.patch

Allows table AM to skip index insertions in the executor and handle
those insertions itself.

0009-Custom-reloptions-for-table-AM-v1.patch

Enables table AMs to define and override reloptions for tables and indexes.

0010-Notify-table-AM-about-index-creation-v1.patch

Allows table AMs to prepare or update specific meta-information during
index creation.

011-Introduce-RowRefType-which-describes-the-table-ro-v1.patch

Separates the row identifier type from the lock mode in RowMarkType,
providing clearer semantics and more flexibility.

0012-Introduce-RowID-bytea-tuple-identifier-v1.patch

`This patch introduces 'RowID', a new bytea tuple identifier, to
overcome the limitations of the current 32-bit block number and 16-bit
offset-based tuple identifier. This is particularly useful for
index-organized tables and other advanced use cases.

Each commit message contains a detailed explanation of the changes and
their rationale. I believe these enhancements will significantly
improve the flexibility and capabilities of the PostgreSQL Table AM
interface.

I am looking forward to your feedback and suggestions on these patches.

Links

1. https://www.pgcon.org/events/pgcon_2023/schedule/session/470-future-of-table-access-methods/
2. /messages/by-id/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg@mail.gmail.com

------
Regards,
Alexander Korotkov

Attachments:

0002-Add-EvalPlanQual-delete-returning-isolation-test-v1.patchapplication/octet-stream; name=0002-Add-EvalPlanQual-delete-returning-isolation-test-v1.patchDownload+34-1
0001-Allow-locking-updated-tuples-in-tuple_update-and--v1.patchapplication/octet-stream; name=0001-Allow-locking-updated-tuples-in-tuple_update-and--v1.patchDownload+476-349
0003-Allow-table-AM-to-store-complex-data-structures-i-v1.patchapplication/octet-stream; name=0003-Allow-table-AM-to-store-complex-data-structures-i-v1.patchDownload+44-13
0004-Add-table-AM-tuple_is_current-method-v1.patchapplication/octet-stream; name=0004-Add-table-AM-tuple_is_current-method-v1.patchDownload+31-8
0005-Generalize-relation-analyze-in-table-AM-interface-v1.patchapplication/octet-stream; name=0005-Generalize-relation-analyze-in-table-AM-interface-v1.patchDownload+317-363
0006-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v1.patchapplication/octet-stream; name=0006-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v1.patchDownload+348-291
0007-Allow-table-AM-tuple_insert-method-to-return-the--v1.patchapplication/octet-stream; name=0007-Allow-table-AM-tuple_insert-method-to-return-the--v1.patchDownload+17-14
0009-Custom-reloptions-for-table-AM-v1.patchapplication/octet-stream; name=0009-Custom-reloptions-for-table-AM-v1.patchDownload+169-31
0010-Notify-table-AM-about-index-creation-v1.patchapplication/octet-stream; name=0010-Notify-table-AM-about-index-creation-v1.patchDownload+69-19
0008-Let-table-AM-insertion-methods-control-index-inse-v1.patchapplication/octet-stream; name=0008-Let-table-AM-insertion-methods-control-index-inse-v1.patchDownload+58-25
0012-Introduce-RowID-bytea-tuple-identifier-v1.patchapplication/octet-stream; name=0012-Introduce-RowID-bytea-tuple-identifier-v1.patchDownload+508-151
0011-Introduce-RowRefType-which-describes-the-table-ro-v1.patchapplication/octet-stream; name=0011-Introduce-RowRefType-which-describes-the-table-ro-v1.patchDownload+57-24
#2Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Alexander Korotkov (#1)
Re: Table AM Interface Enhancements

Hi,

On Thu, 23 Nov 2023 at 13:43, Alexander Korotkov <aekorotkov@gmail.com> wrote:

Hello PostgreSQL Hackers,

I am pleased to submit a series of patches related to the Table Access
Method (AM) interface, which I initially announced during my talk at
PGCon 2023 [1]. These patches are primarily designed to support the
OrioleDB engine, but I believe they could be beneficial for other
table AM implementations as well.

The focus of these patches is to introduce more flexibility and
capabilities into the Table AM interface. This is particularly
relevant for advanced use cases like index-organized tables,
alternative MVCC implementations, etc.

Here's a brief overview of the patches included in this set:

Note: no significant review of the patches, just a first response on
the cover letters and oddities I noticed:

Overall, this patchset adds significant API area to TableAmRoutine,
without adding the relevant documentation on how it's expected to be
used. With the overall size of the patchset also being very
significant, I don't think this patch is reviewable as is; the goal
isn't clear enough, the APIs aren't well explained, and the
interactions with the index API are left up in the air.

0001-Allow-locking-updated-tuples-in-tuple_update-and--v1.patch

Optimizes the process of locking concurrently updated tuples during
update and delete operations. Helpful for table AMs where refinding
existing tuples is expensive.

Is this essentially an optimized implementation of the "DELETE FROM
... RETURNING *" per-tuple primitive?

0003-Allow-table-AM-to-store-complex-data-structures-i-v1.patch

Allows table AM to store complex data structure in rd_amcache rather
than a single chunk of memory.

I don't think we should allow arbitrarily large and arbitrarily many
chunks of data in the relcache or table caches. Why isn't one chunk
enough?

0004-Add-table-AM-tuple_is_current-method-v1.patch

This allows us to abstract how/whether table AM uses transaction identifiers.

I'm not a fan of the indirection here. Also, assuming that table slots
don't outlive transactions, wouldn't this be a more appropriate fit
with the table tuple slot API?

0005-Generalize-relation-analyze-in-table-AM-interface-v1.patch

Provides a more flexible API for sampling tuples, beneficial for
non-standard table types like index-organized tables.

0006-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v1.patch

Provides a new table AM API method to encapsulate the whole INSERT ...
ON CONFLICT ... algorithm rather than just implementation of
speculative tokens.

Does this not still require speculative inserts, with speculative
tokens, for secondary indexes? Why make AMs implement that all over
again?

0007-Allow-table-AM-tuple_insert-method-to-return-the--v1.patch

This allows table AM to return a native tuple slot, which is aware of
table AM-specific system attributes.

This seems reasonable.

0008-Let-table-AM-insertion-methods-control-index-inse-v1.patch

Allows table AM to skip index insertions in the executor and handle
those insertions itself.

Who handles index tuple removal then? I don't see a patch that
describes index AM changes for this...

0009-Custom-reloptions-for-table-AM-v1.patch

Enables table AMs to define and override reloptions for tables and indexes.

0010-Notify-table-AM-about-index-creation-v1.patch

Allows table AMs to prepare or update specific meta-information during
index creation.

I don't think the described use case of this API is OK - a table AM
cannot know about the internals of index AMs, and is definitely not
allowed to overwrite the information of that index.
If I ask for an index that uses the "btree" index, then that needs to
be the AM actually used, or an error needs to be raised if it is
somehow incompatible with the table AM used. It can't be that we
silently update information and create an index that is explicitly not
what the user asked to create.

I also don't see updates in documentation, which I think is quite a
shame as I have trouble understanding some parts.

0012-Introduce-RowID-bytea-tuple-identifier-v1.patch

`This patch introduces 'RowID', a new bytea tuple identifier, to
overcome the limitations of the current 32-bit block number and 16-bit
offset-based tuple identifier. This is particularly useful for
index-organized tables and other advanced use cases.

We don't have any index methods that can handle anything but
block+offset TIDs, and I don't see any changes to the IndexAM APIs to
support these RowID tuples, so what's the plan here? I don't see any
of that in the commit message, nor in the rest of this patchset.

Each commit message contains a detailed explanation of the changes and
their rationale. I believe these enhancements will significantly
improve the flexibility and capabilities of the PostgreSQL Table AM
interface.

I've noticed there is not a lot of rationale for several of the
changes as to why PostgreSQL needs these changes implemented like
this, amongst which the index-related tableAM changes.

I understand that index-organized tables can be quite useful, but I
don't see design solutions to the more complex questions that would
still be required before we could host such table AMs like OreoleDB's
as a first-party citizen: For index-organized tables, you also need
index AM APIs that support TIDS with more than 48 bits of data
(assuming we actually want primary keys with >48 bits of usable
space), and for undo-based logging you would probably need index APIs
for retail index tuple deletion. Neither is supplied here, nor is
described why these APIs were omitted.

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

#3Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Alexander Korotkov (#1)
Re: Table AM Interface Enhancements

On Nov 23, 2023, at 4:42 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote:

0006-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v1.patch

Provides a new table AM API method to encapsulate the whole INSERT ...
ON CONFLICT ... algorithm rather than just implementation of
speculative tokens.

I *think* I understand that you are taking the part of INSERT..ON CONFLICT that lives outside the table AM and pulling it inside so that table AM authors are free to come up with whatever implementation is more suited for them. The most straightforward way of doing so results in an EState parameter in the interface definition. That seems not so good, as the EState is a fairly complicated structure, and future changes in the executor might want to rearrange what EState tracks, which would change which values tuple_insert_with_arbiter() can depend on. Should the patch at least document which parts of the EState are expected to be in which states, and which parts should be viewed as undefined? If the implementors of table AMs rely on any/all aspects of EState, doesn't that prevent future changes to how that structure is used?

0008-Let-table-AM-insertion-methods-control-index-inse-v1.patch

Allows table AM to skip index insertions in the executor and handle
those insertions itself.

The new parameter could use more documentation.

0009-Custom-reloptions-for-table-AM-v1.patch

Enables table AMs to define and override reloptions for tables and indexes.

This could use some regression tests to exercise the custom reloptions.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#4Alexander Korotkov
aekorotkov@gmail.com
In reply to: Mark Dilger (#3)
Re: Table AM Interface Enhancements

On Fri, Nov 24, 2023 at 5:18 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:

On Nov 23, 2023, at 4:42 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote:

0006-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v1.patch

Provides a new table AM API method to encapsulate the whole INSERT ...
ON CONFLICT ... algorithm rather than just implementation of
speculative tokens.

I *think* I understand that you are taking the part of INSERT..ON CONFLICT that lives outside the table AM and pulling it inside so that table AM authors are free to come up with whatever implementation is more suited for them. The most straightforward way of doing so results in an EState parameter in the interface definition. That seems not so good, as the EState is a fairly complicated structure, and future changes in the executor might want to rearrange what EState tracks, which would change which values tuple_insert_with_arbiter() can depend on.

I think this is the correct understanding.

Should the patch at least document which parts of the EState are expected to be in which states, and which parts should be viewed as undefined? If the implementors of table AMs rely on any/all aspects of EState, doesn't that prevent future changes to how that structure is used?

New tuple tuple_insert_with_arbiter() table AM API method needs EState
argument to call executor functions: ExecCheckIndexConstraints(),
ExecUpdateLockMode(), and ExecInsertIndexTuples(). I think we
probably need to invent some opaque way to call this executor function
without revealing EState to table AM. Do you think this could work?

0008-Let-table-AM-insertion-methods-control-index-inse-v1.patch

Allows table AM to skip index insertions in the executor and handle
those insertions itself.

The new parameter could use more documentation.

0009-Custom-reloptions-for-table-AM-v1.patch

Enables table AMs to define and override reloptions for tables and indexes.

This could use some regression tests to exercise the custom reloptions.

Thank you for these notes. I'll take this into account for the next
patchset version.

------
Regards,
Alexander Korotkov

#5Mark Dilger
mark.dilger@enterprisedb.com
In reply to: Alexander Korotkov (#4)
Re: Table AM Interface Enhancements

On Nov 25, 2023, at 9:47 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote:

Should the patch at least document which parts of the EState are expected to be in which states, and which parts should be viewed as undefined? If the implementors of table AMs rely on any/all aspects of EState, doesn't that prevent future changes to how that structure is used?

New tuple tuple_insert_with_arbiter() table AM API method needs EState
argument to call executor functions: ExecCheckIndexConstraints(),
ExecUpdateLockMode(), and ExecInsertIndexTuples(). I think we
probably need to invent some opaque way to call this executor function
without revealing EState to table AM. Do you think this could work?

We're clearly not accessing all of the EState, just some specific fields, such as es_per_tuple_exprcontext. I think you could at least refactor to pass the minimum amount of state information through the table AM API.


Mark Dilger
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#6Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Mark Dilger (#5)
Re: Table AM Interface Enhancements

Hi, Alexander!

I think table AM extensibility is a very good idea generally, not only in
the scope of APIs that are needed in OrioleDB. Thanks for your proposals!

For patches

0001-Allow-locking-updated-tuples-in-tuple_update-and--v1.patch

0002-Add-EvalPlanQual-delete-returning-isolation-test-v1.patch

The new isolation test is related to the previous patch. These two

patches were previously discussed in [2]. /messages/by-id/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg@mail.gmail.com.

As discussion in [2]. /messages/by-id/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg@mail.gmail.com seems close to the patches being committed and the
only thing it is not in v16 yet is that it was too close to feature freeze,
I've copied these most recent versions of patches 0001 and 0002 from this
thread in [2]. /messages/by-id/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg@mail.gmail.com to finish and commit them there.

I'm planning to review some of the other patches from the current patchset
soon.

[2]: . /messages/by-id/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg@mail.gmail.com
/messages/by-id/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg@mail.gmail.com

Kind regards,
Pavel Borisov

#7Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#6)
Re: Table AM Interface Enhancements

Hi, Alexander!

I'm planning to review some of the other patches from the current patchset
soon.

I've looked into the patch 0003.
The patch looks in good shape and is uncontroversial to me. Making memory
structures to be dynamically allocated is simple enough and it allows to
store complex data like lists etc. I consider places like this that expect
memory structures to be flat and allocated at once are because the was no
need in more complex ones previously. If there is a need for them, I think
they could be added without much doubt, provided the simplicity of the
change.

For the code:
+static inline void
+table_free_rd_amcache(Relation rel)
+{
+ if (rel->rd_tableam && rel->rd_tableam->free_rd_amcache)
+ {
+ rel->rd_tableam->free_rd_amcache(rel);
+ }
+ else
+ {
+ if (rel->rd_amcache)
+ pfree(rel->rd_amcache);
+ rel->rd_amcache = NULL;
+ }

here I suggest adding Assert(rel->rd_amcache == NULL) (or maybe better an
error report) after calling free_rd_amcache to be sure the custom
implementation has done what it should do.

Also, I think some brief documentation about writing this custom method is
quite relevant maybe based on already existing comments in the code.

Kind regards,
Pavel

#8Nikita Malakhov
hukutoc@gmail.com
In reply to: Pavel Borisov (#7)
Re: Table AM Interface Enhancements

Hi,

Pavel, as far as I understand Alexander's idea assertion and especially
ereport
here does not make any sense - this method is not considered to report
error, it
silently calls if there is underlying [free] function and simply falls
through otherwise,
also, take into account that it could be located in the uninterruptible
part of the code.

On the whole topic I have to

On Wed, Nov 29, 2023 at 4:56 PM Pavel Borisov <pashkin.elfe@gmail.com>
wrote:

Hi, Alexander!

I'm planning to review some of the other patches from the current
patchset soon.

I've looked into the patch 0003.
The patch looks in good shape and is uncontroversial to me. Making memory
structures to be dynamically allocated is simple enough and it allows to
store complex data like lists etc. I consider places like this that expect
memory structures to be flat and allocated at once are because the was no
need in more complex ones previously. If there is a need for them, I think
they could be added without much doubt, provided the simplicity of the
change.

For the code:
+static inline void
+table_free_rd_amcache(Relation rel)
+{
+ if (rel->rd_tableam && rel->rd_tableam->free_rd_amcache)
+ {
+ rel->rd_tableam->free_rd_amcache(rel);
+ }
+ else
+ {
+ if (rel->rd_amcache)
+ pfree(rel->rd_amcache);
+ rel->rd_amcache = NULL;
+ }

here I suggest adding Assert(rel->rd_amcache == NULL) (or maybe better an
error report) after calling free_rd_amcache to be sure the custom
implementation has done what it should do.

Also, I think some brief documentation about writing this custom method is
quite relevant maybe based on already existing comments in the code.

Kind regards,
Pavel

--
Regards,
Nikita Malakhov
Postgres Professional
The Russian Postgres Company
https://postgrespro.ru/

#9Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Nikita Malakhov (#8)
Re: Table AM Interface Enhancements

Hi, Nikita!

On Wed, 29 Nov 2023 at 18:27, Nikita Malakhov <hukutoc@gmail.com> wrote:

Hi,

Pavel, as far as I understand Alexander's idea assertion and especially
ereport
here does not make any sense - this method is not considered to report
error, it
silently calls if there is underlying [free] function and simply falls
through otherwise,
also, take into account that it could be located in the uninterruptible
part of the code.

On the whole topic I have to

On Wed, Nov 29, 2023 at 4:56 PM Pavel Borisov <pashkin.elfe@gmail.com>
wrote:

Hi, Alexander!

I'm planning to review some of the other patches from the current
patchset soon.

I've looked into the patch 0003.
The patch looks in good shape and is uncontroversial to me. Making memory
structures to be dynamically allocated is simple enough and it allows to
store complex data like lists etc. I consider places like this that expect
memory structures to be flat and allocated at once are because the was no
need in more complex ones previously. If there is a need for them, I think
they could be added without much doubt, provided the simplicity of the
change.

For the code:
+static inline void
+table_free_rd_amcache(Relation rel)
+{
+ if (rel->rd_tableam && rel->rd_tableam->free_rd_amcache)
+ {
+ rel->rd_tableam->free_rd_amcache(rel);
+ }
+ else
+ {
+ if (rel->rd_amcache)
+ pfree(rel->rd_amcache);
+ rel->rd_amcache = NULL;
+ }

here I suggest adding Assert(rel->rd_amcache == NULL) (or maybe better an
error report) after calling free_rd_amcache to be sure the custom
implementation has done what it should do.

Also, I think some brief documentation about writing this custom method
is quite relevant maybe based on already existing comments in the code.

Kind regards,
Pavel

When we do default single chunk routine we invalidate rd_amcache pointer,

+ if (rel->rd_amcache)
+ pfree(rel->rd_amcache);
+ rel->rd_amcache = NULL;

If we delegate this to method, my idea is to check the method
implementation don't leave this pointer valid.
If it's not needed, I'm ok with it, but to me it seems that the check I
proposed makes sense.

Regards,
Pavel

#10Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#9)
Re: Table AM Interface Enhancements

Hi, Alexander!

I've reviewed patch 0004. It's clear enough and I think does what it's
supposed.
One thing, in function signature
+bool (*tuple_is_current) (Relation rel, TupleTableSlot *slot);
there is a Relation agrument, which is unused in both existing heapam
method. Also it's unused in OrioleDb implementation of tuple_is_current.
For what goal it is needed in the interface?

No other objections around this patch.

I've also looked at 0005-0007. Although it is not a thorough review, they
seem to depend on previous patch 0004.
Additionally changes in 0007 looks dependent from 0005. Does replacement of
slot inside ExecInsert, that is already used in the code below the call of

/* insert the tuple normally */
- table_tuple_insert(resultRelationDesc, slot,
- estate->es_output_cid,
- 0, NULL);

could be done without side effects?

Kind regards,
Pavel.

Show quoted text
#11Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#10)
Re: Table AM Interface Enhancements

Additionally changes in 0007 looks dependent from 0005. Does replacement
of slot inside ExecInsert, that is already used in the code below the call
of

/* insert the tuple normally */
- table_tuple_insert(resultRelationDesc, slot,
- estate->es_output_cid,
- 0, NULL);

could be done without side effects?

I'm sorry that I inserter not all relevant code in the previous message:

    /* insert the tuple normally */
- table_tuple_insert(resultRelationDesc, slot,
-   estate->es_output_cid,
-   0, NULL);
+ slot = table_tuple_insert(resultRelationDesc, slot,
+  estate->es_output_cid,
+
(Previously slot variable that exists in the ExecInsert() and could be used
later was not modified at the quoted code block)

Pavel.

#12Alexander Korotkov
aekorotkov@gmail.com
In reply to: Matthias van de Meent (#2)
Re: Table AM Interface Enhancements

Hi, Matthias!

On Fri, Nov 24, 2023 at 1:07 AM Matthias van de Meent
<boekewurm+postgres@gmail.com> wrote:

On Thu, 23 Nov 2023 at 13:43, Alexander Korotkov <aekorotkov@gmail.com> wrote:

Hello PostgreSQL Hackers,

I am pleased to submit a series of patches related to the Table Access
Method (AM) interface, which I initially announced during my talk at
PGCon 2023 [1]. These patches are primarily designed to support the
OrioleDB engine, but I believe they could be beneficial for other
table AM implementations as well.

The focus of these patches is to introduce more flexibility and
capabilities into the Table AM interface. This is particularly
relevant for advanced use cases like index-organized tables,
alternative MVCC implementations, etc.

Here's a brief overview of the patches included in this set:

Note: no significant review of the patches, just a first response on
the cover letters and oddities I noticed:

Overall, this patchset adds significant API area to TableAmRoutine,
without adding the relevant documentation on how it's expected to be
used.

I have to note that, unlike documentation for index access methods,
our documentation for table access methods doesn't have an explanation
of API functions. Instead, it just refers to tableam.h for details.
The patches touching tableam.h also revise relevant comments. These
comments are for sure a target for improvements.

With the overall size of the patchset also being very
significant

I wouldn't say that volume is very significant. It's just 2K lines,
not the great size of a patchset. But it for sure represents changes
of great importance.

I don't think this patch is reviewable as is; the goal
isn't clear enough,

The goal is to revise table AM API so that new full-featured
implementations could exist. AFAICS, the current API was designed
keeping zheap in mind, but even zheap was always shipped with the core
patch. All other implementations of table AM, which I've seen, are
very limited. Basically, there is still no real alternative and
functional OLTP table AM. I believe API limitation is one of the
reasons for that.

the APIs aren't well explained, and

As I mentioned before, the table AM API is documented by the comments
in tableam.h. The comments in the patchset aren't perfect for sure,
but a subject for the incremental improvements.

the interactions with the index API are left up in the air.

Right. These patches bring more control on interactions with indexes
to table AMs without touching the index API. In my PGCon 2016 talk I
proposed that table AM could have its own implementation of index AM.

As you mentioned before, this patchset isn't very small already.
Considering it all together with a patchset for index AM redesign
would make it a mess. I propose we can consider here the patches,
which are usable by themselves even without index AM changes. And the
patches tightly coupled with index AM API changes could be considered
later together with those changes.

0001-Allow-locking-updated-tuples-in-tuple_update-and--v1.patch

Optimizes the process of locking concurrently updated tuples during
update and delete operations. Helpful for table AMs where refinding
existing tuples is expensive.

Is this essentially an optimized implementation of the "DELETE FROM
... RETURNING *" per-tuple primitive?

Not really. The test for "DELETE FROM ... RETURNING *" was used just
to reproduce one of the bugs stopped in [2]. The general idea is to
avoid repeated calls for tuple lock.

0003-Allow-table-AM-to-store-complex-data-structures-i-v1.patch

Allows table AM to store complex data structure in rd_amcache rather
than a single chunk of memory.

I don't think we should allow arbitrarily large and arbitrarily many
chunks of data in the relcache or table caches.

Hmm.. It seems to be far out of control of API what and how large
PostgreSQL extensions could actually cache.

Why isn't one chunk
enough?

It's generally possible to fit everything into one chunk, but that's
extremely unhandy when your cache contains something at least as
complex as tuple slots and descriptors. I think the reason that we
still have one chunk restriction is that we don't have a full-featured
implementation fitting API yet. If we had it, I can't imagine there
would be one chunk for a cache.

0004-Add-table-AM-tuple_is_current-method-v1.patch

This allows us to abstract how/whether table AM uses transaction identifiers.

I'm not a fan of the indirection here. Also, assuming that table slots
don't outlive transactions, wouldn't this be a more appropriate fit
with the table tuple slot API?

This is a good idea. I will update the patch accordingly.

0005-Generalize-relation-analyze-in-table-AM-interface-v1.patch

Provides a more flexible API for sampling tuples, beneficial for
non-standard table types like index-organized tables.

0006-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v1.patch

Provides a new table AM API method to encapsulate the whole INSERT ...
ON CONFLICT ... algorithm rather than just implementation of
speculative tokens.

Does this not still require speculative inserts, with speculative
tokens, for secondary indexes? Why make AMs implement that all over
again?

The idea here is to generalize upsert and leave speculative tokens as
details of one particular implementation. Imagine an index-organized
table and upsert on primary key. For that you need to just locate the
relevant page in a tree and do insert or update. Speculative tokens
would rather be an unreasonable complication for this case.

0007-Allow-table-AM-tuple_insert-method-to-return-the--v1.patch

This allows table AM to return a native tuple slot, which is aware of
table AM-specific system attributes.

This seems reasonable.

0008-Let-table-AM-insertion-methods-control-index-inse-v1.patch

Allows table AM to skip index insertions in the executor and handle
those insertions itself.

Who handles index tuple removal then?

Table AM implementation decides what actions to perform on tuple
update/delete. The reason why it can't really care about updating
indexes is that the executor already does it.
The situation is different with deletes, because the executor doesn't
do something immediately about the corresponding index tuples. They
are deleted later by vacuum, which is also controlled by table AM
implementation.

I don't see a patch that describes index AM changes for this...

Yes, index AM should be revised for that. See my comment about that earlier.

0009-Custom-reloptions-for-table-AM-v1.patch

Enables table AMs to define and override reloptions for tables and indexes.

0010-Notify-table-AM-about-index-creation-v1.patch

Allows table AMs to prepare or update specific meta-information during
index creation.

I don't think the described use case of this API is OK - a table AM
cannot know about the internals of index AMs, and is definitely not
allowed to overwrite the information of that index.
If I ask for an index that uses the "btree" index, then that needs to
be the AM actually used, or an error needs to be raised if it is
somehow incompatible with the table AM used. It can't be that we
silently update information and create an index that is explicitly not
what the user asked to create.

I agree that this currently looks more like workarounds rather than
proper API changes. I propose these two should be considered later
together with relevant index API changes.

I also don't see updates in documentation, which I think is quite a
shame as I have trouble understanding some parts.

Sorry for this. I hope I gave some answers in this message and I'll
update the patchset comments and commit messages accordingly. And I'm
open to answer any further questions.

0012-Introduce-RowID-bytea-tuple-identifier-v1.patch

`This patch introduces 'RowID', a new bytea tuple identifier, to
overcome the limitations of the current 32-bit block number and 16-bit
offset-based tuple identifier. This is particularly useful for
index-organized tables and other advanced use cases.

We don't have any index methods that can handle anything but
block+offset TIDs, and I don't see any changes to the IndexAM APIs to
support these RowID tuples, so what's the plan here? I don't see any
of that in the commit message, nor in the rest of this patchset.

Each commit message contains a detailed explanation of the changes and
their rationale. I believe these enhancements will significantly
improve the flexibility and capabilities of the PostgreSQL Table AM
interface.

I've noticed there is not a lot of rationale for several of the
changes as to why PostgreSQL needs these changes implemented like
this, amongst which the index-related tableAM changes.

I understand that index-organized tables can be quite useful, but I
don't see design solutions to the more complex questions that would
still be required before we could host such table AMs like OreoleDB's
as a first-party citizen: For index-organized tables, you also need
index AM APIs that support TIDS with more than 48 bits of data
(assuming we actually want primary keys with >48 bits of usable
space), and for undo-based logging you would probably need index APIs
for retail index tuple deletion. Neither is supplied here, nor is
described why these APIs were omitted.

As I mentioned before, I agree that index AM changes haven't been
presented yet. And yes, for bytea rowID there is currently no way to
use the current index API. However, I think this exact patch could be
useful even without index AM implements. This allows table AMs to
identify rows by custom bytea, even though these tables couldn't be
indexed yet. So, if we allow a custom table AM to implement an
index-organized table, that would have use cases even if secondary
indexes are not supported yet.

Links
1. https://pgconf.ru/media/2016/02/19/06_Korotkov%20Extendability.pdf

2. /messages/by-id/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg@mail.gmail.com

------
Regards,
Alexander Korotkov

#13Alexander Korotkov
aekorotkov@gmail.com
In reply to: Mark Dilger (#5)
Re: Table AM Interface Enhancements

On Mon, Nov 27, 2023 at 10:18 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:

On Nov 25, 2023, at 9:47 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote:

Should the patch at least document which parts of the EState are expected to be in which states, and which parts should be viewed as undefined? If the implementors of table AMs rely on any/all aspects of EState, doesn't that prevent future changes to how that structure is used?

New tuple tuple_insert_with_arbiter() table AM API method needs EState
argument to call executor functions: ExecCheckIndexConstraints(),
ExecUpdateLockMode(), and ExecInsertIndexTuples(). I think we
probably need to invent some opaque way to call this executor function
without revealing EState to table AM. Do you think this could work?

We're clearly not accessing all of the EState, just some specific fields, such as es_per_tuple_exprcontext. I think you could at least refactor to pass the minimum amount of state information through the table AM API.

Yes, the table AM doesn't need the full EState, just the ability to do
specific manipulation with tuples. I'll refactor the patch to make a
better isolation for this.

------
Regards,
Alexander Korotkov

#14Alexander Korotkov
aekorotkov@gmail.com
In reply to: Alexander Korotkov (#13)
Re: Table AM Interface Enhancements

On Sun, Mar 3, 2024 at 1:50 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Mon, Nov 27, 2023 at 10:18 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:

On Nov 25, 2023, at 9:47 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote:

Should the patch at least document which parts of the EState are expected to be in which states, and which parts should be viewed as undefined? If the implementors of table AMs rely on any/all aspects of EState, doesn't that prevent future changes to how that structure is used?

New tuple tuple_insert_with_arbiter() table AM API method needs EState
argument to call executor functions: ExecCheckIndexConstraints(),
ExecUpdateLockMode(), and ExecInsertIndexTuples(). I think we
probably need to invent some opaque way to call this executor function
without revealing EState to table AM. Do you think this could work?

We're clearly not accessing all of the EState, just some specific fields, such as es_per_tuple_exprcontext. I think you could at least refactor to pass the minimum amount of state information through the table AM API.

Yes, the table AM doesn't need the full EState, just the ability to do
specific manipulation with tuples. I'll refactor the patch to make a
better isolation for this.

Please find the revised patchset attached. The changes are following:
1. Patchset is rebase. to the current master.
2. Patchset is reordered. I tried to put less debatable patches to the top.
3. tuple_is_current() method is moved from the Table AM API to the
slot as proposed by Matthias van de Meent.
4. Assert added to the table_free_rd_amcache() as proposed by Pavel Borisov.

------
Regards,
Alexander Korotkov

Attachments:

0010-Notify-table-AM-about-index-creation-v2.patchapplication/octet-stream; name=0010-Notify-table-AM-about-index-creation-v2.patchDownload+69-19
0012-Introduce-RowRefType-which-describes-the-table-ro-v2.patchapplication/octet-stream; name=0012-Introduce-RowRefType-which-describes-the-table-ro-v2.patchDownload+57-24
0011-Let-table-AM-insertion-methods-control-index-inse-v2.patchapplication/octet-stream; name=0011-Let-table-AM-insertion-methods-control-index-inse-v2.patchDownload+58-25
0001-Allow-locking-updated-tuples-in-tuple_update-and--v2.patchapplication/octet-stream; name=0001-Allow-locking-updated-tuples-in-tuple_update-and--v2.patchDownload+474-347
0013-Introduce-RowID-bytea-tuple-identifier-v2.patchapplication/octet-stream; name=0013-Introduce-RowID-bytea-tuple-identifier-v2.patchDownload+509-154
0009-Let-table-AM-override-reloptions-for-indexes-buil-v2.patchapplication/octet-stream; name=0009-Let-table-AM-override-reloptions-for-indexes-buil-v2.patchDownload+65-6
0005-Add-table-AM-tuple_is_current-method-v2.patchapplication/octet-stream; name=0005-Add-table-AM-tuple_is_current-method-v2.patchDownload+101-8
0007-Custom-reloptions-for-table-AM-v2.patchapplication/octet-stream; name=0007-Custom-reloptions-for-table-AM-v2.patchDownload+106-26
0006-Generalize-relation-analyze-in-table-AM-interface-v2.patchapplication/octet-stream; name=0006-Generalize-relation-analyze-in-table-AM-interface-v2.patchDownload+317-363
0008-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v2.patchapplication/octet-stream; name=0008-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v2.patchDownload+348-291
0004-Allow-table-AM-tuple_insert-method-to-return-the--v2.patchapplication/octet-stream; name=0004-Allow-table-AM-tuple_insert-method-to-return-the--v2.patchDownload+17-14
0002-Add-EvalPlanQual-delete-returning-isolation-test-v2.patchapplication/octet-stream; name=0002-Add-EvalPlanQual-delete-returning-isolation-test-v2.patchDownload+34-1
0003-Allow-table-AM-to-store-complex-data-structures-i-v2.patchapplication/octet-stream; name=0003-Allow-table-AM-to-store-complex-data-structures-i-v2.patchDownload+50-13
#15Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Alexander Korotkov (#14)
Re: Table AM Interface Enhancements

Hi, Alexander!

On Tue, 19 Mar 2024 at 03:34, Alexander Korotkov <aekorotkov@gmail.com>
wrote:

On Sun, Mar 3, 2024 at 1:50 PM Alexander Korotkov <aekorotkov@gmail.com>
wrote:

On Mon, Nov 27, 2023 at 10:18 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:

On Nov 25, 2023, at 9:47 AM, Alexander Korotkov <

aekorotkov@gmail.com> wrote:

Should the patch at least document which parts of the EState are

expected to be in which states, and which parts should be viewed as
undefined? If the implementors of table AMs rely on any/all aspects of
EState, doesn't that prevent future changes to how that structure is used?

New tuple tuple_insert_with_arbiter() table AM API method needs

EState

argument to call executor functions: ExecCheckIndexConstraints(),
ExecUpdateLockMode(), and ExecInsertIndexTuples(). I think we
probably need to invent some opaque way to call this executor

function

without revealing EState to table AM. Do you think this could work?

We're clearly not accessing all of the EState, just some specific

fields, such as es_per_tuple_exprcontext. I think you could at least
refactor to pass the minimum amount of state information through the table
AM API.

Yes, the table AM doesn't need the full EState, just the ability to do
specific manipulation with tuples. I'll refactor the patch to make a
better isolation for this.

Please find the revised patchset attached. The changes are following:
1. Patchset is rebase. to the current master.
2. Patchset is reordered. I tried to put less debatable patches to the
top.
3. tuple_is_current() method is moved from the Table AM API to the
slot as proposed by Matthias van de Meent.
4. Assert added to the table_free_rd_amcache() as proposed by Pavel
Borisov.

Patches 0001-0002 are unchanged compared to the last version in thread [1]. /messages/by-id/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg@mail.gmail.com.
In my opinion, it's still ready to be committed, which was not done for
time were too close to feature freeze one year ago.

0003 - Assert added from previous version. I still have a strong opinion
that allowing multi-chunked data structures instead of single chunks is
completely safe and makes natural process of Postgres improvement that is
self-justified. The patch is simple enough and ready to be pushed.

0004 (previously 0007) - Have not changed, and there is consensus that
this is reasonable. I've re-checked the current code. Looks safe
considering returning a different slot, which I doubted before. So consider
this patch also ready.

0005 (previously 0004) - Unused argument in the is_current_xact_tuple()
signature is removed. Also comparing to v1 the code shifted from tableam
methods to TTS's level.

I'd propose to remove Assert(!TTS_EMPTY(slot))
for tts_minimal_is_current_xact_tuple()
and tts_virtual_is_current_xact_tuple() as these are only error reporting
functions that don't use slot actually.

Comment similar to:
+/*
+ * VirtualTupleTableSlots never have a storage tuples.  We generally
+ * shouldn't get here, but provide a user-friendly message if we do.
+ */
also applies to tts_minimal_is_current_xact_tuple()

I'd propose changes for clarity of this comment:
%s/a storage tuples/storage tuples/g
%s/generally//g

Otherwise patch 0005 also looks good to me.

I'm planning to review the remaining patches. Meanwhile think pushing what
is now ready and uncontroversial is a good intention.
Thank you for the work done on this patchset!

Regards,
Pavel Borisov,
Supabase.

[1]: . /messages/by-id/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg@mail.gmail.com
/messages/by-id/CAPpHfdua-YFw3XTprfutzGp28xXLigFtzNbuFY8yPhqeq6X5kg@mail.gmail.com

#16Alexander Korotkov
aekorotkov@gmail.com
In reply to: Pavel Borisov (#15)
Re: Table AM Interface Enhancements

Hi, Pavel!

On Tue, Mar 19, 2024 at 11:34 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote:

On Tue, 19 Mar 2024 at 03:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Sun, Mar 3, 2024 at 1:50 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Mon, Nov 27, 2023 at 10:18 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:

On Nov 25, 2023, at 9:47 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote:

Should the patch at least document which parts of the EState are expected to be in which states, and which parts should be viewed as undefined? If the implementors of table AMs rely on any/all aspects of EState, doesn't that prevent future changes to how that structure is used?

New tuple tuple_insert_with_arbiter() table AM API method needs EState
argument to call executor functions: ExecCheckIndexConstraints(),
ExecUpdateLockMode(), and ExecInsertIndexTuples(). I think we
probably need to invent some opaque way to call this executor function
without revealing EState to table AM. Do you think this could work?

We're clearly not accessing all of the EState, just some specific fields, such as es_per_tuple_exprcontext. I think you could at least refactor to pass the minimum amount of state information through the table AM API.

Yes, the table AM doesn't need the full EState, just the ability to do
specific manipulation with tuples. I'll refactor the patch to make a
better isolation for this.

Please find the revised patchset attached. The changes are following:
1. Patchset is rebase. to the current master.
2. Patchset is reordered. I tried to put less debatable patches to the top.
3. tuple_is_current() method is moved from the Table AM API to the
slot as proposed by Matthias van de Meent.
4. Assert added to the table_free_rd_amcache() as proposed by Pavel Borisov.

Patches 0001-0002 are unchanged compared to the last version in thread [1]. In my opinion, it's still ready to be committed, which was not done for time were too close to feature freeze one year ago.

0003 - Assert added from previous version. I still have a strong opinion that allowing multi-chunked data structures instead of single chunks is completely safe and makes natural process of Postgres improvement that is self-justified. The patch is simple enough and ready to be pushed.

0004 (previously 0007) - Have not changed, and there is consensus that this is reasonable. I've re-checked the current code. Looks safe considering returning a different slot, which I doubted before. So consider this patch also ready.

0005 (previously 0004) - Unused argument in the is_current_xact_tuple() signature is removed. Also comparing to v1 the code shifted from tableam methods to TTS's level.

I'd propose to remove Assert(!TTS_EMPTY(slot)) for tts_minimal_is_current_xact_tuple() and tts_virtual_is_current_xact_tuple() as these are only error reporting functions that don't use slot actually.

Comment similar to:
+/*
+ * VirtualTupleTableSlots never have a storage tuples.  We generally
+ * shouldn't get here, but provide a user-friendly message if we do.
+ */
also applies to tts_minimal_is_current_xact_tuple()

I'd propose changes for clarity of this comment:
%s/a storage tuples/storage tuples/g
%s/generally//g

Otherwise patch 0005 also looks good to me.

I'm planning to review the remaining patches. Meanwhile think pushing what is now ready and uncontroversial is a good intention.
Thank you for the work done on this patchset!

Thank you, Pavel!

Regarding 0005, I did apply "a storage tuples" grammar fix. Regarding
the rest of the things, I'd like to keep methods
tts_*_is_current_xact_tuple() to be similar to nearby
tts_*_getsysattr(). This is why I'm keeping the rest unchanged. I
think we could refactor that later, but together with
tts_*_getsysattr() methods.

I'm going to push 0003, 0004 and 0005 if there are no objections.

And I'll update 0001 and 0002 in their dedicated thread.

------
Regards,
Alexander Korotkov

Attachments:

0002-Add-EvalPlanQual-delete-returning-isolation-test-v3.patchapplication/octet-stream; name=0002-Add-EvalPlanQual-delete-returning-isolation-test-v3.patchDownload+34-1
0005-Add-TupleTableSlotOps.is_current_xact_tuple-metho-v3.patchapplication/octet-stream; name=0005-Add-TupleTableSlotOps.is_current_xact_tuple-metho-v3.patchDownload+101-8
0001-Allow-locking-updated-tuples-in-tuple_update-and--v3.patchapplication/octet-stream; name=0001-Allow-locking-updated-tuples-in-tuple_update-and--v3.patchDownload+474-347
0003-Allow-table-AM-to-store-complex-data-structures-i-v3.patchapplication/octet-stream; name=0003-Allow-table-AM-to-store-complex-data-structures-i-v3.patchDownload+50-13
0004-Allow-table-AM-tuple_insert-method-to-return-the--v3.patchapplication/octet-stream; name=0004-Allow-table-AM-tuple_insert-method-to-return-the--v3.patchDownload+17-14
0006-Generalize-relation-analyze-in-table-AM-interface-v3.patchapplication/octet-stream; name=0006-Generalize-relation-analyze-in-table-AM-interface-v3.patchDownload+317-363
0007-Custom-reloptions-for-table-AM-v3.patchapplication/octet-stream; name=0007-Custom-reloptions-for-table-AM-v3.patchDownload+106-26
0010-Notify-table-AM-about-index-creation-v3.patchapplication/octet-stream; name=0010-Notify-table-AM-about-index-creation-v3.patchDownload+69-19
0008-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v3.patchapplication/octet-stream; name=0008-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v3.patchDownload+348-291
0009-Let-table-AM-override-reloptions-for-indexes-buil-v3.patchapplication/octet-stream; name=0009-Let-table-AM-override-reloptions-for-indexes-buil-v3.patchDownload+65-6
0011-Let-table-AM-insertion-methods-control-index-inse-v3.patchapplication/octet-stream; name=0011-Let-table-AM-insertion-methods-control-index-inse-v3.patchDownload+58-25
0012-Introduce-RowRefType-which-describes-the-table-ro-v3.patchapplication/octet-stream; name=0012-Introduce-RowRefType-which-describes-the-table-ro-v3.patchDownload+57-24
0013-Introduce-RowID-bytea-tuple-identifier-v3.patchapplication/octet-stream; name=0013-Introduce-RowID-bytea-tuple-identifier-v3.patchDownload+509-154
#17Japin Li
japinli@hotmail.com
In reply to: Alexander Korotkov (#16)
Re: Table AM Interface Enhancements

On Tue, 19 Mar 2024 at 21:05, Alexander Korotkov <aekorotkov@gmail.com> wrote:

Hi, Pavel!

On Tue, Mar 19, 2024 at 11:34 AM Pavel Borisov <pashkin.elfe@gmail.com> wrote:

On Tue, 19 Mar 2024 at 03:34, Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Sun, Mar 3, 2024 at 1:50 PM Alexander Korotkov <aekorotkov@gmail.com> wrote:

On Mon, Nov 27, 2023 at 10:18 PM Mark Dilger
<mark.dilger@enterprisedb.com> wrote:

On Nov 25, 2023, at 9:47 AM, Alexander Korotkov <aekorotkov@gmail.com> wrote:

Should the patch at least document which parts of the EState are expected to be in which states, and which parts should be viewed as undefined? If the implementors of table AMs rely on any/all aspects of EState, doesn't that prevent future changes to how that structure is used?

New tuple tuple_insert_with_arbiter() table AM API method needs EState
argument to call executor functions: ExecCheckIndexConstraints(),
ExecUpdateLockMode(), and ExecInsertIndexTuples(). I think we
probably need to invent some opaque way to call this executor function
without revealing EState to table AM. Do you think this could work?

We're clearly not accessing all of the EState, just some specific fields, such as es_per_tuple_exprcontext. I think you could at least refactor to pass the minimum amount of state information through the table AM API.

Yes, the table AM doesn't need the full EState, just the ability to do
specific manipulation with tuples. I'll refactor the patch to make a
better isolation for this.

Please find the revised patchset attached. The changes are following:
1. Patchset is rebase. to the current master.
2. Patchset is reordered. I tried to put less debatable patches to the top.
3. tuple_is_current() method is moved from the Table AM API to the
slot as proposed by Matthias van de Meent.
4. Assert added to the table_free_rd_amcache() as proposed by Pavel Borisov.

Patches 0001-0002 are unchanged compared to the last version in thread [1]. In my opinion, it's still ready to be committed, which was not done for time were too close to feature freeze one year ago.

0003 - Assert added from previous version. I still have a strong opinion that allowing multi-chunked data structures instead of single chunks is completely safe and makes natural process of Postgres improvement that is self-justified. The patch is simple enough and ready to be pushed.

0004 (previously 0007) - Have not changed, and there is consensus that this is reasonable. I've re-checked the current code. Looks safe considering returning a different slot, which I doubted before. So consider this patch also ready.

0005 (previously 0004) - Unused argument in the is_current_xact_tuple() signature is removed. Also comparing to v1 the code shifted from tableam methods to TTS's level.

I'd propose to remove Assert(!TTS_EMPTY(slot)) for tts_minimal_is_current_xact_tuple() and tts_virtual_is_current_xact_tuple() as these are only error reporting functions that don't use slot actually.

Comment similar to:
+/*
+ * VirtualTupleTableSlots never have a storage tuples.  We generally
+ * shouldn't get here, but provide a user-friendly message if we do.
+ */
also applies to tts_minimal_is_current_xact_tuple()

I'd propose changes for clarity of this comment:
%s/a storage tuples/storage tuples/g
%s/generally//g

Otherwise patch 0005 also looks good to me.

I'm planning to review the remaining patches. Meanwhile think pushing what is now ready and uncontroversial is a good intention.
Thank you for the work done on this patchset!

Thank you, Pavel!

Regarding 0005, I did apply "a storage tuples" grammar fix. Regarding
the rest of the things, I'd like to keep methods
tts_*_is_current_xact_tuple() to be similar to nearby
tts_*_getsysattr(). This is why I'm keeping the rest unchanged. I
think we could refactor that later, but together with
tts_*_getsysattr() methods.

I'm going to push 0003, 0004 and 0005 if there are no objections.

And I'll update 0001 and 0002 in their dedicated thread.

When I try to test the patch on Ubuntu 22.04 with GCC 11.4.0. There are some
warnings as following:

/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c: In function ‘heapam_acquire_sample_rows’:
/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c:1603:28: warning: implicit declaration of function ‘get_tablespace_maintenance_io_concurrency’ [-Wimplicit-function-declaration]
 1603 |         prefetch_maximum = get_tablespace_maintenance_io_concurrency(onerel->rd_rel->reltablespace);
      |                            ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c:1757:30: warning: implicit declaration of function ‘floor’ [-Wimplicit-function-declaration]
 1757 |                 *totalrows = floor((liverows / bs.m) * totalblocks + 0.5);
      |                              ^~~~~
/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c:49:1: note: include ‘<math.h>’ or provide a declaration of ‘floor’
   48 | #include "utils/sampling.h"
  +++ |+#include <math.h>
   49 |
/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c:1757:30: warning: incompatible implicit declaration of built-in function ‘floor’ [-Wbuiltin-declaration-mismatch]
 1757 |                 *totalrows = floor((liverows / bs.m) * totalblocks + 0.5);
      |                              ^~~~~
/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c:1757:30: note: include ‘<math.h>’ or provide a declaration of ‘floor’
/home/japin/Codes/postgres/build/../src/backend/access/heap/heapam_handler.c:1603:21: warning: implicit declaration of function 'get_tablespace_maintenance_io_concurrency' is invalid in C99 [-Wimplicit-function-declaration]
        prefetch_maximum = get_tablespace_maintenance_io_concurrency(onerel->rd_rel->reltablespace);
                           ^

It seems you forgot to include math.h and utils/spccache.h header files
in heapam_handler.c.

diff --git a/src/backend/access/heap/heapam_handler.c b/src/backend/access/heap/heapam_handler.c
index ac24691bd2..04365394f1 100644
--- a/src/backend/access/heap/heapam_handler.c
+++ b/src/backend/access/heap/heapam_handler.c
@@ -19,6 +19,8 @@
  */
 #include "postgres.h"

+#include <math.h>
+
#include "access/genam.h"
#include "access/heapam.h"
#include "access/heaptoast.h"
@@ -46,6 +48,7 @@
#include "utils/builtins.h"
#include "utils/rel.h"
#include "utils/sampling.h"
+#include "utils/spccache.h"

static TM_Result heapam_tuple_lock(Relation relation, Datum tupleid,
Snapshot snapshot, TupleTableSlot *slot,

#18Alexander Korotkov
aekorotkov@gmail.com
In reply to: Japin Li (#17)
Re: Table AM Interface Enhancements

On Tue, Mar 19, 2024 at 4:26 PM Japin Li <japinli@hotmail.com> wrote:

On Tue, 19 Mar 2024 at 21:05, Alexander Korotkov <aekorotkov@gmail.com> wrote:

Regarding 0005, I did apply "a storage tuples" grammar fix. Regarding
the rest of the things, I'd like to keep methods
tts_*_is_current_xact_tuple() to be similar to nearby
tts_*_getsysattr(). This is why I'm keeping the rest unchanged. I
think we could refactor that later, but together with
tts_*_getsysattr() methods.

I'm going to push 0003, 0004 and 0005 if there are no objections.

And I'll update 0001 and 0002 in their dedicated thread.

When I try to test the patch on Ubuntu 22.04 with GCC 11.4.0. There are some
warnings as following:

Thank you for catching this!
Please, find the revised patchset attached.

------
Regards,
Alexander Korotkov

Attachments:

0010-Notify-table-AM-about-index-creation-v4.patchapplication/octet-stream; name=0010-Notify-table-AM-about-index-creation-v4.patchDownload+69-19
0012-Introduce-RowRefType-which-describes-the-table-ro-v4.patchapplication/octet-stream; name=0012-Introduce-RowRefType-which-describes-the-table-ro-v4.patchDownload+57-24
0013-Introduce-RowID-bytea-tuple-identifier-v4.patchapplication/octet-stream; name=0013-Introduce-RowID-bytea-tuple-identifier-v4.patchDownload+509-154
0011-Let-table-AM-insertion-methods-control-index-inse-v4.patchapplication/octet-stream; name=0011-Let-table-AM-insertion-methods-control-index-inse-v4.patchDownload+58-25
0001-Allow-locking-updated-tuples-in-tuple_update-and--v4.patchapplication/octet-stream; name=0001-Allow-locking-updated-tuples-in-tuple_update-and--v4.patchDownload+502-347
0009-Let-table-AM-override-reloptions-for-indexes-buil-v4.patchapplication/octet-stream; name=0009-Let-table-AM-override-reloptions-for-indexes-buil-v4.patchDownload+65-6
0008-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v4.patchapplication/octet-stream; name=0008-Generalize-table-AM-API-for-INSERT-.-ON-CONFLICT-v4.patchDownload+348-291
0007-Custom-reloptions-for-table-AM-v4.patchapplication/octet-stream; name=0007-Custom-reloptions-for-table-AM-v4.patchDownload+106-26
0006-Generalize-relation-analyze-in-table-AM-interface-v4.patchapplication/octet-stream; name=0006-Generalize-relation-analyze-in-table-AM-interface-v4.patchDownload+320-363
0005-Add-TupleTableSlotOps.is_current_xact_tuple-metho-v4.patchapplication/octet-stream; name=0005-Add-TupleTableSlotOps.is_current_xact_tuple-metho-v4.patchDownload+101-8
0004-Allow-table-AM-tuple_insert-method-to-return-the--v4.patchapplication/octet-stream; name=0004-Allow-table-AM-tuple_insert-method-to-return-the--v4.patchDownload+17-14
0003-Allow-table-AM-to-store-complex-data-structures-i-v4.patchapplication/octet-stream; name=0003-Allow-table-AM-to-store-complex-data-structures-i-v4.patchDownload+50-13
0002-Add-EvalPlanQual-delete-returning-isolation-test-v4.patchapplication/octet-stream; name=0002-Add-EvalPlanQual-delete-returning-isolation-test-v4.patchDownload+34-1
#19Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Alexander Korotkov (#18)
Re: Table AM Interface Enhancements

Hi, Alexander!

For 0007:

Code inside

+heapam_reloptions(char relkind, Datum reloptions, bool validate)
+{
+   if (relkind == RELKIND_RELATION ||
+       relkind == RELKIND_TOASTVALUE ||
+       relkind == RELKIND_MATVIEW)
+       return heap_reloptions(relkind, reloptions, validate);
+
+   return NULL;

looks redundant to what is done inside heap_reloptions(). Was this on
purpose? Is it possible to leave only "return heap_reloptions()" ?

This looks like a duplicate:
src/include/access/reloptions.h:extern bytea
*index_reloptions(amoptions_function amoptions, Datum reloptions,
src/include/access/tableam.h:extern bytea
*index_reloptions(amoptions_function amoptions, Datum reloptions,

Otherwise the patch looks good and doing what it's proposed to do.

Regards,
Pavel Borisov.

#20Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#19)
Re: Table AM Interface Enhancements

Hi, Alexander!

On Wed, 20 Mar 2024 at 09:22, Pavel Borisov <pashkin.elfe@gmail.com> wrote:

Hi, Alexander!

For 0007:

Code inside

+heapam_reloptions(char relkind, Datum reloptions, bool validate)
+{
+   if (relkind == RELKIND_RELATION ||
+       relkind == RELKIND_TOASTVALUE ||
+       relkind == RELKIND_MATVIEW)
+       return heap_reloptions(relkind, reloptions, validate);
+
+   return NULL;

looks redundant to what is done inside heap_reloptions(). Was this on
purpose? Is it possible to leave only "return heap_reloptions()" ?

This looks like a duplicate:
src/include/access/reloptions.h:extern bytea
*index_reloptions(amoptions_function amoptions, Datum reloptions,
src/include/access/tableam.h:extern bytea
*index_reloptions(amoptions_function amoptions, Datum reloptions,

Otherwise the patch looks good and doing what it's proposed to do.

For patch 0006:

The change for analyze is in the same style as for previous table am
extensibility patches.

table_scan_analyze_next_tuple/table_scan_analyze_next_block existing
extensibility is dropped in favour of more general method
table_relation_analyze. I haven't found existing extensions on a GitHub
that use these table am's, so probably it's quite ok to remove the
extensibility that didn't get any traction for many years.

The patch contains a big block of code copy-paste. I've checked that the
code is the same with only function name replacement in favor to using
table am instead of heap am. I'd propose restoring the static functions
declaration in the head of the file, which was removed in the patch and
place heapam_acquire_sample_rows() above compare_rows() to make functions
copied as the whole code block. This is for better patch look only, not a
principal change.

-static int acquire_sample_rows(Relation onerel, int elevel,
- HeapTuple *rows, int targrows,
- double *totalrows, double *totaldeadrows);
-static int compare_rows(const void *a, const void *b, void *arg)

May it also be a better place than vacuum.h for
typedef int (*AcquireSampleRowsFunc) ? Maybe sampling.h ?

The other patch that I'd like to review is 0012:

For a
typedef enum RowRefType
 I think some comments would be useful to avoid confusion about the changes
like
-               newrc->allMarkTypes = (1 << newrc->markType);
+              newrc->allRefTypes = (1 << refType);

Also I think the semantical difference between ROW_REF_COPY
and ROW_MARK_COPY is better to be mentioned in the comments and/or commit
message. This may include a description of assigning different reftypes in
parse_relation.c

In a comment there is a small confusion between markType and refType:

* The parent's allRefTypes field gets the OR of (1<<refType) across all
* its children (this definition allows children to use different
markTypes).

Both patches look good to me and are ready, though they may need minimal
comments/cosmetic work.

Regards,
Pavel Borisov

#21Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#19)
#22Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#21)
#23Alexander Korotkov
aekorotkov@gmail.com
In reply to: Pavel Borisov (#22)
#24Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Alexander Korotkov (#23)
#25Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#24)
#26Alexander Korotkov
aekorotkov@gmail.com
In reply to: Pavel Borisov (#24)
#27Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Alexander Korotkov (#26)
#28Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#27)
#29Alexander Korotkov
aekorotkov@gmail.com
In reply to: Pavel Borisov (#28)
#30Japin Li
japinli@hotmail.com
In reply to: Alexander Korotkov (#29)
#31Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Japin Li (#30)
#32Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#31)
#33Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#32)
#34Alexander Korotkov
aekorotkov@gmail.com
In reply to: Pavel Borisov (#33)
#35Andrei Lepikhov
lepihov@gmail.com
In reply to: Alexander Korotkov (#34)
#36Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrei Lepikhov (#35)
#37Alexander Korotkov
aekorotkov@gmail.com
In reply to: Tom Lane (#36)
#38Alexander Korotkov
aekorotkov@gmail.com
In reply to: Tom Lane (#36)
#39Japin Li
japinli@hotmail.com
In reply to: Alexander Korotkov (#38)
#40Jeff Davis
pgsql@j-davis.com
In reply to: Alexander Korotkov (#34)
#41Alexander Korotkov
aekorotkov@gmail.com
In reply to: Jeff Davis (#40)
#42Jeff Davis
pgsql@j-davis.com
In reply to: Alexander Korotkov (#41)
#43Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Jeff Davis (#42)
#44Alexander Korotkov
aekorotkov@gmail.com
In reply to: Pavel Borisov (#43)
#45Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Alexander Korotkov (#44)
#46Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#45)
#47Andres Freund
andres@anarazel.de
In reply to: Alexander Korotkov (#34)
#48Alexander Korotkov
aekorotkov@gmail.com
In reply to: Andres Freund (#47)
#49Alexander Korotkov
aekorotkov@gmail.com
In reply to: Alexander Korotkov (#48)
#50Andres Freund
andres@anarazel.de
In reply to: Alexander Korotkov (#48)
#51Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Alexander Korotkov (#48)
#52Alexander Korotkov
aekorotkov@gmail.com
In reply to: Pavel Borisov (#51)
#53Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Alexander Korotkov (#52)
#54Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Pavel Borisov (#53)
#55Andres Freund
andres@anarazel.de
In reply to: Pavel Borisov (#51)
#56Andres Freund
andres@anarazel.de
In reply to: Andres Freund (#55)
#57Alexander Korotkov
aekorotkov@gmail.com
In reply to: Andres Freund (#56)
#58Robert Haas
robertmhaas@gmail.com
In reply to: Alexander Korotkov (#57)
#59Alexander Korotkov
aekorotkov@gmail.com
In reply to: Robert Haas (#58)
#60Alexander Korotkov
aekorotkov@gmail.com
In reply to: Robert Haas (#58)
#61Robert Haas
robertmhaas@gmail.com
In reply to: Alexander Korotkov (#60)
#62Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Alexander Korotkov (#60)
#63Joe Conway
mail@joeconway.com
In reply to: Robert Haas (#61)
#64Alexander Korotkov
aekorotkov@gmail.com
In reply to: Robert Haas (#61)
#65Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#58)
#66Robert Haas
robertmhaas@gmail.com
In reply to: Alexander Korotkov (#64)
In reply to: Robert Haas (#66)
#68Bruce Momjian
bruce@momjian.us
In reply to: Pavel Borisov (#62)
#69Andres Freund
andres@anarazel.de
In reply to: Alexander Korotkov (#60)
#70Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#69)
#71Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#70)
#72Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#71)
#73Andres Freund
andres@anarazel.de
In reply to: Melanie Plageman (#72)
#74Alexander Korotkov
aekorotkov@gmail.com
In reply to: Andres Freund (#65)
#75Melanie Plageman
melanieplageman@gmail.com
In reply to: Andres Freund (#73)
#76Jeff Davis
pgsql@j-davis.com
In reply to: Alexander Korotkov (#60)
#77Alexander Korotkov
aekorotkov@gmail.com
In reply to: Jeff Davis (#76)
#78Alexander Korotkov
aekorotkov@gmail.com
In reply to: Melanie Plageman (#75)
#79Melanie Plageman
melanieplageman@gmail.com
In reply to: Melanie Plageman (#75)
#80Robert Haas
robertmhaas@gmail.com
In reply to: Alexander Korotkov (#78)
#81Andres Freund
andres@anarazel.de
In reply to: Alexander Korotkov (#78)
#82Alexander Korotkov
aekorotkov@gmail.com
In reply to: Andres Freund (#81)
#83Alexander Korotkov
aekorotkov@gmail.com
In reply to: Robert Haas (#80)
#84Melanie Plageman
melanieplageman@gmail.com
In reply to: Alexander Korotkov (#82)
#85Alexander Korotkov
aekorotkov@gmail.com
In reply to: Melanie Plageman (#84)
#86Robert Haas
robertmhaas@gmail.com
In reply to: Alexander Korotkov (#85)
#87Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Robert Haas (#86)
#88Nazir Bilal Yavuz
byavuz81@gmail.com
In reply to: Robert Haas (#86)
#89Robert Haas
robertmhaas@gmail.com
In reply to: Pavel Borisov (#87)
#90Robert Haas
robertmhaas@gmail.com
In reply to: Nazir Bilal Yavuz (#88)
#91Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Robert Haas (#89)
#92Andres Freund
andres@anarazel.de
In reply to: Pavel Borisov (#91)
#93Robert Haas
robertmhaas@gmail.com
In reply to: Andres Freund (#92)
#94Andres Freund
andres@anarazel.de
In reply to: Alexander Korotkov (#82)
#95Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#93)
#96Alexander Korotkov
aekorotkov@gmail.com
In reply to: Andres Freund (#94)
#97Alexander Korotkov
aekorotkov@gmail.com
In reply to: Andres Freund (#95)
#98Pavel Borisov
pashkin.elfe@gmail.com
In reply to: Alexander Korotkov (#97)
#99Robert Haas
robertmhaas@gmail.com
In reply to: Alexander Korotkov (#97)
#100Andres Freund
andres@anarazel.de
In reply to: Alexander Korotkov (#96)
#101Andres Freund
andres@anarazel.de
In reply to: Robert Haas (#99)
#102Mats Kindahl
mats@timescale.com
In reply to: Alexander Korotkov (#1)
#103Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Alexander Korotkov (#96)
#104Alexander Korotkov
aekorotkov@gmail.com
In reply to: Matthias van de Meent (#103)