Reading all tuples in Index Access Method

Started by Sushrut Shivaswamyover 1 year ago3 messages

sushrut.shivaswamy@gmail.com

over 1 year ago

Hi,

I'm trying to create an Index Access Method Roting.
Building the index requires iterating over all rows and calculating,
which is then used during index construction.

The methods in the IndexAmRoutine seem to deal with insertion / index build
one row at a time.
Is there any workaround you can suggest that would allow me to calculate
the median of a column,
store it someplace and then use it during Inserts to the index?

Thanks,
Sushrut

Matthias van de Meent

boekewurm+postgres@gmail.com

over 1 year ago

In reply to: Sushrut Shivaswamy (#1)

Re: Reading all tuples in Index Access Method

On Wed, 28 Aug 2024 at 16:21, Sushrut Shivaswamy
<sushrut.shivaswamy@gmail.com> wrote:

Hi,

I'm trying to create an Index Access Method Roting.
Building the index requires iterating over all rows and calculating,
which is then used during index construction.

The methods in the IndexAmRoutine seem to deal with insertion / index build one row at a time.
Is there any workaround you can suggest that would allow me to calculate the median of a column,
store it someplace and then use it during Inserts to the index?

I'm not sure what to say. Index insertions through indam->aminsert
happen as users insert new values into the table, so I don't see how a
once-calculated median would remain correct across an index's
lifespan: every time I insert a new value (or delete a tuple) the
median will change. Furthermore, indexes will not know about deletions
and updates until significantly after the deleting or updating
transaction got committed, so transactionally consistent aggregates
are likely impossible to keep consistent while staying inside the
index AM API.

However, if you only need this median (or other aggregate) at index
build time, you should probably look at various indexes'
indam->ambuild functions, as that function's purpose is to build a new
index from an existing table's dataset, usually by scanning the table
with table_index_build_scan.

As for storing such data more permanently: Practically all included
indexes currently have a metapage at block 0 of the main data fork,
which contains metadata and bookkeeping info about the index's
structure, and you're free to do the same for your index.

I hope that helps?

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)

Sushrut Shivaswamy

sushrut.shivaswamy@gmail.com

over 1 year ago

In reply to: Matthias van de Meent (#2)

Re: Reading all tuples in Index Access Method

Thanks Matthias, table_index_build_scan sounds like what I"m looking for.

On Wed, Aug 28, 2024 at 9:29 PM Matthias van de Meent <
boekewurm+postgres@gmail.com> wrote:

Show quoted text

On Wed, 28 Aug 2024 at 16:21, Sushrut Shivaswamy
<sushrut.shivaswamy@gmail.com> wrote:

Hi,

I'm trying to create an Index Access Method Roting.
Building the index requires iterating over all rows and calculating,
which is then used during index construction.

The methods in the IndexAmRoutine seem to deal with insertion / index

build one row at a time.

Is there any workaround you can suggest that would allow me to calculate

the median of a column,

store it someplace and then use it during Inserts to the index?

I'm not sure what to say. Index insertions through indam->aminsert
happen as users insert new values into the table, so I don't see how a
once-calculated median would remain correct across an index's
lifespan: every time I insert a new value (or delete a tuple) the
median will change. Furthermore, indexes will not know about deletions
and updates until significantly after the deleting or updating
transaction got committed, so transactionally consistent aggregates
are likely impossible to keep consistent while staying inside the
index AM API.

However, if you only need this median (or other aggregate) at index
build time, you should probably look at various indexes'
indam->ambuild functions, as that function's purpose is to build a new
index from an existing table's dataset, usually by scanning the table
with table_index_build_scan.

As for storing such data more permanently: Practically all included
indexes currently have a metapage at block 0 of the main data fork,
which contains metadata and bookkeeping info about the index's
structure, and you're free to do the same for your index.

I hope that helps?

Kind regards,

Matthias van de Meent
Neon (https://neon.tech)