Inquiry on Generating Bitmaps from Filter Conditions in Index Scans

Started by Jinjing Zhouabout 2 years ago5 messages
#1Jinjing Zhou
allenzhou@tensorchord.ai

Hi hackers, 

I hope this message finds you well. I am reaching out to seek guidance on a specific aspect of PostgreSQL's index scanning functionality.

I am currently working on a vector search extension for postgres, where I need to generate bitmaps based on filter conditions during an index scan. The goal is to optimize the query performance by efficiently identifying the rows that meet the given criteria.

The query plan looks like this

Index Scan using products_feature_idx on products  (cost=0.00..27.24 rows=495 width=12)
         Order By: (feature <-> '[0.5, 0.5, 0.5]'::vector)
         Filter: ((price > '0.2'::double precision) AND (price <= '0.7'::double precision))

We have a custom index for the order by clause on the feature column. Now we want to utilize the index on other columns like price column. We want to access the bitmap of price column's filter condition in the feature column index. Is there any way I can achieve this goal?

Any help or guidance is appreciated!

Thanks.
Jinjing Zhou

#2Tomas Vondra
tomas.vondra@enterprisedb.com
In reply to: Jinjing Zhou (#1)
Re: Inquiry on Generating Bitmaps from Filter Conditions in Index Scans

On 11/19/23 18:19, Jinjing Zhou wrote:

Hi hackers, 

I hope this message finds you well. I am reaching out to seek guidance
on a specific aspect of PostgreSQL's index scanning functionality.

I am currently working on a vector search extension for postgres, where
I need to generate bitmaps based on filter conditions during an index
scan. The goal is to optimize the query performance by efficiently
identifying the rows that meet the given criteria.

The query plan looks like this

Index Scan using products_feature_idx on products  (cost=0.00..27.24
rows=495 width=12)
         Order By: (feature <-> '[0.5, 0.5, 0.5]'::vector)
         Filter: ((price > '0.2'::double precision) AND (price <=
'0.7'::double precision))

We have a custom index for the order by clause on the feature column.
Now we want to utilize the index on other columns like price column. We
want to access the bitmap of price column's filter condition in the
feature column index. Is there any way I can achieve this goal?

Any help or guidance is appreciated!

I suppose you'll need to give more details about what exactly are you
trying to achieve, what you tried, maybe some code examples, etc. Your
question is quite vague, and it's unclear what "bitmaps generated on
filter conditions" or "custom index" means.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

#3Matthias van de Meent
boekewurm+postgres@gmail.com
In reply to: Jinjing Zhou (#1)
Re: Inquiry on Generating Bitmaps from Filter Conditions in Index Scans

On Mon, 20 Nov 2023 at 09:30, Jinjing Zhou <allenzhou@tensorchord.ai> wrote:

Hi hackers,

I hope this message finds you well. I am reaching out to seek guidance on a specific aspect of PostgreSQL's index scanning functionality.

I am currently working on a vector search extension for postgres, where I need to generate bitmaps based on filter conditions during an index scan. The goal is to optimize the query performance by efficiently identifying the rows that meet the given criteria.

The query plan looks like this

Index Scan using products_feature_idx on products (cost=0.00..27.24 rows=495 width=12)
Order By: (feature <-> '[0.5, 0.5, 0.5]'::vector)
Filter: ((price > '0.2'::double precision) AND (price <= '0.7'::double precision))

We have a custom index for the order by clause on the feature column. Now we want to utilize the index on other columns like price column. We want to access the bitmap of price column's filter condition in the feature column index. Is there any way I can achieve this goal?

If you mean "I'd like to use bitmaps generated by combining filter
results from index A, B, and C for (pre-)filtering the ordered index
lookups in index D",
then there is no current infrastructure to do this. Bitmap scans
currently generate a data structure that is not indexable, and can
thus not be used efficiently to push an index's generated bitmap into
another bitmap's scans.

There are efforts to improve the data structures we use for storing
TIDs during vacuum [0]/messages/by-id/CANWCAZbrZ58-w1W_3pg-0tOfbx8K41_n_03_0ndGV78hJWswBA@mail.gmail.com which could extend to the TID bitmap structure,
but even then we'd need some significant effort to rewire Postgres'
internals to push down the bitmap filters; and that is even under the
assumption that pushing the bitmap down into the index AM is more
efficient than doing the merges above the index AM and then re-sorting
the data.

So, in short, it's not currently available in community PostgreSQL.
You could probably create a planner hook + custom executor node that
does this, but it won't be able to use much of the features available
inside PostgreSQL.

Kind regards,

Matthias van de Meent

[0]: /messages/by-id/CANWCAZbrZ58-w1W_3pg-0tOfbx8K41_n_03_0ndGV78hJWswBA@mail.gmail.com

#4Jinjing Zhou
allenzhou@tensorchord.ai
In reply to: Matthias van de Meent (#3)
Re: Inquiry on Generating Bitmaps from Filter Conditions in Index Scans

Thanks a lot! This is exactly what I'm asking. We've tried the CustomScanAPI at https://github.com/tensorchord/pgvecto.rs/pull/126, but met error with "variable not found in subplan target list". We're still investigating the root cause and thanks for your guidance!

Best
Jinjing Zhou

Show quoted text

From: "Matthias van de Meent"<boekewurm+postgres@gmail.com>
Date: Mon, Nov 20, 2023, 19:33
Subject: Re: Inquiry on Generating Bitmaps from Filter Conditions in Index Scans
To: "Jinjing Zhou"<allenzhou@tensorchord.ai>
Cc: "pgsql-hackers@lists.postgresql.org"<pgsql-hackers@lists.postgresql.org>
On Mon, 20 Nov 2023 at 09:30, Jinjing Zhou <allenzhou@tensorchord.ai> wrote:

Hi hackers,

I hope this message finds you well. I am reaching out to seek guidance on a specific aspect of PostgreSQL's index scanning functionality.

I am currently working on a vector search extension for postgres, where I need to generate bitmaps based on filter conditions during an index scan. The goal is to optimize the query performance by efficiently identifying the rows that meet the given criteria.

The query plan looks like this

Index Scan using products_feature_idx on products (cost=0.00..27.24 rows=495 width=12)
Order By: (feature <-> '[0.5, 0.5, 0.5]'::vector)
Filter: ((price > '0.2'::double precision) AND (price <= '0.7'::double precision))

We have a custom index for the order by clause on the feature column. Now we want to utilize the index on other columns like price column. We want to access the bitmap of price column's filter condition in the feature column index. Is there any way I can achieve this goal?

If you mean "I'd like to use bitmaps generated by combining filter
results from index A, B, and C for (pre-)filtering the ordered index
lookups in index D",
then there is no current infrastructure to do this. Bitmap scans
currently generate a data structure that is not indexable, and can
thus not be used efficiently to push an index's generated bitmap into
another bitmap's scans.

There are efforts to improve the data structures we use for storing
TIDs during vacuum [0] which could extend to the TID bitmap structure,
but even then we'd need some significant effort to rewire Postgres'
internals to push down the bitmap filters; and that is even under the
assumption that pushing the bitmap down into the index AM is more
efficient than doing the merges above the index AM and then re-sorting
the data.

So, in short, it's not currently available in community PostgreSQL.
You could probably create a planner hook + custom executor node that
does this, but it won't be able to use much of the features available
inside PostgreSQL.

Kind regards,

Matthias van de Meent

[0] /messages/by-id/CANWCAZbrZ58-w1W_3pg-0tOfbx8K41_n_03_0ndGV78hJWswBA@mail.gmail.com

#5Jinjing Zhou
allenzhou@tensorchord.ai
In reply to: Tomas Vondra (#2)
Re: Inquiry on Generating Bitmaps from Filter Conditions in Index Scans

Thanks. Our project is at https://github.com/tensorchord/pgvecto.rs. A custom index is implemented for the vector similarity search, which implements `amgettuples` with direction support to provide candidates for the order by clause. 

And we want to inject the filter condition using bitmap into the amgettuples process, instead of checking the tuples one by one to accelerate the whole process. 

Best
Jinjing Zhou

Show quoted text

From: "Tomas Vondra"<tomas.vondra@enterprisedb.com>
Date: Mon, Nov 20, 2023, 18:52
Subject: Re: Inquiry on Generating Bitmaps from Filter Conditions in Index Scans
To: "Jinjing Zhou"<allenzhou@tensorchord.ai>, "pgsql-hackers@lists.postgresql.org"<pgsql-hackers@lists.postgresql.org>
On 11/19/23 18:19, Jinjing Zhou wrote:

Hi hackers, 

I hope this message finds you well. I am reaching out to seek guidance
on a specific aspect of PostgreSQL's index scanning functionality.

I am currently working on a vector search extension for postgres, where
I need to generate bitmaps based on filter conditions during an index
scan. The goal is to optimize the query performance by efficiently
identifying the rows that meet the given criteria.

The query plan looks like this

Index Scan using products_feature_idx on products  (cost=0.00..27.24
rows=495 width=12)
         Order By: (feature <-> '[0.5, 0.5, 0.5]'::vector)
         Filter: ((price > '0.2'::double precision) AND (price <=
'0.7'::double precision))

We have a custom index for the order by clause on the feature column.
Now we want to utilize the index on other columns like price column. We
want to access the bitmap of price column's filter condition in the
feature column index. Is there any way I can achieve this goal?

Any help or guidance is appreciated!

I suppose you'll need to give more details about what exactly are you
trying to achieve, what you tried, maybe some code examples, etc. Your
question is quite vague, and it's unclear what "bitmaps generated on
filter conditions" or "custom index" means.

regards

--
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company