Question on indexes

Started by sudover 1 year ago11 messagesgeneral

suds1434@gmail.com

over 1 year ago

Hi,
I have never used any 'hash index' but saw documents in the past suggesting
issues around hash index , like WAL doesnt generate for "hash index" which
means we can't get the hash index back after crash also they are not
applied to replicas etc. And also these indexes can not be used for range
queries , for sorting etc.

However, we are seeing that one of the databases has multiple hash indexes
created. So I wanted to understand from experts here, if it's advisable in
any specific scenarios over B-tre despite such downsides?
Note- Its version 15.4 database.

Regards
Sud

Christophe Pettus

xof@thebuild.com

over 1 year ago

In reply to: sud (#1)

Re: Question on indexes

On Oct 10, 2024, at 11:49, sud <suds1434@gmail.com> wrote:

Hi,
I have never used any 'hash index' but saw documents in the past suggesting issues around hash index , like WAL doesnt generate for "hash index" which means we can't get the hash index back after crash also they are not applied to replicas etc.

That's very old information. Hash indexes are correctly WAL-logged since (IIRC) version 10.

Erik Wienhold

ewie@ewie.name

over 1 year ago

In reply to: sud (#1)

Re: Question on indexes

On 2024-10-10 20:49 +0200, sud wrote:

However, we are seeing that one of the databases has multiple hash indexes
created. So I wanted to understand from experts here, if it's advisable in
any specific scenarios over B-tre despite such downsides?

Two things come to my mind:

1. Btree puts a limit on the size of indexed values, whereas hash
indexes only store the 32-bit hash code.

2. Of the core index types, only btree supports unique indexes.

Example of btree's size limit:

CREATE TABLE b (s text);
CREATE INDEX ON b USING btree (s);
INSERT INTO b (s) VALUES (repeat('x', 1000000));
ERROR: index row requires 11464 bytes, maximum size is 8191

The docs have more details:
https://www.postgresql.org/docs/current/btree.html
https://www.postgresql.org/docs/current/hash-index.html

--
Erik

sud

suds1434@gmail.com

over 1 year ago

In reply to: Erik Wienhold (#3)

Re: Question on indexes

On Fri, Oct 11, 2024 at 12:51 AM Erik Wienhold <ewie@ewie.name> wrote:

On 2024-10-10 20:49 +0200, sud wrote:

However, we are seeing that one of the databases has multiple hash

indexes

created. So I wanted to understand from experts here, if it's advisable

in

any specific scenarios over B-tre despite such downsides?

Two things come to my mind:

1. Btree puts a limit on the size of indexed values, whereas hash
indexes only store the 32-bit hash code.

2. Of the core index types, only btree supports unique indexes.

Example of btree's size limit:

CREATE TABLE b (s text);
CREATE INDEX ON b USING btree (s);
INSERT INTO b (s) VALUES (repeat('x', 1000000));
ERROR: index row requires 11464 bytes, maximum size is 8191

The docs have more details:
https://www.postgresql.org/docs/current/btree.html
https://www.postgresql.org/docs/current/hash-index.html

Thank you.

Not yet confirmed, but actually somehow we see the DB crashed repetitively
a few times and teammates suspecting the cause while it tried extending
this hash index. Did you experience any such thing with hash index?
However, as you mentioned ,if we have any column with large string/text
values and we want it to be indexed then there is no choice but to go for a
hash index. Please correct me if I'm wrong.

Erik Wienhold

ewie@ewie.name

over 1 year ago

In reply to: sud (#4)

Re: Question on indexes

On 2024-10-10 21:44 +0200, sud wrote:

Not yet confirmed, but actually somehow we see the DB crashed repetitively
a few times and teammates suspecting the cause while it tried extending
this hash index.

Your first mail says that you're using version 15.4. You should
consider upgrading to 15.8 to get the latest bugfixes.

Did you experience any such thing with hash index?

No. But I can't remember ever seeing a hash index in the databases that
I've worked on.

However, as you mentioned ,if we have any column with large string/text
values and we want it to be indexed then there is no choice but to go for a
hash index. Please correct me if I'm wrong.

Define "large".

What kind of text (natural, JSON, XML, base64, DNA sequences, etc.) is
stored in those columns? Why do you want/need to index those columns?
Because hash indexes only support the equal operator, one can only use
that index to search for exact matches (i.e. values with identical hash
code) which I find strange for values that are so large that btree
cannot be used. But maybe you have solid use case for that.

If it's natural text and you're using tsvector for full-text search,
then GiST or GIN indexes are a better choice:
https://www.postgresql.org/docs/current/textsearch-indexes.html

--
Erik

Efrain J. Berdecia

ejberdecia@yahoo.com

over 1 year ago

In reply to: Christophe Pettus (#2)

Re: Question on indexes

They are extremely efficient for joins!!!

Yahoo Mail: Search, Organize, Conquer

On Thu, Oct 10, 2024 at 2:52 PM, Christophe Pettus<xof@thebuild.com> wrote:

On Oct 10, 2024, at 11:49, sud <suds1434@gmail.com> wrote:

Hi,
I have never used any 'hash index' but saw documents in the past suggesting issues around hash index , like WAL doesnt generate for "hash index" which means we can't get the hash index back after crash also they are not applied to replicas etc.

That's very old information. Hash indexes are correctly WAL-logged since (IIRC) version 10.

Laurenz Albe

laurenz.albe@cybertec.at

over 1 year ago

In reply to: sud (#1)

Re: Question on indexes

On Fri, 2024-10-11 at 00:19 +0530, sud wrote:

I have never used any 'hash index' but saw documents in the past suggesting issues
around hash index , like WAL doesnt generate for "hash index" which means we can't
get the hash index back after crash also they are not applied to replicas etc.
And also these indexes can not be used for range queries , for sorting etc.

However, we are seeing that one of the databases has multiple hash indexes created.
So I wanted to understand from experts here, if it's advisable in any specific
scenarios over B-tre despite such downsides?
Note- Its version 15.4 database.

It is safe to use them, but in my tests I didn't find a realistic case where the were
better than a B-tree index:
https://www.cybertec-postgresql.com/en/postgresql-hash-index-performance/

Keep them if they do the trick for you, but I'd use B-tree indexes instead.

Yours,
Laurenz Albe

Greg Sabino Mullane

greg@turnstep.com

over 1 year ago

In reply to: sud (#4)

Re: Question on indexes

if we have any column with large string/text values and we want it to be
indexed then there is no choice but to go for a hash index. Please correct
me if I'm wrong.

There are other strategies / solutions, but we would need to learn more
about your use case.

Cheers,
Greg

Durgamahesh Manne

maheshpostgres9@gmail.com

over 1 year ago

In reply to: Greg Sabino Mullane (#8)

Re: Question on indexes

On Fri, Oct 11, 2024 at 5:00 PM Greg Sabino Mullane <htamfids@gmail.com>
wrote:

if we have any column with large string/text values and we want it to be

indexed then there is no choice but to go for a hash index. Please correct
me if I'm wrong.

There are other strategies / solutions, but we would need to learn more
about your use case.

Cheers,
Greg

Hi Respected Team

How do we enforce the secondary column of composite index to index scan on
concurrent activity in postgres?
Second column of composite index not in use effectively with index scan
when using second column at where clause

I have composite index on (placedon,id) of test
When querying select * from test where id = '4234';
Value of id changes and during concurrent activity and cpu utilization
increased too much that i have observed which means query plan changed why

I could see index scan with explain for it on singal call or double calls

Is there any way to keep an index scan for it during concurrency rather
than a separate index on the second column of the composite index ?

Regards,
Durga Mahesh

#10

Greg Sabino Mullane

greg@turnstep.com

over 1 year ago

In reply to: Durgamahesh Manne (#9)

Re: Question on indexes

(please start a new thread in the future rather than replying to an
existing one)

You cannot query on b and use an index on (a,b) as you observed. However,
you can have two indexes:

index1(a)
index2(b)

Postgres will be able to combine those when needed in the case where your
WHERE clause needs to filter by both columns. So then you no longer need
the two-column index.

Cheers,
Greg

#11

Durgamahesh Manne

maheshpostgres9@gmail.com

over 1 year ago

In reply to: Greg Sabino Mullane (#10)

Re: Question on indexes

On Fri, Oct 11, 2024 at 6:18 PM Greg Sabino Mullane <htamfids@gmail.com>
wrote:

(please start a new thread in the future rather than replying to an
existing one)

You cannot query on b and use an index on (a,b) as you observed. However,
you can have two indexes:

index1(a)
index2(b)

Postgres will be able to combine those when needed in the case where your
WHERE clause needs to filter by both columns. So then you no longer need
the two-column index.

Cheers,
Greg

Hi greg

Mail sent you with a new thread. composite key is on partitioned table

Regards,
Durga Mahesh