Query with rightmost function does not use index

Started by - -over 14 years ago7 messagesgeneral
Jump to latest
#1- -
loh.law@hotmail.com

For a table where column col has an index, the query:
SELECT ... FROM ... WHERE col = MD5('')
doesn't seem to use an index but
SELECT ... FROM ... WHERE col = 'd41d8cd98f00b204e9800998ecf8427e'
does.

Is this a gotcha?
I'm using PostgreSQL 8.4.8 on i686-pc-linux-gnu, compiled by GCC gcc-4.5.real (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2, 32-bit

#2Pavel Stehule
pavel.stehule@gmail.com
In reply to: - - (#1)
Re: Query with rightmost function does not use index

Hello

use a functional index

http://www.postgresql.org/docs/8.4/interactive/indexes-expressional.html

Regards

Pavel Stehule

2011/8/8 - - <loh.law@hotmail.com>:

Show quoted text

For a table where column col has an index, the query:
     SELECT ... FROM ... WHERE col = MD5('')
doesn't seem to use an index but
     SELECT ... FROM ... WHERE col = 'd41d8cd98f00b204e9800998ecf8427e'
does.

Is this a gotcha?
I'm using PostgreSQL 8.4.8 on i686-pc-linux-gnu, compiled by GCC
gcc-4.5.real (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2, 32-bit

#3- -
loh.law@hotmail.com
In reply to: Pavel Stehule (#2)
Re: Query with rightmost function does not use index

But why? The expression is not on the left side of the WHERE clause.

Show quoted text

Hello

use a functional index

http://www.postgresql.org/docs/8.4/interactive/indexes-expressional.html

Regards

Pavel Stehule

2011/8/8 - - <loh.law@hotmail.com>:

For a table where column col has an index, the query:
SELECT ... FROM ... WHERE col = MD5('')
doesn't seem to use an index but
SELECT ... FROM ... WHERE col = 'd41d8cd98f00b204e9800998ecf8427e'
does.

Is this a gotcha?
I'm using PostgreSQL 8.4.8 on i686-pc-linux-gnu, compiled by GCC
gcc-4.5.real (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2, 32-bit

#4pasman pasmański
pasman.p@gmail.com
In reply to: Pavel Stehule (#2)
Re: Query with rightmost function does not use index

Here may be other problem.
Show us the result of EXPLAIN Analyze.

2011/8/8, Pavel Stehule <pavel.stehule@gmail.com>:

Hello

use a functional index

http://www.postgresql.org/docs/8.4/interactive/indexes-expressional.html

Regards

Pavel Stehule

2011/8/8 - - <loh.law@hotmail.com>:

For a table where column col has an index, the query:
     SELECT ... FROM ... WHERE col = MD5('')
doesn't seem to use an index but
     SELECT ... FROM ... WHERE col = 'd41d8cd98f00b204e9800998ecf8427e'
does.

Is this a gotcha?
I'm using PostgreSQL 8.4.8 on i686-pc-linux-gnu, compiled by GCC
gcc-4.5.real (Ubuntu/Linaro 4.5.2-8ubuntu4) 4.5.2, 32-bit

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
------------
pasman

#5- -
loh.law@hotmail.com
In reply to: Pavel Stehule (#2)
Re: Query with rightmost function does not use index

Here are the EXPLAIN ANALYZE outputs:

explain analyze select * from filter_item where filter_hash = MD5('');

QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
Seq Scan on filter_item (cost=0.00..424644.96 rows=86108 width=49) (actual time=8177.807..12421.921 rows=77 loops=1)
Filter: ((filter_hash)::text = 'd41d8cd98f00b204e9800998ecf8427e'::text)
Total runtime: 12421.959 ms
(3 rows)

explain analyze select * from filter_item where filter_hash = 'd41d8cd98f00b204e9800998ecf8427e'

QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on filter_item (cost=77.92..6609.02 rows=3534 width=49) (actual time=0.055..0.100 rows=77 loops=1)
Recheck Cond: (filter_hash = 'd41d8cd98f00b204e9800998ecf8427e'::bpchar)
-> Bitmap Index Scan on filter_item__filter_hash (cost=0.00..77.04 rows=3534 width=0) (actual time=0.049..0.049 rows=77 loops=1)
Index Cond: (filter_hash = 'd41d8cd98f00b204e9800998ecf8427e'::bpchar)
Total runtime: 0.130 ms
(5 rows)

#6David G. Johnston
david.g.johnston@gmail.com
In reply to: - - (#5)
Re: Query with rightmost function does not use index

From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of - -
Sent: Monday, August 08, 2011 3:40 AM
To: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Query with rightmost function does not use index

Here are the EXPLAIN ANALYZE outputs:

explain analyze select * from filter_item where filter_hash = MD5('');

QUERY PLAN

----------------------------------------------------------------------------
-------------------------------------------
Seq Scan on filter_item (cost=0.00..424644.96 rows=86108 width=49) (ac
tual time=8177.807..12421.921 rows=77 loops=1)
Filter: ((filter_hash)::text = 'd41d8cd98f00b204e9800998ecf8427e'::text)
Total runtime: 12421.959 ms
(3 rows)

explain analyze select * from filter_item where filter_hash =
'd41d8cd98f00b204e9800998ecf8427e'

QUERY PLAN

----------------------------------------------------------------------------
----------------------------------------------------------
Bitmap Heap Scan on filter_item (cost=77.92..6609.02 rows=3534 width=49)
(actual time=0.055..0.100 rows=77 loops=1)
Recheck Cond: (filter_hash = 'd41d8cd98f00b204e9800998ecf8427e'::bpchar)
-> Bitmap Index Scan on filter_item__filter_hash (cost=0.00..77.04
rows=3534 width=0) (actual time=0.049..0.049 rows=77 loops=1)
Index Cond: (filter_hash =
'd41d8cd98f00b204e9800998ecf8427e'::bpchar)
Total runtime: 0.130 ms
(5 rows)

The filter_hash index uses a "character(n)" data type - the ::bpchar. The
second query is of unknown type and thus is converted to "character" and
then used in the index. The first query use a function that outputs a
"text". Since the output type is known the left-side of the equals is
casted to that known type. Since the index is one the "character" version
of the filter_hash but the comparison requires a "text" version the index
cannot be used. You would need to manually cast the result of the md5
function call to "character" in order to get the index usage; or convert the
filter_hash column to text, the latter option probably being preferred.

It is not a bug, in cases of uncertainty the types of the value and the
indexed field must be the same, but it could possibly be more user-friendly.

I'll leave it to other to comment on whether this is different in more
recent versions. Text-character are binary compatible and so it is not be
unreasonable to assume, like you did, that indexes of one should be usable
by the other.

David J.

#7Andres Freund
andres@anarazel.de
In reply to: - - (#5)
Re: Query with rightmost function does not use index

Hi,

On Monday, August 08, 2011 15:40:20 - - wrote:

explain analyze select * from filter_item where filter_hash = MD5('');

QUERY PLAN
----------------------------------------------------------------------------
------------------------------------------- Seq Scan on filter_item
(cost=0.00..424644.96 rows=86108 width=49) (actual time=8177.807..12421.921
rows=77 loops=1) Filter: ((filter_hash)::text =
'd41d8cd98f00b204e9800998ecf8427e'::text) Total runtime: 12421.959 ms
(3 rows)

explain analyze select * from filter_item where filter_hash =
'd41d8cd98f00b204e9800998ecf8427e'

QUERY PLAN
----------------------------------------------------------------------------
---------------------------------------------------------- Bitmap Heap Scan
on filter_item (cost=77.92..6609.02 rows=3534 width=49) (actual
time=0.055..0.100 rows=77 loops=1) Recheck Cond: (filter_hash =
'd41d8cd98f00b204e9800998ecf8427e'::bpchar) -> Bitmap Index Scan on
filter_item__filter_hash (cost=0.00..77.04 rows=3534 width=0) (actual
time=0.049..0.049 rows=77 loops=1) Index Cond: (filter_hash =
'd41d8cd98f00b204e9800998ecf8427e'::bpchar) Total runtime: 0.130 ms
(5 rows)

The problem is that your filter_hash columns seems to be of type char(n). Thats
not directly compatible with text (which is the type returned by the md5
function).
So either change the column type or cast the return type of md5 to char(n).

I do have to admit that this is somewhat strange.

Greetings,

Andres