Why is PostgreSQL 9.1 not using index for simple equality select

Started by Yang Zhangalmost 13 years ago9 messagesgeneral
Jump to latest
#1Yang Zhang
yanghatespam@gmail.com

Any hints with this question I had posted to SO?

http://stackoverflow.com/questions/15965785/why-is-postgresql-9-1-not-using-index-for-simple-equality-select

Pasted here as well. Thanks.

My table `lead` has an index:

\d lead
...
Indexes:
"lead_pkey" PRIMARY KEY, btree (id)
"lead_account__c" btree (account__c)
...
"lead_email" btree (email)
"lead_id_prefix" btree (id text_pattern_ops)

Why doesn't PG (9.1) use the index for this straightforward equality
selection? Emails are almost all unique....

db=> explain select * from lead where email = 'blah';
QUERY PLAN
------------------------------------------------------------
Seq Scan on lead (cost=0.00..319599.38 rows=1 width=5108)
Filter: (email = 'blah'::text)
(2 rows)

Other index-hitting queries seem to be OK (though I don't know why
this one doesn't just use the pkey index):

db=> explain select * from lead where id = '';
QUERY PLAN
------------------------------------------------------------------------------
Index Scan using lead_id_prefix on lead (cost=0.00..8.57 rows=1
width=5108)
Index Cond: (id = ''::text)
(2 rows)

db=> explain select * from lead where account__c = '';
QUERY PLAN
----------------------------------------------------------------------------------
Index Scan using lead_account__c on lead (cost=0.00..201.05
rows=49 width=5108)
Index Cond: (account__c = ''::text)
(2 rows)

At first I thought it may be due to not enough distinct values of
`email`. For instance, if the stats claim that `email` is `blah` for
most of the table, then a seq scan is faster. But that's not the
case:

db=> select count(*), count(distinct email) from lead;
count | count
--------+--------
749148 | 733416
(1 row)

Even if I force seq scans to be off, the planner behaves as if it has
no other choice:

db=> set enable_seqscan = off;
SET
db=> show enable_seqscan;
enable_seqscan
----------------
off
(1 row)

db=> explain select * from lead where email = 'foo@blah.com';
QUERY PLAN
---------------------------------------------------------------------------
Seq Scan on lead (cost=10000000000.00..10000319599.38 rows=1 width=5108)
Filter: (email = 'foo@blah.com'::text)
(2 rows)

I searched over a good number of past SO questions but none were about
a simple equality query like this one.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2John R Pierce
pierce@hogranch.com
In reply to: Yang Zhang (#1)
Re: Why is PostgreSQL 9.1 not using index for simple equality select

On 4/12/2013 1:03 AM, Yang Zhang wrote:

db=> explain select * from lead where email = 'blah';
QUERY PLAN
------------------------------------------------------------
Seq Scan on lead (cost=0.00..319599.38 rows=1 width=5108)
Filter: (email = 'blah'::text)

try EXPLAIN ANALYZE ..... its more useful.

my guess is, there's no statistics on this table, and doing an ANALYZE
lead; would rectify this.

--
john r pierce 37N 122W
somewhere on the middle of the left coast

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Yang Zhang
yanghatespam@gmail.com
In reply to: John R Pierce (#2)
Re: Why is PostgreSQL 9.1 not using index for simple equality select

Doesn't seem to be the case. This table has been around for a while
and should have been auto-analyzed by now. But anyway:

db=> analyze lead;
ANALYZE
db=> explain select * from lead where email = 'foo@blah.com';
QUERY PLAN
---------------------------------------------------------------------------
Seq Scan on lead (cost=10000000000.00..10000319666.99 rows=1 width=5208)
Filter: (email = 'foo@blah.com'::text)
(2 rows)

On Fri, Apr 12, 2013 at 1:13 AM, John R Pierce <pierce@hogranch.com> wrote:

On 4/12/2013 1:03 AM, Yang Zhang wrote:

db=> explain select * from lead where email = 'blah';
QUERY PLAN
------------------------------------------------------------
Seq Scan on lead (cost=0.00..319599.38 rows=1 width=5108)
Filter: (email = 'blah'::text)

try EXPLAIN ANALYZE ..... its more useful.

my guess is, there's no statistics on this table, and doing an ANALYZE lead;
would rectify this.

--
john r pierce 37N 122W
somewhere on the middle of the left coast

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Yang Zhang
http://yz.mit.edu/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Alban Hertroys
haramrae@gmail.com
In reply to: Yang Zhang (#3)
Re: Why is PostgreSQL 9.1 not using index for simple equality select

On 12 April 2013 10:45, Yang Zhang <yanghatespam@gmail.com> wrote:

explain select * from lead where email = 'foo@blah.com';

What about:
explain analyze select * from lead where email = 'foo@blah.com';

--
If you can't see the forest for the trees,
Cut the trees and you'll see there is no forest.

#5John R Pierce
pierce@hogranch.com
In reply to: Yang Zhang (#3)
Re: Why is PostgreSQL 9.1 not using index for simple equality select

On 4/12/2013 1:45 AM, Yang Zhang wrote:

db=> explain select * from lead where email = 'foo@blah.com';

can you try

explain analyze select * from lead where email = 'foo@blah.com';

?

--
john r pierce 37N 122W
somewhere on the middle of the left coast

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Yang Zhang (#1)
Re: Why is PostgreSQL 9.1 not using index for simple equality select

Yang Zhang <yanghatespam@gmail.com> writes:

db=> explain select * from lead where email = 'blah';
QUERY PLAN
------------------------------------------------------------
Seq Scan on lead (cost=0.00..319599.38 rows=1 width=5108)
Filter: (email = 'blah'::text)
(2 rows)

That's awfully odd. What data type is the "email" column?

It seems possible also that the index on it is marked invalid. I'd have
expected \d to tell you so, but maybe you're using a version of psql that
doesn't know about that. It'd be interesting to look at
select * from pg_index where indexrelid = 'index name here'::regclass;

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Yang Zhang
yanghatespam@gmail.com
In reply to: Tom Lane (#6)
Re: Why is PostgreSQL 9.1 not using index for simple equality select

It's actually just `text`.

I updated my SO question with some more info including explain analyze
(no difference), \d,
and your last incantation.

Thanks!

On Fri, Apr 12, 2013 at 7:11 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yang Zhang <yanghatespam@gmail.com> writes:

db=> explain select * from lead where email = 'blah';
QUERY PLAN
------------------------------------------------------------
Seq Scan on lead (cost=0.00..319599.38 rows=1 width=5108)
Filter: (email = 'blah'::text)
(2 rows)

That's awfully odd. What data type is the "email" column?

It seems possible also that the index on it is marked invalid. I'd have
expected \d to tell you so, but maybe you're using a version of psql that
doesn't know about that. It'd be interesting to look at
select * from pg_index where indexrelid = 'index name here'::regclass;

regards, tom lane

--
Yang Zhang
http://yz.mit.edu/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Yang Zhang (#7)
Re: Why is PostgreSQL 9.1 not using index for simple equality select

Yang Zhang <yanghatespam@gmail.com> writes:

I updated my SO question with some more info including explain analyze
(no difference), \d,
and your last incantation.

The question is being asked here, not in SO, and I find it rather
impolite of you to expect me to go chasing off to some other forum
to answer your question.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#9Yang Zhang
yanghatespam@gmail.com
In reply to: Tom Lane (#8)
Re: Why is PostgreSQL 9.1 not using index for simple equality select

Apologies for that Tom. I will paste the information in line once I'm
back at my computer. I do appreciate your help.

On Fri, Apr 12, 2013 at 10:24 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

Yang Zhang <yanghatespam@gmail.com> writes:

I updated my SO question with some more info including explain analyze
(no difference), \d,
and your last incantation.

The question is being asked here, not in SO, and I find it rather
impolite of you to expect me to go chasing off to some other forum
to answer your question.

regards, tom lane

--
Yang Zhang
http://yz.mit.edu/

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general