Weird indices

Started by Jean-Christophe Boggioabout 25 years ago28 messagesgeneral
Jump to latest
#1Jean-Christophe Boggio
cat@thefreecat.org

Hi,

I try to optimize our databases and I find a query that's not very
optimal :

sitefr=# explain select nomsession from session where nomsession='xxx';
NOTICE: QUERY PLAN:

Seq Scan on session (cost=0.00..16275.95 rows=10113 width=12)

EXPLAIN

Phew! I think there's an index missing but...

sitefr=# \d session
Table "session"
Attribute | Type | Modifier
------------+-----------+-------------------------------------------------
idsession | integer | not null default nextval('seq_idsession'::text)
nomsession | text |
idmembre | text |
referer | text |
ip | text |
datelog | timestamp |
Indices: ix_session_idmembre,
ix_session_nomsession,
session_idsession_key

So I look at the index itself :

sitefr=# \d ix_session_nomsession
Index "ix_session_nomsession"
Attribute | Type
------------+------
nomsession | text
btree

Did I miss something or 'text' attributes (fields) can't be indexed ?
That sounds crazy ! (I vacuum analyzed many times)

Just in case 'nomsession' would not be as dispersed as I would
think...

sitefr=# select count(nomsession) from session;
count
--------
510069
(1 row)

sitefr=# select count(distinct nomsession) from session;
count
--------
401094
(1 row)

Anyone has an idea ?

Thanks !

--
Jean-Christophe Boggio
cat@thefreecat.org
Independant Consultant and Developer
Delphi, Linux, Perl, PostgreSQL

#2Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Jean-Christophe Boggio (#1)
Re: Weird indices

Do you have a value that is not null that is very common?
It's estimating that there will be 10113 rows that match
nomsession='xxx' which makes a seq scan a much less bad plan.

On Sat, 17 Feb 2001, Jean-Christophe Boggio wrote:

Show quoted text

Hi,

I try to optimize our databases and I find a query that's not very
optimal :

sitefr=# explain select nomsession from session where nomsession='xxx';
NOTICE: QUERY PLAN:

Seq Scan on session (cost=0.00..16275.95 rows=10113 width=12)

EXPLAIN

Phew! I think there's an index missing but...

sitefr=# \d session
Table "session"
Attribute | Type | Modifier
------------+-----------+-------------------------------------------------
idsession | integer | not null default nextval('seq_idsession'::text)
nomsession | text |
idmembre | text |
referer | text |
ip | text |
datelog | timestamp |
Indices: ix_session_idmembre,
ix_session_nomsession,
session_idsession_key

So I look at the index itself :

sitefr=# \d ix_session_nomsession
Index "ix_session_nomsession"
Attribute | Type
------------+------
nomsession | text
btree

Did I miss something or 'text' attributes (fields) can't be indexed ?
That sounds crazy ! (I vacuum analyzed many times)

Just in case 'nomsession' would not be as dispersed as I would
think...

sitefr=# select count(nomsession) from session;
count
--------
510069
(1 row)

sitefr=# select count(distinct nomsession) from session;
count
--------
401094
(1 row)

#3Jean-Christophe Boggio
cat@thefreecat.org
In reply to: Stephan Szabo (#2)
Re[2]: Weird indices

Stephan,

Ref : Saturday, February 17, 2001 1:50:32 AM

SS> Do you have a value that is not null that is very common?
SS> It's estimating that there will be 10113 rows that match
SS> nomsession='xxx' which makes a seq scan a much less bad plan.

At first, I thought that couldn't be the case but we happen to have
100000 records where nomsession=''

Just updated them so that their value is null and everything runs
fine. Thanks for your help !

--
Jean-Christophe Boggio
cat@thefreecat.org
Independant Consultant and Developer
Delphi, Linux, Perl, PostgreSQL

#4Joseph Shraibman
jks@selectacast.net
In reply to: Stephan Szabo (#2)
Re: Weird indices

Stephan Szabo wrote:

Do you have a value that is not null that is very common?
It's estimating that there will be 10113 rows that match
nomsession='xxx' which makes a seq scan a much less bad plan.

Err, why? There is an index, isn't there? Shouldn't the index allow
postgres to quickly find the %2 of rows that would match?

On Sat, 17 Feb 2001, Jean-Christophe Boggio wrote:

Hi,

I try to optimize our databases and I find a query that's not very
optimal :

sitefr=# explain select nomsession from session where nomsession='xxx';
NOTICE: QUERY PLAN:

Seq Scan on session (cost=0.00..16275.95 rows=10113 width=12)

EXPLAIN

Phew! I think there's an index missing but...

sitefr=# \d session
Table "session"
Attribute | Type | Modifier
------------+-----------+-------------------------------------------------
idsession | integer | not null default nextval('seq_idsession'::text)
nomsession | text |
idmembre | text |
referer | text |
ip | text |
datelog | timestamp |
Indices: ix_session_idmembre,
ix_session_nomsession,
session_idsession_key

So I look at the index itself :

sitefr=# \d ix_session_nomsession
Index "ix_session_nomsession"
Attribute | Type
------------+------
nomsession | text
btree

Did I miss something or 'text' attributes (fields) can't be indexed ?
That sounds crazy ! (I vacuum analyzed many times)

Just in case 'nomsession' would not be as dispersed as I would
think...

sitefr=# select count(nomsession) from session;
count
--------
510069
(1 row)

sitefr=# select count(distinct nomsession) from session;
count
--------
401094
(1 row)

--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio. http://www.targabot.com

#5Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Joseph Shraibman (#4)
Re: Weird indices

On Mon, 19 Feb 2001, Joseph Shraibman wrote:

Stephan Szabo wrote:

Do you have a value that is not null that is very common?
It's estimating that there will be 10113 rows that match
nomsession='xxx' which makes a seq scan a much less bad plan.

Err, why? There is an index, isn't there? Shouldn't the index allow
postgres to quickly find the %2 of rows that would match?

Right now it has to go to the heap file to find out whether or not
a row is currently visible to the transaction which means potentially
alot of seeks and reads from the heap file which can be more expensive
than just sequentially reading from the heap file depending on a bunch
of things such as how wide the rows are (if there are 100 rows per
block in the heap file and 500000 rows, you need to do 5000 reads.
If you are looking for 10000 rows in that file, you're likely (always?)
going to end up doing 10000 heap file reads plus the reads on the
index file.)

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joseph Shraibman (#4)
Re: Weird indices

Joseph Shraibman <jks@selectacast.net> writes:

Stephan Szabo wrote:

Do you have a value that is not null that is very common?
It's estimating that there will be 10113 rows that match
nomsession='xxx' which makes a seq scan a much less bad plan.

Err, why? There is an index, isn't there? Shouldn't the index allow
postgres to quickly find the %2 of rows that would match?

Define "quickly".

sitefr=# explain select nomsession from session where nomsession='xxx';
NOTICE: QUERY PLAN:

Seq Scan on session (cost=0.00..16275.95 rows=10113 width=12)

We have here an estimate that 10113 rows will be matched (out of the
510069 in the table). The table contains something on the order of
16000 pages (guesstimate from the seqscan cost estimate). The planner
is assuming that the 10113 rows are randomly scattered in the table,
and therefore that the executor will have to fetch the majority of the
pages in the table. Under these circumstances a seqscan is cheaper
than an indexscan, because it works with the Unix kernel's preference
for sequential reads (to say nothing of the disk drive's ;-)), instead
of fighting that optimization. Random fetches are more than twice as
expensive as sequential fetches.

Of course, if the 10113-match estimate is wildly off (as it was in this
case), then the wrong plan may be chosen. But it IS NOT CORRECT to
suppose that indexscans always beat seqscans. The planner's job would
be a lot easier if that were true.

regards, tom lane

#7Joseph Shraibman
jks@selectacast.net
In reply to: Stephan Szabo (#2)
Re: Weird indices

Tom Lane wrote:

Joseph Shraibman <jks@selectacast.net> writes:

Stephan Szabo wrote:

Do you have a value that is not null that is very common?
It's estimating that there will be 10113 rows that match
nomsession='xxx' which makes a seq scan a much less bad plan.

Err, why? There is an index, isn't there? Shouldn't the index allow
postgres to quickly find the %2 of rows that would match?

Define "quickly".

sitefr=# explain select nomsession from session where nomsession='xxx';
NOTICE: QUERY PLAN:

Seq Scan on session (cost=0.00..16275.95 rows=10113 width=12)

We have here an estimate that 10113 rows will be matched (out of the
510069 in the table). The table contains something on the order of
16000 pages (guesstimate from the seqscan cost estimate). The planner
is assuming that the 10113 rows are randomly scattered in the table,
and therefore that the executor will have to fetch the majority of the
pages in the table. Under these circumstances a seqscan is cheaper
than an indexscan, because it works with the Unix kernel's preference
for sequential reads (to say nothing of the disk drive's ;-)), instead
of fighting that optimization. Random fetches are more than twice as
expensive as sequential fetches.

Of course, if the 10113-match estimate is wildly off (as it was in this
case), then the wrong plan may be chosen. But it IS NOT CORRECT to
suppose that indexscans always beat seqscans. The planner's job would
be a lot easier if that were true.

regards, tom lane

Can't postgres do the index lookup first and find out there are only a
few tuples that might match?

--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio. http://www.targabot.com

#8Joseph Shraibman
jks@selectacast.net
In reply to: Stephan Szabo (#2)
Re: Weird indices

Joseph Shraibman wrote:

Can't postgres do the index lookup first and find out there are only a
few tuples that might match?

Actually it looks like postgres is doing this:

o=# explain select * from usertable where p = 33;
NOTICE: QUERY PLAN:

Seq Scan on usertable (cost=0.00..30.54 rows=502 width=72)

EXPLAIN
o=# explain select * from usertable where p = 1;
NOTICE: QUERY PLAN:

Index Scan using usertable_p_key on usertable (cost=0.00..25.68 rows=50
width=72)

EXPLAIN
o=# explain select count(*) from usertable where p = 1;
NOTICE: QUERY PLAN:

Aggregate (cost=25.81..25.81 rows=1 width=4)
-> Index Scan using usertable_p_key on usertable (cost=0.00..25.68
rows=50 width=4)

EXPLAIN
o=# explain select count(*) from usertable where p = 33;
NOTICE: QUERY PLAN:

Aggregate (cost=31.79..31.79 rows=1 width=4)
-> Seq Scan on usertable (cost=0.00..30.54 rows=502 width=4)

o=# select count(*) from usertable where p in(1,33) group by p;
count
-------
16
502
(2 rows)

This raises some other questions. Why can't postgres get the count(*)
from the index? Why doesn't it predict the correct number of rows in
the planner? (25 estimated vs 16 actual).

--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio. http://www.targabot.com

#9Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Joseph Shraibman (#7)
Re: Weird indices

On Mon, 19 Feb 2001, Joseph Shraibman wrote:

Of course, if the 10113-match estimate is wildly off (as it was in this
case), then the wrong plan may be chosen. But it IS NOT CORRECT to
suppose that indexscans always beat seqscans. The planner's job would
be a lot easier if that were true.

Can't postgres do the index lookup first and find out there are only a
few tuples that might match?

Well, theoretically the estimate is supposed to match reality. There are
still some cases where there isn't enough information kept to allow that
to be true (the case where there is a single very common non-NULL value is
one such case).

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joseph Shraibman (#8)
Re: Weird indices

Joseph Shraibman <jks@selectacast.net> writes:

This raises some other questions. Why can't postgres get the count(*)
from the index? Why doesn't it predict the correct number of rows in
the planner? (25 estimated vs 16 actual).

The name of the game here is to make a plan *without* actually going
out and expending large amounts of time to find out the true state of
affairs; by the time you know for sure, you've already done the query.
We have to do a certain amount of guessing, otherwise the planner will
be a net drag on performance. Accordingly, the estimates will never be
perfectly accurate.

regards, tom lane

#11Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Joseph Shraibman (#8)
Re: Weird indices

On Mon, 19 Feb 2001, Joseph Shraibman wrote:

Joseph Shraibman wrote:

Can't postgres do the index lookup first and find out there are only a
few tuples that might match?

o=# select count(*) from usertable where p in(1,33) group by p;
count
-------
16
502
(2 rows)

This raises some other questions. Why can't postgres get the count(*)
from the index? Why doesn't it predict the correct number of rows in
the planner? (25 estimated vs 16 actual).

First question: Mostly the same reason. Not all of the index entries
are necessarily real active rows that you can see, so you would still
have to hit the heap file to get that data, so I'd guess you're already
hitting the entire heap file.

[Tom answered the second]

#12Joseph Shraibman
jks@selectacast.net
In reply to: Stephan Szabo (#2)
Re: Weird indices

Tom Lane wrote:

Joseph Shraibman <jks@selectacast.net> writes:

This raises some other questions. Why can't postgres get the count(*)
from the index? Why doesn't it predict the correct number of rows in
the planner? (25 estimated vs 16 actual).

The name of the game here is to make a plan *without* actually going
out and expending large amounts of time to find out the true state of
affairs; by the time you know for sure, you've already done the query.

Well I'd hope that extracting the count from the index should be very
low cost. That is what indecies are for.

We have to do a certain amount of guessing, otherwise the planner will
be a net drag on performance. Accordingly, the estimates will never be
perfectly accurate.

But certain things could be done. Like planning for the case of there
being a single not null value, and updating the indecies not to point at
expired rows. Isn't the point of a vacuum to get rid of old rows? Then
why doesn't it update the index as well?

I mean the explain shows that getting the count(*) from the field that
is indexed has to do a seq scan, presumably to determine if the rows are
in fact valid. That is ridiculous.

--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio. http://www.targabot.com

#13Joseph Shraibman
jks@selectacast.net
In reply to: Stephan Szabo (#9)
Re: Weird indices

Stephan Szabo wrote:

On Mon, 19 Feb 2001, Joseph Shraibman wrote:

Of course, if the 10113-match estimate is wildly off (as it was in this
case), then the wrong plan may be chosen. But it IS NOT CORRECT to
suppose that indexscans always beat seqscans. The planner's job would
be a lot easier if that were true.

Can't postgres do the index lookup first and find out there are only a
few tuples that might match?

Well, theoretically the estimate is supposed to match reality. There are
still some cases where there isn't enough information kept to allow that
to be true (the case where there is a single very common non-NULL value is
one such case).

But the index should give the upper bounds of the query and show that
this that this query is not going to return 10113 rows. It appeared to
work like this in my query. I don't really know what his database is
like or how many times it was updated since he last vacuumed, but it
seems that postgres should have been able to tell that query would have
returned much less than 10113 entries.

--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio. http://www.targabot.com

#14Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Joseph Shraibman (#13)
Re: Weird indices

On Mon, 19 Feb 2001, Joseph Shraibman wrote:

Stephan Szabo wrote:

On Mon, 19 Feb 2001, Joseph Shraibman wrote:

Of course, if the 10113-match estimate is wildly off (as it was in this
case), then the wrong plan may be chosen. But it IS NOT CORRECT to
suppose that indexscans always beat seqscans. The planner's job would
be a lot easier if that were true.

Can't postgres do the index lookup first and find out there are only a
few tuples that might match?

Well, theoretically the estimate is supposed to match reality. There are
still some cases where there isn't enough information kept to allow that
to be true (the case where there is a single very common non-NULL value is
one such case).

But the index should give the upper bounds of the query and show that
this that this query is not going to return 10113 rows. It appeared to
work like this in my query. I don't really know what his database is
like or how many times it was updated since he last vacuumed, but it
seems that postgres should have been able to tell that query would have
returned much less than 10113 entries.

The problem is that the stats that are kept are woefully inadequate for
these cases. The problem is basically that IIRC it's taking the
most common value's # of appearances and using a fraction of that
as the estimate for any other value. This is not a really meaningful
estimate of the number of rows to return and there's been talk of how
to add more detailed statistics to make this number more meaningful.
And btree indexes really aren't all that good for getting the exact
number of entries - you'd be better off keeping that number somewhere
else, but MVCC would probably make that difficult, since I'd guess
that the different versions of rows would each have index entries
and not all of them apply to your transaction - which is why I think
it goes to the heap to test for visibility.

#15Joseph Shraibman
jks@selectacast.net
In reply to: Stephan Szabo (#14)
Re: Weird indices

Stephan Szabo wrote:

On Mon, 19 Feb 2001, Joseph Shraibman wrote:

Stephan Szabo wrote:

On Mon, 19 Feb 2001, Joseph Shraibman wrote:

Of course, if the 10113-match estimate is wildly off (as it was in this
case), then the wrong plan may be chosen. But it IS NOT CORRECT to
suppose that indexscans always beat seqscans. The planner's job would
be a lot easier if that were true.

Can't postgres do the index lookup first and find out there are only a
few tuples that might match?

Well, theoretically the estimate is supposed to match reality. There are
still some cases where there isn't enough information kept to allow that
to be true (the case where there is a single very common non-NULL value is
one such case).

But the index should give the upper bounds of the query and show that
this that this query is not going to return 10113 rows. It appeared to
work like this in my query. I don't really know what his database is
like or how many times it was updated since he last vacuumed, but it
seems that postgres should have been able to tell that query would have
returned much less than 10113 entries.

The problem is that the stats that are kept are woefully inadequate for
these cases. The problem is basically that IIRC it's taking the
most common value's # of appearances and using a fraction of that
as the estimate for any other value. This is not a really meaningful
estimate of the number of rows to return and there's been talk of how
to add more detailed statistics to make this number more meaningful.
And btree indexes really aren't all that good for getting the exact
number of entries - you'd be better off keeping that number somewhere
else, but MVCC would probably make that difficult, since I'd guess
that the different versions of rows would each have index entries
and not all of them apply to your transaction - which is why I think
it goes to the heap to test for visibility.

You didn't address my point. The point was that an explain shows that
it is using the index to get an upper bounds, so why isn't it using
that?

--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio. http://www.targabot.com

#16Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Joseph Shraibman (#15)
Re: Weird indices

On Tue, 20 Feb 2001, Joseph Shraibman wrote:

But the index should give the upper bounds of the query and show that
this that this query is not going to return 10113 rows. It appeared to
work like this in my query. I don't really know what his database is
like or how many times it was updated since he last vacuumed, but it
seems that postgres should have been able to tell that query would have
returned much less than 10113 entries.

The problem is that the stats that are kept are woefully inadequate for
these cases. The problem is basically that IIRC it's taking the
most common value's # of appearances and using a fraction of that
as the estimate for any other value. This is not a really meaningful
estimate of the number of rows to return and there's been talk of how
to add more detailed statistics to make this number more meaningful.
And btree indexes really aren't all that good for getting the exact
number of entries - you'd be better off keeping that number somewhere
else, but MVCC would probably make that difficult, since I'd guess
that the different versions of rows would each have index entries
and not all of them apply to your transaction - which is why I think
it goes to the heap to test for visibility.

You didn't address my point. The point was that an explain shows that
it is using the index to get an upper bounds, so why isn't it using
that?

Where are you seeing something that says the estimator/planner using the
index to get an upper bound? The estimator shouldn't be asking either the
index or the heap for anything, it should be working entirely with the
statistics that were generated from vacuum.

#17Joseph Shraibman
jks@selectacast.net
In reply to: Stephan Szabo (#16)
Re: Weird indices

Stephan Szabo wrote:

Where are you seeing something that says the estimator/planner using the
index to get an upper bound? The estimator shouldn't be asking either the
index or the heap for anything, it should be working entirely with the
statistics that were generated from vacuum.

Index Scan using usertable_p_key on usertable (cost=0.00..25.68 rows=50
width=72)

That rows=50, which is an overestimate by the way.

--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio. http://www.targabot.com

#18Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Joseph Shraibman (#17)
Re: Weird indices

On Tue, 20 Feb 2001, Joseph Shraibman wrote:

Stephan Szabo wrote:

Where are you seeing something that says the estimator/planner using the
index to get an upper bound? The estimator shouldn't be asking either the
index or the heap for anything, it should be working entirely with the
statistics that were generated from vacuum.

Index Scan using usertable_p_key on usertable (cost=0.00..25.68 rows=50
width=72)

That rows=50, which is an overestimate by the way.

That's because the estimate in this case was 50 and so it's estimating
that going through the index and checking the heap is faster than a
sequence scan. The *estimator* didn't use the index to figure that out,
it's just saying that the best plan to actually *run* the query uses
the index.
IIRC, There's something which is effectively :
estimated rows = <most common value's frequency>*<fraction>
I think fraction defaults to (is always?) 1/10 for the standard
index type. That's where the 50 comes from. And the frequency is
probably from the last vacuum analyze.

#19Joseph Shraibman
jks@selectacast.net
In reply to: Stephan Szabo (#18)
Re: Weird indices

Stephan Szabo wrote:

On Tue, 20 Feb 2001, Joseph Shraibman wrote:

Stephan Szabo wrote:

Where are you seeing something that says the estimator/planner using the
index to get an upper bound? The estimator shouldn't be asking either the
index or the heap for anything, it should be working entirely with the
statistics that were generated from vacuum.

Index Scan using usertable_p_key on usertable (cost=0.00..25.68 rows=50
width=72)

That rows=50, which is an overestimate by the way.

That's because the estimate in this case was 50 and so it's estimating
that going through the index and checking the heap is faster than a
sequence scan. The *estimator* didn't use the index to figure that out,
it's just saying that the best plan to actually *run* the query uses
the index.
IIRC, There's something which is effectively :
estimated rows = <most common value's frequency>*<fraction>
I think fraction defaults to (is always?) 1/10 for the standard
index type. That's where the 50 comes from. And the frequency is
probably from the last vacuum analyze.

Then it should do the same thing no matter what value I use, but when I
do different searches in one case it estimates 50 when there are 16 and
in the other it estimeates 502 where there are 502.

--
Joseph Shraibman
jks@selectacast.net
Increase signal to noise ratio. http://www.targabot.com

#20Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Joseph Shraibman (#19)
Re: Weird indices

On Tue, 20 Feb 2001, Joseph Shraibman wrote:

That's because the estimate in this case was 50 and so it's estimating
that going through the index and checking the heap is faster than a
sequence scan. The *estimator* didn't use the index to figure that out,
it's just saying that the best plan to actually *run* the query uses
the index.
IIRC, There's something which is effectively :
estimated rows = <most common value's frequency>*<fraction>
I think fraction defaults to (is always?) 1/10 for the standard
index type. That's where the 50 comes from. And the frequency is
probably from the last vacuum analyze.

Then it should do the same thing no matter what value I use, but when I
do different searches in one case it estimates 50 when there are 16 and
in the other it estimeates 502 where there are 502.

It knows enough to do the special case where you are searching for the
most common value. I'd guess that's what's happening on the 502.
I think it stores the most common value and the fraction of rows that
represents as of last vacuum analyze.

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Joseph Shraibman (#19)
#22Martijn van Oosterhout
kleptog@svana.org
In reply to: Stephan Szabo (#18)
#23Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Martijn van Oosterhout (#22)
#24Martijn van Oosterhout
kleptog@svana.org
In reply to: Stephan Szabo (#23)
#25Stephan Szabo
sszabo@megazone23.bigpanda.com
In reply to: Martijn van Oosterhout (#24)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Martijn van Oosterhout (#22)
#27Tom Lane
tgl@sss.pgh.pa.us
In reply to: Stephan Szabo (#25)
#28Richard Huxton
dev@archonet.com
In reply to: Joseph Shraibman (#17)