Why is explain horribly optimistic for sorts?

Started by Benabout 25 years ago13 messagesgeneral

bench@silentmedia.com

about 25 years ago

Hello all. We are logging our web server traffic to postgresql 7.0.3, and
that's working well. What's not working so well is retrieving our data in
reasonable times if I try to order it. When I run our queries through
explain, it *looks* like they will run in reasonable times, but in fact
they take several minutes. That's not so good. I'm wondering why explain
is so horribly wrong when it comes to sorts? For that matter, I'm
wondering why sorts take so incredibly long.

Some background.....

- We only have ~120,000 records.
- The relevant parts of the table are:

Table "jennyann"
Attribute | Type | Modifier
----------------+-------------+----------
ClientHost | text |
LogTime | timestamp |
target | text |
host | text |
Indices: jennyan_host_key,
jennyann_clienthost_key,
jennyann_logtime_key,
jennyann_target_key

- All indices are normal btrees.
- ClientHost is (for the most part) an IP address.

Here's what explain tells me:

explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000;
NOTICE: QUERY PLAN:

Sort (cost=119.88..119.88 rows=2085 width=136)
-> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=2085 width=136)

A cost of 119 seems pretty good, and usually takes just a couple seconds
for other queries I've made. Unfortuantely, it's completely wrong. This
query takes several minutes to complete. If I drop the "order by" clause
then things get to be reasonable speeds, but I rather need that clause
there.

Help? Please?

will@serensoft.com

about 25 years ago

In reply to: Ben (#1)

Re: Why is explain horribly optimistic for sorts?

On Sat, Mar 03, 2001 at 09:44:27AM -0800, Ben wrote:

Hello all. We are logging our web server traffic to postgresql 7.0.3, and
that's working well. What's not working so well is retrieving our data in
reasonable times if I try to order it. When I run our queries through
explain, it *looks* like they will run in reasonable times, but in fact
they take several minutes. That's not so good. I'm wondering why explain
is so horribly wrong when it comes to sorts? For that matter, I'm
wondering why sorts take so incredibly long.

[snip]

explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000;
NOTICE: QUERY PLAN:

Sort (cost=119.88..119.88 rows=2085 width=136)
-> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=2085 width=136)

A cost of 119 seems pretty good, and usually takes just a couple seconds
for other queries I've made. Unfortuantely, it's completely wrong. This
query takes several minutes to complete. If I drop the "order by" clause
then things get to be reasonable speeds, but I rather need that clause
there.

<guessing>
would this work?

create temp table QRY as select * from jennyann where ...;
create index ORD on qry("LogTime");
select * from QRY order by "Logtime" limit 1000;

if the select is fast but the order is slow, maybe this'd help?
</guessing>

--
It is always hazardous to ask "Why?" in science, but it is often
interesting to do so just the same.
-- Isaac Asimov, 'The Genetic Code'

will@serensoft.com
http://groups.yahoo.com/group/newbieDoc -- we need your brain!
http://www.dontUthink.com/ -- your brain needs us!

tgl@sss.pgh.pa.us

about 25 years ago

In reply to: Ben (#1)

Re: Why is explain horribly optimistic for sorts?

Ben <bench@silentmedia.com> writes:

Here's what explain tells me:

explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000;
NOTICE: QUERY PLAN:

Sort (cost=119.88..119.88 rows=2085 width=136)
-> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=2085 width=136)

How many records are actually getting sorted here --- ie, how many would
you get back if you didn't have the LIMIT? It seems to be estimating
2085, but I suspect that must be wildly wrong ...

regards, tom lane

mitch@venux.net

about 25 years ago

In reply to: Ben (#1)

Re: Why is explain horribly optimistic for sorts?

You VACUUM and VACUUM ANALYZE regularly, right?

-Mitch

----- Original Message -----
From: "Ben" <bench@silentmedia.com>
To: <pgsql-general@postgresql.org>
Sent: Saturday, March 03, 2001 12:44 PM
Subject: Why is explain horribly optimistic for sorts?

Hello all. We are logging our web server traffic to postgresql 7.0.3, and
that's working well. What's not working so well is retrieving our data in
reasonable times if I try to order it. When I run our queries through
explain, it *looks* like they will run in reasonable times, but in fact
they take several minutes. That's not so good. I'm wondering why explain
is so horribly wrong when it comes to sorts? For that matter, I'm
wondering why sorts take so incredibly long.

Some background.....

- We only have ~120,000 records.
- The relevant parts of the table are:

Table "jennyann"
Attribute | Type | Modifier
----------------+-------------+----------
ClientHost | text |
LogTime | timestamp |
target | text |
host | text |
Indices: jennyan_host_key,
jennyann_clienthost_key,
jennyann_logtime_key,
jennyann_target_key

- All indices are normal btrees.
- ClientHost is (for the most part) an IP address.

Here's what explain tells me:

explain SELECT * FROM jennyann where target like '/music/%' order by

"LogTime" limit 1000;

NOTICE: QUERY PLAN:

Sort (cost=119.88..119.88 rows=2085 width=136)
-> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94

rows=2085 width=136)

Show quoted text

A cost of 119 seems pretty good, and usually takes just a couple seconds
for other queries I've made. Unfortuantely, it's completely wrong. This
query takes several minutes to complete. If I drop the "order by" clause
then things get to be reasonable speeds, but I rather need that clause
there.

Help? Please?

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

bench@silentmedia.com

about 25 years ago

In reply to: Mitch Vincent (#4)

Re: Why is explain horribly optimistic for sorts?

nightly.

On Sat, 3 Mar 2001, Mitch Vincent wrote:

Show quoted text

You VACUUM and VACUUM ANALYZE regularly, right?

-Mitch

----- Original Message -----
From: "Ben" <bench@silentmedia.com>
To: <pgsql-general@postgresql.org>
Sent: Saturday, March 03, 2001 12:44 PM
Subject: Why is explain horribly optimistic for sorts?

Hello all. We are logging our web server traffic to postgresql 7.0.3, and
that's working well. What's not working so well is retrieving our data in
reasonable times if I try to order it. When I run our queries through
explain, it *looks* like they will run in reasonable times, but in fact
they take several minutes. That's not so good. I'm wondering why explain
is so horribly wrong when it comes to sorts? For that matter, I'm
wondering why sorts take so incredibly long.

Some background.....

- We only have ~120,000 records.
- The relevant parts of the table are:

Table "jennyann"
Attribute | Type | Modifier
----------------+-------------+----------
ClientHost | text |
LogTime | timestamp |
target | text |
host | text |
Indices: jennyan_host_key,
jennyann_clienthost_key,
jennyann_logtime_key,
jennyann_target_key

- All indices are normal btrees.
- ClientHost is (for the most part) an IP address.

Here's what explain tells me:

explain SELECT * FROM jennyann where target like '/music/%' order by

"LogTime" limit 1000;

NOTICE: QUERY PLAN:

Sort (cost=119.88..119.88 rows=2085 width=136)
-> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94

rows=2085 width=136)

A cost of 119 seems pretty good, and usually takes just a couple seconds
for other queries I've made. Unfortuantely, it's completely wrong. This
query takes several minutes to complete. If I drop the "order by" clause
then things get to be reasonable speeds, but I rather need that clause
there.

Help? Please?

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

bench@silentmedia.com

about 25 years ago

In reply to: Tom Lane (#3)

Re: Why is explain horribly optimistic for sorts?

Yes, it is horribly wrong.

select count(*) FROM jennyann where target like '/music/%'

gives me 93686 rows.

On Sat, 3 Mar 2001, Tom Lane wrote:

Show quoted text

Ben <bench@silentmedia.com> writes:

Here's what explain tells me:

explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000;
NOTICE: QUERY PLAN:

Sort (cost=119.88..119.88 rows=2085 width=136)
-> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=2085 width=136)

How many records are actually getting sorted here --- ie, how many would
you get back if you didn't have the LIMIT? It seems to be estimating
2085, but I suspect that must be wildly wrong ...

regards, tom lane

tgl@sss.pgh.pa.us

about 25 years ago

In reply to: Ben (#6)

Re: Why is explain horribly optimistic for sorts?

Ben <bench@silentmedia.com> writes:

Yes, it is horribly wrong.
select count(*) FROM jennyann where target like '/music/%'
gives me 93686 rows.

Well, that misestimation is the source of the problem, then, not any
misestimation of the cost of a sort.

7.0 didn't have very good estimation rules for LIKE clauses, at least
not by default. Have you tried the new LIKE estimator (see
contrib/likeplanning/README in the source tree)? I'm not sure it will
do any better, given that your data appears to be mightily nonuniform
;-), but it's worth a try.

regards, tom lane

bench@silentmedia.com

about 25 years ago

In reply to: will trillich (#2)

Re: Why is explain horribly optimistic for sorts?

While this does get me results faster, it still takes a couple minutes,
and I don't see why this should be the case.

On Sat, 3 Mar 2001, will trillich wrote:

Show quoted text

On Sat, Mar 03, 2001 at 09:44:27AM -0800, Ben wrote:

Hello all. We are logging our web server traffic to postgresql 7.0.3, and
that's working well. What's not working so well is retrieving our data in
reasonable times if I try to order it. When I run our queries through
explain, it *looks* like they will run in reasonable times, but in fact
they take several minutes. That's not so good. I'm wondering why explain
is so horribly wrong when it comes to sorts? For that matter, I'm
wondering why sorts take so incredibly long.

[snip]

explain SELECT * FROM jennyann where target like '/music/%' order by "LogTime" limit 1000;
NOTICE: QUERY PLAN:

Sort (cost=119.88..119.88 rows=2085 width=136)
-> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=2085 width=136)

A cost of 119 seems pretty good, and usually takes just a couple seconds
for other queries I've made. Unfortuantely, it's completely wrong. This
query takes several minutes to complete. If I drop the "order by" clause
then things get to be reasonable speeds, but I rather need that clause
there.

<guessing>
would this work?

create temp table QRY as select * from jennyann where ...;
create index ORD on qry("LogTime");
select * from QRY order by "Logtime" limit 1000;

if the select is fast but the order is slow, maybe this'd help?
</guessing>

--
It is always hazardous to ask "Why?" in science, but it is often
interesting to do so just the same.
-- Isaac Asimov, 'The Genetic Code'

will@serensoft.com
http://groups.yahoo.com/group/newbieDoc -- we need your brain!
http://www.dontUthink.com/ -- your brain needs us!

---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly

bench@silentmedia.com

about 25 years ago

In reply to: Tom Lane (#7)

Re: Why is explain horribly optimistic for sorts?

After enabling the new likeplanning code, things only get worse... the
estimated cost for the same query is now:

Sort (cost=4.95..4.95 rows=1 width=136)
-> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=1 width=136)

So this doesn't really seem to help much. :)

Out of curiosity, why does it take so long to order data by a datetime
field?

On Sat, 3 Mar 2001, Tom Lane wrote:

Show quoted text

Ben <bench@silentmedia.com> writes:

Yes, it is horribly wrong.
select count(*) FROM jennyann where target like '/music/%'
gives me 93686 rows.

Well, that misestimation is the source of the problem, then, not any
misestimation of the cost of a sort.

7.0 didn't have very good estimation rules for LIKE clauses, at least
not by default. Have you tried the new LIKE estimator (see
contrib/likeplanning/README in the source tree)? I'm not sure it will
do any better, given that your data appears to be mightily nonuniform
;-), but it's worth a try.

regards, tom lane

tgl@sss.pgh.pa.us

about 25 years ago

In reply to: Ben (#9)

Re: Why is explain horribly optimistic for sorts?

Ben <bench@silentmedia.com> writes:

After enabling the new likeplanning code, things only get worse... the
estimated cost for the same query is now:

Sort (cost=4.95..4.95 rows=1 width=136)
-> Index Scan using jennyann_target_key on jennyann (cost=0.00..4.94 rows=1 width=136)

Yeah, I was afraid of that: the new code is smarter but also subject to
the same range-estimation problem that plagues us in cases like "WHERE a
BETWEEN b AND c". The only stats we have for range estimation are the
column minimum and maximum values collected by VACUUM ANALYZE, so we
just do a linear interpolation. If your data isn't fairly uniformly
distributed between the min and max then range estimates will be all
wet. In this example, it seems that 80% of your target entries start
with "/music/", but I'll bet the min and max cover a much wider range.
So the estimator mistakenly thinks that LIKE '/music/%' will select
a fairly small proportion of rows, rather than 80%, and this leads the
planner to choose a plan that would be appropriate if that were true.

There has been talk of keeping more extensive stats to allow better
estimation in cases like this, but I don't foresee it happening for
a release or two yet...

Out of curiosity, why does it take so long to order data by a datetime
field?

AFAIK it shouldn't be materially different from sorting on a float8
field. You're still blaming the wrong thing: this isn't the sorter's
fault, it's doing the best it can. The reason this is so slow, I
believe, is that an indexscan is being used to select 80% of the data.
That results in a lot of useless thrashing to visit tuples in strict
order by target. A simple sequential scan would've been much faster.

It's possible that an indexscan on logtime would work better, because it
could stop as soon as it had retrieved 1000 matching tuples. This would
be particularly likely if the tuples are more or less in logtime order,
so that there's not a lot of disk thrashing to retrieve them. Otherwise
a plain old sequential scan and sort is likely to be the fastest way.

Unfortunately the planner is never going to figure this out as long as
it's so far off about the selectivity of the LIKE. You could try
forcing its hand with various combinations of
SET enable_sort TO off;
SET enable_indexscan TO off;
... just remember to turn these things back on for subsequent queries,
else you'll get really bad plans ...

regards, tom lane

matt@starnix.com

about 25 years ago

In reply to: Ben (#1)

documentation bug (was Re: Why is explain horribly optimistic for sorts?)

will trillich <will@serensoft.com> writes:

create temp table QRY as select * from jennyann where ...;

I'm still learning SQL so I looked up the 'temp' keyword in the
7.0 docs. They describe 'TEMP' in the 'CREATE TABLE' command but
not in the 'CREATE TABLE AS' command.

Well, I just tried it and it worked so I guess it's just an oversight
in the docs.

Regards,
--
matthew rice <matt@starnix.com> starnix inc.
tollfree: 1-87-pro-linux thornhill, ontario, canada
http://www.starnix.com professional linux services & products

Import Notes

Reply to msg id not found: willtrillich'smessageofSat3Mar2001115726-0600

tgl@sss.pgh.pa.us

about 25 years ago

In reply to: Matthew Rice (#11)

Re: documentation bug (was Re: Why is explain horribly optimistic for sorts?)

Matthew Rice <matt@starnix.com> writes:

I'm still learning SQL so I looked up the 'temp' keyword in the
7.0 docs. They describe 'TEMP' in the 'CREATE TABLE' command but
not in the 'CREATE TABLE AS' command.

Well, I just tried it and it worked so I guess it's just an oversight
in the docs.

Good point. Fixed, thanks.

regards, tom lane

bench@silentmedia.com

about 25 years ago

In reply to: Tom Lane (#10)

Re: Why is explain horribly optimistic for sorts?

On Sat, 3 Mar 2001, Tom Lane wrote:

Yeah, I was afraid of that: the new code is smarter but also subject to
the same range-estimation problem that plagues us in cases like "WHERE a
BETWEEN b AND c". The only stats we have for range estimation are the
column minimum and maximum values collected by VACUUM ANALYZE, so we
just do a linear interpolation. If your data isn't fairly uniformly
distributed between the min and max then range estimates will be all
wet. In this example, it seems that 80% of your target entries start
with "/music/", but I'll bet the min and max cover a much wider range.
So the estimator mistakenly thinks that LIKE '/music/%' will select
a fairly small proportion of rows, rather than 80%, and this leads the
planner to choose a plan that would be appropriate if that were true.

Ah, I think I understand the problem here.

Out of curiosity, why does it take so long to order data by a datetime
field?

AFAIK it shouldn't be materially different from sorting on a float8
field. You're still blaming the wrong thing: this isn't the sorter's
fault, it's doing the best it can. The reason this is so slow, I
believe, is that an indexscan is being used to select 80% of the data.
That results in a lot of useless thrashing to visit tuples in strict
order by target. A simple sequential scan would've been much faster.

I think you're correct, because this did speed things up quite a bit....
the query time dropped to about a minute.

It's possible that an indexscan on logtime would work better, because it
could stop as soon as it had retrieved 1000 matching tuples. This would
be particularly likely if the tuples are more or less in logtime order,
so that there's not a lot of disk thrashing to retrieve them. Otherwise
a plain old sequential scan and sort is likely to be the fastest way.

And *this* sped things up even more.... not using the target key and using
the LogTime key instead dropped the time to about 30 seconds.

Unfortunately the planner is never going to figure this out as long as
it's so far off about the selectivity of the LIKE. You could try
forcing its hand with various combinations of
SET enable_sort TO off;
SET enable_indexscan TO off;
... just remember to turn these things back on for subsequent queries,
else you'll get really bad plans ...

Having no notion of what's involved, it seems like it would be nice to let
the query optimizer select which index to use based on the query and the
indices available. Or, if it already does this, to figure out exactly why
it's not doing the right thing here.

Thanks for the help!