The planner chooses seqscan+sort when there is an index on the sort column

Started by Csaba Nagyalmost 20 years ago15 messagesgeneral

nagy@ecircle-ag.com

almost 20 years ago

Hi all,

I wonder why this happens:

- postgres: 8.1.3
- the table has ~200 million rows;
- there is a primary key on (col_1, col_2);
- the table was ANALYZEd;
- the planner chooses seqscan+sort for the following query even with
enable_seqscan=off:

select * from table order by col_1;

Isn't it supposed to choose the index scan at least when
enable_seqscan=off ? Even if it is indeed not faster to do the index
scan than seqscan+sort.

The actual plan looks like (names changed):

db=# set enable_seqscan = off;
SET
db=# explain select * from table order by col_1;
QUERY PLAN
-----------------------------------------------------------------------------------------
Sort (cost=165585198.70..166102386.06 rows=206874944 width=60)
Sort Key: col_1
-> Seq Scan on table (cost=100000000.00..104552091.44
rows=206874944 width=60)
(3 rows)

Cheers,
Csaba.

John D. Burger

john@mitre.org

almost 20 years ago

In reply to: Csaba Nagy (#1)

Re: The planner chooses seqscan+sort when there is an index on the sort column

Csaba Nagy wrote:

select * from table order by col_1;

Isn't it supposed to choose the index scan at least when
enable_seqscan=off ? Even if it is indeed not faster to do the index
scan than seqscan+sort.

I think because you've asked for every row, it's going to have to scan
the whole table anyway, to determine MVCC "liveness" of the rows
(sorry, dunno what the correct word is).

- John Burger
MITRE

Andreas Kretschmer

akretschmer@spamfence.net

almost 20 years ago

In reply to: Csaba Nagy (#1)

Re: The planner chooses seqscan+sort when there is an index on the sort column

Csaba Nagy <nagy@ecircle-ag.com> schrieb:

select * from table order by col_1;

Without a WHERE you get the whole table.

A Index-Scan is, in this case, expensive.

HTH, Andreas
--
Really, I'm not out to destroy Microsoft. That will just be a completely
unintentional side effect. (Linus Torvalds)
"If I was god, I would recompile penguin with --enable-fly." (unknow)
Kaufbach, Saxony, Germany, Europe. N 51.05082ï¿½, E 13.56889ï¿½

Csaba Nagy

nagy@ecircle-ag.com

almost 20 years ago

In reply to: John D. Burger (#2)

Re: The planner chooses seqscan+sort when there is an

On Wed, 2006-05-03 at 17:48, John D. Burger wrote:

Csaba Nagy wrote:

select * from table order by col_1;

Isn't it supposed to choose the index scan at least when
enable_seqscan=off ? Even if it is indeed not faster to do the index
scan than seqscan+sort.

I think because you've asked for every row, it's going to have to scan
the whole table anyway, to determine MVCC "liveness" of the rows
(sorry, dunno what the correct word is).

But I also asked for _ordered_ results, which the seq scan is not
covering, but the index does... and I specifically disabled sequential
scan. That means the planner is not even considering the primary key
index, and I would like to know why...

Actually this is a problem for me in a more complex query, which also
needs this table ordered by that column, and it results in the same plan
fragment with sequential scan + sort.

Thanks,
Csaba.

Csaba Nagy

nagy@ecircle-ag.com

almost 20 years ago

In reply to: Andreas Kretschmer (#3)

Re: The planner chooses seqscan+sort when there is an

Without a WHERE you get the whole table.

A Index-Scan is, in this case, expensive.

I know that, and I would agree if I wouldn't have asked for _ordered_
result, and wouldn't have turned enable_seqscan=off.

In these conditions I would have expected an index scan, even if it is
more expensive than the sequential scan + sort. Actually I wanted to see
how expensive it thinks it is, but I can't get to that plan at all.

Thanks,
Csaba.

John D. Burger

john@mitre.org

almost 20 years ago

In reply to: Csaba Nagy (#5)

Re: The planner chooses seqscan+sort when there is an

But I also asked for _ordered_ results, which the seq scan is not
covering, but the index does... and I specifically disabled sequential
scan.

Docs say:

Enables or disables the query planner's use of sequential scan plan
types. It's not possible to suppress sequential scans entirely, but
turning this variable off discourages the planner from using one if
there are other methods available.

Note the second sentence. Again, I think it will have to scan the
whole table anyway, because that's what you've asked for, and given
that, enable_seqscan=off doesn't apply.

- John Burger
MITRE

Csaba Nagy

nagy@ecircle-ag.com

almost 20 years ago

In reply to: John D. Burger (#6)

Re: The planner chooses seqscan+sort when there is an

Docs say:

Enables or disables the query planner's use of sequential scan plan
types. It's not possible to suppress sequential scans entirely, but
turning this variable off discourages the planner from using one if
there are other methods available.

Note the second sentence. Again, I think it will have to scan the
whole table anyway, because that's what you've asked for, and given
that, enable_seqscan=off doesn't apply.

OK, maybe that's the point... the "cost bust" given to the sequential
scan by enable_seqscan=off is not enough in this case to exceed the cost
of the index scan ? The table is quite big, might be possible. I still
wonder why would be seqscan+sort faster than index scan... the sort will
for sure have to write to disk too given the size of the table...

Cheers,
Csaba.

Martijn van Oosterhout

kleptog@svana.org

almost 20 years ago

In reply to: Csaba Nagy (#7)

Re: The planner chooses seqscan+sort when there is an

On Wed, May 03, 2006 at 06:42:00PM +0200, Csaba Nagy wrote:

OK, maybe that's the point... the "cost bust" given to the sequential
scan by enable_seqscan=off is not enough in this case to exceed the cost
of the index scan ? The table is quite big, might be possible. I still
wonder why would be seqscan+sort faster than index scan... the sort will
for sure have to write to disk too given the size of the table...

Have you tuned the values of effective_cache_size and random_page_cost?
These have significant effects on index scans.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

Florian Pflug

fgp@phlo.org

almost 20 years ago

In reply to: Csaba Nagy (#7)

Re: The planner chooses seqscan+sort when there is an

Csaba Nagy wrote:

Docs say:

Enables or disables the query planner's use of sequential scan plan
types. It's not possible to suppress sequential scans entirely, but
turning this variable off discourages the planner from using one if
there are other methods available.

Note the second sentence. Again, I think it will have to scan the
whole table anyway, because that's what you've asked for, and given
that, enable_seqscan=off doesn't apply.

OK, maybe that's the point... the "cost bust" given to the sequential
scan by enable_seqscan=off is not enough in this case to exceed the cost
of the index scan ? The table is quite big, might be possible. I still
wonder why would be seqscan+sort faster than index scan... the sort will
for sure have to write to disk too given the size of the table...

When using an indexscan, postgres will read the actual rows in
index-order, rathen then in the order they appear on-disk.
For 200 million rows this means doing at least 200 million
disk seeks. Assuming that each seek takes just 1ms, thats
still amount to 200.000 seconds.

greetings, Florian Pflug

#10

Tom Lane

tgl@sss.pgh.pa.us

almost 20 years ago

In reply to: Csaba Nagy (#7)

Re: The planner chooses seqscan+sort when there is an

Csaba Nagy <nagy@ecircle-ag.com> writes:

OK, maybe that's the point... the "cost bust" given to the sequential
scan by enable_seqscan=off is not enough in this case to exceed the cost
of the index scan ?

Looks that way to me. You could try setting enable_sort off as well,
which will penalize the seqscan+sort plan another 100million cost units.
And maybe try reducing random_page_cost to make the indexscan look
cheaper. However, if there's a 100million delta between the two plans,
I suspect you really really don't want the indexscan anyway ;-)

regards, tom lane

#11

Scott Marlowe

smarlowe@g2switchworks.com

almost 20 years ago

In reply to: Tom Lane (#10)

Re: The planner chooses seqscan+sort when there is an

On Wed, 2006-05-03 at 13:34, Tom Lane wrote:

Csaba Nagy <nagy@ecircle-ag.com> writes:

OK, maybe that's the point... the "cost bust" given to the sequential
scan by enable_seqscan=off is not enough in this case to exceed the cost
of the index scan ?

Looks that way to me. You could try setting enable_sort off as well,
which will penalize the seqscan+sort plan another 100million cost units.
And maybe try reducing random_page_cost to make the indexscan look
cheaper. However, if there's a 100million delta between the two plans,
I suspect you really really don't want the indexscan anyway ;-)

I imagine the followup post:

So, I've had this query running for six weeks now, and...

#12

A. Kretschmer

andreas.kretschmer@schollglas.com

almost 20 years ago

In reply to: Florian Pflug (#9)

Re: The planner chooses seqscan+sort when there is an

am 03.05.2006, um 20:20:55 +0200 mailte Florian G. Pflug folgendes:

of the index scan ? The table is quite big, might be possible. I still
wonder why would be seqscan+sort faster than index scan... the sort will
for sure have to write to disk too given the size of the table...

When using an indexscan, postgres will read the actual rows in index-order,
rathen then in the order they appear on-disk.
For 200 million rows this means doing at least 200 million
disk seeks. Assuming that each seek takes just 1ms, thats
still amount to 200.000 seconds.

Yepp, it is much cheaper to read the table seq and order later.

Andreas
--
Andreas Kretschmer (Kontakt: siehe Header)
Heynitz: 035242/47215, D1: 0160/7141639
GnuPG-ID 0x3FFF606C http://wwwkeys.de.pgp.net
=== Schollglas Unternehmensgruppe ===

#13

Csaba Nagy

nagy@ecircle-ag.com

almost 20 years ago

In reply to: Scott Marlowe (#11)

Re: The planner chooses seqscan+sort when there is an

Looks that way to me. You could try setting enable_sort off as well,
which will penalize the seqscan+sort plan another 100million cost units.
And maybe try reducing random_page_cost to make the indexscan look
cheaper. However, if there's a 100million delta between the two plans,
I suspect you really really don't want the indexscan anyway ;-)

I imagine the followup post:

So, I've had this query running for six weeks now, and...

Well, I guess that's it then... I will let the query run with the
seqscan+sort. It will still run 1-2 days, yesterday I stopped it after 6
hours ;-) Actually it would be nice to have some kind of feedback on
what is it doing so I can estimate how long it will still take... cause
I'm not sure the seqscan+sort won't run itself for 6 weeks...

Thanks,
Csaba.

#14

Martijn van Oosterhout

kleptog@svana.org

almost 20 years ago

In reply to: Csaba Nagy (#13)

Re: The planner chooses seqscan+sort when there is an

On Thu, May 04, 2006 at 10:09:33AM +0200, Csaba Nagy wrote:

Well, I guess that's it then... I will let the query run with the
seqscan+sort. It will still run 1-2 days, yesterday I stopped it after 6
hours ;-) Actually it would be nice to have some kind of feedback on
what is it doing so I can estimate how long it will still take... cause
I'm not sure the seqscan+sort won't run itself for 6 weeks...

There are a number of ways you can (indirectly) see how far it has got.
If it's in the first phase (the seq scan), by looking at which file it
has open you should be able to see how far along the table it is. Once
it's in the sort stage you should be able to see from the tape files
approimately how far it is. As for the number of sort stages required,
that depends on the size of the data to be sorted, but you should be
able to estimate that...

At least, you should be able to tell the difference between a 6 hour and
6 day query.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

From each according to his ability. To each according to his ability to litigate.

#15

Csaba Nagy

nagy@ecircle-ag.com

almost 20 years ago

In reply to: Martijn van Oosterhout (#14)

Re: The planner chooses seqscan+sort when there is an

There are a number of ways you can (indirectly) see how far it has got.
If it's in the first phase (the seq scan), by looking at which file it
has open you should be able to see how far along the table it is. Once
it's in the sort stage you should be able to see from the tape files
approimately how far it is. As for the number of sort stages required,
that depends on the size of the data to be sorted, but you should be
able to estimate that...

At least, you should be able to tell the difference between a 6 hour and
6 day query.

Thanks for the hints, I'll try to look up on the details...

Cheers,
Csaba.