Yet again on indices...
Ok,
I'm working on query analysis for a program in ecpg for business puposes. Look
at what I found on with PG 7.2: Please be cool with my french2english processor,
I got few bogomips in my brain dedicated to english (should have listen more in
class..):
----
line 962 (in the ecpg source..)
EXPLAIN SELECT t12_bskid, t12_pnb, t12_lne, t12_tck
FROM T12_20011231
WHERE t12_bskid >= 1
ORDER BY t12_bskid, t12_pnb, t12_tck, t12_lne;
NOTICE: QUERY PLAN:
Sort (cost=3006.13..3006.13 rows=25693 width=46)
-> Seq Scan on t12_20011231 (cost=0.00..1124.20 rows=25693 width=46)
=> not good, table t12_20011231 as 26K tuples :-(
=> create index t12_idx_bskid_20011231 on t12_20011231 (t12_bskid);
Sort (cost=3006.13..3006.13 rows=25693 width=46)
-> Seq Scan on t12_20011231 (cost=0.00..1124.20 rows=25693 width=46)
=> probably statistic refresh to be done:
$ /usr/local/pgsql/bin/vacuumdb --analyze dbks
Sort (cost=3006.13..3006.13 rows=25693 width=46)
-> Seq Scan on t12_20011231 (cost=0.00..1124.20 rows=25693 width=46)
=> Uh? Seq scan cheaper than index???
=> let's disable seqscan to read cost of index:
postgresql.conf : enable_seqscan = false
Sort (cost=3126.79..3126.79 rows=25693 width=46)
-> Index Scan using t12_idx_bskid_20011231 on t12_20011231
(cost=0.00..1244.86 rows=25693 width=46)
=> Uh? seq scan'cost is lower than index scan?? => mailto hackers
----
What's your opinion?
I have to tell that this select opperates in a forloop statment .
I hardly believe reading 26K tuples is cheaper thant index reading, but maybe
you'll ask me about buffers that should store de 26K tuples?...
But just after this query, there is another one that maybe will put data in
buffers, kicking t12_20011231 data blocks...
Well I feel a little stuck there. I'll continue with enable_scans=false, but
I feel bad beeing forced to do so... and still asking myself if this is good
idea.
Thanks for support, best regards.
--
Jean-Paul ARGUDO
On Wed, 27 Feb 2002, Jean-Paul ARGUDO wrote:
EXPLAIN SELECT t12_bskid, t12_pnb, t12_lne, t12_tck
FROM T12_20011231
WHERE t12_bskid >= 1
ORDER BY t12_bskid, t12_pnb, t12_tck, t12_lne;NOTICE: QUERY PLAN:
Sort (cost=3006.13..3006.13 rows=25693 width=46)
-> Seq Scan on t12_20011231 (cost=0.00..1124.20 rows=25693 width=46)=> not good, table t12_20011231 as 26K tuples :-(
=> create index t12_idx_bskid_20011231 on t12_20011231 (t12_bskid);
Sort (cost=3006.13..3006.13 rows=25693 width=46)
-> Seq Scan on t12_20011231 (cost=0.00..1124.20 rows=25693 width=46)=> probably statistic refresh to be done:
$ /usr/local/pgsql/bin/vacuumdb --analyze dbksSort (cost=3006.13..3006.13 rows=25693 width=46)
-> Seq Scan on t12_20011231 (cost=0.00..1124.20 rows=25693 width=46)=> Uh? Seq scan cheaper than index???
=> let's disable seqscan to read cost of index:
postgresql.conf : enable_seqscan = falseSort (cost=3126.79..3126.79 rows=25693 width=46)
-> Index Scan using t12_idx_bskid_20011231 on t12_20011231
(cost=0.00..1244.86 rows=25693 width=46)=> Uh? seq scan'cost is lower than index scan?? => mailto hackers
----
What's your opinion?
Well you didn't send the schema, or explain analyze results to show
which is actually faster, but...
Sequence scan *can be* faster than index scan when a large portion of the
table is going to be read. If the data is randomly distributed,
eventually you end up reading most/all of the table blocks anyway to get
the validity information for the rows and you're doing it in random order,
plus you're reading parts of the index as well. How many rows are in
the table, and how many match t12_bskid >=1?
Jean-Paul ARGUDO <jean-paul.argudo@idealx.com> writes:
EXPLAIN SELECT t12_bskid, t12_pnb, t12_lne, t12_tck
FROM T12_20011231
WHERE t12_bskid >= 1
ORDER BY t12_bskid, t12_pnb, t12_tck, t12_lne;
Sort (cost=3006.13..3006.13 rows=25693 width=46)
-> Seq Scan on t12_20011231 (cost=0.00..1124.20 rows=25693 width=46)
=> Uh? Seq scan cheaper than index???
For that kind of query, very probably. How much of the table is
actually selected by "WHERE t12_bskid >= 1"?
regards, tom lane
<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN">
<html>
<head>
<title></title>
</head>
<body>
Tom Lane wrote:<br>
<blockquote type="cite" cite="mid22262.1014823120@sss.pgh.pa.us">
<pre wrap="">Jean-Paul ARGUDO <a class="moz-txt-link-rfc2396E" href="mailto:jean-paul.argudo@idealx.com"><jean-paul.argudo@idealx.com></a> writes:<br></pre>
<blockquote type="cite">
<pre wrap="">EXPLAIN SELECT t12_bskid, t12_pnb, t12_lne, t12_tck<br>FROM T12_20011231<br>WHERE t12_bskid >= 1 <br>ORDER BY t12_bskid, t12_pnb, t12_tck, t12_lne;<br></pre>
</blockquote>
<pre wrap=""><!----><br></pre>
<blockquote type="cite">
<pre wrap="">Sort (cost=3006.13..3006.13 rows=25693 width=46)<br> -> Seq Scan on t12_20011231 (cost=0.00..1124.20 rows=25693 width=46)<br></pre>
</blockquote>
<pre wrap=""><!----></pre>
</blockquote>
Try the following:<br>
<br>
EXPLAIN ANALYZE SELECT t12_bskid, t12_pnb, t12_lne, t12_tck FROM T12_20011231
WHERE t12_bskid >= 1 ORDER BY t12_bskid, t12_pnb, t12_tck, t12_lne;<br>
<pre wrap=""><!----></pre>
and see what the actual results are. Then turn the seq_scans off and do
the same thing.<br>
<blockquote type="cite" cite="mid22262.1014823120@sss.pgh.pa.us">
<pre wrap=""><br></pre>
<blockquote type="cite">
<pre wrap="">=> Uh? Seq scan cheaper than index??? <br></pre>
</blockquote>
<pre wrap=""><!----><br>For that kind of query, very probably. How much of the table is<br>actually selected by "WHERE t12_bskid >= 1"?<br><br> regards, tom lane<br><br>---------------------------(end of broadcast)---------------------------<br>TIP 2: you can get off all lists at once with the unregister command<br> (send "unregister YourEmailAddressHere" to <a class="moz-txt-link-abbreviated" href="mailto:majordomo@postgresql.org">majordomo@postgresql.org</a>)<br></pre>
</blockquote>
<br>
<br>
</body>
</html>