again on index usage

Started by Daniel Kalchevover 24 years ago48 messageshackers
Jump to latest
#1Daniel Kalchev
daniel@digsys.bg

Hello,

I have an table with ca 1.7 million records, with the following structure:

Table "iplog_gate200112"
Attribute | Type | Modifier
-----------+-----------+----------
ipaddr | inet |
input | bigint |
output | bigint |
router | text |
ipdate | timestamp |
Indices: iplog_gate200112_ipaddr_idx,
iplog_gate200112_ipdate_idx

the following query

explain
SELECT sum(input) FROM iplog_gate200112
WHERE
'2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2001-12-02 00:00:00+02' AND
'2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2002-01-01 00:00:00+02' AND
network_subeq(ipaddr, '193.68.240.0/20') AND 'uni-gw' ~ router;

results in

NOTICE: QUERY PLAN:

Aggregate (cost=51845.51..51845.51 rows=1 width=8)
-> Seq Scan on iplog_gate200112 (cost=0.00..51845.04 rows=190 width=8)

Why would it not want to use index scan?

Statistics for the table are as follows (from pg_statistic s, pg_attribute a,
pg_class c
where starelid = c.oid and attrelid = c.oid and staattnum = attnum
and relname = 'iplog_gate200112')

attname | attdispersion | starelid | staattnum | staop | stanullfrac |
stacommonfrac | stacommonval | staloval |
stahival
---------+---------------+-----------+-----------+-------+-------------+-------
--------+------------------------+------------------------+--------------------
----
ipaddr | 8.85397e-05 | 190565949 | 1 | 1203 | 0 |
0.000441917 | 192.92.129.1 | 192.92.129.0 | 212.72.197.154
input | 0.0039343 | 190565949 | 2 | 412 | 0 |
0.0183278 | 0 | 0 | 5929816798
output | 0.724808 | 190565949 | 3 | 412 | 0 |
0.835018 | 0 | 0 | 2639435033
router | 0.222113 | 190565949 | 4 | 664 | 0 |
0.416541 | sofia5 | bourgas1 | varna3
ipdate | 0.014311 | 190565949 | 5 | 1322 | 0 |
0.0580676 | 2001-12-04 00:00:00+02 | 2001-12-01 00:00:00+02 | 2001-12-31
00:00:00+02
(5 rows)

The query

explain
SELECT sum(input) FROM iplog_gate200112
WHERE
ipdate < '2001-12-01 00:00:00+02' AND
network_subeq(ipaddr, '193.68.240.0/20') AND 'uni-gw' ~ router;

produces

Aggregate (cost=4.91..4.91 rows=1 width=8)
-> Index Scan using iplog_gate200112_ipdate_idx on iplog_gate200112
(cost=0.00..4.91 rows=1 width=8)

Note there are no records with ipdate < '2001-12-01 00:00:00+02' in the table.

Could anyone sched some light? This is on 7.1.3.

Daniel

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Kalchev (#1)
Re: again on index usage

Daniel Kalchev <daniel@digsys.bg> writes:

explain
SELECT sum(input) FROM iplog_gate200112
WHERE
'2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2001-12-02 00:00:00+02' AND
'2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2002-01-01 00:00:00+02' AND
network_subeq(ipaddr, '193.68.240.0/20') AND 'uni-gw' ~ router;

results in

NOTICE: QUERY PLAN:

Aggregate (cost=51845.51..51845.51 rows=1 width=8)
-> Seq Scan on iplog_gate200112 (cost=0.00..51845.04 rows=190 width=8)

Why would it not want to use index scan?

It's difficult to tell from this what it thinks the selectivity of the
ipdate index would be, since the rows estimate includes the effect of
the ipaddr and router restrictions. What do you get from just

explain
SELECT sum(input) FROM iplog_gate200112
WHERE
'2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2001-12-02 00:00:00+02' AND
'2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2002-01-01 00:00:00+02';

If you say "set enable_seqscan to off", does that change the plan?

BTW, the planner does not associate function calls with indexes. If you
want to have the ipaddr index considered for this query, you need to write
ipaddr <<= '193.68.240.0/20' not network_subeq(ipaddr, '193.68.240.0/20').
(But IIRC, that only works in 7.2 anyway, not earlier releases :-()

regards, tom lane

#3Sean Chittenden
sean@chittenden.org
In reply to: Tom Lane (#2)
Re: again on index usage

It's difficult to tell from this what it thinks the selectivity of the
ipdate index would be, since the rows estimate includes the effect of
the ipaddr and router restrictions. What do you get from just

explain
SELECT sum(input) FROM iplog_gate200112
WHERE
'2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2001-12-02 00:00:00+02' AND
'2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2002-01-01 00:00:00+02';

I don't know if this'd help, but if you're suming the data and running
this query often, see if a function index would help:

CREATE INDEX sum_input_fnc_idx ON iplog_gate200112 (sum(input));

-sc

--
Sean Chittenden

#4Daniel Kalchev
daniel@digsys.bg
In reply to: Tom Lane (#2)
Re: again on index usage

Tom Lane said:

It's difficult to tell from this what it thinks the selectivity of the
ipdate index would be, since the rows estimate includes the effect of
the ipaddr and router restrictions. What do you get from just

explain
SELECT sum(input) FROM iplog_gate200112
WHERE
'2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2001-12-02 00:00:00+02' A

ND

'2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2002-01-01 00:00:00+02';

Same result (sorry, should have included this originally):

Aggregate (cost=47721.72..47721.72 rows=1 width=8)
-> Seq Scan on iplog_gate200112 (cost=0.00..47579.54 rows=56873 width=8)

If you say "set enable_seqscan to off", does that change the plan?

Yes. As expected (I no longer have the problem of NaN estimates :)

Aggregate (cost=100359.71..100359.71 rows=1 width=8)
-> Index Scan using iplog_gate200112_ipdate_idx on iplog_gate200112
(cost=0.00..100217.52 rows=56873 width=8)

My belief is that the planner does not want to use index due to low value
dispersion of the indexed attribute. When splitting the table into several
smaller tables, index is used.

This bites me, because each such query takes at least 3 minutes and the script
that generates these needs to execute few thousands queries.

BTW, the planner does not associate function calls with indexes. If you
want to have the ipaddr index considered for this query, you need to write
ipaddr <<= '193.68.240.0/20' not network_subeq(ipaddr, '193.68.240.0/20').
(But IIRC, that only works in 7.2 anyway, not earlier releases :-()

This is what I though too, but using the ipdate index will be sufficient.

I understand my complaint is not a bug, but rather question of proper planner
optimization (it worked 'as expected' in 7.0). Perhaps the planner should
consider the total number of rows, as well as the dispersion factor. With the
dispersion being around 1.5% and total rows 1.7 million this gives about 25k
rows with the same value - large enough to trigger sequential scan, as far as
I understand it, but the cost of scanning 1.7 million rows sequentially is
just too high.

By the way, the same query takes approx 10 sec with set enable_seqscan to off.

Daniel

#5Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Kalchev (#4)
Re: again on index usage

Daniel Kalchev <daniel@digsys.bg> writes:

Same result (sorry, should have included this originally):

Aggregate (cost=47721.72..47721.72 rows=1 width=8)
-> Seq Scan on iplog_gate200112 (cost=0.00..47579.54 rows=56873 width=8)

If you say "set enable_seqscan to off", does that change the plan?

Aggregate (cost=100359.71..100359.71 rows=1 width=8)
-> Index Scan using iplog_gate200112_ipdate_idx on iplog_gate200112
(cost=0.00..100217.52 rows=56873 width=8)

So, what we've got here is a difference of opinion: the planner thinks
that the seqscan will be faster. How many rows are actually selected
by this WHERE clause? How long does each plan actually take?

regards, tom lane

#6Daniel Kalchev
daniel@digsys.bg
In reply to: Tom Lane (#5)
Re: again on index usage

Tom Lane said:

Daniel Kalchev <daniel@digsys.bg> writes:

Same result (sorry, should have included this originally):

Aggregate (cost=47721.72..47721.72 rows=1 width=8)
-> Seq Scan on iplog_gate200112 (cost=0.00..47579.54 rows=56873 width=

8)

If you say "set enable_seqscan to off", does that change the plan?

Aggregate (cost=100359.71..100359.71 rows=1 width=8)
-> Index Scan using iplog_gate200112_ipdate_idx on iplog_gate200112
(cost=0.00..100217.52 rows=56873 width=8)

So, what we've got here is a difference of opinion: the planner thinks
that the seqscan will be faster. How many rows are actually selected
by this WHERE clause? How long does each plan actually take?

regards, tom lane

3-5 minutes with sequential scan; 10-15 sec with index scan. The query returns
4062 rows. Out of ca 1700000 rows.

With only the datetime constraints (which relates to the index), the number of
rows is 51764.

In any case, sequential scan of millions of rows cannot be faster than index
scan. The average number of records for each index key is around 25000 -
perhaps the planner thinks because the number of tuples in this case is
higher, it should prefer sequential scan. I guess the planner will do better
if there is some scaling of these values with respect to the total number of
rows.

Daniel

#7Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Daniel Kalchev (#6)
Re: again on index usage

Daniel Kalchev wrote:

Tom Lane said:

Daniel Kalchev <daniel@digsys.bg> writes:

Same result (sorry, should have included this originally):

If you say "set enable_seqscan to off", does that change the plan?

Aggregate (cost=100359.71..100359.71 rows=1 width=8)
-> Index Scan using iplog_gate200112_ipdate_idx on iplog_gate200112
(cost=0.00..100217.52 rows=56873 width=8)

So, what we've got here is a difference of opinion: the planner thinks
that the seqscan will be faster. How many rows are actually selected
by this WHERE clause? How long does each plan actually take?

regards, tom lane

3-5 minutes with sequential scan; 10-15 sec with index scan. The query returns
4062 rows. Out of ca 1700000 rows.

With only the datetime constraints (which relates to the index), the number of
rows is 51764.

The planner estimates 56873 rows. It seems not that bad.
A plausible reason is your table is nearly clustered
as to ipdate.

regards,
Hiroshi Inoue

#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Kalchev (#6)
Re: again on index usage

Daniel Kalchev <daniel@digsys.bg> writes:

Aggregate (cost=47721.72..47721.72 rows=1 width=8)
-> Seq Scan on iplog_gate200112 (cost=0.00..47579.54 rows=56873 width=
8)

If you say "set enable_seqscan to off", does that change the plan?

Aggregate (cost=100359.71..100359.71 rows=1 width=8)
-> Index Scan using iplog_gate200112_ipdate_idx on iplog_gate200112
(cost=0.00..100217.52 rows=56873 width=8)

So, what we've got here is a difference of opinion: the planner thinks
that the seqscan will be faster. How many rows are actually selected
by this WHERE clause? How long does each plan actually take?

3-5 minutes with sequential scan; 10-15 sec with index scan. The query returns
4062 rows. Out of ca 1700000 rows.

With only the datetime constraints (which relates to the index), the number of
rows is 51764.

Hm. Okay, so the number-of-rows estimate is not too far off. I concur
with Hiroshi's comment: the reason the indexscan is so fast must be that
the table is clustered (physical order largely agrees with index order).
This would obviously hold if the records were entered in order by
ipdate; is that true?

The 7.2 planner does try to estimate index order correlation, and would
be likely to price this indexscan much lower, so that it would make the
right choice. I'd suggest updating to 7.2 as soon as we have RC1 out.
(Don't do it yet, though, since we've got some timestamp bugs to fix
that are probably going to require another initdb.)

In any case, sequential scan of millions of rows cannot be faster than index
scan.

Snort. If that were true, we could get rid of most of the complexity
in the planner.

regards, tom lane

#9Daniel Kalchev
daniel@digsys.bg
In reply to: Tom Lane (#8)
Re: again on index usage

Tom Lane said:

Hm. Okay, so the number-of-rows estimate is not too far off. I concur
with Hiroshi's comment: the reason the indexscan is so fast must be that
the table is clustered (physical order largely agrees with index order).
This would obviously hold if the records were entered in order by
ipdate; is that true?

Yes. But... do you want me to cluster it by ipaddr for example and try it
again? I understand the clustering might help with sequential scans, but why
would it help with index scans?

Daniel

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Kalchev (#9)
Re: again on index usage

Daniel Kalchev <daniel@digsys.bg> writes:

I understand the clustering might help with sequential scans, but why
would it help with index scans?

No, the other way around: it makes no difference for seq scans, but can
speed up index scans quite a lot. With a clustered table, successive
index-driven fetches tend to hit the same pages rather than hitting
random pages throughout the table. That saves I/O.

Given the numbers you were quoting, if the table were in perfectly
random order by ipdate then there would probably have been about three
rows per page that the indexscan would've had to fetch. This would mean
touching each page three times in some random order. Unless the table
is small enough to fit in Postgres' shared buffer cache, that's going to
represent a lot of extra I/O --- a lot more than reading each page only
once, as a seqscan would do. At the other extreme, if the table is
perfectly ordered by ipdate then the indexscan need only hit a small
number of pages (all the rows we want are in a narrow range) and we
touch each page many times before moving on to the next. Very few I/O
requests in that case.

7.1 does not have any statistics about table order, so it uses the
conservative assumption that the ordering is random. 7.2 has more
statistical data and perhaps will make better estimates about the
cost of indexscans.

regards, tom lane

#11Daniel Kalchev
daniel@digsys.bg
In reply to: Tom Lane (#10)
Re: again on index usage

Tom Lane said:

Daniel Kalchev <daniel@digsys.bg> writes:

I understand the clustering might help with sequential scans, but why
would it help with index scans?

No, the other way around: it makes no difference for seq scans, but can
speed up index scans quite a lot. With a clustered table, successive
index-driven fetches tend to hit the same pages rather than hitting
random pages throughout the table. That saves I/O.

Ok, time to go home :-), but...

Given the numbers you were quoting, if the table were in perfectly
random order by ipdate then there would probably have been about three
rows per page that the indexscan would've had to fetch. This would mean
touching each page three times in some random order. Unless the table
is small enough to fit in Postgres' shared buffer cache, that's going to
represent a lot of extra I/O --- a lot more than reading each page only
once, as a seqscan would do. At the other extreme, if the table is
perfectly ordered by ipdate then the indexscan need only hit a small
number of pages (all the rows we want are in a narrow range) and we
touch each page many times before moving on to the next. Very few I/O
requests in that case.

In any case, if we need to hit 50k pages (assuming the indexed data is
randomly scattered in the file), and having to read these three times each, it
will be less I/O than having to read 1.7 million records. The table will never
be laid sequentially on the disk, at least not in this case (which adds to the
table day after day - and this is why data is almost ordered by ipdate).

What I am arguing about is the scaling - is 50k random reads worse than 1.7
million sequential reads? Eventually considering the tuple size, disk block
size etc.

I will wait patiently for 4.2 to release and see how this same table performs.
:-)

Daniel

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Kalchev (#11)
Re: again on index usage

Daniel Kalchev <daniel@digsys.bg> writes:

In any case, if we need to hit 50k pages (assuming the indexed data is
randomly scattered in the file), and having to read these three times each, it
will be less I/O than having to read 1.7 million records.

How do you arrive at that? Assuming 100 records per page (probably the
right order of magnitude), the seqscan alternative is 17k page reads.
Yes, you examine more tuples, but CPUs are lots faster than disks.

That doesn't even address the fact that Unix systems reward sequential
reads and penalize random access. In a seqscan, we can expect that the
kernel will schedule the next page read before we ask for it, so that
our CPU time to examine a page is overlapped with I/O for the next page.
In an indexscan that advantage goes away, because neither we nor the
kernel know which page will be touched next. On top of the loss of
read-ahead, the filesystem is probably laid out in a way that rewards
sequential access with fewer and shorter seeks.

The tests I've made suggest that the penalty involved is about a factor
of four -- ie, a seqscan can scan four pages in the same amount of time
that it takes to bring in one randomly-accessed page.

regards, tom lane

#13Daniel Kalchev
daniel@digsys.bg
In reply to: Tom Lane (#12)
Re: again on index usage

Tom Lane said:

Daniel Kalchev <daniel@digsys.bg> writes:

In any case, if we need to hit 50k pages (assuming the indexed data is
randomly scattered in the file), and having to read these three times each

, it

will be less I/O than having to read 1.7 million records.

How do you arrive at that? Assuming 100 records per page (probably the
right order of magnitude), the seqscan alternative is 17k page reads.
Yes, you examine more tuples, but CPUs are lots faster than disks.

I tried this:

db=# select * into iplog_test from iplog_gate200112;
SELECT
db=# create index iplog_test_ipaddr_idx on iplog_test(ipaddr);
CREATE
db=# cluster iplog_test_ipaddr_idx on iplog_test;
CLUSTER
db=# create index iplog_test_ipdate_idx on iplog_test(ipdate);
CREATE
db=# vacuum verbose analyze iplog_test;
NOTICE: --Relation iplog_test--
NOTICE: Pages 17722: Changed 0, reaped 0, Empty 0, New 0; Tup 1706202: Vac 0,
Keep/VTL 0/0, Crash 0, UnUsed 0, MinLen 80, MaxLen 88; Re-using: Free/Avail.
Space 0/0; EndEmpty/Avail. Pages 0/0. CPU 1.48s/-1.86u sec.
NOTICE: Index iplog_test_ipaddr_idx: Pages 5621; Tuples 1706202. CPU
0.51s/1.80u sec.
NOTICE: Index iplog_test_ipdate_idx: Pages 4681; Tuples 1706202. CPU
0.36s/1.92u sec.
NOTICE: --Relation pg_toast_253297758--
NOTICE: Pages 0: Changed 0, reaped 0, Empty 0, New 0; Tup 0: Vac 0, Keep/VTL
0/0, Crash 0, UnUsed 0, MinLen 0, MaxLen 0; Re-using: Free/Avail. Space 0/0;
EndEmpty/Avail. Pages 0/0. CPU 0.00s/0.00u sec.
NOTICE: Index pg_toast_253297758_idx: Pages 1; Tuples 0. CPU 0.00s/0.00u sec.
NOTICE: Analyzing...
VACUUM
db=# explain
db-# SELECT sum(input), sum(output) FROM iplog_test
db-# WHERE
db-# '2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2001-12-02 00:00:00+02'
AND
db-# '2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2002-01-01 00:00:00+02'
AND
db-# ipaddr <<= '193.68.240.0/20' AND 'uni-gw' ~ router;
NOTICE: QUERY PLAN:

Aggregate (cost=56112.97..56112.97 rows=1 width=16)
-> Seq Scan on iplog_test (cost=0.00..56111.54 rows=284 width=16)

EXPLAIN

query runs for ca 3.5 minutes.

db=# set enable_seqscan to off;

the query plan is

Aggregate (cost=100507.36..100507.36 rows=1 width=16)
-> Index Scan using iplog_test_ipdate_idx on iplog_test
(cost=0.00..100505.94 rows=284 width=16)

query runs for ca 2.2 minutes.

Moves closer to your point :-)

Anyway, the platform is an dual Pentium III 550 MHz (intel) machine with 512
MB RAM, with 15000 RPM Cheetah for the database, running BSD/OS 4.2. The
machine is reasonably loaded all the time, so this is very much real-time test.

I agree, that with the 'wrong' clustering the index scan is not so much faster
than the sequential scan.

Perhaps I need to tune this machine's costs to prefer more disk intensive
operations over CPU intensive operations?

Let's see what 4.2 will result in.

Daniel

#14Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Daniel Kalchev (#13)
Re: again on index usage

Daniel wrote: (stripped to info I used)

NOTICE: Pages 17722: Changed 0, reaped 0, Empty 0, New 0;
Tup 1706202: Vac 0,
NOTICE: Index iplog_test_ipaddr_idx: Pages 5621; Tuples 1706202. CPU
NOTICE: Index iplog_test_ipdate_idx: Pages 4681; Tuples 1706202. CPU

-> Seq Scan on iplog_test (cost=0.00..56111.54 rows=284 width=16)
query runs for ca 3.5 minutes.

-> Index Scan using iplog_test_ipdate_idx on iplog_test
(cost=0.00..100505.94 rows=284 width=16)
query runs for ca 2.2 minutes.

I cannot really see how 284 rows can have an estimated index cost of 100506 ?

512 MB RAM, with 15000 RPM Cheetah for the database, running

Perhaps I need to tune this machine's costs to prefer more
disk intensive operations over CPU intensive operations?

What is actually estimated wrong here seems to be the estimated
effective cache size, and thus the cache ratio of page fetches.
Most of your pages will be cached.

The tuning parameter is: effective_cache_size

With (an estimated) 50 % of 512 Mb for file caching that number would
need to be:
effective_cache_size = 32768 # 8k pages

Can you try this and tell us what happens ?

Andreas

#15Daniel Kalchev
daniel@digsys.bg
In reply to: Zeugswetter Andreas SB SD (#14)
Re: again on index usage

"Zeugswetter Andreas SB SD" said:

What is actually estimated wrong here seems to be the estimated
effective cache size, and thus the cache ratio of page fetches.
Most of your pages will be cached.

The tuning parameter is: effective_cache_size

With (an estimated) 50 % of 512 Mb for file caching that number would
need to be:
effective_cache_size = 32768 # 8k pages

Can you try this and tell us what happens ?

I suspected this, but haven't really come to test it. On BSD/OS, the buffer
cache is 10% of the RAM, in my case

buffer cache = 53522432 (51.04 MB)

I guess effective_cache_size = 6400 will be ok.

Daniel

#16Daniel Kalchev
daniel@digsys.bg
In reply to: Zeugswetter Andreas SB SD (#14)
Re: again on index usage

[with the new effective_cache_size = 6400]

explain
SELECT sum(input), sum(output) FROM iplog_gate200112
WHERE
'2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2001-12-02 00:00:00+02' AND
'2001-12-01 00:00:00+02' <= ipdate AND ipdate < '2002-01-01 00:00:00+02' AND
ipaddr <<= '193.68.240.0/20' AND 'uni-gw' ~ router;

gives

Aggregate (cost=56111.97..56111.97 rows=1 width=16)
-> Seq Scan on iplog_gate200112 (cost=0.00..56110.54 rows=284 width=16)

takes 3 min to execute. (was 10 sec after fresh restart)

db=# set enable_seqscan to off;

Aggregate (cost=84980.10..84980.10 rows=1 width=16)
-> Index Scan using iplog_gate200112_ipdate_idx on iplog_gate200112
(cost=0.00..84978.68 rows=284 width=16)

takes 1.8 min to execute. (was 2 sec after fresh reshart)

Still proves my point, But the fresh restart performance is impressive. After
few minutes the database takes its normal load and in my opinion the buffer
cache is too much cluttered with pages from other tables.

Which brings another question: with so much RAM recent equipment runs, it may
be good idea to specifically add to INSTALL instruction on tuning the system
as soon as it is installed. Most people will stop there, especially after an
upgrade (as I did).

Daniel

#17Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Daniel Kalchev (#16)
Re: again on index usage

[with the new effective_cache_size = 6400]

This seems way too low for a 512 Mb machine. Why does your OS
only use so little for filecache ? Is the rest used for processes ?
For the above number you need to consider OS cache and shared_buffers.
You can approximatly add them together minus a few %.

With the data you gave, a calculated value for effective_cache_size
would be 29370, assuming the random_page_cost is actually 4 on your
machine. 29370 might be a slight overestimate, since your new table
will probably still be somewhat sorted by date within one IP.

Try to measure IO/s during the seq scan and during the index path
and calculate the ratio. This should be done during an average workload
on the machine.

Andreas

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Kalchev (#13)
Re: again on index usage

Daniel Kalchev <daniel@digsys.bg> writes:

I agree, that with the 'wrong' clustering the index scan is not so
much faster than the sequential scan.

It would be interesting to check whether there is any correlation
between ipaddr and ipdate in your test data. Perhaps clustering
on ipaddr might not destroy the ordering on ipdate as much as you
thought. A more clearly random-order test would go:

select * into iplog_test from iplog_gate200112 order by random();
create index iplog_test_ipdate_idx on iplog_test(ipdate);
vacuum verbose analyze iplog_test;
<< run queries >>

Perhaps I need to tune this machine's costs to prefer more disk intensive
operations over CPU intensive operations?

Possibly. I'm not sure there's much point in tuning the cost estimates
until the underlying model is more nearly right (ie, knows something
about correlation). Do you care to try your dataset with 7.2 beta?

regards, tom lane

#19Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zeugswetter Andreas SB SD (#14)
Re: again on index usage

"Zeugswetter Andreas SB SD" <ZeugswetterA@spardat.at> writes:

I cannot really see how 284 rows can have an estimated index cost of 100506 ?

The estimated number of indexscanned rows is more like 50k. The number
you are looking at includes the estimated selectivity of the
non-indexable WHERE clauses, too.

What is actually estimated wrong here seems to be the estimated
effective cache size, and thus the cache ratio of page fetches.

Good point, but I think the estimates are only marginally sensitive
to estimated cache size (if they're not, we have a problem, considering
how poorly we can estimate the kernel's disk buffer size). It would
be interesting for Daniel to try a few different settings of
effective_cache_size and see how much the EXPLAIN costs change.

regards, tom lane

#20Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Tom Lane (#19)
Re: again on index usage

What is actually estimated wrong here seems to be the estimated
effective cache size, and thus the cache ratio of page fetches.

Good point, but I think the estimates are only marginally sensitive
to estimated cache size (if they're not, we have a problem, considering
how poorly we can estimate the kernel's disk buffer size). It would
be interesting for Daniel to try a few different settings of
effective_cache_size and see how much the EXPLAIN costs change.

Well, the number I told him (29370) should clearly prefer the index.
The estimate is very sensitive to this value :-(
With 29370 (=229 Mb) the index cost is 1,364 instead of 3,887 with the
default of 1000 pages ==> index scan.

229 Mb file cache with 512Mb Ram is a reasonable value, I have
a lot more here:
Memory Real Virtual
free 0 MB 218 MB
procs 95 MB 293 MB
files 159 MB
total 256 MB 512 MB

Andreas

#21Daniel Kalchev
daniel@digsys.bg
In reply to: Zeugswetter Andreas SB SD (#17)
#22Daniel Kalchev
daniel@digsys.bg
In reply to: Zeugswetter Andreas SB SD (#20)
#23Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Daniel Kalchev (#22)
#24Bruce Momjian
bruce@momjian.us
In reply to: Zeugswetter Andreas SB SD (#23)
#25Daniel Kalchev
daniel@digsys.bg
In reply to: Zeugswetter Andreas SB SD (#23)
#26Daniel Kalchev
daniel@digsys.bg
In reply to: Tom Lane (#18)
#27Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Daniel Kalchev (#26)
#28Tom Lane
tgl@sss.pgh.pa.us
In reply to: Zeugswetter Andreas SB SD (#27)
#29Don Baccus
dhogaza@pacifier.com
In reply to: Zeugswetter Andreas SB SD (#27)
#30Bruce Momjian
bruce@momjian.us
In reply to: Don Baccus (#29)
#31Don Baccus
dhogaza@pacifier.com
In reply to: Bruce Momjian (#30)
#32Daniel Kalchev
daniel@digsys.bg
In reply to: Tom Lane (#28)
#33Ross J. Reedstrom
reedstrm@rice.edu
In reply to: Bruce Momjian (#30)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Kalchev (#32)
#35Don Baccus
dhogaza@pacifier.com
In reply to: Don Baccus (#29)
#36Bruce Momjian
bruce@momjian.us
In reply to: Daniel Kalchev (#32)
#37Bruce Momjian
bruce@momjian.us
In reply to: Ross J. Reedstrom (#33)
#38mlw
markw@mohawksoft.com
In reply to: Zeugswetter Andreas SB SD (#27)
#39Tom Lane
tgl@sss.pgh.pa.us
In reply to: mlw (#38)
#40Don Baccus
dhogaza@pacifier.com
In reply to: Bruce Momjian (#36)
#41Don Baccus
dhogaza@pacifier.com
In reply to: Bruce Momjian (#37)
#42Daniel Kalchev
daniel@digsys.bg
In reply to: Tom Lane (#39)
#43Hannu Krosing
hannu@tm.ee
In reply to: Zeugswetter Andreas SB SD (#27)
#44Don Baccus
dhogaza@pacifier.com
In reply to: Zeugswetter Andreas SB SD (#27)
#45Zeugswetter Andreas SB SD
ZeugswetterA@spardat.at
In reply to: Don Baccus (#44)
#46Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Daniel Kalchev (#13)
#47Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#46)
#48Hiroshi Inoue
Inoue@tpf.co.jp
In reply to: Hiroshi Inoue (#47)