Dissapearing indexes, what's that all about?

Started by Daniel Åkerudabout 25 years ago12 messagesgeneral
Jump to latest
#1Daniel Åkerud
zilch@home.se

Hey people,
have a very strange problem now...
It's that when my database with the script i'm using a
explain select * from acc where username='britta' yields an
index scan. Very nice. But after some time being up the
same thing yields a SEQUENTIAL scan. What's happening
folks? =) No more than update's select's and vacuums has
been invoked.

Thanks! *hug*

Daniel �kerud

#2Len Morgan
len-morgan@crcom.net
In reply to: Daniel Åkerud (#1)
Re: Dissapearing indexes, what's that all about?

Hey people,
have a very strange problem now...
It's that when my database with the script i'm using a
explain select * from acc where username='britta' yields an
index scan. Very nice. But after some time being up the
same thing yields a SEQUENTIAL scan. What's happening
folks? =) No more than update's select's and vacuums has
been invoked.

Are you doing vacuum or vacuum analyze? I believe only the 'vacuum analyze'
function cleans up the indexing statistics.

len

#3Daniel Åkerud
zilch@home.se
In reply to: Len Morgan (#2)
Re: Re: Dissapearing indexes, what's that all about?

Hey people,
have a very strange problem now...
It's that when my database with the script i'm using a
explain select * from acc where username='britta' yields

an

index scan. Very nice. But after some time being up the
same thing yields a SEQUENTIAL scan. What's happening
folks? =) No more than update's select's and vacuums has
been invoked.

Are you doing vacuum or vacuum analyze? I believe only

the 'vacuum analyze'

function cleans up the indexing statistics.

len

Please look at this:
I just ran an interesting little test here...
watch this:

<DROP DB><CREATE DB><RUN DB SCRIPT>

explain select * from acc where username='britta';

Index scan using acc_username_key on acc...

VACUUM;

explain select * from acc where username='britta';

Seq Scan on acc

<DROP DB><CREATE DB><RUN DB SCRIPT>

explain select * from acc where username='britta';

Index scan

VACUUM ANALYZE

explain select * from acc where username='britta';

Seq scan

What the wakk is goin on here people? =)

version: PostgreSQL 7.0.0 on i686-pc-linux-gnu, compiled by
gcc 2.95.2

Thanks

Daniel �kerud

#4Mike Mascari
mascarm@mascari.com
In reply to: Daniel Åkerud (#3)
RE: Re: Dissapearing indexes, what's that all about?

What are the costs associated with the EXPLAIN output? Perhaps a sequential scan is *faster* then an index scan.

Mike Mascari
mascarm@mascari.com

-----Original Message-----
From: Daniel ?erud [SMTP:zilch@home.se]
Sent: Sunday, April 01, 2001 12:31 PM
To: pgsql-general@postgresql.org
Subject: Re: Re: [GENERAL] Dissapearing indexes, what's that all about?

Hey people,
have a very strange problem now...
It's that when my database with the script i'm using a
explain select * from acc where username='britta' yields

an

index scan. Very nice. But after some time being up the
same thing yields a SEQUENTIAL scan. What's happening
folks? =) No more than update's select's and vacuums has
been invoked.

Are you doing vacuum or vacuum analyze? I believe only

the 'vacuum analyze'

function cleans up the indexing statistics.

len

Please look at this:
I just ran an interesting little test here...
watch this:

<DROP DB><CREATE DB><RUN DB SCRIPT>

explain select * from acc where username='britta';

Index scan using acc_username_key on acc...

VACUUM;

explain select * from acc where username='britta';

Seq Scan on acc

<DROP DB><CREATE DB><RUN DB SCRIPT>

explain select * from acc where username='britta';

Index scan

VACUUM ANALYZE

explain select * from acc where username='britta';

Seq scan

What the wakk is goin on here people? =)

version: PostgreSQL 7.0.0 on i686-pc-linux-gnu, compiled by
gcc 2.95.2

Thanks

Daniel Akerud

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

#5Daniel Åkerud
zilch@home.se
In reply to: Mike Mascari (#4)
Re: RE: Re: Dissapearing indexes, what's that all about?

after a refresh database the explain yields:
index scan using xXxX (cost=0.00..8.14 rows=10 width=147)
after a vacuum + vacuum analyze the explain yields:
seq scan on acc xXxX �(cost=0.00..1.23 rows=1 width=147)

humm, seems you are right here... but why is it choosing a
index scan in the first place then?

What are the costs associated with the EXPLAIN output?

Perhaps a sequential scan is *faster* then an index scan.

Mike Mascari
mascarm@mascari.com

-----Original Message-----
From: Daniel ?erud [SMTP:zilch@home.se]
Sent: Sunday, April 01, 2001 12:31 PM
To: pgsql-general@postgresql.org
Subject: Re: Re: [GENERAL] Dissapearing indexes,

what's that all about?

Hey people,
have a very strange problem now...
It's that when my database with the script i'm using a
explain select * from acc where username='britta'

yields

an

index scan. Very nice. But after some time being up the
same thing yields a SEQUENTIAL scan. What's happening
folks? =) No more than update's select's and vacuums

has

been invoked.

Are you doing vacuum or vacuum analyze? I believe only

the 'vacuum analyze'

function cleans up the indexing statistics.

len

Please look at this:
I just ran an interesting little test here...
watch this:

<DROP DB><CREATE DB><RUN DB SCRIPT>

explain select * from acc where username='britta';

Index scan using acc_username_key on acc...

VACUUM;

explain select * from acc where username='britta';

Seq Scan on acc

<DROP DB><CREATE DB><RUN DB SCRIPT>

explain select * from acc where username='britta';

Index scan

VACUUM ANALYZE

explain select * from acc where username='britta';

Seq scan

What the wakk is goin on here people? =)

version: PostgreSQL 7.0.0 on i686-pc-linux-gnu, compiled

by

gcc 2.95.2

Thanks

Daniel Akerud

---------------------------(end of broadcast)-------------

--------------

Show quoted text

TIP 4: Don't 'kill -9' the postmaster

#6Mike Mascari
mascarm@mascari.com
In reply to: Daniel Åkerud (#5)
RE: RE: Re: Dissapearing indexes, what's that all about?

If I recall correctly, when the optimizer was changed (greatly enhanced),
there was a debate about what the default behavior should be. The problem
was that a large number of users would populate they're database after
index creation and see sluggishness because the statistics had not yet been
updated vs. the much smaller number of users that would suffer at the hands
of an index scan against a table that would be better served with a
sequential scan. I *think* the result of assuming 0 rows in a newly created
table, until the next vacuum, would yield a significant increase in
mailing-list traffic complaints to the tune of:

"Why isn't PostgreSQL using my index?"

followed by the usual

"Did you run VACUUM ANALYZE?"

So an assumption of 1000 rows was made, with 10 rows matching your WHERE
clause.

Mike Mascari
mascarm@mascari.com

-----Original Message-----
From: Daniel ?erud [SMTP:zilch@home.se]
Sent: Sunday, April 01, 2001 12:43 PM
To: pgsql-general@postgresql.org
Subject: Re: RE: Re: [GENERAL] Dissapearing indexes, what's that all about?

after a refresh database the explain yields:
index scan using xXxX (cost=0.00..8.14 rows=10 width=147)
after a vacuum + vacuum analyze the explain yields:
seq scan on acc xXxX A(cost=0.00..1.23 rows=1 width=147)

humm, seems you are right here... but why is it choosing a
index scan in the first place then?

What are the costs associated with the EXPLAIN output?

Perhaps a sequential scan is *faster* then an index scan.

Show quoted text

Mike Mascari
mascarm@mascari.com

#7Daniel Åkerud
zilch@home.se
In reply to: Mike Mascari (#6)
Re: RE: RE: Re: Dissapearing indexes, what's that all about?

I appriciate all the help i've gotten here...
anyway,
creating a table:

CREATE TABLE index_with (
id SERIAL,
name text
);

CREATE INDEX name_index ON index_with(name);

and filling it with 10000 rows made out of

$pwgen 8 10000 > data [enter]

and then running VACUUM and VACUUM ANALYZE
still yields a sequential scan doing a
select * from index_with where name > 'm';
namely
seq scan on index_with (cost=0.00..189 rows 5170 width=16)

Sorry to bother �ou guys this much.

Daniel �kerud

If I recall correctly, when the optimizer was changed

(greatly enhanced),

there was a debate about what the default behavior should

be. The problem

was that a large number of users would populate they're

database after

index creation and see sluggishness because the

statistics had not yet been

updated vs. the much smaller number of users that would

suffer at the hands

of an index scan against a table that would be better

served with a

sequential scan. I *think* the result of assuming 0 rows

in a newly created

table, until the next vacuum, would yield a significant

increase in

mailing-list traffic complaints to the tune of:

"Why isn't PostgreSQL using my index?"

followed by the usual

"Did you run VACUUM ANALYZE?"

So an assumption of 1000 rows was made, with 10 rows

matching your WHERE

clause.

Mike Mascari
mascarm@mascari.com

-----Original Message-----
From: Daniel ?erud [SMTP:zilch@home.se]
Sent: Sunday, April 01, 2001 12:43 PM
To: pgsql-general@postgresql.org
Subject: Re: RE: Re: [GENERAL] Dissapearing indexes,

what's that all about?

after a refresh database the explain yields:
index scan using xXxX (cost=0.00..8.14 rows=10 width=147)
after a vacuum + vacuum analyze the explain yields:
seq scan on acc xXxX A(cost=0.00..1.23 rows=1 width=147)

humm, seems you are right here... but why is it choosing a
index scan in the first place then?

What are the costs associated with the EXPLAIN output?

Perhaps a sequential scan is *faster* then an index scan.

Mike Mascari
mascarm@mascari.com

---------------------------(end of broadcast)-------------

--------------

TIP 2: you can get off all lists at once with the

unregister command

(send "unregister YourEmailAddressHere" to

majordomo@postgresql.org)

Show quoted text
#8Tom Lane
tgl@sss.pgh.pa.us
In reply to: Mike Mascari (#6)
Re: Dissapearing indexes, what's that all about?

Mike Mascari <mascarm@mascari.com> writes:

I *think* the result of assuming 0 rows in a newly created
table, until the next vacuum, would yield a significant increase in
mailing-list traffic complaints to the tune of:
"Why isn't PostgreSQL using my index?"
followed by the usual
"Did you run VACUUM ANALYZE?"
So an assumption of 1000 rows was made, with 10 rows matching your WHERE
clause.

Yup, exactly. The initial default statistics are set (with malice
aforethought) to provoke an indexscan. After you VACUUM, the optimizer
knows how large the table really is (ie, tiny), and so it decides that
looking at the index is a waste of time, it might as well just scan the
table. Load up some more data, VACUUM again, and you'll probably see an
indexscan used.

after a refresh database the explain yields:
index scan using xXxX (cost=0.00..8.14 rows=10 width=147)
after a vacuum + vacuum analyze the explain yields:
seq scan on acc xXxX A(cost=0.00..1.23 rows=1 width=147)

BTW, comparing those two cost numbers is pretty pointless since they are
based on different information about the size of the table.

regards, tom lane

#9Daniel Åkerud
zilch@home.se
In reply to: Tom Lane (#8)
Re: Re: Dissapearing indexes, what's that all about?

Cool!!
Can you guess where the limit is?
ten thousand is not enought obviously, and putting 10000
more in there takes 10 minutes... even on a clean database.
That is another problem however, bevuase on a slower
machine it takes 13 seconds

Mike Mascari <mascarm@mascari.com> writes:

I *think* the result of assuming 0 rows in a newly

created

table, until the next vacuum, would yield a significant

increase in

mailing-list traffic complaints to the tune of:
"Why isn't PostgreSQL using my index?"
followed by the usual
"Did you run VACUUM ANALYZE?"
So an assumption of 1000 rows was made, with 10 rows

matching your WHERE

clause.

Yup, exactly. The initial default statistics are set

(with malice

aforethought) to provoke an indexscan. After you VACUUM,

the optimizer

knows how large the table really is (ie, tiny), and so it

decides that

looking at the index is a waste of time, it might as well

just scan the

table. Load up some more data, VACUUM again, and you'll

probably see an

indexscan used.

after a refresh database the explain yields:
index scan using xXxX (cost=0.00..8.14 rows=10

width=147)

after a vacuum + vacuum analyze the explain yields:
seq scan on acc xXxX A(cost=0.00..1.23 rows=1 width=147)

BTW, comparing those two cost numbers is pretty pointless

since they are

based on different information about the size of the

table.

Show quoted text

regards, tom lane

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Åkerud (#7)
Re: Dissapearing indexes, what's that all about?

Daniel ?erud <zilch@home.se> writes:

and filling it with 10000 rows made out of
$pwgen 8 10000 > data [enter]
and then running VACUUM and VACUUM ANALYZE
still yields a sequential scan doing a
select * from index_with where name > 'm';
namely
seq scan on index_with (cost=0.00..189 rows 5170 width=16)

So? You're asking it to retrieve over half of the table (or at least
the planner estimates so, and I don't see any evidence here that its
estimate is wildly off). An indexscan would still be a loser in this
scenario.

If you want to see an indexscan with an inequality query, try giving
it a reasonably tight range. Probably

select * from index_with where name > 'm' and name < 'n';

would use the index in this example.

regards, tom lane

#11Daniel Åkerud
zilch@home.se
In reply to: Tom Lane (#10)
Re: Re: Dissapearing indexes, what's that all about?

Wohooo,
deluxe :-)

THANKS EVERYBODY!!

Can't see the logic behind that though
The jump in the b-tree must save about 5000 checks... half
the table??

Thanks!

Daniel �kerud

Daniel ?erud <zilch@home.se> writes:

and filling it with 10000 rows made out of
$pwgen 8 10000 > data [enter]
and then running VACUUM and VACUUM ANALYZE
still yields a sequential scan doing a
select * from index_with where name > 'm';
namely
seq scan on index_with (cost=0.00..189 rows 5170

width=16)

So? You're asking it to retrieve over half of the table

(or at least

the planner estimates so, and I don't see any evidence

here that its

estimate is wildly off). An indexscan would still be a

loser in this

scenario.

If you want to see an indexscan with an inequality query,

try giving

Show quoted text

it a reasonably tight range. Probably

select * from index_with where name > 'm' and name < 'n';

would use the index in this example.

regards, tom lane

#12Tom Lane
tgl@sss.pgh.pa.us
In reply to: Daniel Åkerud (#11)
Re: Dissapearing indexes, what's that all about?

Daniel ?erud <zilch@home.se> writes:

Can't see the logic behind that though
The jump in the b-tree must save about 5000 checks... half
the table??

CPUs are fast. Disks are slow. If you think about CPU time rather
than disk accesses, you will usually draw the wrong conclusions.

Even more to the point, disks do not like random access. A seqscan
can typically fetch four or more sequential blocks from disk in the
time it takes an indexscan to fetch one block on a random-access basis.

When you do the math it turns out seqscan wins unless you are fetching
just a small percentage of the rows.

regards, tom lane