select query takes 13 seconds to run with index

Started by markalmost 18 years ago16 messagesgeneral
Jump to latest
#1mark
markkicks@gmail.com

Hi, is there anyway this can be made faster? id is the primary key,
and there is an index on uid..

thanks

EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
DESC limit 6;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..9329.02 rows=6 width=135) (actual
time=13612.247..13612.247 rows=0 loops=1)
-> Index Scan Backward using pokes_pkey on pokes
(cost=0.00..5182270.69 rows=3333 width=135) (actual
time=13612.245..13612.245 rows=0 loops=1)
Filter: (uid = 578439028)
Total runtime: 13612.369 ms
(4 rows)

#2Justin
justin@emproshunts.com
In reply to: mark (#1)
Re: select query takes 13 seconds to run with index

mark wrote:

Hi, is there anyway this can be made faster? id is the primary key,
and there is an index on uid..

thanks

EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
DESC limit 6;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..9329.02 rows=6 width=135) (actual
time=13612.247..13612.247 rows=0 loops=1)
-> Index Scan Backward using pokes_pkey on pokes
(cost=0.00..5182270.69 rows=3333 width=135) (actual
time=13612.245..13612.245 rows=0 loops=1)
Filter: (uid = 578439028)
Total runtime: 13612.369 ms
(4 rows)

First this should be posted on performance list.

how many records are in this table? The uid 578,439,028 assuming this
is auto incremented key that started 1 this means there is 578 million
records in the table.

The estimate is way off, when was the last time Vaccum was on the table?

What verison of Postgresql are you running
Size of the Table
Table layout
Load on the database

#3mark
markkicks@gmail.com
In reply to: Justin (#2)
Re: select query takes 13 seconds to run with index

On Mon, May 26, 2008 at 4:26 PM, Justin <justin@emproshunts.com> wrote:

mark wrote:

Hi, is there anyway this can be made faster? id is the primary key,
and there is an index on uid..
thanks
EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
DESC limit 6;
QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..9329.02 rows=6 width=135) (actual
time=13612.247..13612.247 rows=0 loops=1)
-> Index Scan Backward using pokes_pkey on pokes
(cost=0.00..5182270.69 rows=3333 width=135) (actual
time=13612.245..13612.245 rows=0 loops=1)
Filter: (uid = 578439028)
Total runtime: 13612.369 ms
(4 rows)

First this should be posted on performance list.

sorry about this.

how many records are in this table?

22334262, 22 million records.

The estimate is way off, when was the last time Vaccum was on the table?

about a week ago i ran this VACUUM VERBOSE ANALYZE;
this table is never updated or deleted, rows are just inserted...

What verison of Postgresql are you running

8.3.1

Size of the Table

22 million rows approximately

Table layout

CREATE TABLE pokes
(
id serial NOT NULL,
uid integer,
action_id integer,
created timestamp without time zone DEFAULT now(),
friend_id integer,
message text,
pic text,
"name" text,
CONSTRAINT pokes_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);
ALTER TABLE pokes OWNER TO postgres;

-- Index: idx_action_idx

-- DROP INDEX idx_action_idx;

CREATE INDEX idx_action_idx
ON pokes
USING btree
(action_id);

-- Index: idx_friend_id

-- DROP INDEX idx_friend_id;

CREATE INDEX idx_friend_id
ON pokes
USING btree
(friend_id);

-- Index: idx_pokes_uid

-- DROP INDEX idx_pokes_uid;

CREATE INDEX idx_pokes_uid
ON pokes
USING btree
(uid);

Load on the database

how do i measure load on database?

#4Justin
justin@emproshunts.com
In reply to: mark (#3)
Re: select query takes 13 seconds to run with index

mark wrote:

On Mon, May 26, 2008 at 4:26 PM, Justin <justin@emproshunts.com> wrote:

mark wrote:

Hi, is there anyway this can be made faster? id is the primary key,
and there is an index on uid..
thanks
EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
DESC limit 6;
QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..9329.02 rows=6 width=135) (actual
time=13612.247..13612.247 rows=0 loops=1)
-> Index Scan Backward using pokes_pkey on pokes
(cost=0.00..5182270.69 rows=3333 width=135) (actual
time=13612.245..13612.245 rows=0 loops=1)
Filter: (uid = 578439028)
Total runtime: 13612.369 ms
(4 rows)

First this should be posted on performance list.

sorry about this.

how many records are in this table?

22334262, 22 million records.

The estimate is way off, when was the last time Vaccum was on the table?

about a week ago i ran this VACUUM VERBOSE ANALYZE;
this table is never updated or deleted, rows are just inserted...

What verison of Postgresql are you running

8.3.1

Size of the Table

22 million rows approximately

I have no experience on large datasets so people with more experience
in this area are going to have to chime in.
My gut feel is 13 seconds for Postgresql to sort through an index of
that size and table is not bad.

you may need to take a look at hardware and postgresql.config settings
to improve the performance for this query

This query is very simple where changing it around or adding index
results massive improvements is not going to help in this case.

Table layout

CREATE TABLE pokes
(
id serial NOT NULL,
uid integer,
action_id integer,
created timestamp without time zone DEFAULT now(),
friend_id integer,
message text,
pic text,
"name" text,
CONSTRAINT pokes_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);
ALTER TABLE pokes OWNER TO postgres;

-- Index: idx_action_idx

-- DROP INDEX idx_action_idx;

CREATE INDEX idx_action_idx
ON pokes
USING btree
(action_id);

-- Index: idx_friend_id

-- DROP INDEX idx_friend_id;

CREATE INDEX idx_friend_id
ON pokes
USING btree
(friend_id);

-- Index: idx_pokes_uid

-- DROP INDEX idx_pokes_uid;

CREATE INDEX idx_pokes_uid
ON pokes
USING btree
(uid);

Load on the database

how do i measure load on database?

How many users are attached to the server at any given time. how many
inserts, deletes selects are being done on the server. Its number TPS
on the server.

#5mark
markkicks@gmail.com
In reply to: Justin (#4)
Re: select query takes 13 seconds to run with index

On Mon, May 26, 2008 at 4:49 PM, Justin <justin@emproshunts.com> wrote:

mark wrote:

On Mon, May 26, 2008 at 4:26 PM, Justin <justin@emproshunts.com> wrote:
mark wrote:
Hi, is there anyway this can be made faster? id is the primary key,
and there is an index on uid..
thanks
EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
DESC limit 6;
QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..9329.02 rows=6 width=135) (actual
time=13612.247..13612.247 rows=0 loops=1)
-> Index Scan Backward using pokes_pkey on pokes
(cost=0.00..5182270.69 rows=3333 width=135) (actual
time=13612.245..13612.245 rows=0 loops=1)
Filter: (uid = 578439028)
Total runtime: 13612.369 ms
(4 rows)

First this should be posted on performance list.

sorry about this.

how many records are in this table?

22334262, 22 million records.

The estimate is way off, when was the last time Vaccum was on the table?

about a week ago i ran this VACUUM VERBOSE ANALYZE;
this table is never updated or deleted, rows are just inserted...

What verison of Postgresql are you running

8.3.1

Size of the Table

22 million rows approximately

I have no experience on large datasets so people with more experience in
this area are going to have to chime in.
My gut feel is 13 seconds for Postgresql to sort through an index of that
size and table is not bad.

you may need to take a look at hardware and postgresql.config settings to
improve the performance for this query

This query is very simple where changing it around or adding index results
massive improvements is not going to help in this case.

the hardware is e5405 dual quad core on a 16GB RAM machine, with 8.3.1
default settings except maximum connections increased...

#6Gregory Williamson
Gregory.Williamson@digitalglobe.com
In reply to: mark (#1)
Re: select query takes 13 seconds to run with index

Justin wrote:

mark wrote:

On Mon, May 26, 2008 at 4:26 PM, Justin <justin@emproshunts.com> wrote:

mark wrote:

Hi, is there anyway this can be made faster? id is the primary key,
and there is an index on uid..
thanks
EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
DESC limit 6;
QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..9329.02 rows=6 width=135) (actual
time=13612.247..13612.247 rows=0 loops=1)
-> Index Scan Backward using pokes_pkey on pokes
(cost=0.00..5182270.69 rows=3333 width=135) (actual
time=13612.245..13612.245 rows=0 loops=1)
Filter: (uid = 578439028)
Total runtime: 13612.369 ms
(4 rows)

First this should be posted on performance list.

sorry about this.

how many records are in this table?

22334262, 22 million records.

The estimate is way off, when was the last time Vaccum was on the table?

about a week ago i ran this VACUUM VERBOSE ANALYZE;
this table is never updated or deleted, rows are just inserted...

What verison of Postgresql are you running

8.3.1

Size of the Table

22 million rows approximately

I have no experience on large datasets so people with more experience in this area are going to have to chime in.
My gut feel is 13 seconds for Postgresql to sort through an index of that size and table is not bad.

you may need to take a look at hardware and postgresql.config settings to improve the performance for this query

This query is very simple where changing it around or adding index results massive improvements is not going to help in this case.

Table layout

CREATE TABLE pokes
(
id serial NOT NULL,
uid integer,
action_id integer,
created timestamp without time zone DEFAULT now(),
friend_id integer,
message text,
pic text,
"name" text,
CONSTRAINT pokes_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);
ALTER TABLE pokes OWNER TO postgres;

-- Index: idx_action_idx

-- DROP INDEX idx_action_idx;

CREATE INDEX idx_action_idx
ON pokes
USING btree
(action_id);

-- Index: idx_friend_id

-- DROP INDEX idx_friend_id;

CREATE INDEX idx_friend_id
ON pokes
USING btree
(friend_id);

-- Index: idx_pokes_uid

-- DROP INDEX idx_pokes_uid;

CREATE INDEX idx_pokes_uid
ON pokes
USING btree
(uid);

Load on the database

how do i measure load on database?

How many users are attached to the server at any given time. how many inserts, deletes selects are being done on the server. Its number TPS on the server.

Justin wrote:

mark wrote:

On Mon, May 26, 2008 at 4:26 PM, Justin <justin@emproshunts.com> wrote:

mark wrote:

Hi, is there anyway this can be made faster? id is the primary key,
and there is an index on uid..
thanks
EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
DESC limit 6;
QUERY
PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..9329.02 rows=6 width=135) (actual
time=13612.247..13612.247 rows=0 loops=1)
-> Index Scan Backward using pokes_pkey on pokes
(cost=0.00..5182270.69 rows=3333 width=135) (actual
time=13612.245..13612.245 rows=0 loops=1)
Filter: (uid = 578439028)
Total runtime: 13612.369 ms
(4 rows)

First this should be posted on performance list.

sorry about this.

how many records are in this table?

22334262, 22 million records.

The estimate is way off, when was the last time Vaccum was on the table?

about a week ago i ran this VACUUM VERBOSE ANALYZE;
this table is never updated or deleted, rows are just inserted...

What verison of Postgresql are you running

8.3.1

Size of the Table

22 million rows approximately

I have no experience on large datasets so people with more experience in this area are going to have to chime in.
My gut feel is 13 seconds for Postgresql to sort through an index of that size and table is not bad.

you may need to take a look at hardware and postgresql.config settings to improve the performance for this query

This query is very simple where changing it around or adding index results massive
improvements is not going to help in this case.

I just ran a test on not particularly impressive hardware (8.2.6) on a table with 58980741 rows:
billing=# explain analyze select * from stats_asset_use where date = '2006-03-12' order by tracking_id desc limit 6;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..5.45 rows=6 width=38) (actual time=0.028..0.037 rows=6 loops=1)
-> Index Scan Backward using stats_day_ndx on stats_asset_use (cost=0.00..61279.91 rows=67437 width=38) (actual time=0.026..0.032 rows=6 loops=1)
Index Cond: (date = '2006-03-12'::date)
Total runtime: 5.957 ms
(4 rows)

There is an index on date (only). A typical day might have anywhere from a few thousand entries to a few hundred thousand with the average in the low thousands. Inserts only, no deletes or updates.

This table gets analyzed daily (overkill) so the stats are up to date; I wonder if that's a problem in your case ?

Table layout

CREATE TABLE pokes
(
id serial NOT NULL,
uid integer,
action_id integer,
created timestamp without time zone DEFAULT now(),
friend_id integer,
message text,
pic text,
"name" text,
CONSTRAINT pokes_pkey PRIMARY KEY (id)
)
WITH (OIDS=FALSE);
ALTER TABLE pokes OWNER TO postgres;

-- Index: idx_action_idx

-- DROP INDEX idx_action_idx;

CREATE INDEX idx_action_idx
ON pokes
USING btree
(action_id);

-- Index: idx_friend_id

-- DROP INDEX idx_friend_id;

CREATE INDEX idx_friend_id
ON pokes
USING btree
(friend_id);

-- Index: idx_pokes_uid

-- DROP INDEX idx_pokes_uid;

CREATE INDEX idx_pokes_uid
ON pokes
USING btree
(uid);

Load on the database

how do i measure load on database?

How many users are attached to the server at any given time. how many inserts, deletes
selects are being done on the server. Its number TPS on the server.

On Windoze I don't know; on *NIX variants the utility "top" can show useful information on load and active processes; iostat or vmstat can give detailed looks over time (use period of between 1 and 5 seconds maybe and discard the first row as nonsense); they show disk i/o and context switching, etc.

HTH,

Greg Williamson
Senior DBA
DigitalGlobe

Confidentiality Notice: This e-mail message, including any attachments, is for the sole use of the intended recipient(s) and may contain confidential and privileged information and must be protected in accordance with those provisions. Any unauthorized review, use, disclosure or distribution is prohibited. If you are not the intended recipient, please contact the sender by reply e-mail and destroy all copies of the original message.

(My corporate masters made me say this.)

#7Justin
justin@emproshunts.com
In reply to: mark (#5)
Re: select query takes 13 seconds to run with index

mark wrote:

Size of the Table

22 million rows approximately

I have no experience on large datasets so people with more experience in
this area are going to have to chime in.
My gut feel is 13 seconds for Postgresql to sort through an index of that
size and table is not bad.

you may need to take a look at hardware and postgresql.config settings to
improve the performance for this query

This query is very simple where changing it around or adding index results
massive improvements is not going to help in this case.

the hardware is e5405 dual quad core on a 16GB RAM machine, with 8.3.1
default settings except maximum connections increased...

That could be problem, Postgresql default settings are very conservative.

You need to read http://www.postgresqldocs.org/wiki/Performance_Optimization
and tune posgtresql.config settings.

What OS are you running?
What is Disk Subsystem setup???

#8David Wilson
david.t.wilson@gmail.com
In reply to: Justin (#2)
Re: select query takes 13 seconds to run with index

On Mon, May 26, 2008 at 7:26 PM, Justin <justin@emproshunts.com> wrote:

The estimate is way off, when was the last time Vaccum was on the table?

I'm going to second this- run "ANALYZE pokes;" and then test the query
again; I'll bet you'll get much better results.

It's not the VACUUM that matters so much as the ANALYZE, and it
definitely needs to be done on occasion if you're adding a lot of
records. Do you have the autovacuum daemon running? (And if not, why
not?)

--
- David T. Wilson
david.t.wilson@gmail.com

#9Justin
justin@emproshunts.com
In reply to: Gregory Williamson (#6)
Re: select query takes 13 seconds to run with index

How many users are attached to the server at any given time. how

many inserts, deletes

selects are being done on the server. Its number TPS on the server.

On Windoze I don't know; on *NIX variants the utility "top" can show
useful information on load and active processes; iostat or vmstat can
give detailed looks over time (use period of between 1 and 5 seconds
maybe and discard the first row as nonsense); they show disk i/o and
context switching, etc.

HTH,

Greg Williamson
Senior DBA
DigitalGlobe

There are several tools to do this process explorer which has to be down
loaded, and performance monitor. The problem with performance monitor
is posgresql keeps spawning new exe which makes reading the result real
a pain.

#10Tom Lane
tgl@sss.pgh.pa.us
In reply to: mark (#1)
Re: select query takes 13 seconds to run with index

mark <markkicks@gmail.com> writes:

EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
DESC limit 6;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..9329.02 rows=6 width=135) (actual
time=13612.247..13612.247 rows=0 loops=1)
-> Index Scan Backward using pokes_pkey on pokes
(cost=0.00..5182270.69 rows=3333 width=135) (actual
time=13612.245..13612.245 rows=0 loops=1)
Filter: (uid = 578439028)
Total runtime: 13612.369 ms
(4 rows)

The problem is the vast disconnect between the estimated and actual
rowcounts for the indexscan (3333 vs 0). The planner thinks there
are three thousand rows matching uid = 578439028, and that encourages
it to try a plan that's only going to be fast if at least six such
rows show up fairly soon while scanning the index in reverse id order.
What you really want it to do here is scan on the uid index and then
sort the result by id ... but that will be slow in exactly the case
where this plan is fast, ie, when there are a lot of matching uids.

Bottom line: the planner cannot make the right choice between these
alternatives unless it's got decent statistics about the frequency
of uid values. "I analyzed the table about a week ago" is not good
enough maintenance policy --- you need current stats, and you might need
to bump up the statistics target to get enough data about less-common
values of uid.

(Since it's 8.3, the autovac daemon might have been analyzing for you,
if you didn't turn off autovacuum. In that case increasing the
statistics target is the first thing to try.)

regards, tom lane

#11mark
markkicks@gmail.com
In reply to: Tom Lane (#10)
Re: select query takes 13 seconds to run with index

On Mon, May 26, 2008 at 5:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

mark <markkicks@gmail.com> writes:

EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
DESC limit 6;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..9329.02 rows=6 width=135) (actual
time=13612.247..13612.247 rows=0 loops=1)
-> Index Scan Backward using pokes_pkey on pokes
(cost=0.00..5182270.69 rows=3333 width=135) (actual
time=13612.245..13612.245 rows=0 loops=1)
Filter: (uid = 578439028)
Total runtime: 13612.369 ms
(4 rows)

The problem is the vast disconnect between the estimated and actual
rowcounts for the indexscan (3333 vs 0). The planner thinks there
are three thousand rows matching uid = 578439028, and that encourages
it to try a plan that's only going to be fast if at least six such
rows show up fairly soon while scanning the index in reverse id order.
What you really want it to do here is scan on the uid index and then
sort the result by id ... but that will be slow in exactly the case
where this plan is fast, ie, when there are a lot of matching uids.

Bottom line: the planner cannot make the right choice between these
alternatives unless it's got decent statistics about the frequency
of uid values. "I analyzed the table about a week ago" is not good
enough maintenance policy --- you need current stats, and you might need
to bump up the statistics target to get enough data about less-common
values of uid.

how do i do this? bump up the statistics target?

(Since it's 8.3, the autovac daemon might have been analyzing for you,
if you didn't turn off autovacuum. In that case increasing the
statistics target is the first thing to try.)

i did not turn it off..
and my OS is fedora 9

i ran vacuum verbose analyze pokes, and then ran the same query, and
there is no improvement..

EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id limit 6;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..8446.80 rows=6 width=130) (actual
time=12262.779..12262.779 rows=0 loops=1)
-> Index Scan using pokes_pkey on pokes (cost=0.00..5149730.49
rows=3658 width=130) (actual time=12262.777..12262.777 rows=0 loops=1)
Filter: (uid = 578439028)
Total runtime: 12262.817 ms

VACUUM VERBOSE ANALYZE pokes ;
INFO: vacuuming "public.pokes"
INFO: index "pokes_pkey" now contains 22341026 row versions in 61258 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.24s/0.06u sec elapsed 1.61 sec.
INFO: index "idx_action_idx" now contains 22341026 row versions in 61548 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.38s/0.09u sec elapsed 7.21 sec.
INFO: index "idx_friend_id" now contains 22341026 row versions in 60547 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.44s/0.11u sec elapsed 9.13 sec.
INFO: index "idx_pokes_uid" now contains 22341026 row versions in 62499 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.41s/0.09u sec elapsed 7.44 sec.
INFO: "pokes": found 0 removable, 22341026 nonremovable row versions
in 388144 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
1923 pages contain useful free space.
0 pages are entirely empty.
CPU 3.02s/2.38u sec elapsed 29.21 sec.
INFO: vacuuming "pg_toast.pg_toast_43415"
INFO: index "pg_toast_43415_index" now contains 12 row versions in 2 pages
DETAIL: 0 index row versions were removed.
0 index pages have been deleted, 0 are currently reusable.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: "pg_toast_43415": found 0 removable, 12 nonremovable row
versions in 2 pages
DETAIL: 0 dead row versions cannot be removed yet.
There were 0 unused item pointers.
2 pages contain useful free space.
0 pages are entirely empty.
CPU 0.00s/0.00u sec elapsed 0.00 sec.
INFO: analyzing "public.pokes"
INFO: "pokes": scanned 3000 of 388144 pages, containing 172933 live
rows and 0 dead rows; 3000 rows in sample, 22374302 estimated total

#12mark
markkicks@gmail.com
In reply to: mark (#11)
Re: select query takes 13 seconds to run with index

On Mon, May 26, 2008 at 7:58 PM, mark <markkicks@gmail.com> wrote:

On Mon, May 26, 2008 at 5:36 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

mark <markkicks@gmail.com> writes:

EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
DESC limit 6;
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..9329.02 rows=6 width=135) (actual
time=13612.247..13612.247 rows=0 loops=1)
-> Index Scan Backward using pokes_pkey on pokes
(cost=0.00..5182270.69 rows=3333 width=135) (actual
time=13612.245..13612.245 rows=0 loops=1)
Filter: (uid = 578439028)
Total runtime: 13612.369 ms
(4 rows)

The problem is the vast disconnect between the estimated and actual
rowcounts for the indexscan (3333 vs 0). The planner thinks there
are three thousand rows matching uid = 578439028, and that encourages
it to try a plan that's only going to be fast if at least six such
rows show up fairly soon while scanning the index in reverse id order.
What you really want it to do here is scan on the uid index and then
sort the result by id ... but that will be slow in exactly the case
where this plan is fast, ie, when there are a lot of matching uids.

Bottom line: the planner cannot make the right choice between these
alternatives unless it's got decent statistics about the frequency
of uid values. "I analyzed the table about a week ago" is not good
enough maintenance policy --- you need current stats, and you might need
to bump up the statistics target to get enough data about less-common
values of uid.

how do i do this? bump up the statistics target?

(Since it's 8.3, the autovac daemon might have been analyzing for you,
if you didn't turn off autovacuum. In that case increasing the
statistics target is the first thing to try.)

i did not turn it off..
and my OS is fedora 9

i ran vacuum verbose analyze pokes, and then ran the same query, and
there is no improvement..

EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id limit 6;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=0.00..8446.80 rows=6 width=130) (actual
time=12262.779..12262.779 rows=0 loops=1)
-> Index Scan using pokes_pkey on pokes (cost=0.00..5149730.49
rows=3658 width=130) (actual time=12262.777..12262.777 rows=0 loops=1)
Filter: (uid = 578439028)
Total runtime: 12262.817 ms

OK I did this

ALTER TABLE pokes ALTER uid set statistics 500;
ALTER TABLE

ANALYZE pokes;
ANALYZE

and then it became super fast!! thanks a lot!!!
my question:
-> is 500 too high? what all does this affect?
-> now increasing this number does it affect only when i am running
analyze commands, or will it slow down inserts and other operations?
EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
desc limit 6;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=467.80..467.81 rows=6 width=134) (actual
time=0.016..0.016 rows=0 loops=1)
-> Sort (cost=467.80..468.09 rows=117 width=134) (actual
time=0.016..0.016 rows=0 loops=1)
Sort Key: id
Sort Method: quicksort Memory: 25kB
-> Index Scan using idx_pokes_uid on pokes
(cost=0.00..465.70 rows=117 width=134) (actual time=0.011..0.011
rows=0 loops=1)
Index Cond: (uid = 578439028)
Total runtime: 0.037 ms

#13PFC
lists@peufeu.com
In reply to: mark (#3)
Re: [PERFORM] select query takes 13 seconds to run with index

The estimate is way off, when was the last time Vaccum was on the table?

about a week ago i ran this VACUUM VERBOSE ANALYZE;
this table is never updated or deleted, rows are just inserted...

You should analyze it more often, then... Postgres probably thinks the
table has the same data distribution as last week !
Analyze just takes a couple seconds...

Load on the database

how do i measure load on database?

Just look at vmstat.

Also if you very often do SELECT .. WHERE x = ... ORDER BY id DESC you'll
benefit from an index on (x,id) instead of just (x).

In reply to: mark (#3)
Re: [PERFORM] select query takes 13 seconds to run with index

On Mon, May 26, 2008 at 04:32:50PM -0700, mark wrote:

EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
DESC limit 6;

The estimate is way off, when was the last time Vaccum was on the table?

about a week ago i ran this VACUUM VERBOSE ANALYZE;
this table is never updated or deleted, rows are just inserted...

1. boost default_statistics_target
2. run analyze more often - daily job for example
3. create index q on pokes (uid, id); should help

depesz

#15mark
markkicks@gmail.com
In reply to: hubert depesz lubaczewski (#14)
Re: [GENERAL] select query takes 13 seconds to run with index

On Tue, May 27, 2008 at 1:22 AM, hubert depesz lubaczewski
<depesz@depesz.com> wrote:

On Mon, May 26, 2008 at 04:32:50PM -0700, mark wrote:

EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
DESC limit 6;

The estimate is way off, when was the last time Vaccum was on the table?

about a week ago i ran this VACUUM VERBOSE ANALYZE;
this table is never updated or deleted, rows are just inserted...

1. boost default_statistics_target
2. run analyze more often - daily job for example
3. create index q on pokes (uid, id); should help

OK I did this

ALTER TABLE pokes ALTER uid set statistics 500;
ALTER TABLE

ANALYZE pokes;
ANALYZE

and then it became super fast!! thanks a lot!!!
my question:
-> is 500 too high? what all does this affect?
-> now increasing this number does it affect only when i am running
analyze commands, or will it slow down inserts and other operations?
EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
desc limit 6;
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=467.80..467.81 rows=6 width=134) (actual
time=0.016..0.016 rows=0 loops=1)
-> Sort (cost=467.80..468.09 rows=117 width=134) (actual
time=0.016..0.016 rows=0 loops=1)
Sort Key: id
Sort Method: quicksort Memory: 25kB
-> Index Scan using idx_pokes_uid on pokes
(cost=0.00..465.70 rows=117 width=134) (actual time=0.011..0.011
rows=0 loops=1)
Index Cond: (uid = 578439028)
Total runtime: 0.037 ms

In reply to: mark (#15)
Re: [GENERAL] select query takes 13 seconds to run with index

On Tue, May 27, 2008 at 07:46:05AM -0700, mark wrote:

and then it became super fast!! thanks a lot!!!
my question:
-> is 500 too high? what all does this affect?

i usually dont go over 100. it affects number of elements in statistics
for fields. you can see the stats in:
select * from pg_stats;

-> now increasing this number does it affect only when i am running
analyze commands, or will it slow down inserts and other operations?
EXPLAIN ANALYZE select * from pokes where uid = 578439028 order by id
desc limit 6;

it (theoretically) can slow down selects to to the fact that it now has
to load more data to be able to plan (i.e. it loads the statistics, and
since there are more values - the statistics are larger).

generally - in most cases this shouldn't be an issue.

additionally - i think that the 2-column index would work in this
particular case even better.

regards,

depesz