Index only select count(*)

Started by Toni Heleniusover 12 years ago7 messagesgeneral
Jump to latest
#1Toni Helenius
Toni.Helenius@syncrontech.com

Hello,

I have a table with 961 columns. 1 of the columns is included in a primary key. The table has 52 760 rows. The table size is 137 MB and indexes size is 1176 kB (just the primary key). When I try:
SELECT count(*)
on it, it uses Seq Scan instead of Index scan. The query takes ~3 seconds. I have almost 500 similar tables. No index scans, why?

I'm using PostgreSQL 9.2.4 server. And I've read https://wiki.postgresql.org/wiki/Index-only_scans , apparently without understanding it. This is quite a performance issue to us. I've manually vacuumed and recreated indexes and all. No help. The output of analyze:
"Aggregate (cost=18240.50..18240.51 rows=1 width=0)"
" -> Seq Scan on min1_009 (cost=0.00..18108.60 rows=52760 width=0)"

--
Toni Helenius

#2Alban Hertroys
haramrae@gmail.com
In reply to: Toni Helenius (#1)
Re: Index only select count(*)

On Oct 7, 2013, at 11:23, Toni Helenius <Toni.Helenius@syncrontech.com> wrote:

The output of analyze:
"Aggregate (cost=18240.50..18240.51 rows=1 width=0)"
" -> Seq Scan on min1_009 (cost=0.00..18108.60 rows=52760 width=0)"

That's the output of Explain, not of Explain Analyze. The latter has actual measurements to go with the estimated costs, which gives a lot more insight.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Toni Helenius
Toni.Helenius@syncrontech.com
In reply to: Alban Hertroys (#2)
Re: Index only select count(*)

Hi,

yes, I'm cheating by using GUI :) (PgAdmin)

Here:

"Aggregate (cost=18240.50..18240.51 rows=1 width=0) (actual time=2911.117..2911.119 rows=1 loops=1)"
" -> Seq Scan on min1_009 (cost=0.00..18108.60 rows=52760 width=0) (actual time=5.390..2816.274 rows=52760 loops=1)"
"Total runtime: 2912.211 ms"

-----Original Message-----
From: Alban Hertroys [mailto:haramrae@gmail.com]
Sent: 7. lokakuuta 2013 12:31
To: Toni Helenius
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Index only select count(*)

On Oct 7, 2013, at 11:23, Toni Helenius <Toni.Helenius@syncrontech.com> wrote:

The output of analyze:
"Aggregate (cost=18240.50..18240.51 rows=1 width=0)"
" -> Seq Scan on min1_009 (cost=0.00..18108.60 rows=52760 width=0)"

That's the output of Explain, not of Explain Analyze. The latter has actual measurements to go with the estimated costs, which gives a lot more insight.

Alban Hertroys
--
If you can't see the forest for the trees, cut the trees and you'll find there is no forest.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4Alban Hertroys
haramrae@gmail.com
In reply to: Toni Helenius (#3)
Re: Index only select count(*)

On Oct 7, 2013, at 11:34, Toni Helenius <Toni.Helenius@syncrontech.com> wrote:

Hi,

yes, I'm cheating by using GUI :) (PgAdmin)

Please do not top-post.

Analyze is an entirely different command than Explain analyze. Analyze updates the statistics of tables, while Explain analyze tells how those statistics affect the query plan.

Here:

"Aggregate (cost=18240.50..18240.51 rows=1 width=0) (actual time=2911.117..2911.119 rows=1 loops=1)"
" -> Seq Scan on min1_009 (cost=0.00..18108.60 rows=52760 width=0) (actual time=5.390..2816.274 rows=52760 loops=1)"
"Total runtime: 2912.211 ms"

That article you referenced mentions pg_class.relallvisible - what value does that have for your table?
Is it possible that a relatively large amount of the data in that table is not visible to other sessions, or was that perhaps the case when you last (vacuum) analyzed the table?

What is the plan if you set enable_seqscan = off; in your session? Does that give any more insight?

-----Original Message-----
From: Alban Hertroys [mailto:haramrae@gmail.com]
Sent: 7. lokakuuta 2013 12:31
To: Toni Helenius
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Index only select count(*)

On Oct 7, 2013, at 11:23, Toni Helenius <Toni.Helenius@syncrontech.com> wrote:

The output of analyze:
"Aggregate (cost=18240.50..18240.51 rows=1 width=0)"
" -> Seq Scan on min1_009 (cost=0.00..18108.60 rows=52760 width=0)"

That's the output of Explain, not of Explain Analyze. The latter has actual measurements to go with the estimated costs, which gives a lot more insight.

Alban Hertroys
--
If you can't see the forest for the trees, cut the trees and you'll find there is no forest.

Alban Hertroys
--
If you can't see the forest for the trees,
cut the trees and you'll find there is no forest.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#5Toni Helenius
Toni.Helenius@syncrontech.com
In reply to: Alban Hertroys (#4)
Re: Index only select count(*)

On Oct 7, 2013, at 11:34, Toni Helenius <Toni.Helenius@syncrontech.com>
wrote:

Hi,

yes, I'm cheating by using GUI :) (PgAdmin)

Please do not top-post.

Analyze is an entirely different command than Explain analyze. Analyze
updates the statistics of tables, while Explain analyze tells how those
statistics affect the query plan.

Here:

"Aggregate (cost=18240.50..18240.51 rows=1 width=0) (actual

time=2911.117..2911.119 rows=1 loops=1)"

" -> Seq Scan on min1_009 (cost=0.00..18108.60 rows=52760 width=0)

(actual time=5.390..2816.274 rows=52760 loops=1)"

"Total runtime: 2912.211 ms"

That article you referenced mentions pg_class.relallvisible - what
value does that have for your table?
Is it possible that a relatively large amount of the data in that table
is not visible to other sessions, or was that perhaps the case when you
last (vacuum) analyzed the table?

Hmm, yes I didn't check these before.
SELECT pg_class.relname, pg_class.relallvisible, pg_class.relpages FROM pg_class where pg_class.relname = 'min1_009';
"min1_009";0;17581

So no pages are visible? How come? They should be? The schema I'm on has GRANT ALL ON SCHEMA "SERIES" TO public;

What is the plan if you set enable_seqscan = off; in your session? Does
that give any more insight?

-----Original Message-----
From: Alban Hertroys [mailto:haramrae@gmail.com]
Sent: 7. lokakuuta 2013 12:31
To: Toni Helenius
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Index only select count(*)

On Oct 7, 2013, at 11:23, Toni Helenius

<Toni.Helenius@syncrontech.com> wrote:

The output of analyze:
"Aggregate (cost=18240.50..18240.51 rows=1 width=0)"
" -> Seq Scan on min1_009 (cost=0.00..18108.60 rows=52760

width=0)"

That's the output of Explain, not of Explain Analyze. The latter has

actual measurements to go with the estimated costs, which gives a lot
more insight.

Alban Hertroys
--
If you can't see the forest for the trees, cut the trees and you'll

find there is no forest.

Alban Hertroys
--
If you can't see the forest for the trees, cut the trees and you'll
find there is no forest.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#6Toni Helenius
Toni.Helenius@syncrontech.com
In reply to: Toni Helenius (#5)
Re: Index only select count(*)

On Oct 7, 2013, at 11:34, Toni Helenius
<Toni.Helenius@syncrontech.com>
wrote:

Hi,

yes, I'm cheating by using GUI :) (PgAdmin)

Please do not top-post.

Analyze is an entirely different command than Explain analyze.

Analyze

updates the statistics of tables, while Explain analyze tells how
those statistics affect the query plan.

Here:

"Aggregate (cost=18240.50..18240.51 rows=1 width=0) (actual

time=2911.117..2911.119 rows=1 loops=1)"

" -> Seq Scan on min1_009 (cost=0.00..18108.60 rows=52760
width=0)

(actual time=5.390..2816.274 rows=52760 loops=1)"

"Total runtime: 2912.211 ms"

That article you referenced mentions pg_class.relallvisible - what
value does that have for your table?
Is it possible that a relatively large amount of the data in that
table is not visible to other sessions, or was that perhaps the case
when you last (vacuum) analyzed the table?

Hmm, yes I didn't check these before.
SELECT pg_class.relname, pg_class.relallvisible, pg_class.relpages FROM
pg_class where pg_class.relname = 'min1_009';
"min1_009";0;17581

So no pages are visible? How come? They should be? The schema I'm on
has GRANT ALL ON SCHEMA "SERIES" TO public;

Hmm, I was able to get an index-only scan and get all pages to be visible by doing the VACUUM in the same session. Running VACUUM from PgAdmin GUI didn't have any effect.

This did the trick:
VACUUM "SERIES".min1_009;
EXPLAIN ANALYZE SELECT count(*)
FROM "SERIES".min1_009;

But I'm still confused about this, do I need to log in on the database with the login I use and manually do the vacuum? Because auto-vacuum doesn't update the visibility maps for that user or session or something? Should auto-vacuum do this?

What is the plan if you set enable_seqscan = off; in your session?
Does that give any more insight?

-----Original Message-----
From: Alban Hertroys [mailto:haramrae@gmail.com]
Sent: 7. lokakuuta 2013 12:31
To: Toni Helenius
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Index only select count(*)

On Oct 7, 2013, at 11:23, Toni Helenius

<Toni.Helenius@syncrontech.com> wrote:

The output of analyze:
"Aggregate (cost=18240.50..18240.51 rows=1 width=0)"
" -> Seq Scan on min1_009 (cost=0.00..18108.60 rows=52760

width=0)"

That's the output of Explain, not of Explain Analyze. The latter

has

actual measurements to go with the estimated costs, which gives a lot
more insight.

Alban Hertroys
--
If you can't see the forest for the trees, cut the trees and you'll

find there is no forest.

Alban Hertroys
--
If you can't see the forest for the trees, cut the trees and you'll
find there is no forest.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To
make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Toni Helenius
Toni.Helenius@syncrontech.com
In reply to: Toni Helenius (#6)
Re: Index only select count(*)

On Oct 7, 2013, at 11:34, Toni Helenius
<Toni.Helenius@syncrontech.com>
wrote:

Hi,

yes, I'm cheating by using GUI :) (PgAdmin)

Please do not top-post.

Analyze is an entirely different command than Explain analyze.

Analyze

updates the statistics of tables, while Explain analyze tells how
those statistics affect the query plan.

Here:

"Aggregate (cost=18240.50..18240.51 rows=1 width=0) (actual

time=2911.117..2911.119 rows=1 loops=1)"

" -> Seq Scan on min1_009 (cost=0.00..18108.60 rows=52760
width=0)

(actual time=5.390..2816.274 rows=52760 loops=1)"

"Total runtime: 2912.211 ms"

That article you referenced mentions pg_class.relallvisible - what
value does that have for your table?
Is it possible that a relatively large amount of the data in that
table is not visible to other sessions, or was that perhaps the

case

when you last (vacuum) analyzed the table?

Hmm, yes I didn't check these before.
SELECT pg_class.relname, pg_class.relallvisible, pg_class.relpages
FROM pg_class where pg_class.relname = 'min1_009';
"min1_009";0;17581

So no pages are visible? How come? They should be? The schema I'm on
has GRANT ALL ON SCHEMA "SERIES" TO public;

Hmm, I was able to get an index-only scan and get all pages to be
visible by doing the VACUUM in the same session. Running VACUUM from
PgAdmin GUI didn't have any effect.

This did the trick:
VACUUM "SERIES".min1_009;
EXPLAIN ANALYZE SELECT count(*)
FROM "SERIES".min1_009;

But I'm still confused about this, do I need to log in on the database
with the login I use and manually do the vacuum? Because auto-vacuum
doesn't update the visibility maps for that user or session or
something? Should auto-vacuum do this?

Ok, from the GUI I checked FULL, FREEZE & ANALYZE. That combination doesn't seem to set the last vacuum date either. So it was wrong for me to use it like that. And auto-vacuum seems to update visibility maps as planned. My mistakes. Sorry about that. Everything seems to work.

What is the plan if you set enable_seqscan = off; in your session?
Does that give any more insight?

-----Original Message-----
From: Alban Hertroys [mailto:haramrae@gmail.com]
Sent: 7. lokakuuta 2013 12:31
To: Toni Helenius
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Index only select count(*)

On Oct 7, 2013, at 11:23, Toni Helenius

<Toni.Helenius@syncrontech.com> wrote:

The output of analyze:
"Aggregate (cost=18240.50..18240.51 rows=1 width=0)"
" -> Seq Scan on min1_009 (cost=0.00..18108.60 rows=52760

width=0)"

That's the output of Explain, not of Explain Analyze. The latter

has

actual measurements to go with the estimated costs, which gives a
lot more insight.

Alban Hertroys
--
If you can't see the forest for the trees, cut the trees and
you'll

find there is no forest.

Alban Hertroys
--
If you can't see the forest for the trees, cut the trees and you'll
find there is no forest.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To
make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general