Why would a scan take so long?

Started by Michal Politowskiover 15 years ago3 messagesgeneral
Jump to latest
#1Michal Politowski
mpol+pg@meep.pl

EXPLAIN SELECT count(DISTINCT catalog.id) FROM catalog WHERE flag=false;
QUERY PLAN
-----------------------------------------------------------------------------------------
Aggregate (cost=1615927.27..1615927.28 rows=1 width=8)
-> Seq Scan on catalog (cost=0.00..1603214.56 rows=5085084 width=8)
Filter: (NOT flag)

SELECT pg_size_pretty(pg_relation_size('catalog'));
pg_size_pretty
----------------
9380 MB

Nothing else is going on the system, during the query disk reads rise from
around 0 to > 100MB/s, so I would assume it should take a couple minutes
and it takes ten times longer:
Time: 1495549.716 ms

What am I missing?

--
Michal Politowski

#2Scott Marlowe
scott.marlowe@gmail.com
In reply to: Michal Politowski (#1)
Re: Why would a scan take so long?

On Fri, Oct 1, 2010 at 9:13 AM, Michal Politowski <mpol+pg@meep.pl> wrote:

EXPLAIN SELECT count(DISTINCT catalog.id) FROM catalog WHERE flag=false;

1: Try running explain analyze select ... here. It's far more informative.
2: select distinct is generally slower than using group by.

                                      QUERY PLAN
-----------------------------------------------------------------------------------------
 Aggregate  (cost=1615927.27..1615927.28 rows=1 width=8)
  ->  Seq Scan on catalog  (cost=0.00..1603214.56 rows=5085084 width=8)
        Filter: (NOT flag)

SELECT pg_size_pretty(pg_relation_size('catalog'));
 pg_size_pretty
----------------
 9380 MB

Nothing else is going on the system, during the query disk reads rise from
around 0 to > 100MB/s, so I would assume it should take a couple minutes
and it takes ten times longer:
Time: 1495549.716 ms

So that's about 9 Gigs read in 1495 seconds, or 6 Megs a second. Not real fast.

What am I missing?

Hard to say. Have a look at these two pages:
http://wiki.postgresql.org/wiki/SlowQueryQuestions
http://wiki.postgresql.org/wiki/Guide_to_reporting_problems

And see if they help.

I'd run explain analyze and use iostat, vmstat, htop etc to see what
the machine is doing while the query is running.

--
To understand recursion, one must first understand recursion.

#3Satoshi Nagayasu
satoshi.nagayasu@gmail.com
In reply to: Scott Marlowe (#2)
Re: Why would a scan take so long?

These three queries may help you to understand what's going on.

EXPLAIN ANALYZE SELECT count(catalog.id) FROM catalog;
EXPLAIN ANALYZE SELECT count(catalog.id) FROM catalog WHERE flag=false;
EXPLAIN ANALYZE SELECT count(DISTINCT catalog.id) FROM catalog WHERE flag=false;

Regards,
--
NAGAYASU Satoshi <satoshi.nagayasu@gmail.com>

------元のメッセージ------
送信者 : Michal Politowski
送信者: pgsql-general-owner@postgresql.org
To: pgsql-general@postgresql.org
件名: [GENERAL] Why would a scan take so long?
送信: 2010/10/2 12:13 AM

EXPLAIN SELECT count(DISTINCT catalog.id) FROM catalog WHERE flag=false;
QUERY PLAN
-----------------------------------------------------------------------------------------
Aggregate (cost=1615927.27..1615927.28 rows=1 width=8)
-> Seq Scan on catalog (cost=0.00..1603214.56 rows=5085084 width=8)
Filter: (NOT flag)

SELECT pg_size_pretty(pg_relation_size('catalog'));
pg_size_pretty
----------------
9380 MB

Nothing else is going on the system, during the query disk reads rise from
around 0 to > 100MB/s, so I would assume it should take a couple minutes
and it takes ten times longer:
Time: 1495549.716 ms

What am I missing?

--
Michal Politowski

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general