Very slow query in PostgreSQL 9.3.3
<html><body><span style="font-family:Verdana; color:#000000; font-size:10pt;"><div>PostgreSQL 9.3.3 RHEL 6.4<br><br>Total db Server memory 64GB<br><br><br># -----------------------------<br># PostgreSQL configuration file<br># -----------------------------<br>max_connections = 100<br>shared_buffers = 16GB<br>work_mem = 32MB <br>maintenance_work_mem = 1GB<br>seq_page_cost = 1.0 <br>random_page_cost = 2.0 <br>cpu_tuple_cost = 0.03 <br>#cpu_index_tuple_cost = 0.005 <br>#cpu_operator_cost = 0.0025 <br>effective_cache_size = 48MB<br>default_statistics_target = 100 <br>constraint_exclusion = partition <br><br>Partition table Setup<br>---------------------<br><br>CREATE TABLE measurement (<br> id bigint not null,<br> city_id bigint not null,<br> logdate date not null,<br> peaktemp bigint,<br> unitsales bigint,<br> type bigint,<br> uuid uuid,<br> geom geometry<br>);<br><br><br>CREATE TABLE measurement_y2006m02 (<br> CHECK ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )<br>) INHERITS (measurement);<br>CREATE TABLE measurement_y2006m03 (<br> CHECK ( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' )<br>) INHERITS (measurement);<br>...<br>CREATE TABLE measurement_y2007m11 (<br> CHECK ( logdate >= DATE '2007-11-01' AND logdate < DATE '2007-12-01' )<br>) INHERITS (measurement);<br>CREATE TABLE measurement_y2007m12 (<br> CHECK ( logdate >= DATE '2007-12-01' AND logdate < DATE '2008-01-01' )<br>) INHERITS (measurement);<br>CREATE TABLE measurement_y2008m01 (<br> CHECK ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )<br>) INHERITS (measurement);<br><br>Partition measurement_y2007m12 contains 38,261,732 rows<br><br>Indexes on partition measurement_y2007m12:<br> "pkey_measurement_y2007m12" PRIMARY KEY, btree (id), tablespace "measurement_y2007"<br> "idx_measurement_uuid_y2003m12" btree (uuid), tablespace "measurement_y2007"<br> "idx_measurement_type_y2003m12" btree (type), tablespace "measurement_y2007"<br> "idx_measurement_city_y2003m12" btree (city_id), tablespace "measurement_y2007"<br> "idx_measurement_logdate_y2003m12" btree (logdate), tablespace "measurement_y2007"<br> "sidx_measurement_geom_y2003m12" gist (geom), tablespace "measurement_y2007"<br><br><b>*** Problem Query *** </b><br><br>explain (analyze on, buffers on) Select * from measurement this_ <br> where this_.logdate between '2007-12-19 23:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp<br> and this_.city_id=25183 order by this_.logdate asc, this_.peaktemp asc, this_.unitsales asc limit 10000;<br><br> QUERY PLAN <br>-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------<br> Limit (cost=33849.98..33855.15 rows=2068 width=618) (actual time=51710.803..51714.266 rows=10000 loops=1)<br> Buffers: shared hit=25614 read=39417<br> -> Sort (cost=33849.98..33855.15 rows=2068 width=618) (actual time=51710.799..51712.924 rows=10000 loops=1)<br> Sort Key: this_.logdate, this_.unitsales<br> Sort Method: top-N heapsort Memory: 15938kB<br> Buffers: shared hit=25614 read=39417<br> -> Append (cost=0.00..33736.09 rows=2068 width=618) (actual time=50.210..50793.589 rows=312046 loops=1)<br> Buffers: shared hit=25608 read=39417<br> -> Seq Scan on measurement this_ (cost=0.00..0.00 rows=1 width=840) (actual time=0.002..0.002 rows=0 loops=1)<br> Filter: ((logdate >= '2007-12-19 23:38:41.22'::timestamp without time zone) AND (logdate <= '2007-12-20 08:01:04.22'::timestamp without time zone) AND (city_id = 25183))<br> -> Index Scan using idx_measurement_city_y2007m12 on measurement_y2007m12 this__1 (cost=0.56..33736.09 rows=2067 width=618) (actual time=50.206..50731.637 rows=312046 loops=1)<br> Index Cond: (city_id = 25183)<br> Filter: ((logdate >= '2007-12-19 23:38:41.22'::timestamp without time zone) AND (logdate <= '2007-12-20 08:01:04.22'::timestamp without time zone))<br> Buffers: shared hit=25608 read=39417<br><br> Total runtime: <b>51717.639 ms</b> <--- *** unacceptable ***<br><br>(15 rows) <br><br>Total Rows meeting query criteria<br>---------------------------------<br><br>Select count(*) from measurement this_ where this_.logdate between '2007-12-19 23:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp and this_.city_id=25183;<br><br>count<br>------<br>312046</div><div><br>Total Rows in the partition table referenced<br>------------------------------------------<br><br>Select
count(*) from measurement_y2007m12;<br><br> count<br>---------<br>38261732</div><div><br></div><div><b>Does anyone know how to speed up this query? I removed the order by clause and that significantly reduced the run time to approx. 2000-3000 ms. This query is being recorded repeatedly <br>in our logs and executes very slowly for our UI users from 12000 ms thru 68000 ms<br><br>Any suggestions would be appreciated.</b><br><br>thanks<br mce_bogus="1"></div></span></body></html>
2014-03-13 20:26 GMT+01:00 <fburgess@radiantblue.com>:
PostgreSQL 9.3.3 RHEL 6.4
Total db Server memory 64GB
# -----------------------------
# PostgreSQL configuration file
# -----------------------------
max_connections = 100
shared_buffers = 16GB
work_mem = 32MB
maintenance_work_mem = 1GB
seq_page_cost = 1.0
random_page_cost = 2.0
cpu_tuple_cost = 0.03
#cpu_index_tuple_cost = 0.005
#cpu_operator_cost = 0.0025
effective_cache_size = 48MB
default_statistics_target = 100
constraint_exclusion = partitionPartition table Setup
---------------------CREATE TABLE measurement (
id bigint not null,
city_id bigint not null,
logdate date not null,
peaktemp bigint,
unitsales bigint,
type bigint,
uuid uuid,
geom geometry
);CREATE TABLE measurement_y2006m02 (
CHECK ( logdate >= DATE '2006-02-01' AND logdate < DATE '2006-03-01' )
) INHERITS (measurement);
CREATE TABLE measurement_y2006m03 (
CHECK ( logdate >= DATE '2006-03-01' AND logdate < DATE '2006-04-01' )
) INHERITS (measurement);
...
CREATE TABLE measurement_y2007m11 (
CHECK ( logdate >= DATE '2007-11-01' AND logdate < DATE '2007-12-01' )
) INHERITS (measurement);
CREATE TABLE measurement_y2007m12 (
CHECK ( logdate >= DATE '2007-12-01' AND logdate < DATE '2008-01-01' )
) INHERITS (measurement);
CREATE TABLE measurement_y2008m01 (
CHECK ( logdate >= DATE '2008-01-01' AND logdate < DATE '2008-02-01' )
) INHERITS (measurement);Partition measurement_y2007m12 contains 38,261,732 rows
Indexes on partition measurement_y2007m12:
"pkey_measurement_y2007m12" PRIMARY KEY, btree (id), tablespace
"measurement_y2007"
"idx_measurement_uuid_y2003m12" btree (uuid), tablespace
"measurement_y2007"
"idx_measurement_type_y2003m12" btree (type), tablespace
"measurement_y2007"
"idx_measurement_city_y2003m12" btree (city_id), tablespace
"measurement_y2007"
"idx_measurement_logdate_y2003m12" btree (logdate), tablespace
"measurement_y2007"
"sidx_measurement_geom_y2003m12" gist (geom), tablespace
"measurement_y2007"**** Problem Query *** *
explain (analyze on, buffers on) Select * from measurement this_
where this_.logdate between '2007-12-19
23:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp
and this_.city_id=25183 order by
this_.logdate asc, this_.peaktemp asc, this_.unitsales asc limit 10000;QUERY
PLAN-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=33849.98..33855.15 rows=2068 width=618) (actual
time=51710.803..51714.266 rows=10000 loops=1)
Buffers: shared hit=25614 read=39417
-> Sort (cost=33849.98..33855.15 rows=2068 width=618) (actual
time=51710.799..51712.924 rows=10000 loops=1)
Sort Key: this_.logdate, this_.unitsales
Sort Method: top-N heapsort Memory: 15938kB
Buffers: shared hit=25614 read=39417
-> Append (cost=0.00..33736.09 rows=2068 width=618) (actual
time=50.210..50793.589 rows=312046 loops=1)
Buffers: shared hit=25608 read=39417
-> Seq Scan on measurement this_ (cost=0.00..0.00 rows=1
width=840) (actual time=0.002..0.002 rows=0 loops=1)
Filter: ((logdate >= '2007-12-19
23:38:41.22'::timestamp without time zone) AND (logdate <= '2007-12-20
08:01:04.22'::timestamp without time zone) AND (city_id = 25183))
-> Index Scan using idx_measurement_city_y2007m12 on
measurement_y2007m12 this__1 (cost=0.56..33736.09 rows=2067 width=618)
(actual time=50.206..50731.637 rows=312046 loops=1)
Index Cond: (city_id = 25183)
Filter: ((logdate >= '2007-12-19
23:38:41.22'::timestamp without time zone) AND (logdate <= '2007-12-20
08:01:04.22'::timestamp without time zone))
Buffers: shared hit=25608 read=39417Total runtime: *51717.639 ms* <--- *** unacceptable ***
(15 rows)
Total Rows meeting query criteria
---------------------------------Select count(*) from measurement this_ where this_.logdate between
'2007-12-19 23:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp
and this_.city_id=25183;count
------
312046Total Rows in the partition table referenced
------------------------------------------Select count(*) from measurement_y2007m12;
count
---------
38261732*Does anyone know how to speed up this query? I removed the order by
clause and that significantly reduced the run time to approx. 2000-3000 ms.
This query is being recorded repeatedly in our logs and executes very
slowly for our UI users from 12000 ms thru 68000 msAny suggestions would be
appreciated.*
sort (ORDER BY clause) enforce a reading of complete partitions. And it is
slow - it is strange so reading 300K rows needs a 5K sec. Probably your IO
is overloaded.
Regards
Pavel Stehule
Show quoted text
thanks
On Thu, Mar 13, 2014 at 12:26 PM, <fburgess@radiantblue.com> wrote:
*** Problem Query ***
explain (analyze on, buffers on) Select * from measurement this_
where this_.logdate between '2007-12-19
23:38:41.22'::timestamp and '2007-12-20 08:01:04.22'::timestamp
and this_.city_id=25183 order by
this_.logdate asc, this_.peaktemp asc, this_.unitsales asc limit 10000;
[...]
Total runtime: 51717.639 ms <--- *** unacceptable ***
Try to create a multi-column index on the partition by (city_id,
logdate). Then run the original query and the query without peaktemp
and nitsales on the order by. Compare the results, and if the first
one will not be satisfying try to add these two columns to the end of
the column list of your multi-column index on the order as they appear
in your query. It should do the trick. If it wont, please, show the
plans.
--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA
http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
2014-03-14 4:26 GMT+09:00 <fburgess@radiantblue.com>:
PostgreSQL 9.3.3 RHEL 6.4
Total db Server memory 64GB
(...)
effective_cache_size = 48MB
I'm not sure if this will help directly, but is the value for
'effective_cache_size' intentional? 48 *GB* would be a more likely
setting.
Regards
Ian Barwick
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general