How to evaluate "explain analyze" correctly soon after "explain" for the same statement ?

Started by 高健over 13 years ago4 messagesgeneral
Jump to latest
#1高健
luckyjackgao@gmail.com

Hi all:

I have one question about the cache clearing.

If I use the following soon after database startup(or first time I use it):

postgres=# explain analyze select id,deptno from gaotab where id=200;

QUERY
PLAN

--------------------------------------------------------------------------------

-------------------------------------------

Index Only Scan using idx_id_dept on gaotab (cost=0.00..8.27 rows=1
width=8) (

actual time=30.912..30.915 rows=1 loops=1)

Index Cond: (id = 200)

Heap Fetches: 1

Total runtime: 47.390 ms

(4 rows)

postgres=#

The result is: the above explain analyze got a total runtime of 47 ms.

But If I restart the database again, and then execute the following:

postgres=# explain select id,deptno from gaotab where id=200;

QUERY
PLAN

-------------------------------------------------------------------------------

Index Only Scan using idx_id_dept on gaotab (cost=0.00..8.27 rows=1
width=8)

Index Cond: (id = 200)

(2 rows)

postgres=# explain analyze select id,deptno from gaotab where id=200;

QUERY
PLAN

--------------------------------------------------------------------------------

-----------------------------------------

Index Only Scan using idx_id_dept on gaotab (cost=0.00..8.27 rows=1
width=8) (

actual time=0.052..0.053 rows=1 loops=1)

Index Cond: (id = 200)

Heap Fetches: 1

Total runtime: 0.074 ms

(4 rows)

This time I got the total runtime of 0.074ms, obviously the explain
analyze benefit from the explain statement.

It might not be a big problem in a small system.

But when in a production environment, When I want to use explain and
then , soon use explain analyze for the same statement,

How can I avoid the influence of cache and get the right answer for
evaluating purpose?

It is not a good idea to restart the database again and again I think.

I wonder is there any method of clearing cache or even clear only a
particular part of the cache?

In my test environment, I can get the following:

postgres=# show seq_page_cost;

seq_page_cost

---------------

1

(1 row)

postgres=# show cpu_tuple_cost;

cpu_tuple_cost

----------------

0.01

(1 row)

postgres=# show cpu_operator_cost;

cpu_operator_cost

-------------------

0.0025

(1 row)

And my table is like that:

postgres=# analyze;
ANALYZE
postgres=# select a.relpages, a.reltuples,
a.relfilenode,a.reltype,b.typname from pg_class a, pg_type b where
a.relname like 'gaotab%' and a.reltype=b.oid;
relpages | reltuples | relfilenode | reltype | typname
----------+-----------+-------------+---------+---------
7 | 1000 | 16396 | 16386 | gaotab
(1 row)

Thanks in advance.

#2Laurenz Albe
laurenz.albe@cybertec.at
In reply to: 高健 (#1)
Re: How to evaluate "explain analyze" correctly soon after "explain" for the same statement ?

高健 wrote:

I have one question about the cache clearing.

If I use the following soon after database startup(or first time I use it):

postgres=# explain analyze select id,deptno from gaotab where id=200;

The result is: the above explain analyze got a total runtime of 47 ms.

But If I restart the database again, and then execute the following:

postgres=# explain select id,deptno from gaotab where id=200;

postgres=# explain analyze select id,deptno from gaotab where id=200;

This time I got the total runtime of 0.074ms, obviously the explain analyze benefit from the explain
statement.

The EXPLAIN will not have a noticable effect on the performance
of the EXPLAIN ANALYZE.

If you actually restarted the PostgreSQL server like you said,
then the difference must be that the file is cached in the file
system cache.

You can verify that be omitting the EXPLAIN in the second run.

It might not be a big problem in a small system.

But when in a production environment, When I want to use explain and then , soon use explain
analyze for the same statement,

How can I avoid the influence of cache and get the right answer for evaluating purpose?

It is not a good idea to restart the database again and again I think.

I wonder is there any method of clearing cache or even clear only a particular part of the cache?

The only way of clearing the cache in database shared memory is to
restart the server.

That's just a simple "pg_ctl restart -m fast -D <datadir>".
I think that's simple enough for tests.

You should also empty the file system cache.
On recent Linux systems that would be
sync; echo 3 > /proc/sys/vm/drop_caches
You'd have to consult the documentation for other OSs.

Yours,
Laurenz Albe

#3Tom Lane
tgl@sss.pgh.pa.us
In reply to: 高健 (#1)
Re: How to evaluate "explain analyze" correctly soon after "explain" for the same statement ?

=?UTF-8?B?6auY5YGl?= <luckyjackgao@gmail.com> writes:

It might not be a big problem in a small system.
But when in a production environment, When I want to use explain and
then , soon use explain analyze for the same statement,
How can I avoid the influence of cache and get the right answer for
evaluating purpose?

I think this question is based on a false premise. Why do you feel that
the behavior with cold caches is "the right answer", and not the behavior
with warm caches? A short-duration query like this one is not going to
be interesting at all for performance unless it's executed quite a lot,
and if it's executed quite a lot then the warm-cache result ought to be
the more representative one.

In general, trying to tune for cold-cache cases seems backwards to me.
It's much more productive to try to ensure that the caches are warm.

regards, tom lane

#4高健
luckyjackgao@gmail.com
In reply to: Tom Lane (#3)
Re: How to evaluate "explain analyze" correctly soon after "explain" for the same statement ?

Hi tom

At frist I have thought that the database parsed my explain statement,
so the pre-compiled execution plan will be re-used , which made the
statement's second run quick.

I think that what you said is right.

Thank you

2012/11/7 Tom Lane <tgl@sss.pgh.pa.us>

Show quoted text

=?UTF-8?B?6auY5YGl?= <luckyjackgao@gmail.com> writes:

It might not be a big problem in a small system.
But when in a production environment, When I want to use explain and
then , soon use explain analyze for the same statement,
How can I avoid the influence of cache and get the right answer for
evaluating purpose?

I think this question is based on a false premise. Why do you feel that
the behavior with cold caches is "the right answer", and not the behavior
with warm caches? A short-duration query like this one is not going to
be interesting at all for performance unless it's executed quite a lot,
and if it's executed quite a lot then the warm-cache result ought to be
the more representative one.

In general, trying to tune for cold-cache cases seems backwards to me.
It's much more productive to try to ensure that the caches are warm.

regards, tom lane