Thread: How to evaluate "explain analyze" correctly soon after "explain" for the same statement ?
How to evaluate "explain analyze" correctly soon after "explain" for the same statement ?
Hi all:
I have one question about the cache clearing.
If I use the following soon after database startup(or first time I use it):
postgres=# explain analyze select id,deptno from gaotab where id=200;
QUERY PLAN
--------------------------------------------------------------------------------
-------------------------------------------
Index Only Scan using idx_id_dept on gaotab (cost=0.00..8.27 rows=1 width=8) (
actual time=30.912..30.915 rows=1 loops=1)
Index Cond: (id = 200)
Heap Fetches: 1
Total runtime: 47.390 ms
(4 rows)
postgres=#
The result is: the above explain analyze got a total runtime of 47 ms.
But If I restart the database again, and then execute the following:
postgres=# explain select id,deptno from gaotab where id=200; | |||||||
QUERY PLAN | |||||||
------------------------------------------------------------------------------- | |||||||
Index Only Scan using idx_id_dept on gaotab (cost=0.00..8.27 rows=1 width=8) | |||||||
Index Cond: (id = 200) | |||||||
(2 rows) | |||||||
postgres=# explain analyze select id,deptno from gaotab where id=200; | |||||||
QUERY PLAN | |||||||
| |||||||
-------------------------------------------------------------------------------- | |||||||
----------------------------------------- | |||||||
Index Only Scan using idx_id_dept on gaotab (cost=0.00..8.27 rows=1 width=8) ( | |||||||
actual time=0.052..0.053 rows=1 loops=1) | |||||||
Index Cond: (id = 200) | |||||||
Heap Fetches: 1 | |||||||
Total runtime: 0.074 ms | |||||||
(4 rows) |
This time I got the total runtime of 0.074ms, obviously the explain analyze benefit from the explain statement.
It might not be a big problem in a small system.
But when in a production environment, When I want to use explain and then , soon use explain analyze for the same statement,
How can I avoid the influence of cache and get the right answer for evaluating purpose?
It is not a good idea to restart the database again and again I think.
I wonder is there any method of clearing cache or even clear only a particular part of the cache?
In my test environment, I can get the following:
postgres=# show seq_page_cost;
seq_page_cost
---------------
1
(1 row)
postgres=# show cpu_tuple_cost;
cpu_tuple_cost
----------------
0.01
(1 row)
postgres=# show cpu_operator_cost;
cpu_operator_cost
-------------------
0.0025
(1 row)
And my table is like that:
postgres=# analyze;
ANALYZE
postgres=# select a.relpages, a.reltuples, a.relfilenode,a.reltype,b.typname from pg_class a, pg_type b where a.relname like 'gaotab%' and a.reltype=b.oid; relpages | reltuples | relfilenode | reltype | typname
----------+-----------+-------------+---------+--------- 7 | 1000 | 16396 | 16386 | gaotab
(1 row)
Thanks in advance.
Re: How to evaluate "explain analyze" correctly soon after "explain" for the same statement ?
高健 wrote: > I have one question about the cache clearing. > > If I use the following soon after database startup(or first time I use it): > > postgres=# explain analyze select id,deptno from gaotab where id=200; > The result is: the above explain analyze got a total runtime of 47 ms. > > But If I restart the database again, and then execute the following: > > postgres=# explain select id,deptno from gaotab where id=200; > postgres=# explain analyze select id,deptno from gaotab where id=200; > This time I got the total runtime of 0.074ms, obviously the explain analyze benefit from the explain > statement. The EXPLAIN will not have a noticable effect on the performance of the EXPLAIN ANALYZE. If you actually restarted the PostgreSQL server like you said, then the difference must be that the file is cached in the file system cache. You can verify that be omitting the EXPLAIN in the second run. > It might not be a big problem in a small system. > > But when in a production environment, When I want to use explain and then , soon use explain > analyze for the same statement, > > How can I avoid the influence of cache and get the right answer for evaluating purpose? > > > > It is not a good idea to restart the database again and again I think. > > I wonder is there any method of clearing cache or even clear only a particular part of the cache? The only way of clearing the cache in database shared memory is to restart the server. That's just a simple "pg_ctl restart -m fast -D <datadir>". I think that's simple enough for tests. You should also empty the file system cache. On recent Linux systems that would be sync; echo 3 > /proc/sys/vm/drop_caches You'd have to consult the documentation for other OSs. Yours, Laurenz Albe
Re: How to evaluate "explain analyze" correctly soon after "explain" for the same statement ?
=?UTF-8?B?6auY5YGl?= <luckyjackgao@gmail.com> writes: > It might not be a big problem in a small system. > But when in a production environment, When I want to use explain and > then , soon use explain analyze for the same statement, > How can I avoid the influence of cache and get the right answer for > evaluating purpose? I think this question is based on a false premise. Why do you feel that the behavior with cold caches is "the right answer", and not the behavior with warm caches? A short-duration query like this one is not going to be interesting at all for performance unless it's executed quite a lot, and if it's executed quite a lot then the warm-cache result ought to be the more representative one. In general, trying to tune for cold-cache cases seems backwards to me. It's much more productive to try to ensure that the caches are warm. regards, tom lane
Re: How to evaluate "explain analyze" correctly soon after "explain" for the same statement ?
高健 <luckyjackgao@gmail.com> writes:I think this question is based on a false premise. Why do you feel that
> It might not be a big problem in a small system.
> But when in a production environment, When I want to use explain and
> then , soon use explain analyze for the same statement,
> How can I avoid the influence of cache and get the right answer for
> evaluating purpose?
the behavior with cold caches is "the right answer", and not the behavior
with warm caches? A short-duration query like this one is not going to
be interesting at all for performance unless it's executed quite a lot,
and if it's executed quite a lot then the warm-cache result ought to be
the more representative one.
In general, trying to tune for cold-cache cases seems backwards to me.
It's much more productive to try to ensure that the caches are warm.
regards, tom lane