Re: How to measure query time - with warm up and cached data - Mailing list pgsql-general

From Neto pr
Subject Re: How to measure query time - with warm up and cached data
Date
Msg-id CA+TZvYJpiLDZ88rd0gp6WviX_ti1NRcDjG3xw6OfL4+_DBS40A@mail.gmail.com
Whole thread Raw
In response to Re: How to measure query time - with warm up and cached data  ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
List pgsql-general


2018-01-21 13:53 GMT-08:00 Peter J. Holzer <hjp-pgsql@hjp.at>:
On 2018-01-21 12:45:54 -0800, Neto pr wrote:
> I need to know the actual execution time of a query, but considering that the
> data is already cached. I also need to make sure that cached data from other
> queries is cleared.
> I believe that in order to know the real time of a query it will be necessary
> to "warm up" the data to be inserted in cache.
>
> Below are the steps suggested by a DBA for me:
>
> Step 1- run ANALYZE on all tables involved before the test;
> Step 2- restart the DBMS (to clear the DBMS cache);
> Step 3- erase the S.O. cache;

Did you mean "OS cache" (operating system cache)?


Yes, Operating System cache...  S.O. = Sistema Operacional in portuguese,  it was a translation error!!

To restart the DBMS and clear the cache of O.S. I execute this commands in linux Debian8.

/etc/init.d/pgsql stop
sync

echo "clear cache !!!!!!"

echo 3 > /proc/sys/vm/drop_caches
/etc/init.d/pgsql start
 

> Step 4- execute at least 5 times the same query.
>
> After the actual execution time of the query, it would have to take the time of
> the query that is in the "median" among all.

If you do this, clearing the caches before the tests will probably have little
effekt. The first query will fill the cache with the data needed for
your query (possibly evicting other data) and the next 4 will work on
the cached data.

Yes, I believe that the first execution can be discarded, because the data is accommodating in the cache ... the ideal is considered only the others after the first one.
 
Whether the cache was empty or full before the first
query will make little difference to the median, because the first query
will almost certainly be discarded as an outlier.

Flushing out caches is very useful if you want to measure performance
without caches (e.g. if you want to determine what the performance
impact of a server reboot is).


> Example:
>
> Execution 1: 07m 58s
> Execution 2: 14m 51s
> Execution 3: 17m 59s
> Execution 4: 17m 55s
> Execution 5: 17m 07s

Are these real measurements or did you make them up? They look weird.
Normally the first run is by far the slowest, then the others are very
similar, sometimes with a slight improvement (especially between the 2nd
and 3rd). But in your case it is just the opposite.


Yes, they are real information from TPC-H query 9.
I can not understand why in several tests I have done here, the first execution is executed faster, even without indexes, and theoretically without cache.

If someone wants to see the execution plans and other information the worksheet with results is at the following link:
https://sites.google.com/site/eletrolareshop/repository/Result_80gb-SSD-10_exec_v4.ods

I thought it was because my CPU was working with variance .. but I configured the BIOS it as " OS Control"  and in " Performance" CPU mode in Linux Debian8. See below:
-----------------------------------------------------------
user1@hp110deb8:~/Desktop$ cpufreq-info | grep 'current CPU fr' current CPU frequency is 2.80 GHz. current CPU frequency is 2.80 GHz. current CPU frequency is 2.80 GHz. current CPU frequency is 2.80 GHz.
--------------------------------------
Apparently the processor is not working variably now.
Any idea why the first execution can be faster in many cases?

Best Regards
Neto


 

 
> [cleardot]

Sending Webbugs to a mailinglist?

        hp

--
   _  | Peter J. Holzer    | we build much bigger, better disasters now
|_|_) |                    | because we have much more sophisticated
| |   | hjp@hjp.at         | management tools.
__/   | http://www.hjp.at/ | -- Ross Anderson <https://www.edge.org/>

pgsql-general by date:

Previous
From: "Peter J. Holzer"
Date:
Subject: Re: How to measure query time - with warm up and cached data
Next
From: Michael Paquier
Date:
Subject: Re: Best non-networked front end for postgresql