Thread: PostgreSQL's query caching behaviour

PostgreSQL's query caching behaviour

From
Jan-Peter.Seifert@gmx.de
Date:
Hello,

I've a question regarding caching of results in the PostgreSQL-Server.

As expected a:

SELECT COUNT(*) FROM <table>;

causes a sequential scan of the table. The table I tested this on has about 345 000 tuples. This and any following run
tookabout 50-60 seconds on a test system (pgAdminIII 1.8.4 + Windows xp sp3 + Pg v8.3.5). 

"Aggregate  (cost=179437.32..179437.33 rows=1 width=0) (actual time=44632.925..44632.927 rows=1 loops=1)"
"  ->  Seq Scan on table  (cost=0.00..178576.45 rows=344345 width=0) (actual time=13316.145..43815.699 rows=344345
loops=1)"
"Total runtime: 44633.150 ms"

On the other hand this just took 3-7 seconds:

SELECT COUNT(*) FROM <table> WHERE year = '2008';

The index used is on year and two other columns. No surprise here as well. However, every subsequent run and with
differentyears does not even take a second: 

Aggregate  (cost=71684.22..71684.23 rows=1 width=0) (actual time=13.071..13.074 rows=1 loops=1)
  ->  Bitmap Heap Scan on table  (cost=987.71..71618.62 rows=26239 width=0) (actual time=1.285..7.883 rows=2214
loops=1)
        Recheck Cond: ((year)::text = '2008'::text)
        ->  Bitmap Index Scan on absidx  (cost=0.00..981.15 rows=26239 width=0) (actual time=0.994..0.994 rows=2214
loops=1)
              Index Cond: ((year)::text = '2008'::text)
Total runtime: 13.308 ms

The server does remember even after a stop and restart as the query still doesn't need more than a second then.

I want to compare the performance of two different implementations of an application on the same database. So I want to
flushthe 'cache' responsible for this between the runs. 

This caching seems to come with increasing work_mem (from 1MB to 2MB). A different server not showing this behaviour
didso after increasing the work_mem as well (and max_stack_depth). 

Doing VACUUM ANALYZE did 'help' a bit it seems as the query took substantially longer again, but I want to make sure.

My next guess was the Statistics Collector:
http://www.postgresql.org/docs/8.3/interactive/monitoring-stats.html

Neither pg_stat_clear_snapshot() nor pg_stat_reset() did had any effect though.

So how do I reset the server / cache reliably?

Thank you very much for any hints in advance,

Peter Seifert
--
Psssst! Schon vom neuen GMX MultiMessenger gehört? Der kann`s mit allen: http://www.gmx.net/de/go/multimessenger

Re: PostgreSQL's query caching behaviour

From
Chander Ganesan
Date:
Jan-Peter.Seifert@gmx.de wrote:
> Hello,
>
> I've a question regarding caching of results in the PostgreSQL-Server.
>
If you restart the server it will clear out the shared buffer pool, and
subsequent queries would show the appropriate delay..  You can clear out
the OS disk cache (on linux) using the commands:

sync
echo 1 > /proc/sys/vm/drop_caches

That's likely what you are missing.

--
Chander Ganesan
Open Technology Group, Inc.
One Copley Parkway, Suite 210
Morrisville, NC  27560
919-463-0999/877-258-8987
http://www.otg-nc.com
Expert PostgreSQL, PostGIS, MapServer, Python, PHP, Django and other Open Source Training.