Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
Date
Msg-id CAEYLb_XZVYksbNQ0Girs+Nnq_FpDKnYngbmS5LOLhQ=Hudoa6A@mail.gmail.com
Whole thread Raw
In response to Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On 29 March 2012 21:05, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Barring objections I'll go fix this, and then this patch can be
> considered closed except for possible future tweaking of the
> sticky-entry decay rule.

Attached patch fixes a bug, and tweaks sticky-entry decay.

The extant code bumps usage (though not call counts) in two hooks
(pgss_post_parse_analyze() and pgss_ExecutorEnd()) , so prepared
queries will always have about half the usage of an equivalent simple
query, which is clearly not desirable. With the proposed patch,
"usage" should be similar to "calls" until the first call of
entry_dealloc(), rather than usually having a value that's about twice
as high. With the patch, a run of pgbench with and without "-M
prepared" results in a usage of calls + 1 for each query from both
runs.

The approach I've taken with decay is to maintain a server-wide median
usage value (well, a convenient approximation), which is assigned to
sticky entries. This makes it hard to evict the entries in the first
couple of calls to entry_dealloc(). On the other hand, if there really
is contention for entries, it will soon become really easy to evict
sticky entries, because we use a much more aggressive multiplier of
0.5 for their decay.

I rather conservatively initially assume that the median usage is 10,
which is a very low value considering the use of the multiplier trick.
In any case, in the real world it won't take too long to call
entry_dealloc() to set the median value, if in fact it actually
matters.

You described entries as precious. This isn't quite the full picture;
while pg_stat_statements will malthusianistically burn through pretty
much as many entries as you care give to it, or so you might think, I
believe that in the real world, the rate at which the module burns
through them would frequently look logarithmic. In other words, after
an entry_dealloc() call the hashtable is 95% full, but it might take
rather a long time to reach 100% again - the first 5% is consumed
dramatically faster than the last. The user might not actually care if
you need to cache a sticky value for a few hours in one of their
slots, as you run an epic reporting query, even though the hashtable
is over 95% full.

The idea is to avoid evicting a sticky entry just because there
happened to be an infrequent entry_dealloc() at the wrong time, and
the least marginal of the most marginal 5% of non-sticky entries (that
is, the 5% up for eviction) happened to have a call count/usage of
higher than the magic value of 3, which I find quite plausible.

If I apply your test for dead sticky entries after the regression
tests (serial schedule) were run, my approach compares very favourably
(granted, presumably usage values were double-counted for your test,
making our results less than completely comparable).

For the purposes of this experiment, I've just commented out "if
(calls == 0) continue;" within the pg_stat_statements() function,
obviously:

postgres=# select calls = 0, count(*) from pg_stat_statements() group
by calls = 0;
-[ RECORD 1 ]-
?column? | f
count    | 959
-[ RECORD 2 ]-
?column? | t
count    | 3  <--- this includes the above query itself

postgres=# select calls = 0, count(*) from pg_stat_statements() group
by calls = 0;
-[ RECORD 1 ]-
?column? | f
count    | 960   <----now it's counted here...
-[ RECORD 2 ]-
?column? | t
count    | 2       <---- ...not here

I've also attached some elogs, in their original chronological order,
that trace the median usage when recorded at entry_dealloc() for the
regression tests. As you'd expect given that this is the regression
tests, the median is very low, consistently between 1.9 and 2.5. An
additional factor that makes this work well is that the standard
deviation is low, and as such it is much easier to evict sticky
entries, which is what you want here.

--
Peter Geoghegan       http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training and Services

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: pgsql_fdw, FDW for PostgreSQL server
Next
From: Greg Smith
Date:
Subject: Re: checkpoint patches