Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation) - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
Date
Msg-id 3482.1332976483@sss.pgh.pa.us
Whole thread Raw
In response to Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)  (Peter Geoghegan <peter@2ndquadrant.com>)
Responses Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
List pgsql-hackers
Peter Geoghegan <peter@2ndquadrant.com> writes:
> On 28 March 2012 15:57, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Is there any actual benefit in providing the
>> "pg_stat_statements.string_key" GUC?  It looks to me more like something
>> that was thrown in because it was easy than because anybody would want
>> it.  I'd just as soon leave it out and avoid the incremental API
>> complexity increase.  (While on that subject, I see no documentation
>> updates in the patch...)

> Personally, I don't care for it, and I'm sure most users wouldn't
> either, but I thought that someone somewhere might be relying on the
> existing behaviour.

Hearing no squawks, I will remove it from the committed patch; one
less thing to document.  Easy enough to put back later, if someone
makes a case for it.

>> Also, I'm not terribly happy with the "sticky entries" hack.

> I was troubled by that too, and had considered various ways of at
> least polishing the kludge. Maybe a better approach would be to start
> with a usage of 1e10 (or something rather high, anyway), but apply a
> much more aggressive multiplier than USAGE_DECREASE_FACTOR for sticky
> entries only? That way, in earlier calls of entry_dealloc() the sticky
> entries, easily identifiable as having 0 calls, are almost impossible
> to evict, but after a relatively small number of calls they soon
> become more readily evictable.

I did some simple experiments with the regression tests.  Now, those
tests are by far a worst case for this sort of thing, since (a) they
probably generate many more unique queries than a typical production
application, and (b) they almost certainly provoke many more errors
and hence more dead sticky entries than a typical production app.
Nonetheless, the results look pretty bad.  Using various values of
USAGE_NON_EXEC_STICK, the numbers of useful and dead entries in the hash
table after completing one round of regression tests was:
STICK        live entries    dead sticky entries
10.0        780        1905.0        858        1124.0        874        963.0        911        622.0        918
43
 

I did not bother measuring 1e10 ;-).  It's clear that sticky entries
are forcing useful entries out of the table in this scenario.
I think wasting more than about 10% of the table in this way is not
acceptable.

I'm planning to commit the patch with a USAGE_NON_EXEC_STICK value
of 3.0, which is the largest value that stays below 10% wastage.
We can twiddle that logic later, so if you want to experiment with an
alternate decay rule, feel free.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: Standbys, txid_current_snapshot, wraparound
Next
From: Peter Geoghegan
Date:
Subject: Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)