Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation) - Mailing list pgsql-hackers
From | Peter Geoghegan |
---|---|
Subject | Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation) |
Date | |
Msg-id | CAEYLb_WuSdWferKgx3WZWbPLLTvpKkH315+nA7UnyP3PtF4gSQ@mail.gmail.com Whole thread Raw |
In response to | pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation) (Peter Geoghegan <peter@2ndquadrant.com>) |
Responses |
Re: Re: pg_stat_statements normalisation without invasive changes to the parser (was: Next steps on pg_stat_statements normalisation)
|
List | pgsql-hackers |
On 28 March 2012 15:57, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Is there any actual benefit in providing the > "pg_stat_statements.string_key" GUC? It looks to me more like something > that was thrown in because it was easy than because anybody would want > it. I'd just as soon leave it out and avoid the incremental API > complexity increase. (While on that subject, I see no documentation > updates in the patch...) Personally, I don't care for it, and I'm sure most users wouldn't either, but I thought that someone somewhere might be relying on the existing behaviour. I will produce a doc-patch. It would have been premature to do so until quite recently. > Also, I'm not terribly happy with the "sticky entries" hack. The way > you had it set up with a 1e10 bias for a sticky entry was completely > unreasonable IMO, because if the later pgss_store call never happens > (which is quite possible if the statement contains an error detected > during planning or execution), that entry is basically never going to > age out, and will just uselessly consume a precious table slot for a > long time. In the extreme, somebody could effectively disable query > tracking by filling the hashtable with variants of "SELECT 1/0". > The best quick answer I can think of is to reduce USAGE_NON_EXEC_STICK > to maybe 10 or so, but I wonder whether there's some less klugy way to > get the result in the first place. I thought about keeping the > canonicalized query string around locally in the backend rather than > having the early pgss_store call at all, but am not sure it's worth > the complexity of an additional local hashtable or some such to hold > such pending entries. I was troubled by that too, and had considered various ways of at least polishing the kludge. Maybe a better approach would be to start with a usage of 1e10 (or something rather high, anyway), but apply a much more aggressive multiplier than USAGE_DECREASE_FACTOR for sticky entries only? That way, in earlier calls of entry_dealloc() the sticky entries, easily identifiable as having 0 calls, are almost impossible to evict, but after a relatively small number of calls they soon become more readily evictable. This seems better than simply having some much lower usage that is only a few times the value of USAGE_INIT. Let's suppose we set sticky entries to have a usage value of 10. If all other queries have more than 10 calls, which is not unlikely (under the current usage format, 1.0 usage = 1 call, at least until entry_dealloc() dampens usage) then when we entry_dealloc(), the sticky entry might as well have a usage of 1, and has no way of increasing its usage short of becoming a "real" entry. On the other hand, with the multiplier trick, how close the sticky entry is to eviction is, importantly, far more strongly influenced by the number of entry_dealloc() calls, which in turn is influenced by churn in the system, rather than being largely influenced by how the magic sticky usage value happens to compare to those usage values of some random set of "real" entries on some random database. If entries really are precious, then the sticky entry is freed much sooner. If not, then why not allow the sticky entry to stick around pending its execution/ promotion to a "real" entry? It would probably be pretty inexpensive to maintain what is currently the largest usage value in the hash table at entry_dealloc() time - that would likely be far more suitable than 1e10, and might even work well. We could perhaps cut that in half every entry_dealloc(). -- Peter Geoghegan http://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Training and Services
pgsql-hackers by date: