explain plans with information about (modified) gucs - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | explain plans with information about (modified) gucs |
Date | |
Msg-id | e1791b4c-df9c-be02-edc5-7c8874944be0@2ndquadrant.com Whole thread Raw |
Responses |
Re: explain plans with information about (modified) gucs
Re: explain plans with information about (modified) gucs Re: explain plans with information about (modified) gucs Re: explain plans with information about (modified) gucs Re: explain plans with information about (modified) gucs |
List | pgsql-hackers |
Hi, every now and then I have to investigate an execution plan that is strange in some way and I can't reproduce the same behavior. Usually it's simply due to data distribution changing since the problem was observed (say, after a nightly batch load/update). In many cases it however may be due to some local GUC tweaks, usually addressing some query specific issues (say, disabling nested loops or lowering join_collapse_limit). I've repeatedly ran into cases where the GUC was not properly reset to the "regular" value, and it's rather difficult to identify this is what's happening. Or cases with different per-user settings and connection pooling (SET SESSION AUTHORIZATION / ROLE etc.). So I propose to extend EXPLAIN output with an additional option, which would include information about modified GUCs in the execution plan (disabled by default, of course): test=# explain (gucs) select * from t; QUERY PLAN -------------------------------------------------------------------- Seq Scan on t (cost=0.00..35.50 rows=2550 width=4) GUCs: application_name = 'x', client_encoding = 'UTF8', cpu_tuple_cost = '0.01' (2 rows) Of course, this directly applies to auto_explain too, which gets a new option log_gucs. The patch is quite trivial, but there are about three open questions: 1) names of the options I'm not particularly happy with calling the option "gucs" - it's an acronym and many users have little idea what GUC stands for. So I think a better name would be desirable, but I'm not sure what would that be. Options? Parameters? 2) format of output At this point the names/values are simply formatted into a one-line string. That's not particularly readable, and it's not very useful for the YAML/JSON formats I guess. So adding each modified GUC as an extra text property would be better. 3) identifying modified (and interesting) GUCs We certainly don't want to include all GUCs, so the question is how to decide which GUCs are interesting. The simplest approach would be to look for GUCs that changed in the session (source == PGC_S_SESSION), but that does not help with SET SESSION AUTHORIZATION / ROLE cases. So we probably want (source > PGC_S_ARGV), but probably not PGC_S_OVERRIDE because that includes irrelevant options like wal_buffers etc. For now I've used /* return only options that were modified (not as in config file) */ if ((conf->source <= PGC_S_ARGV) || (conf->source == PGC_S_OVERRIDE)) continue; which generally does the right thing, although it also includes stuff like application_name or client_encoding. But perhaps it'd be better to whitelist the GUCs in some way, because some of the user-defined GUCs may be sensitive and should not be included in plans. Opinions? regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
pgsql-hackers by date: