Re: proposal - log_full_scan - Mailing list pgsql-hackers

From Daniel Gustafsson
Subject Re: proposal - log_full_scan
Date
Msg-id 3E8BD81C-07C9-46F2-98D3-2E4EEEFB82FD@yesql.se
Whole thread Raw
In response to Re: proposal - log_full_scan  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: proposal - log_full_scan  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-hackers
Looking at this I like the idea in principle, but I'm not convinced that
auto_explain is the right tool for this.  auto_explain is for identifying slow
queries, and what you are proposing is to identify queries with a certain
"shape" (for lack of a better term) even if they aren't slow as per the
log_min_duration setting.  If log_min_duration is deemed to crude due to query
volume then sample_rate is the tool.  If sample_rate is also discarded, then
pg_stat_statements seems a better option.

Also, why just sequential scans (apart from it being this specific usecase)?
If the idea is to track aspects of execution which are deemed slow, then
tracking for example spills etc would be just as valid.  Do you have thoughts
on that?

That being said, a few comments on the patch:

-    (auto_explain_log_min_duration >= 0 && \
+    ((auto_explain_log_min_duration >= 0 || auto_explain_log_seqscan != -1) && \
Is there a reason to not follow the existing code and check for >= 0?

+    DefineCustomIntVariable("auto_explain.log_seqscan",
It's only a matter of time before another node is proposed for logging, and
then we'll be stuck adding log_XXXnode GUCs.  Is there a more future-proof way
to do this?

+    "Sets the minimum tuples produced by sequantial scans which plans will be logged",
s/sequantial/sequential/

-    es->analyze = (queryDesc->instrument_options && auto_explain_log_analyze);
+    es->analyze = (queryDesc->instrument_options && (auto_explain_log_analyze || auto_explain_log_seqscan != -1));
Turning on ANALYZE when log_analyze isn't set to True is a no-no IMO.

+ * Colllect relations where log_seqscan limit was exceeded
s/Colllect/Collect/

+    if (*relnames.data != '\0')
+        appendStringInfoString(&relnames, ",");
This should use appendStringInfoChar instead.

+    (errmsg("duration: %.3f ms, over limit seqscans: %s, plan:\n%s",
The "over limit" part is superfluous since it otherwise wouldn't be logged.  If
we're prefixing something wouldn't it be more helpful to include the limit,
like: "seqscans >= %d tuples returned:".  I'm not a fan of "seqscans" but
spelling it out is also quite verbose and this is grep-able.

Documentation and tests are also missing

--
Daniel Gustafsson        https://vmware.com/




pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: [PATCH] Use optimized single-datum tuplesort in ExecSort
Next
From: Dipesh Pandit
Date:
Subject: Re: .ready and .done files considered harmful