Home > mailing lists

Re: proposal - log_full_scan - Mailing list pgsql-hackers

From	Daniel Gustafsson
Subject	Re: proposal - log_full_scan
Date	July 6, 2021 14:07:33
Msg-id	3E8BD81C-07C9-46F2-98D3-2E4EEEFB82FD@yesql.se Whole thread Raw
In response to	Re: proposal - log_full_scan (Pavel Stehule <pavel.stehule@gmail.com>)
Responses	Re: proposal - log_full_scan
List	pgsql-hackers

Tree view

Looking at this I like the idea in principle, but I'm not convinced that
auto_explain is the right tool for this.  auto_explain is for identifying slow
queries, and what you are proposing is to identify queries with a certain
"shape" (for lack of a better term) even if they aren't slow as per the
log_min_duration setting.  If log_min_duration is deemed to crude due to query
volume then sample_rate is the tool.  If sample_rate is also discarded, then
pg_stat_statements seems a better option.

Also, why just sequential scans (apart from it being this specific usecase)?
If the idea is to track aspects of execution which are deemed slow, then
tracking for example spills etc would be just as valid.  Do you have thoughts
on that?

That being said, a few comments on the patch:

-    (auto_explain_log_min_duration >= 0 && \
+    ((auto_explain_log_min_duration >= 0 || auto_explain_log_seqscan != -1) && \
Is there a reason to not follow the existing code and check for >= 0?

+    DefineCustomIntVariable("auto_explain.log_seqscan",
It's only a matter of time before another node is proposed for logging, and
then we'll be stuck adding log_XXXnode GUCs.  Is there a more future-proof way
to do this?

+    "Sets the minimum tuples produced by sequantial scans which plans will be logged",
s/sequantial/sequential/

-    es->analyze = (queryDesc->instrument_options && auto_explain_log_analyze);
+    es->analyze = (queryDesc->instrument_options && (auto_explain_log_analyze || auto_explain_log_seqscan != -1));
Turning on ANALYZE when log_analyze isn't set to True is a no-no IMO.

+ * Colllect relations where log_seqscan limit was exceeded
s/Colllect/Collect/

+    if (*relnames.data != '\0')
+        appendStringInfoString(&relnames, ",");
This should use appendStringInfoChar instead.

+    (errmsg("duration: %.3f ms, over limit seqscans: %s, plan:\n%s",
The "over limit" part is superfluous since it otherwise wouldn't be logged.  If
we're prefixing something wouldn't it be more helpful to include the limit,
like: "seqscans >= %d tuples returned:".  I'm not a fan of "seqscans" but
spelling it out is also quite verbose and this is grep-able.

Documentation and tests are also missing

--
Daniel Gustafsson        https://vmware.com/

pgsql-hackers by date:

From: Dilip Kumar
Date: 06 July 2021, 13:39:38
Subject: Re: [PATCH] Use optimized single-datum tuplesort in ExecSort

From: Dipesh Pandit
Date: 06 July 2021, 14:38:42
Subject: Re: .ready and .done files considered harmful

Re: proposal - log_full_scan - Mailing list pgsql-hackers

Previous

Next