Re: Proposal: Progressive explain - Mailing list pgsql-hackers
From | Robert Haas |
---|---|
Subject | Re: Proposal: Progressive explain |
Date | |
Msg-id | CA+Tgmobjnj0FJj-0GUSXCga3-mUe819Wj=EeZqY=QydMWGOguA@mail.gmail.com Whole thread Raw |
In response to | Re: Proposal: Progressive explain (Rafael Thofehrn Castro <rafaelthca@gmail.com>) |
Responses |
Re: Proposal: Progressive explain
|
List | pgsql-hackers |
On Fri, Mar 28, 2025 at 4:12 PM Rafael Thofehrn Castro <rafaelthca@gmail.com> wrote: > > That is correct. Interval currently is only used when instrumentation > > is enabled through progressive_explain = analyze. > > I guess there would still be a point in printing the plan on a regular interval > when instrumentation is disabled. In that case the only thing we would see > changing in the plan is the node currently being executed. But with that we > add more overhead to progressive_explain = 'explain', that in that case will > also require a custom execProcNode wrapper. I don't think that's at all worth it. I had even considered asking for the current node stuff to be removed altogether. Doing regular updates to only change that seems like a real waste. It's barely ever going to be useful to see the node being executed right this exact instant - what you care about is which nodes suck up resources over time. As far as option naming is concerned, I'm open to input from others, but personally I feel like ANALYZE is a bit opaque as an EXPLAIN option in general. We've had previous discussions about the fact that we might have wanted to name it EXECUTE if EXPLAIN EXECUTE didn't already mean something else. I think for most people, what ANALYZE means - somewhat counterintuitively - is that we're actually going to run the query. But here we're always going to run the query, so my brain just goes into a tailspin trying to figure out what ANALYZE means. So, personally, I would like to see progressive_explain = off | explain | analyze go away in favor of progressive_explain = off | on. I think we have two ideas on how to get there so far: (1) Just get rid of what you call progressive_explain = explain. After all, that amounts to just serializing the plan once, and maybe that's not actually such a great feature. With that change, then we only have two remaining values for the progressive_explain, and we can call them "off" and "on". (2) Merge progressive_explain = explain and progressive_explain = analyze into a single value, progressive_explain = on. To tell which is intended, just check progresive_explain_interval. If it's zero, then there's no repeat, so we mean what you currently call progressive_explain = explain i.e. serialize once. Otherwise we mean progressive_explain = analyze. I'm open to idea #3, but I'm pretty resistant to the idea of progressive_explain remaining three-valued as it is today. If 27 people show up to say that they understand what that means perfectly and Robert's a big fat idiot, I shall of course defer to the consensus, but I kind of think I might not be alone in finding this naming confusing. For what it's worth, I have trouble seeing how anyone gets confused if we do what I propose as #2. I mean, I agree with your point that they could misunderstand how much overhead different settings will cause, so we would need to document that carefully. But if you're just wondering about the behavior, then I feel like people have a pretty good chance of guessing that progressive_explain=on, progressive_explain_interval=0 means do it once. You could think that means do it at maximum speed, but that seems like a somewhat naive interpretation. You could also think it disables the feature, effectively negating progressive_explain=on, but then you should start to wonder why there are two GUCs. If you don't think either of those things, then I think "do it once" is the only other reasonable interpretation. Of course, sometimes what I think turns out to be completely wrong! -- Robert Haas EDB: http://www.enterprisedb.com
pgsql-hackers by date: