Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers

From James Coleman
Subject Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date
Msg-id CAAaqYe-BV=kuMmwHCnOfdAG_mKjCo=41MeFrdtBjB+iXd6-FJg@mail.gmail.com
Whole thread Raw
In response to Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (James Coleman <jtc331@gmail.com>)
Responses Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List pgsql-hackers
On Sat, Mar 7, 2020 at 5:47 PM James Coleman <jtc331@gmail.com> wrote:
On Tue, Jan 21, 2020 at 9:37 AM James Coleman <jtc331@gmail.com> wrote:
That being said, the patch also needs some more work on improving
EXPLAIN ANALYZE output (perhaps min/max/mean or median of
memory usage number of groups in each sort mode), and I think it's far
more feasible that I can tackle that piecemeal before the next CF.

James

I'm attaching a rebased patch revision + a new commit that reworks EXPLAIN output. I left that patch as separate for now so that it was easy enough to see the difference, and so that as Tomas is working on stuff in parallel I don't unnecessarily cause merge conflicts for now, but next patch revision (assuming the EXPLAIN change looks good) can just incorporate it into the base patch.

Here's what I've changed:

- The stats necessary for ANALYZE are now only kept if the PlanState has a non-null instrument field (thanks to Tom for pointing out this as the correct way to check that ANALYZE is in flight). I did leave lines like `node->incsort_info.fullsortGroupInfo.groupCount++;` unguarded by that `if` since it seems like practically zero overhead (and almost equal to check the condition), but if anyone disagrees, I'm happy to change it. Additionally those lines (if ANALYZE is not in flight) are technically operating on variables that haven't explicitly been initialized in the Init function; please tell me if that's actually an issue given the are counters and we won't be using them in that case.

And...I discovered that I need to do this anyway. Basically, the originally patch stored per-worker instrumentation information on every tuple fetch, which is unnecessary, and in my haste refactoring I'd replaced that spot with my code to better record stats. The original patch just looked at the last tuple sort state, as I'd mentioned previously, so didn't have any special instrumentation in non-parallel workers other than incrementing group counters.

But obviously we don't want to record stats from every tuple, we want to record sort info every time we finalize a sort. And so I've replaced the group counter increment lines with calls to a newly broken out function to record stats for the appropriate fullsort/prefixsort group info.

I came across while adding tests for EXPLAIN ANALYZE and saw a result with the reported average memory usage higher than the max--this happened since I was adding the memory used each time through the loop rather than once when finalizing the sort.
 
- A good bit of cleanup on how parallel workers are output (I believe there was some duplicative group opening and also inconsistent text output with other multi-worker explain nodes). I haven't had a chance to test this yet, thought, so there could be bugs.

Note: I still haven't had time to test parallel plans with the updated EXPLAIN, so there aren't tests for that either.

- I also left a TODO wondering if we should break out the instrumentation into a separate function; it seems like a decent sized chunk of cleanly extractable code; I suppose that's always a bit of personal preference, so anyone who wants to weigh in gets a vote :)

I ended up having to do this anyway, for reasons described above.

See new version attached (still with EXPLAIN changes as a separate patch file).

James
Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: range_agg
Next
From: James Coleman
Date:
Subject: Re: Allow to_date() and to_timestamp() to accept localized names