Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |
Date | |
Msg-id | 20200331011407.vwz3ektatgqyi6vy@development Whole thread Raw |
In response to | Re: [PATCH] Incremental sort (was: PoC: Partial sort) (James Coleman <jtc331@gmail.com>) |
Responses |
Re: [PATCH] Incremental sort (was: PoC: Partial sort)
|
List | pgsql-hackers |
On Mon, Mar 30, 2020 at 06:53:47PM -0400, James Coleman wrote: >On Mon, Mar 30, 2020 at 8:24 AM Tomas Vondra ><tomas.vondra@2ndquadrant.com> wrote: >> >> On Sun, Mar 29, 2020 at 10:16:53PM -0400, James Coleman wrote: >> >On Sun, Mar 29, 2020 at 9:44 PM Tomas Vondra >> ><tomas.vondra@2ndquadrant.com> wrote: >> >> >> >> Hi, >> >> >> >> Attached is a slightly reorganized patch series. I've merged the fixes >> >> into the appropriate matches, and I've also combined the two patches >> >> adding incremental sort paths to additional places in planner. >> >> >> >> A couple more comments: >> >> >> >> >> >> 1) I think the GUC documentation in src/sgml/config.sgml is a bit too >> >> detailed, compared to the other enable_* GUCs. I wonder if there's a >> >> better place where to move the details. What about adding some examples >> >> and explanation to perform.sgml? >> > >> >I'll take a look at that and include in a patch series tomorrow. > >Attached. > >> >> 2) Looking at the explain output, the verbose mode looks like this: >> >> >> >> test=# explain (verbose, analyze) select a from t order by a, b, c; >> >> QUERY PLAN >> >> -------------------------------------------------------------------------------------------------------------------------------------------------------------- >> >> Gather Merge (cost=66.31..816072.71 rows=8333226 width=24) (actual time=4.787..20092.555 rows=10000000 loops=1) >> >> Output: a, b, c >> >> Workers Planned: 2 >> >> Workers Launched: 2 >> >> -> Incremental Sort (cost=66.28..729200.36 rows=4166613 width=24) (actual time=1.308..14021.575 rows=3333333loops=3) >> >> Output: a, b, c >> >> Sort Key: t.a, t.b, t.c >> >> Presorted Key: t.a, t.b >> >> Full-sort Groups: 4169 Sort Method: quicksort Memory: avg=30kB peak=30kB >> >> Presorted Groups: 4144 Sort Method: quicksort Memory: avg=128kB peak=138kB >> >> Worker 0: actual time=0.766..16122.368 rows=3841573 loops=1 >> >> Full-sort Groups: 6871 Sort Method: quicksort Memory: avg=30kB peak=30kB >> >> Presorted Groups: 6823 Sort Method: quicksort Memory: avg=132kB peak=141kB >> >> Worker 1: actual time=1.986..16189.831 rows=3845490 loops=1 >> >> Full-sort Groups: 6874 Sort Method: quicksort Memory: avg=30kB peak=30kB >> >> Presorted Groups: 6847 Sort Method: quicksort Memory: avg=130kB peak=139kB >> >> -> Parallel Index Scan using t_a_b_idx on public.t (cost=0.43..382365.92 rows=4166613 width=24) (actualtime=0.040..9808.449 rows=3333333 loops=3) >> >> Output: a, b, c >> >> Worker 0: actual time=0.048..11275.178 rows=3841573 loops=1 >> >> Worker 1: actual time=0.041..11314.133 rows=3845490 loops=1 >> >> Planning Time: 0.166 ms >> >> Execution Time: 25135.029 ms >> >> (22 rows) >> >> >> >> There seems to be missing indentation for the first line of worker info. >> > >> >Working on that too. > >See attached. I've folded in the original "explain fixes" patch into >the main series, and the "explain fixes" patch in this series contains >only the changes for the above. > Thanks. I'll take a look at those changes tomorrow. >> >> I'm still not quite convinced we should be printing two lines - I know >> >> you mentioned the lines might be too long, but see how long the other >> >> lines may get ... >> > >> >All right, I give in :) >> > >> >Do you think non-workers (both the leader and non-parallel plans) >> >should also move to one line? >> > >> >> I think we should use the same formatting for both cases, so yes. >> >> FWIW I forgot to mention I tweaked the INSTRUMENT_SORT_GROUP macro a >> bit, by moving the if condition in it. That makes the calls easier. > >Ah, that actually fixed some of the compile warnings. The other is >fixed in my explain fixes patch. > >> >> 3) I see the new nodes (plan state, ...) have "presortedCols" which does >> >> not indicate it's a "number of". I think we usually prefix names of such >> >> fields "n" or "num". What about "nPresortedCols"? (Nitpicking, I know.) >> > >> >I can fix this too. > >Changed everywhere we used this var name. I'm tempted to change to >nPresortedKeys, but a cursory glance suggests some cases might >actually be consistent with other var names reference columns, so I'm >not sure if we want to go down that path (and change more than just >this). > Not sure. We use "sort keys" and "path keys" for this, but I think "columns" is good enough. The main thing I've been working on today is benchmarking how this affects planning. And I'm seeing a regression that worries me a bit, unfortunately. The test I'm doing is pretty simple - build a small table with a bunch of columns: create table t (a int, b int, c int, d int, e int, f int, g int); insert into t select 100*random(), 100*random(), 100*random(), 100*random(), 100*random(), 100*random(), 100*random() from generate_series(1,100000) s(i); and then a number of indexes on subsets of up to 3 columns, as generated using the attached build-indexes.py script. And then run a bunch of explains (so no actual execution) sorting the data by at least 4 columns (to trigger incremental sort paths), measuring timing of the script. I did a bunch of runs on current master and v46 with incremental sort disabled and enabled, and the results look like this: master off on -------------------------- 34.609 37.463 37.729 which means about 8-9% regression with incremental sort. Of course, this is only for planning time, for execution the impact is going to be much smaller. But it's still a bit annoying. I've suspected this might be either due to the add_partial_path changes or the patch adding incremental sort to additional places, so I tested those parts individually and the answer is no - add_partial_path changes have very small impact (~1%, which might be noise). The regression comes mostly from the 0002 part that adds incremental sort. At least in this particular test - different tests might behave differently, of course. The annoying bit is that the overhead does not disappear after disabling incremental sort. That suggests this is not merely due to considering and creating higher number of paths, but due to something that happens before we even look at the GUC ... I think I've found one such place - if you look at compare_pathkeys, it has this check right before the forboth() loop: if (keys1 == keys2) return PATHKEYS_EQUAL; But with incremental sort we don't call pathkeys_contained_in, we call pathkeys_common_contained_in instead. And that does not call compare_pathkeys and does not have the simple equality check. Adding the following check seems to cut the overhead in half, which is nice: if (keys1 == keys2) { *n_common = list_length(keys1); return true; } Not sure where the rest of the regression comes from yet. Also, while looking at pathkeys_common_contained_in(), I've been a bit puzzled why does is this correct: return (key1 == NULL); It's easy to not notice it's key1 and not keys1, so I suggest we add a comment 'key1 == NULL' means we've processed whole keys1 list. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
pgsql-hackers by date: