Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers
From | James Coleman |
---|---|
Subject | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |
Date | |
Msg-id | CAAaqYe9ZCNrF8Q-Ouy1Bg4DPbb2+7bBLH=OAdOQYAkgkQi0+=A@mail.gmail.com Whole thread Raw |
In response to | Re: [PATCH] Incremental sort (was: PoC: Partial sort) (James Coleman <jtc331@gmail.com>) |
Responses |
Re: [PATCH] Incremental sort (was: PoC: Partial sort)
|
List | pgsql-hackers |
On Mon, Apr 6, 2020 at 7:09 PM James Coleman <jtc331@gmail.com> wrote: > > On Mon, Apr 6, 2020 at 6:13 PM Tomas Vondra > <tomas.vondra@2ndquadrant.com> wrote: > > > > On Mon, Apr 06, 2020 at 05:47:48PM -0400, James Coleman wrote: > > >On Mon, Apr 6, 2020 at 5:40 PM Tomas Vondra > > ><tomas.vondra@2ndquadrant.com> wrote: > > >> > > >> On Mon, Apr 06, 2020 at 11:12:32PM +0200, Tomas Vondra wrote: > > >> >On Mon, Apr 06, 2020 at 04:54:38PM -0400, Alvaro Herrera wrote: > > >> >>On 2020-Apr-06, Tom Lane wrote: > > >> >> > > >> >>>Locally, things pass without force_parallel_mode, but turning it on > > >> >>>produces failures that look similar to rhinoceros's (didn't examine > > >> >>>other BF members). > > >> >> > > >> >>FWIW I looked at the eight failures there were about fifteen minutes ago > > >> >>and they were all identical. I can confirm that, in my laptop, the > > >> >>tests work without that GUC, and fail in exactly that way with it. > > >> >> > > >> > > > >> >Yes, there's a thinko in show_incremental_sort_info() and it returns too > > >> >soon. I'll push a fix in a minute. > > >> > > > >> > > >> OK, I've pushed a fix - this should make the buildfarm happy again. > > >> > > >> It however seems to me a bit more needs to be done. The fix makes > > >> show_incremental_sort_info closer to show_sort_info, but not entirely > > >> because IncrementalSortState does not have sort_Done flag so it still > > >> depends on (fullsortGroupInfo->groupCount > 0). I haven't noticed that > > >> before, but not having that flag seems a bit weird to me. > > >> > > >> It also seems possibly incorrect - we may end up with > > >> > > >> fullsortGroupInfo->groupCount == 0 > > >> prefixsortGroupInfo->groupCount > 0 > > >> > > >> but we won't print anything. > > > > > >This shouldn't ever be possible, because the only way we get any > > >prefix groups at all is if we've already sorted a full sort group > > >during the mode transition. > > > > > >> James, any opinion on this? I'd say we should restore the sort_Done flag > > >> and make it work as in plain Sort. Or some comment explaining why > > >> depending on the counts is OK (assuming it is). > > > > > >There's previous email traffic on this thread about that (I can look > > >it up later this evening), but the short of it is that I believe that > > >relying on the group count is actually more correct than a sort_Done > > >flag in the case of incremental sort (in contrast to regular sort). > > > > > > > OK. Maybe we should add a comment to explain.c saying it's OK. > > > > I've pushed a fix for failures due to different planned workers (in the > > test I added to show changes due to add_partial_path tweaks). > > > > It seems we're not out of the woods yet, though. rhinoceros and > > sidewinder failed with something like this: > > > > Sort Method: quicksort Memory: NNkB > > + Sort Method: unknown Disk: NNkB > > > > Would you mind investigating at it? > > I assume that means those build farm members run with very low > work_mem? Is it an acceptable fix to adjust work_mem up a bit just for > these tests? Or is that bad practice and these are to expose issues > with changing into disk sort mode? On rhinoceros I see: ================== pgsql.build/src/test/regress/regression.diffs =================== diff -U3 /opt/src/pgsql-git/build-farm-root/HEAD/pgsql.build/src/test/regress/expected/subselect.out /opt/src/pgsql-git/build-farm-root/HEAD/pgsql.build/src/test/regress/results/subselect.out --- /opt/src/pgsql-git/build-farm-root/HEAD/pgsql.build/src/test/regress/expected/subselect.out 2020-03-14 10:37:49.156761104 -0700 +++ /opt/src/pgsql-git/build-farm-root/HEAD/pgsql.build/src/test/regress/results/subselect.out 2020-04-06 16:01:13.766798059 -0700 @@ -1328,8 +1328,9 @@ -> Sort (actual rows=3 loops=1) Sort Key: sq_limit.c1, sq_limit.pk Sort Method: top-N heapsort Memory: xxx + Sort Method: unknown Disk: 0kB -> Seq Scan on sq_limit (actual rows=8 loops=1) -(6 rows) +(7 rows) Same on sidewinder. Given the 0kB I'm not sure this is *just* a work_mem thing, though that's still something I'm curious to know about, and it's still part of the "problem" here. I'm investigating further. James
pgsql-hackers by date: