Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |
Date | |
Msg-id | 20200406233143.kmvjdrzne3oqigdu@development Whole thread Raw |
In response to | Re: [PATCH] Incremental sort (was: PoC: Partial sort) (James Coleman <jtc331@gmail.com>) |
Responses |
Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Re: [PATCH] Incremental sort (was: PoC: Partial sort) |
List | pgsql-hackers |
On Mon, Apr 06, 2020 at 07:09:11PM -0400, James Coleman wrote: >On Mon, Apr 6, 2020 at 6:13 PM Tomas Vondra ><tomas.vondra@2ndquadrant.com> wrote: >> >> On Mon, Apr 06, 2020 at 05:47:48PM -0400, James Coleman wrote: >> >On Mon, Apr 6, 2020 at 5:40 PM Tomas Vondra >> ><tomas.vondra@2ndquadrant.com> wrote: >> >> >> >> On Mon, Apr 06, 2020 at 11:12:32PM +0200, Tomas Vondra wrote: >> >> >On Mon, Apr 06, 2020 at 04:54:38PM -0400, Alvaro Herrera wrote: >> >> >>On 2020-Apr-06, Tom Lane wrote: >> >> >> >> >> >>>Locally, things pass without force_parallel_mode, but turning it on >> >> >>>produces failures that look similar to rhinoceros's (didn't examine >> >> >>>other BF members). >> >> >> >> >> >>FWIW I looked at the eight failures there were about fifteen minutes ago >> >> >>and they were all identical. I can confirm that, in my laptop, the >> >> >>tests work without that GUC, and fail in exactly that way with it. >> >> >> >> >> > >> >> >Yes, there's a thinko in show_incremental_sort_info() and it returns too >> >> >soon. I'll push a fix in a minute. >> >> > >> >> >> >> OK, I've pushed a fix - this should make the buildfarm happy again. >> >> >> >> It however seems to me a bit more needs to be done. The fix makes >> >> show_incremental_sort_info closer to show_sort_info, but not entirely >> >> because IncrementalSortState does not have sort_Done flag so it still >> >> depends on (fullsortGroupInfo->groupCount > 0). I haven't noticed that >> >> before, but not having that flag seems a bit weird to me. >> >> >> >> It also seems possibly incorrect - we may end up with >> >> >> >> fullsortGroupInfo->groupCount == 0 >> >> prefixsortGroupInfo->groupCount > 0 >> >> >> >> but we won't print anything. >> > >> >This shouldn't ever be possible, because the only way we get any >> >prefix groups at all is if we've already sorted a full sort group >> >during the mode transition. >> > >> >> James, any opinion on this? I'd say we should restore the sort_Done flag >> >> and make it work as in plain Sort. Or some comment explaining why >> >> depending on the counts is OK (assuming it is). >> > >> >There's previous email traffic on this thread about that (I can look >> >it up later this evening), but the short of it is that I believe that >> >relying on the group count is actually more correct than a sort_Done >> >flag in the case of incremental sort (in contrast to regular sort). >> > >> >> OK. Maybe we should add a comment to explain.c saying it's OK. >> >> I've pushed a fix for failures due to different planned workers (in the >> test I added to show changes due to add_partial_path tweaks). >> >> It seems we're not out of the woods yet, though. rhinoceros and >> sidewinder failed with something like this: >> >> Sort Method: quicksort Memory: NNkB >> + Sort Method: unknown Disk: NNkB >> >> Would you mind investigating at it? > >I assume that means those build farm members run with very low >work_mem? Is it an acceptable fix to adjust work_mem up a bit just for >these tests? Or is that bad practice and these are to expose issues >with changing into disk sort mode? > I don't think so - I don't see any work_mem changes in the config - see the extra_config at the beginning of the page with details: https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=rhinoceros&dt=2020-04-06%2023%3A00%3A16 Moreover, this seems to be in regular Sort, not Incremental Sort and it very much seems like it gets confused to print a worker info because the only way for Sort to print two "Sort Method" lines seems to be to enter either both if (sortstate->sort_Done && sortstate->tuplesortstate != NULL) { ... print leader info ... } and if (sortstate->shared_info != NULL) { for (n = 0; n < sortstate->shared_info->num_workers; n++) { ... print worker info ... } } or maybe there are two workers? It's strange ... It doesn't seem to be particularly platform-specific, but I've been unable to reproduce it so far. It seems on older gcc versions, though. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: