Re: Parallel tuplesort (for parallel B-Tree index creation) - Mailing list pgsql-hackers

From Peter Geoghegan
Subject Re: Parallel tuplesort (for parallel B-Tree index creation)
Date
Msg-id CAM3SWZR6PPcuh8jnpY-7dyF7Py0oxOrZG1YawGPM_Oencg231A@mail.gmail.com
Whole thread Raw
In response to Re: Parallel tuplesort (for parallel B-Tree index creation)  (Claudio Freire <klaussfreire@gmail.com>)
Responses Re: Parallel tuplesort (for parallel B-Tree index creation)  (Claudio Freire <klaussfreire@gmail.com>)
Re: Parallel tuplesort (for parallel B-Tree index creation)  (Claudio Freire <klaussfreire@gmail.com>)
List pgsql-hackers
On Tue, Sep 6, 2016 at 12:57 PM, Claudio Freire <klaussfreire@gmail.com> wrote:
> Patch lacks any new tests, but the changed code paths seem covered
> sufficiently by existing tests. A little bit of fuzzing on the patch
> itself, like reverting some key changes, or flipping some key
> comparisons, induces test failures as it should, mostly in cluster.
>
> The logic in tuplesort_heap_root_displace seems sound, except:
>
> +                */
> +               memtuples[i] = memtuples[imin];
> +               i = imin;
> +       }
> +
> +       Assert(state->memtupcount > 1 || imin == 0);
> +       memtuples[imin] = *newtup;
> +}
>
> Why that assert? Wouldn't it make more sense to Assert(imin < n) ?

There might only be one or two elements in the heap. Note that the
heap size is indicated by state->memtupcount at this point in the
sort, which is a little confusing (that differs from how memtupcount
is used elsewhere, where we don't partition memtuples into a heap
portion and a preread tuples portion, as we do here).

> In the meanwhile, I'll go and do some perf testing.
>
> Assuming the speedup is realized during testing, LGTM.

Thanks. I suggest spending at least as much time on unsympathetic
cases (e.g., only 2 or 3 tapes must be merged). At the same time, I
suggest focusing on a type that has relatively expensive comparisons,
such as collated text, to make differences clearer.

-- 
Peter Geoghegan



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: improved DefElem list processing
Next
From: Michael Paquier
Date:
Subject: Re: Let file_fdw access COPY FROM PROGRAM