On 2017-05-05 14:40:43 +1200, David Rowley wrote:
> On 5 May 2017 at 14:36, Andres Freund <andres@anarazel.de> wrote:
> > I wonder how much doing the atomic ops approach alone can help, that
> > doesn't have the issue that the work might be unevenly distributed
> > between pages.
>
> I wondered that too, since I though the barrier for making this change
> would be lower by doing it that way.
>
> I didn't manage to think of a way to get around the wrapping the
> position back to 0 when synch-scans are involved.
>
> i.e
> parallel_scan->phs_cblock++;
> if (parallel_scan->phs_cblock >= scan->rs_nblocks)
> parallel_scan->phs_cblock = 0;
Increment phs_cblock without checking rs_nblocks, but outside of the
lock do a % scan->rs_nblocks, to get the "actual" position. Finish if
(phs_cblock - phs_startblock) / scan->rs_nblocks >= 1.
The difficult part seems to be the parallel_scan->phs_startblock
computation, but that we probably can do via an read barrier & unlocked
check, and then a spinlock & recheck if still uninitialized.
- Andres