Re: updated SORT/LIMIT patch - Mailing list pgsql-patches

From Gregory Stark
Subject Re: updated SORT/LIMIT patch
Date
Msg-id 878xbo3n1v.fsf@oxford.xeocode.com
Whole thread Raw
In response to Re: updated SORT/LIMIT patch  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-patches
"Alvaro Herrera" <alvherre@commandprompt.com> writes:

> Gregory Stark wrote:
>
>> Attached is a small patch which fixes this case. It also makes the check
>> slightly more liberal -- we don't need to resort if the previous sort was
>> unbounded or the bound was greater than or equal to the new bound.
>
> Huh, can you clarify this comment:
>
> +        * XXX It would be nice to check tuplesortstate->boundUsed too but that
> +        * seems like an abstraction violation. And for that matter to check
> +        * the tuplesort to see if randomaccess is possible even if it wasn't
> +        * requested so we don't resort input when the parameters haven't
> +        * changed if it was sorted in memory.
>
> I'm having serious trouble parsing it.

Well in the executor currently we check node->boundedDone and node->boundDone
to see whether the previous execution of this node had a bound. If it did and
we now don't or if it did but now our bound is larger then we have to
re-execute.

However the tuplesort may have actually done an unbounded sort either because
the bound (actually bound*2) may not have been reached or because it spilled
to disk and we did a disk sort. It would be nice to check that and not have to
reexecute unnecessarily.

But that means peeking into tuplesort's state. I started doing a patch
yesterday to do this by exporting a new method on tuplesort:

bool
tuplesort_random_access(Tuplesortstate *state, bool bounded, unsigned tuples_needed)
{
    switch (state->status)
    {
        case TSS_FINALMERGE:
            return false;
        case TSS_SORTEDONTAPE:
            return state->randomAccess;
        case TSS_SORTEDINMEM:
            return (!state->boundUsed || (bounded && state->bound > tuples_needed));
        default:
            return false;
    }
}

That solves the abstraction barrier issue but I'm not sure if it's quite
detailed enough. Really Sort only needs to rewind the tuplesort which it can
do even if state->randomAccess is false. We may need two separate functions,
one which tests if rewind is supported and another which tests if random
access is supported. Also, I haven't looked at how the final merge case is
handled. It might be possible to support rescan (but not random access) in
that state as well.

--
  Gregory Stark
  EnterpriseDB          http://www.enterprisedb.com


pgsql-patches by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: updated SORT/LIMIT patch
Next
From: Heikki Linnakangas
Date:
Subject: Re: [DOCS] Autovacuum and XID wraparound