Re: enable_incremental_sort changes query behavior - Mailing list pgsql-hackers

From James Coleman
Subject Re: enable_incremental_sort changes query behavior
Date
Msg-id CAAaqYe_RaBHV05cExUjCYzidyLZzT+3fVVfVNLUkw0bQpVHnug@mail.gmail.com
Whole thread Raw
In response to Re: enable_incremental_sort changes query behavior  (James Coleman <jtc331@gmail.com>)
List pgsql-hackers
On Wed, Nov 25, 2020 at 2:53 PM James Coleman <jtc331@gmail.com> wrote:
>
> On Tue, Nov 24, 2020 at 2:31 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> >
> > James Coleman <jtc331@gmail.com> writes:
> > > On Mon, Nov 23, 2020 at 2:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> > >> 1. James was wondering, far upthread, why we would do projections
> > >> pre-sort or post-sort.  I think the answers can be found by studying
> > >> planner.c's make_sort_input_target(), which separates out what we want
> > >> to do pre-sort and post-sort.
> >
> > > Does it imply we need to intentionally avoid SRFs also?
> >
> > It's sort of a wart that we allow SRFs in ORDER BY at all, but my
> > expectation is that make_sort_input_target would prevent lower levels
> > of the planner from needing to think about that.  We don't allow SRFs
> > in index expressions, nor in WHERE clauses (so they'll never come up
> > as mergejoin sort keys).  So really there's no way that scan/join
> > processing should need to consider such cases.  Such a sort would
> > only ever be implemented via a final Sort atop ProjectSet.
>
> In this case though we're sorting based on "interesting" pathkeys,
> which means we don't don't necessarily have index expressions or
> mergejoin sort keys. For example, even with the bugfix patch (from the
> parallel fix thread I've linked to previously) applied, I'm able to
> generate this (with of course some GUCs "tweaked"):
>
> select unique1 from tenk1 order by unnest('{1}'::int[]);
>
>  Gather Merge
>    Workers Planned: 2
>    ->  Sort
>          Sort Key: (unnest('{1}'::integer[]))
>          ->  ProjectSet
>                ->  Parallel Index Only Scan using tenk1_unique1 on tenk1
>
> If I understand correctly, that's a violation of the spirit of what
> the comments above make_sort_input_target(). If that's true, then it
> should be pretty easy to disallow them from being considered. Given
> the existing restrictions on where SRFs can be placed in a SELECT
> (e.g., no CASE/COALESCE) and my assumption (without having thoroughly
> verified this) that SRFs aren't allowed as arguments to functions or
> as arguments to any other expression (I assume only scalars are
> allowed), would it be sufficient to check the pathkey expression
> (without recursion) to see if it's a FuncExpr that returns a set?

Here's the plan with a change to restrict SRFs here:

 Sort
   Sort Key: (unnest('{1,2}'::integer[]))
   ->  Gather
         Workers Planned: 2
         ->  ProjectSet
               ->  Parallel Index Only Scan using tenk1_unique1 on tenk1

I'm a bit surprised the ProjectSet is above the Index Scan rather than
above the Gather, but maybe this is still more correct?

James



pgsql-hackers by date:

Previous
From: David Zhang
Date:
Subject: Re: Add table access method as an option to pgbench
Next
From: Victor Yegorov
Date:
Subject: Re: Deleting older versions in unique indexes to avoid page splits