Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date
Msg-id 20190720152512.jpdi2g3notbjcxsd@development
Whole thread Raw
In response to Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (James Coleman <jtc331@gmail.com>)
Responses Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (James Coleman <jtc331@gmail.com>)
Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (James Coleman <jtc331@gmail.com>)
List pgsql-hackers
On Sat, Jul 20, 2019 at 10:33:02AM -0400, James Coleman wrote:
>On Sat, Jul 20, 2019 at 9:22 AM Tomas Vondra
><tomas.vondra@2ndquadrant.com> wrote:
>>
>> On Fri, Jul 19, 2019 at 04:59:21PM -0400, James Coleman wrote:
>> >On Mon, Jul 8, 2019 at 9:37 PM Tomas Vondra
>> ><tomas.vondra@2ndquadrant.com> wrote:
>> >> Now, consider this example:
>> >>
>> >>   create table t (a int, b int, c int);
>> >>   insert into t select mod(i,100),mod(i,100),i from generate_series(1,10000000) s(i);
>> >>   create index on t (a);
>> >>   analyze t;
>> >>   explain select a,b,sum(c) from t group by 1,2 order by 1,2,3 limit 1;
>> >>
>> >> With 0001+0002+0003 pathes, I get a plan like this:
>> >>
>> >>                                                      QUERY PLAN
>> >>
--------------------------------------------------------------------------------------------------------------------
>> >>  Limit  (cost=10375.39..10594.72 rows=1 width=16)
>> >>    ->  Incremental Sort  (cost=10375.39..2203675.71 rows=10000 width=16)
>> >>          Sort Key: a, b, (sum(c))
>> >>          Presorted Key: a, b
>> >>          ->  GroupAggregate  (cost=10156.07..2203225.71 rows=10000 width=16)
>> >>                Group Key: a, b
>> >>                ->  Gather Merge  (cost=10156.07..2128124.39 rows=10000175 width=12)
>> >>                      Workers Planned: 2
>> >>                      ->  Incremental Sort  (cost=9156.04..972856.05 rows=4166740 width=12)
>> >>                            Sort Key: a, b
>> >>                            Presorted Key: a
>> >>                            ->  Parallel Index Scan using t_a_idx on t  (cost=0.43..417690.30 rows=4166740
width=12)
>> >> (12 rows)
>> >>
>> >>
>> >> and with 0004, I get this:
>> >>
>> >>                                               QUERY PLAN
>> >> ------------------------------------------------------------------------------------------------------
>> >>  Limit  (cost=20443.84..20665.32 rows=1 width=16)
>> >>    ->  Incremental Sort  (cost=20443.84..2235250.05 rows=10000 width=16)
>> >>          Sort Key: a, b, (sum(c))
>> >>          Presorted Key: a, b
>> >>          ->  GroupAggregate  (cost=20222.37..2234800.05 rows=10000 width=16)
>> >>                Group Key: a, b
>> >>                ->  Incremental Sort  (cost=20222.37..2159698.74 rows=10000175 width=12)
>> >>                      Sort Key: a, b
>> >>                      Presorted Key: a
>> >>                      ->  Index Scan using t_a_idx on t  (cost=0.43..476024.65 rows=10000175 width=12)
>> >> (10 rows)
>> >>
>> >> Notice that cost of the second plan is almost double the first one. That
>> >> means 0004 does not even generate the first plan, i.e. there are cases
>> >> where we don't try to add the explicit sort before passing the path to
>> >> generate_gather_paths().
>> >>
>> >> And I think I know why is that - while gather_grouping_paths() tries to
>> >> add explicit sort below the gather merge, there are other places that
>> >> call generate_gather_paths() that don't do that. In this case it's
>> >> probably apply_scanjoin_target_to_paths() which simply builds
>> >>
>> >>    parallel (seq|index) scan + gather merge
>> >>
>> >> and that's it. The problem is likely the same - the code does not know
>> >> which pathkeys are "interesting" at that point. We probably need to
>> >> teach planner to do this.
>> >
>> >I've been working on figuring out sample queries for each of the
>> >places we're looking at adding create_increment_sort() (starting with
>> >the cases enabled by gather-merge nodes). The
>> >generate_useful_gather_paths() call in
>> >apply_scanjoin_target_to_paths() is required to generate the above
>> >preferred plan.
>
>As I continue this, I've added a couple of test cases (notably for
>generate_useful_gather_paths() in both standard_join_search() and
>apply_scanjoin_target_to_paths()). Those, plus the current WIP state
>of my hacking on your patch adding generate_useful_gather_paths() is
>attached as 0001-parallel-and-more-paths.patch.
>
>My current line of investigation is whether we need to do anything in
>the parallel portion of create_ordered_paths(). I noticed that the
>first-pass patch adding generate_useful_gather_paths() modified that
>section but wasn't actually adding any new gather-merge paths (just
>bare incremental sort paths). That seems pretty clearly just a
>prototype miss, so I modified the prototype to build gather-merge
>paths instead (as a side note that change seems to fix an oddity I was
>seeing where plans would include a parallel index scan node even
>though they weren't parallel plans). While the resulting plan for
>something like:
>

Yes, that seems to be a bug. The block above it clealy has a gather
merge nodes, so this one should too.

>explain analyze select * from t where t.a in (1,2,3,4,5,6) order by
>t.a, t.b limit 50;
>
>changes cost (to be cheaper) ever so slightly with the gather-merge
>addition to create_ordered_paths(), the plan itself is otherwise
>identical (including row estimates):
>
>Limit
>  -> Gather Merge
>       -> Incremental Sort
>          -> Parallel Index Scan
>
>(Note: I'm forcing parallel plans here with: set
>max_parallel_workers_per_gather=4; set min_parallel_table_scan_size=0;
>set parallel_tuple_cost=0; set parallel_setup_cost=0; set
>min_parallel_index_scan_size=0;)
>
>I can't seem to come up with a case where adding these gather-merge
>paths in create_ordered_paths() isn't entirely duplicative of paths
>already created by generate_useful_gather_paths() as called from
>apply_scanjoin_target_to_paths() -- which I _think_ makes sense given
>that both apply_scanjoin_target_to_paths() and create_ordered_paths()
>are called by grouping_planner().
>
>Can you think of a case I'm missing here that would make it valuable
>to generate new parallel plans in create_ordered_paths()?
>

Good question. Not sure. I think such path would need to do something on
a relation that is neither a join nor a scan - in which case the path
should not be created by apply_scanjoin_target_to_paths().

So for example a query like this:

  SELECT
      a, b, sum(expensive_function(c))
  FROM
      t
  GROUP BY a,b
  ORDER BY a,sum(...) LIMIT 10;

should be able to produce a plan like this:

  -> Limit
      -> Gather Merge
        -> Incremental Sort (pathkeys: a, sum)
          -> Group Aggregate
             a, b, sum(expensive_function(c))
            -> Index Scan (pathkeys: a, b)

or something like that, maybe. I haven't tried this, though. The other
question is whether such queries are useful in practice ...

>> ...
>>
>> I think this may be a thinko, as this plan demonstrates - but I'm not
>> sure about it. I wonder if this might be penalizing some other types of
>> plans (essentially anything with limit + gather).
>>
>> Attached is a WIP patch fixing this by considering both startup and
>> total cost (by calling compare_path_costs_fuzzily).
>
>It seems to me that this is likely a bug, and not just a changed
>needed for this. Do you think it's better addressed in a separate
>thread? Or retain it as part of this patch for now (and possibly break
>it out later)? On the other hand, it's entirely possible that someone
>more familiar with parallel plan limitations could explain why the
>above comment holds true. That makes me lean towards asking in a new
>thread.
>

Maybe. I think creating a separate thread would be useful, provided we
manage to demonstrate the issue without an incremental sort.

>I've also attached a new base patch (incremental-sort-30.patch) which
>includes some of the other obvious fixes (costing, etc.) that you'd
>previously proposed.
>

Thanks!

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services 



pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: [sqlsmith] Crash in mcv_get_match_bitmap
Next
From: James Coleman
Date:
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)