Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date
Msg-id 20190625232205.ssecrwplyb7faxbb@development
Whole thread Raw
In response to Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (James Coleman <jtc331@gmail.com>)
Responses Re: [PATCH] Incremental sort (was: PoC: Partial sort)
List pgsql-hackers
On Tue, Jun 25, 2019 at 04:53:40PM -0400, James Coleman wrote:
>
>Unrelated: if you or someone else you know that's more familiar with
>the parallel code, I'd be interested in their looking at the patch at
>some point, because I have a suspicion it might not be operating in
>parallel ever (either that or I don't know how to trigger it), but I'm
>not really familiar with that stuff at all currently. :)
>

That's an interesting question. I don't think plans like this would be
very interesting:

    Limit
      -> Incremental Sort
         -> Gather Merge
            -> Index Scan

because most of the extra cost would be paid in the leader anyway. So
I'm not all that surprised those paths are not generated (I might be
wrong and those plans would be interesting, though).

But I think something like this would be quite beneficial:

    Limit
     -> Gather Merge
        -> Incremental Sort
           -> Index Scan

So I've looked into that, and the reason seems fairly simple - when
generating the Gather Merge paths, we only look at paths that are in
partial_pathlist. See generate_gather_paths().

And we only have sequential + index paths in partial_pathlist, not
incremental sort paths.

IMHO we can do two things:

1) modify generate_gather_paths to also consider incremental sort for
each sorted path, similarly to what create_ordered_paths does

2) modify build_index_paths to also generate an incremental sort path
for each index path

IMHO (1) is the right choice here, because it automatically does the
trick for all other types of ordered paths, not just index scans. So,
something like the attached patch, which gives me plans like this:


                                                                           QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=0.86..2.85 rows=100000 width=12) (actual time=3.726..233.249 rows=100000 loops=1)
       ->  Gather Merge  (cost=0.86..120.00 rows=5999991 width=12) (actual time=3.724..156.802 rows=100000 loops=1)
             Workers Planned: 2
             Workers Launched: 2
             ->  Incremental Sort  (cost=0.84..100.00 rows=2499996 width=12) (actual time=0.563..164.438 rows=33910
loops=3)
                   Sort Key: a, b
                   Presorted Key: a
                   Sort Method: quicksort  Memory: 27kB
                   Sort Groups: 389
                   Worker 0:  Sort Method: quicksort  Memory: 27kB  Groups: 1295
                   Worker 1:  Sort Method: quicksort  Memory: 27kB  Groups: 1241
                   ->  Parallel Index Scan using t_a_idx on t  (cost=0.43..250612.29 rows=2499996 width=12) (actual
time=0.027..128.518rows=33926 loops=3)
 
     Planning Time: 68559.695 ms
     Execution Time: 285.245 ms
    (14 rows)


This is not the whole story, though - there seems to be some costing
issue, because even with the parallel costs set to 0, I only get such
plans after I tweak the cost in the patch like this:

    subpath->total_cost = 100.0;
    path->path.total_cost = 120.0;

When I don't do that, the gather merge gets with total cost 1037485, and
it gets beaten by this plan:

                                                                    QUERY PLAN

------------------------------------------------------------------------------------------------------------------------------------------
     Limit  (cost=1.05..152.75 rows=100000 width=12) (actual time=0.234..374.492 rows=100000 loops=1)
       ->  Incremental Sort  (cost=1.05..9103.09 rows=5999991 width=12) (actual time=0.232..316.210 rows=100000
loops=1)
             Sort Key: a, b
             Presorted Key: a
             Sort Method: quicksort  Memory: 27kB
             Sort Groups: 2863
             ->  Index Scan using t_a_idx on t  (cost=0.43..285612.24 rows=5999991 width=12) (actual
time=0.063..240.248rows=100003 loops=1)
 
     Planning Time: 53743.858 ms
     Execution Time: 403.379 ms
    (9 rows)

I suspect it's related to the fact that for the Gather Merge plan we
don't have the information about the number of rows, while for the
incremental sort we have it.

But clearly 9103.09 is not total cost for all 6M rows the incremental
sort is expected to produce (because that has to be higher than 285612,
which is the cost of the index scan). So it seems like the total cost of
the incremental sort is ~546000, because

    (100000 / 6000000) * 546000 = 9133

so close to the total cost of the incremental sort. But then

    100000.0 / 6000000 * 9133 = 152

so it seems we actually do the linear approximation twice. That seems
pretty bogus, IMO. And indeed, if I remove this part from
cost_incremental_sort:

    if (limit_tuples > 0 && limit_tuples < input_tuples)
    {
        output_tuples = limit_tuples;
        output_groups = floor(output_tuples / group_tuples) + 1;
    }

then it behaves kinda reasonable:

    explain  select * from t order by a, b limit 100000;

                                           QUERY PLAN
    -----------------------------------------------------------------------------------------
     Limit  (cost=1.05..9103.12 rows=100000 width=12)
       ->  Incremental Sort  (cost=1.05..546124.41 rows=5999991 width=12)
             Sort Key: a, b
             Presorted Key: a
             ->  Index Scan using t_a_idx on t  (cost=0.43..285612.24 rows=5999991 width=12)
    (5 rows)

    set parallel_tuple_cost = 0;
    set parallel_setup_cost = 0;

    explain  select * from t order by a, b limit 100000;

                                                   QUERY PLAN
    --------------------------------------------------------------------------------------------------------
     Limit  (cost=0.86..6775.63 rows=100000 width=12)
       ->  Gather Merge  (cost=0.86..406486.25 rows=5999991 width=12)
             Workers Planned: 2
             ->  Incremental Sort  (cost=0.84..343937.44 rows=2499996 width=12)
                   Sort Key: a, b
                   Presorted Key: a
                   ->  Parallel Index Scan using t_a_idx on t  (cost=0.43..250612.29 rows=2499996 width=12)
    (7 rows)

But I'm not going to claim those are total fixes, it's the minimum I
needed to do to make this particular type of plan work.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Don't allocate IndexAmRoutine dynamically?
Next
From: Tom Lane
Date:
Subject: Re: UCT (Re: pgsql: Update time zone data files to tzdata release 2019a.)