Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Date
Msg-id 20200328225922.ljnw54bfg44vi6ib@development
Whole thread Raw
In response to Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: [PATCH] Incremental sort (was: PoC: Partial sort)  (James Coleman <jtc331@gmail.com>)
List pgsql-hackers
Hi,

Attached is my take on simplification of the useful pathkeyes thing. It
keeps the function, but it truncates query_pathkeys to only members with
EC members in the relation. I think that's essentially the optimization
you've proposed.

I've also noticed an issue in explain output. EXPLAIN ANALYZE on a simple
query gives me this:

                                                                       QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------
  Gather Merge  (cost=66.30..816060.48 rows=8333226 width=24) (actual time=6.464..19091.006 rows=10000000 loops=1)
    Workers Planned: 2
    Workers Launched: 2
    ->  Incremental Sort  (cost=66.28..729188.13 rows=4166613 width=24) (actual time=1.836..13401.109 rows=3333333
loops=3)
          Sort Key: a, b, c
          Presorted Key: a, b
          Full-sort Groups: 4156 (Methods: quicksort) Memory: 30kB (avg), 30kB (max)
          Presorted Groups: 4137 (Methods: quicksort) Memory: 108kB (avg), 111kB (max)
          Full-sort Groups: 6888 (Methods: ) Memory: 30kB (avg), 30kB (max)
          Presorted Groups: 6849 (Methods: ) Memory: 121kB (avg), 131kB (max)
          Full-sort Groups: 6869 (Methods: ) Memory: 30kB (avg), 30kB (max)
          Presorted Groups: 6816 (Methods: ) Memory: 128kB (avg), 132kB (max)
          ->  Parallel Index Scan using t_a_b_idx on t  (cost=0.43..382353.69 rows=4166613 width=24) (actual
time=0.033..9346.679rows=3333333 loops=3)
 
  Planning Time: 0.133 ms
  Execution Time: 23998.669 ms
(15 rows)

while with incremental sort disabled it looks like this:

                                                              QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------
  Gather Merge  (cost=734387.50..831676.35 rows=8333226 width=24) (actual time=5597.978..14967.246 rows=10000000
loops=1)
    Workers Planned: 2
    Workers Launched: 2
    ->  Sort  (cost=734387.47..744804.00 rows=4166613 width=24) (actual time=5584.616..7645.711 rows=3333333 loops=3)
          Sort Key: a, b, c
          Sort Method: external merge  Disk: 111216kB
          Worker 0:  Sort Method: external merge  Disk: 109552kB
          Worker 1:  Sort Method: external merge  Disk: 112112kB
          ->  Parallel Seq Scan on t  (cost=0.00..105361.13 rows=4166613 width=24) (actual time=0.011..1753.128
rows=3333333loops=3)
 
  Planning Time: 0.048 ms
  Execution Time: 19682.582 ms
(11 rows)

So I think there's a couple of issues:

1) Missing worker identification (Worker #).

2) Missing method for workers (we have it for the leader, though).

3) I'm not sure why the lable is "Methods" instead of "Sort Method", and
why it's in parenthesis.

4) Not sure having two lines for each worker is a great idea.

5) I'd probably prefer having multiple labels for avg/max memory values,
instead of (avg) and (max) notes. Also, I think we use "peak" in this
context instead of "max".


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

pgsql-hackers by date:

Previous
From: James Coleman
Date:
Subject: Re: [PATCH] Incremental sort (was: PoC: Partial sort)
Next
From: Tom Lane
Date:
Subject: Re: fix for BUG #3720: wrong results at using ltree