Re: [PATCH] Incremental sort (was: PoC: Partial sort) - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: [PATCH] Incremental sort (was: PoC: Partial sort) |
Date | |
Msg-id | 20200328225922.ljnw54bfg44vi6ib@development Whole thread Raw |
In response to | Re: [PATCH] Incremental sort (was: PoC: Partial sort) (Tomas Vondra <tomas.vondra@2ndquadrant.com>) |
Responses |
Re: [PATCH] Incremental sort (was: PoC: Partial sort)
|
List | pgsql-hackers |
Hi, Attached is my take on simplification of the useful pathkeyes thing. It keeps the function, but it truncates query_pathkeys to only members with EC members in the relation. I think that's essentially the optimization you've proposed. I've also noticed an issue in explain output. EXPLAIN ANALYZE on a simple query gives me this: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------- Gather Merge (cost=66.30..816060.48 rows=8333226 width=24) (actual time=6.464..19091.006 rows=10000000 loops=1) Workers Planned: 2 Workers Launched: 2 -> Incremental Sort (cost=66.28..729188.13 rows=4166613 width=24) (actual time=1.836..13401.109 rows=3333333 loops=3) Sort Key: a, b, c Presorted Key: a, b Full-sort Groups: 4156 (Methods: quicksort) Memory: 30kB (avg), 30kB (max) Presorted Groups: 4137 (Methods: quicksort) Memory: 108kB (avg), 111kB (max) Full-sort Groups: 6888 (Methods: ) Memory: 30kB (avg), 30kB (max) Presorted Groups: 6849 (Methods: ) Memory: 121kB (avg), 131kB (max) Full-sort Groups: 6869 (Methods: ) Memory: 30kB (avg), 30kB (max) Presorted Groups: 6816 (Methods: ) Memory: 128kB (avg), 132kB (max) -> Parallel Index Scan using t_a_b_idx on t (cost=0.43..382353.69 rows=4166613 width=24) (actual time=0.033..9346.679rows=3333333 loops=3) Planning Time: 0.133 ms Execution Time: 23998.669 ms (15 rows) while with incremental sort disabled it looks like this: QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------- Gather Merge (cost=734387.50..831676.35 rows=8333226 width=24) (actual time=5597.978..14967.246 rows=10000000 loops=1) Workers Planned: 2 Workers Launched: 2 -> Sort (cost=734387.47..744804.00 rows=4166613 width=24) (actual time=5584.616..7645.711 rows=3333333 loops=3) Sort Key: a, b, c Sort Method: external merge Disk: 111216kB Worker 0: Sort Method: external merge Disk: 109552kB Worker 1: Sort Method: external merge Disk: 112112kB -> Parallel Seq Scan on t (cost=0.00..105361.13 rows=4166613 width=24) (actual time=0.011..1753.128 rows=3333333loops=3) Planning Time: 0.048 ms Execution Time: 19682.582 ms (11 rows) So I think there's a couple of issues: 1) Missing worker identification (Worker #). 2) Missing method for workers (we have it for the leader, though). 3) I'm not sure why the lable is "Methods" instead of "Sort Method", and why it's in parenthesis. 4) Not sure having two lines for each worker is a great idea. 5) I'd probably prefer having multiple labels for avg/max memory values, instead of (avg) and (max) notes. Also, I think we use "peak" in this context instead of "max". regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Attachment
pgsql-hackers by date: