Re: incremental sort vs. gather paths - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: incremental sort vs. gather paths |
Date | |
Msg-id | 2d00f31d-08e7-05b8-ccf3-b0645943d80a@enterprisedb.com Whole thread Raw |
In response to | incremental sort vs. gather paths (Robert Haas <robertmhaas@gmail.com>) |
Responses |
Re: incremental sort vs. gather paths
|
List | pgsql-hackers |
On 12/16/21 17:48, Robert Haas wrote: > This code introduced in ba3e76cc571eba3dea19c9465ff15ac3ac186576 looks > wrong to me: > > total_groups = cheapest_partial_path->rows * > cheapest_partial_path->parallel_workers; > path = (Path *) > create_gather_merge_path(root, ordered_rel, > path, > path->pathtarget, > root->sort_pathkeys, NULL, > &total_groups); > > This too: > > total_groups = input_path->rows * > input_path->parallel_workers; > > This came to my attention because Dave Page sent me a query plan that > looks like this: > > Gather Merge (cost=22617.94..22703.35 rows=732 width=97) (actual > time=2561.476..2561.856 rows=879 loops=1) > Workers Planned: 2 > Workers Launched: 2 > -> Sort (cost=21617.92..21618.83 rows=366 width=97) (actual > time=928.329..928.378 rows=293 loops=3) > Sort Key: aid > Sort Method: quicksort Memory: 155kB > Worker 0: Sort Method: quicksort Memory: 25kB > Worker 1: Sort Method: quicksort Memory: 25kB > -> Parallel Seq Scan on accounts_1m (cost=0.00..21602.33 > rows=366 width=97) (actual time=74.391..74.518 rows=293 loops=3) > Filter: (aid < 10000000) > Rows Removed by Filter: 333040 > > If you look at the actual row count estimates, you see that the Gather > Merge produces 3 times the number of rows that the Parallel Seq Scan > produces, which is completely correct, because the raw number is 897 > in both cases, but EXPLAIN unhelpfully divides the displayed row count > by the loop count, which in this case is 3. If you look at the > estimated row count, you see that the Gather Merge is estimated to > produce exactly 2 times the number of rows that the nodes under it > would produce. That's not a very good estimate, unless > parallel_leader_participation=off, which in this case it isn't. > > What's happening here is that the actual number of rows produced by > accounts_1m is actually 879 and is estimated as 879. > get_parallel_divisor() decides to divide that number by 2.4, and gets > 366, because it thinks the leader will do less work than the other > workers. Then the code above fires and, instead of letting the > original row count estimate for accounts_1m apply to the Gather Merge > path, it overrides it with 2 * 366 = 732. This cannot be right. > Really, I don't think it should be overriding the row count estimate > at all. But if it is, multiplying by the number of workers can't be > right, because the leader can also do stuff. > Maybe, but other places (predating incremental sort) creating Gather Merge do the same thing, and commit ba3e76cc57 merely copied this. For example generate_gather_paths() does this: foreach(lc, rel->partial_pathlist) { Path *subpath = (Path *) lfirst(lc); GatherMergePath *path; if (subpath->pathkeys == NIL) continue; rows = subpath->rows * subpath->parallel_workers; path = create_gather_merge_path(root, rel, subpath, rel->reltarget, subpath->pathkeys, NULL, rowsp); add_path(rel, &path->path); } i.e. it's doing the same (rows * parallel_workers) calculation. It may not not be the right way to estimate this, of course. But I'd say if ba3e76cc57 is doing it wrong, so are the older places. regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
pgsql-hackers by date: