Re: WIP: Upper planner pathification - Mailing list pgsql-hackers
From | Andres Freund |
---|---|
Subject | Re: WIP: Upper planner pathification |
Date | |
Msg-id | 20160313010354.v7gygdse34vylp5a@alap3.anarazel.de Whole thread Raw |
In response to | Re: WIP: Upper planner pathification (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: WIP: Upper planner pathification
|
List | pgsql-hackers |
On 2016-03-12 12:22:01 -0500, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: > > On 2016-03-10 23:38:14 -0500, Tom Lane wrote: > >> I'll do it ... just send me the list. > > > After exporting make_agg, make_limit, make_sort_from_sortclauses and > > making some trivial adjustments due to the pull_var_clause changes > > change, Citus' tests pass on 9.6, bar some noise. > > OK, done. Thanks. > > Pathification made > > some plans switch from hash-agg to sort-agg, and the other way round; > > but that's obviously ok. > > I wonder whether that's pathification per se. If you're interested enough, I've uploaded a dump of the schema relevant table to http://anarazel.de/t/lineitem_95_96_plan.dump.gz the affected query is (after ANALYZE lineitem_102009) EXPLAIN SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, count(l_extendedprice) AS avg, array_agg(l_orderkey)AS array_agg FROM lineitem_102009 lineitem WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 5500) AND (l_orderkey < 9500)) GROUP BY l_quantity; =# SELECT version(); ┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐ │ version │ ├──────────────────────────────────────────────────────────────────────────────────────────────────────────┤ │ PostgreSQL 9.5.1 on x86_64-pc-linux-gnu, compiled by gcc-5.real (Debian 5.3.1-10) 5.3.1 20160224, 64-bit │ └──────────────────────────────────────────────────────────────────────────────────────────────────────────┘ (1 row) =# SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, count(l_extendedprice) AS avg, array_agg(l_orderkey)AS array_agg FROM lineitem_102009 lineitem WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 5500)AND (l_orderkey < 9500)) GROUP BY l_quantity; ┌────────────┬───────┬──────────┬─────┬────────────────────────────────────────────────┐ │ l_quantity │ count │ avg │ avg │ array_agg │ ├────────────┼───────┼──────────┼─────┼────────────────────────────────────────────────┤ │ 1.00 │ 9 │ 13044.06 │ 9 │ {8997,9026,9158,9184,9220,9222,9348,9383,9476} │ │ 4.00 │ 7 │ 40868.84 │ 7 │ {9091,9120,9281,9347,9382,9440,9473} │ │ 2.00 │ 8 │ 26072.02 │ 8 │ {9030,9058,9123,9124,9188,9344,9441,9476} │ │ 3.00 │ 9 │ 39925.32 │ 9 │ {9124,9157,9184,9223,9254,9349,9414,9475,9477} │ └────────────┴───────┴──────────┴─────┴────────────────────────────────────────────────┘ (4 rows) Time: 0.906 ms =# EXPLAIN SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, count(l_extendedprice) AS avg, array_agg(l_orderkey)AS array_agg FROM lineitem_102009 lineitem WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 5500)AND (l_orderkey < 9500)) GROUP BY l_quantity; ┌────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├────────────────────────────────────────────────────────────────────────────────────────────┤ │ HashAggregate (cost=137.91..137.93 rows=1 width=21) │ │ Group Key: l_quantity │ │ -> Bitmap Heap Scan on lineitem_102009 lineitem (cost=13.07..137.44 rows=38 width=21) │ │ Recheck Cond: ((l_orderkey > 5500) AND (l_orderkey < 9500)) │ │ Filter: (l_quantity < '5'::numeric) │ │ -> Bitmap Index Scan on lineitem_pkey_102009 (cost=0.00..13.06 rows=478 width=0) │ │ Index Cond: ((l_orderkey > 5500) AND (l_orderkey < 9500)) │ └──────────────────────────────────────────────────────────────────────────────────────── vs. =# SELECT version(); ┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── │ version ├─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── │ PostgreSQL 9.6devel on x86_64-pc-linux-gnu, compiled by gcc-6.real (Debian 6-20160228-1) 6.0.0 20160228 (experimental)[trunk revision └─────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────── (1 row) =# SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, count(l_extendedprice) AS avg, array_agg(l_orderkey)AS array_agg FROM lineitem_102009 lineitem WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 5500)AND (l_orderkey < 9500)) GROUP BY l_quantity; ┌────────────┬───────┬──────────┬─────┬────────────────────────────────────────────────┐ │ l_quantity │ count │ avg │ avg │ array_agg │ ├────────────┼───────┼──────────┼─────┼────────────────────────────────────────────────┤ │ 1.00 │ 9 │ 13044.06 │ 9 │ {9476,9158,9184,9383,9026,9220,9222,8997,9348} │ │ 2.00 │ 8 │ 26072.02 │ 8 │ {9124,9344,9441,9123,9476,9030,9058,9188} │ │ 3.00 │ 9 │ 39925.32 │ 9 │ {9477,9124,9157,9184,9223,9254,9349,9414,9475} │ │ 4.00 │ 7 │ 40868.84 │ 7 │ {9440,9347,9473,9091,9281,9382,9120} │ └────────────┴───────┴──────────┴─────┴────────────────────────────────────────────────┘ (4 rows) =# EXPLAIN SELECT l_quantity, count(*) AS count, sum(l_extendedprice) AS avg, count(l_extendedprice) AS avg, array_agg(l_orderkey)AS array_agg FROM lineitem_102009 lineitem WHERE ((l_quantity < '5'::numeric) AND (l_orderkey > 5500)AND (l_orderkey < 9500)) GROUP BY l_quantity; ┌──────────────────────────────────────────────────────────────────────────────────────────────────┐ │ QUERY PLAN │ ├──────────────────────────────────────────────────────────────────────────────────────────────────┤ │ GroupAggregate (cost=138.43..139.02 rows=1 width=85) │ │ Group Key: l_quantity │ │ -> Sort (cost=138.43..138.53 rows=38 width=21) │ │ Sort Key: l_quantity │ │ -> Bitmap Heap Scan on lineitem_102009 lineitem (cost=13.07..137.44 rows=38 width=21) │ │ Recheck Cond: ((l_orderkey > 5500) AND (l_orderkey < 9500)) │ │ Filter: (l_quantity < '5'::numeric) │ │ -> Bitmap Index Scan on lineitem_pkey_102009 (cost=0.00..13.06 rows=478 width=0) │ │ Index Cond: ((l_orderkey > 5500) AND (l_orderkey < 9500)) │ └──────────────────────────────────────────────────────────────────────────────────────────────────┘ (9 rows) As you probably can guess, what made me notice this wsa the difference in the array_agg output. > Of the three core regression test EXPLAINs that changed in the > pathification commit, two actually were a case of finding better > plans. The other one was a random-looking swap between two plans with > near-identical costs. When I looked into it, I found that the reason > the planner liked the new plan better was that it was parallel-safe; > add_path() saw the costs as fuzzily equal and allowed parallel-safe to > be the determining factor in the choice. The old code hadn't done > that because the hard-wired cost comparisons in grouping_planner() > never took parallel-safety into account. But I'd call that a > parallelism change, not a pathification change; it would certainly > have appeared to be that if the patches had gone in in the opposite > order. I've not yet looked deep enough to determine the root cause; I did however notice that set enable_sort = false; yields a cheaper plan than the default one, within the fuzz range (137.91..137.93 vs 138.43..139.02). Greetings, Andres Freund
pgsql-hackers by date: