Re: [HACKERS] TPC-H Q20 from 1 hour to 19 hours! - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: [HACKERS] TPC-H Q20 from 1 hour to 19 hours! |
Date | |
Msg-id | 4de4bd67-6bfe-2b44-9769-db750e169a5f@2ndquadrant.com Whole thread Raw |
In response to | [HACKERS] TPC-H Q20 from 1 hour to 19 hours! (Rafia Sabih <rafia.sabih@enterprisedb.com>) |
Responses |
Re: [HACKERS] TPC-H Q20 from 1 hour to 19 hours!
(Robert Haas <robertmhaas@gmail.com>)
Re: [HACKERS] TPC-H Q20 from 1 hour to 19 hours! (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
Hi, I've been looking at this issue today, and so far I don't think it's a bug in the foreign key estimation. It seems mostly that the 9.5 estimates were hopelessly bad, and the join estimation changes simply pushed it a tiny bit the wrong direction. Although maybe there is a bug (or at least a change of behavior) in one case, but I'll get to that. I've managed to extract a small part of Q20 that demonstrates the differences between versions quite nicely, I think. The part causing the trouble looks like this: explain select ps_suppkey from partsupp, ( select l_partkey agg_partkey, l_suppkey agg_suppkey from lineitem where l_shipdate >= date '1997-01-01' and l_shipdate < date '1997-01-01' + interval '1' year group by l_partkey, l_suppkey ) agg_lineitem where agg_partkey = ps_partkey and agg_suppkey = ps_suppkey and ps_partkey in ( select p_partkey from part where p_name like 'hot%' ); i.e. it aggregates the "lineitem" table, and then joins "partsupp" and "part" tables to it. "aggregated lineitem" <-> partsupp <-> part I've collected estimates from four different variants of the query (see the attached exlain.sql): 1) SIMPLE - join directly to lineitem (without the aggregation) - remove the p_name LIKE pattern matching 2) SIMPLE+LIKE - like SIMPLE, but keep the LIKE condition 3) GROUPING - join to the aggregated lineitem table - remove the p_name LIKE pattern matching 4) GROUPING+LIKE - like GROUPING, but keep the LIKE condition I've collected estimates on a 20GB data set, both from 9.5 (so without any of the FK estimation changes) and on master with different foreign keys between the tables. no-keys - no foreign keys between the three tables lineitem - lineitem references partsupp partsupp - partsupp references part both - both foreign keys And the results look like this (actual row counts were collected on 9.5, but that should not matter - the results should be the same on all versions): branch SIMPLE SIMPLE+LIKE GROUPING GROUPING+LIKE -------------------------------------------------------------------- actual 119994608 1311974 10897186 119238 9.5 2863 35 160 160 no-keys 2340 24 868 868 lineitem 119994848 1229750 868 868 partsupp 2340 24 1737 18 both-keys 119994848 1212065 1737 18 This seems mostly sane, I guess, but let's look at various cases. In the SIMPLE cases, the foreign key "lineitem->partsupp" makes a huge difference - the estimates are pretty exact, both with and without the LIKE condition. The "partsupp->part" key makes almost no difference, though - the minor differences (35/24 and 1229750/1212065) seem to be mostly due to minor differences in stats built by ANALYZE, particularly in histograms used by patternsel(). In the GROUPING cases, the situation is obviously much worse. The grouping makes it impossible to use the "lineitem->partsupp" foreign key, resulting in severe underestimates. The "partsupp->part" is used, but the difference is pretty negligible as it's a simple (one column) foreign key. The change from 160 -> 868 is merely due to 84f9a35e3 changing how we estimate number of groups in a GROUP BY clause. In 9.5 we get this: -> HashAggregate (rows=1836028) (actual rows=10897186) while since 9.6 we get this -> GroupAggregate (rows=9674242) Not only is that much closer to the actual value than the 9.5 estimate, but it's almost exactly the factor between 160 and 868: 9674242 / 1836028 = 5.27 160 * 5.26 = 843 So I'd say the 160 vs. 868 is expected, although the result is still way off, of course. Which brings me to the slightly suspicious bit. On 9.5, there's no difference between GROUP and GROUP+LIKE cases - the estimates are exactly the same in both cases. This is true too, but only without the foreign key between "partsupp" and "part", i.e. the two non-grouped relations in the join. And what's more, the difference (1737 vs. 16) is pretty much exactly 100x, which is the estimate for the LIKE condition. So it kinda seems 9.5 does not apply this condition for semi-joins, while >=9.6 does that. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services -- Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-hackers
Attachment
pgsql-hackers by date: