Thread: Poor plan when joining against a union containing a join
Hi all, I'm encountering very poor query plans when joining against a union, where one branch of the union is itself a join: specifically, Postgres does the entire inside join and then filters the result, rather than pushing the filters down to the joined tables. I've provided a standalone test case below, including some variations that don't show the problem for comparison. The fourth query is the problematic one - I would have expected the Append portion of the plan to be essentially the same as the one in the second query. I'm running Postgres 9.2.3 on Oracle Enterprise Linux 5 (some machines 5.3, some 5.6) x86_64, installed using the RPMs from http://yum.pgrpms.org/. The problem occurs with identical results (other than raw speed) on both a rather underpowered test server with default Postgres settings and a much better live server which I've attempted to tune somewhat sensibly. Thanks for any advice. START TRANSACTION; CREATE TABLE bundle_contents( item_id INTEGER PRIMARY KEY, bundle_type INTEGER NOT NULL ); INSERT INTO bundle_contents(item_id, bundle_type) SELECT i * 10 + j, i FROM GENERATE_SERIES(1, 100) i, GENERATE_SERIES(1, 10) j; CREATE INDEX ON bundle_contents(bundle_type, item_id); ANALYSE bundle_contents; CREATE TABLE bundle( bundle_id INTEGER PRIMARY KEY, bundle_type INTEGER NOT NULL ); INSERT INTO bundle(bundle_id, bundle_type) SELECT i * 1000 + j, i FROM GENERATE_SERIES(1, 100) i, GENERATE_SERIES(1, 1000) j; CREATE INDEX ON bundle(bundle_type, bundle_id); ANALYSE bundle; CREATE VIEW bundled_item AS SELECT bundle_id AS item_id_a, item_id AS item_id_b FROM bundle NATURAL JOIN bundle_contents; CREATE TABLE unbundled_item( item_id_a INTEGER, item_id_b INTEGER, PRIMARY KEY (item_id_a, item_id_b) ); INSERT INTO unbundled_item(item_id_a, item_id_b) SELECT i, 1 FROM GENERATE_SERIES(1001, 100000, 10) i; INSERT INTO unbundled_item(item_id_a, item_id_b) SELECT i, 2 FROM GENERATE_SERIES(1001, 100000, 20) i; INSERT INTO unbundled_item(item_id_a, item_id_b) SELECT i, 3 FROM GENERATE_SERIES(1001, 100000, 23) i; INSERT INTO unbundled_item(item_id_a, item_id_b) SELECT i, 4 FROM GENERATE_SERIES(1001, 100000, 41) i; CREATE INDEX ON unbundled_item(item_id_b); ANALYSE unbundled_item; CREATE VIEW item AS SELECT item_id_a, item_id_b FROM bundled_item UNION ALL SELECT item_id_a, item_id_b FROM unbundled_item; CREATE TABLE item_reference( reference_id INTEGER PRIMARY KEY, item_id_a INTEGER NOT NULL, item_id_b INTEGER NOT NULL ); INSERT INTO item_reference(reference_id, item_id_a, item_id_b) VALUES(1, 1472, 16), (2, 1299, 3); CREATE INDEX ON item_reference(item_id_a, item_id_b); CREATE INDEX ON item_reference(item_id_b); ANALYSE item_reference; -- no union, nice and fast EXPLAIN (ANALYSE, BUFFERS) SELECT * FROM bundled_item WHERE (item_id_a, item_id_b) = (1472, 16); /* http://explain.depesz.com/s/1ye QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.00..16.56 rows=1 width=8) (actual time=0.040..0.041 rows=1 loops=1) Join Filter: (bundle.bundle_type = bundle_contents.bundle_type) Buffers: shared hit=8 -> Index Scan using bundle_pkey on bundle (cost=0.00..8.28 rows=1 width=8) (actual time=0.017..0.017 rows=1 loops=1) Index Cond: (bundle_id = 1472) Buffers: shared hit=4 -> Index Scan using bundle_contents_pkey on bundle_contents (cost=0.00..8.27 rows=1 width=8) (actual time=0.012..0.013 rows=1 loops=1) Index Cond: (item_id = 16) Buffers: shared hit=4 Total runtime: 0.085 ms (10 rows) */ -- using the union, but querying it directly rather -- than joining to it, still fast EXPLAIN (ANALYSE, BUFFERS) SELECT * FROM item WHERE (item_id_a, item_id_b) = (1472, 16); /* http://explain.depesz.com/s/rwA QUERY PLAN --------------------------------------------------------------------------------------------------------------------------------------------------------- Result (cost=0.00..24.84 rows=2 width=8) (actual time=0.017..0.068 rows=1 loops=1) Buffers: shared hit=6 read=3 I/O Timings: read=0.032 -> Append (cost=0.00..24.84 rows=2 width=8) (actual time=0.015..0.066 rows=1 loops=1) Buffers: shared hit=6 read=3 I/O Timings: read=0.032 -> Nested Loop (cost=0.00..16.56 rows=1 width=8) (actual time=0.015..0.017 rows=1 loops=1) Join Filter: (bundle.bundle_type = bundle_contents.bundle_type) Buffers: shared hit=6 -> Index Scan using bundle_pkey on bundle (cost=0.00..8.28 rows=1 width=8) (actual time=0.006..0.007 rows=1 loops=1) Index Cond: (bundle_id = 1472) Buffers: shared hit=3 -> Index Scan using bundle_contents_pkey on bundle_contents (cost=0.00..8.27 rows=1 width=8) (actual time=0.004..0.005 rows=1 loops=1) Index Cond: (item_id = 16) Buffers: shared hit=3 -> Index Scan using unbundled_item_item_id_b_idx on unbundled_item (cost=0.00..8.27 rows=1 width=8) (actual time=0.049..0.049 rows=0 loops=1) Index Cond: (item_id_b = 16) Filter: (item_id_a = 1472) Buffers: shared read=3 I/O Timings: read=0.032 Total runtime: 0.116 ms (21 rows) */ -- join but no union, still fast EXPLAIN (ANALYSE, BUFFERS) SELECT * FROM bundled_item NATURAL JOIN item_reference WHERE reference_id = 1; /* http://explain.depesz.com/s/oOy QUERY PLAN ------------------------------------------------------------------------------------------------------------------------------------------------------ Nested Loop (cost=1.04..4.50 rows=1 width=12) (actual time=0.092..0.096 rows=1 loops=1) Buffers: shared hit=5 read=3 I/O Timings: read=0.038 -> Merge Join (cost=1.04..1.33 rows=1 width=16) (actual time=0.032..0.035 rows=1 loops=1) Merge Cond: (bundle_contents.item_id = item_reference.item_id_b) Buffers: shared hit=4 -> Index Scan using bundle_contents_pkey on bundle_contents (cost=0.00..43.25 rows=1000 width=8) (actual time=0.010..0.013 rows=7 loops=1) Buffers: shared hit=3 -> Sort (cost=1.03..1.04 rows=1 width=12) (actual time=0.012..0.012 rows=1 loops=1) Sort Key: item_reference.item_id_b Sort Method: quicksort Memory: 25kB Buffers: shared hit=1 -> Seq Scan on item_reference (cost=0.00..1.02 rows=1 width=12) (actual time=0.005..0.007 rows=1 loops=1) Filter: (reference_id = 1) Rows Removed by Filter: 1 Buffers: shared hit=1 -> Index Only Scan using bundle_bundle_type_bundle_id_idx on bundle (cost=0.00..3.16 rows=1 width=8) (actual time=0.056..0.057 rows=1 loops=1) Index Cond: ((bundle_type = bundle_contents.bundle_type) AND (bundle_id = item_reference.item_id_a)) Heap Fetches: 1 Buffers: shared hit=1 read=3 I/O Timings: read=0.038 Total runtime: 0.139 ms (22 rows) */ -- join to the union, slow. the conditions are pushed into the -- index scan in the non-join branch of the union, but in the join -- branch they're applied after the entire join is built EXPLAIN (ANALYSE, BUFFERS) SELECT * FROM item NATURAL JOIN item_reference WHERE reference_id = 1; /* http://explain.depesz.com/s/M73 QUERY PLAN ----------------------------------------------------------------------------------------------------------------------------------------------------- Nested Loop (cost=27.50..27979.82 rows=26 width=12) (actual time=3.230..554.473 rows=1 loops=1) Buffers: shared hit=452 -> Seq Scan on item_reference (cost=0.00..1.02 rows=1 width=12) (actual time=0.013..0.016 rows=1 loops=1) Filter: (reference_id = 1) Rows Removed by Filter: 1 Buffers: shared hit=1 -> Append (cost=27.50..27978.78 rows=2 width=8) (actual time=3.214..554.454 rows=1 loops=1) Buffers: shared hit=451 -> Subquery Scan on "*SELECT* 1" (cost=27.50..27970.50 rows=1 width=8) (actual time=3.212..554.422 rows=1 loops=1) Filter: ((item_reference.item_id_a = "*SELECT* 1".item_id_a) AND (item_reference.item_id_b = "*SELECT* 1".item_id_b)) Rows Removed by Filter: 999999 Buffers: shared hit=448 -> Hash Join (cost=27.50..12970.50 rows=1000000 width=8) (actual time=0.617..344.127 rows=1000000 loops=1) Hash Cond: (bundle.bundle_type = bundle_contents.bundle_type) Buffers: shared hit=448 -> Seq Scan on bundle (cost=0.00..1443.00 rows=100000 width=8) (actual time=0.009..22.066 rows=100000 loops=1) Buffers: shared hit=443 -> Hash (cost=15.00..15.00 rows=1000 width=8) (actual time=0.594..0.594 rows=1000 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 40kB Buffers: shared hit=5 -> Seq Scan on bundle_contents (cost=0.00..15.00 rows=1000 width=8) (actual time=0.008..0.207 rows=1000 loops=1) Buffers: shared hit=5 -> Index Only Scan using unbundled_item_pkey on unbundled_item (cost=0.00..8.28 rows=1 width=8) (actual time=0.021..0.021 rows=0 loops=1) Index Cond: ((item_id_a = item_reference.item_id_a) AND (item_id_b = item_reference.item_id_b)) Heap Fetches: 0 Buffers: shared hit=3 Total runtime: 554.533 ms (27 rows) */ ROLLBACK;
On 03/06/2013 06:54 AM, David Leverton wrote: > Hi all, > > I'm encountering very poor query plans when joining against a union, > where one branch of the union is itself a join: specifically, Postgres > does the entire inside join and then filters the result, rather than > pushing the filters down to the joined tables. I've provided a > standalone test case below, including some variations that don't show > the problem for comparison. The fourth query is the problematic one - > I would have expected the Append portion of the plan to be essentially > the same as the one in the second query. Thanks for the test case! Actually, in case #4, Postgres *is* pushing down the join qual into the segments of the Union. It's just that that's not helping performance any; it's causing a really slow join on bundle, which is actually where you're spending most of your time: -> Hash Join (cost=27.50..12970.50 rows=1000000 width=8) (actual time=0.617..344.127 rows=1000000 loops=1) Hash Cond: (bundle.bundle_type = bundle_contents.bundle_type) Buffers: shared hit=448 -> Seq Scan on bundle (cost=0.00..1443.00 rows=100000 width=8) (actual time=0.009..22.066 rows=100000 loops=1) Buffers: shared hit=443 -> Hash (cost=15.00..15.00 rows=1000 width=8) (actual time=0.594..0.594 rows=1000 loops=1) Buckets: 1024 Batches: 1 Memory Usage: 40kB Buffers: shared hit=5 -> Seq Scan on bundle_contents (cost=0.00..15.00 rows=1000 width=8) (actual time=0.008..0.207 rows=1000 loops=1) Buffers: shared hit=5 Clearly this is the wrong strategy; Postgres should be letting the filter on item_reference be the driver instead of hashing the whole bundle + bundle_contents join. I suspect that the qual pushdown into the union is hitting an inability to transverse multiple joins, which wouldn't surprise me; in fact, I'd be surprised if it could do so. On a pragmatic basis, joining against complex UNION expressions is liable to be a bit of a minefield for the next few generations of the Postgres planner; it's just really hard to optimize. You might think of using outer joins instead of a UNION. -- Josh Berkus PostgreSQL Experts Inc. http://pgexperts.com
Josh Berkus <josh@agliodbs.com> writes: > On 03/06/2013 06:54 AM, David Leverton wrote: >> I'm encountering very poor query plans when joining against a union, > Actually, in case #4, Postgres *is* pushing down the join qual into the > segments of the Union. Yeah, but not further. I believe the core issue here (as of 9.2) is that we're not willing to generate parameterized paths for subquery relations. We could do that without a huge amount of new code, I think, but the scary thing is how much time it might take to generate (and then discard most of the) plans for assorted parameterizations of complicated subqueries. regards, tom lane
On 7 March 2013 05:52, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Josh Berkus <josh@agliodbs.com> writes: >> On 03/06/2013 06:54 AM, David Leverton wrote: >>> I'm encountering very poor query plans when joining against a union, > >> Actually, in case #4, Postgres *is* pushing down the join qual into the >> segments of the Union. > > Yeah, but not further. I believe the core issue here (as of 9.2) is > that we're not willing to generate parameterized paths for subquery > relations. We could do that without a huge amount of new code, > I think, but the scary thing is how much time it might take to generate > (and then discard most of the) plans for assorted parameterizations of > complicated subqueries. Thanks for looking at this, both of you. Does "as of 9.2" mean it's better in 9.3? I do intend to upgrade once it's released, so if it can handle this better (or if there's anything that can be done to improve it between now and then without making other things worse) that would be great. Otherwise, I'm wondering if the addition of LATERAL will help persuade the planner to do what I want, something like this, perhaps? (please excuse any syntax misunderstandings): SELECT * FROM item_reference, LATERAL ( SELECT * FROM item WHERE (item.item_id_a, item.item_id_b) = (item_reference.item_id_a, item_reference.item_id_b) ) item WHERE reference_id = 1; I'm hoping this might help as the query in the test case where the desired item_id_a and item_id_b were supplied literally rather than from a join was fast, and this version has a similar structure, although naturally it'll only work if the planner doesn't notice that it's really equivalent to the slow version and treat it the same way. If not though, and in the meantime in any case, I suppose I'm looking for a workaround. In the real application the queries involved are generated by code rather than hand-written, so it's not a disaster if they have to be uglified a bit more than they are already. I'll see if I can figure something out, but if anyone has any suggestions they would be much appreciated. I'm afraid I don't really see how Josh's outer join suggestion would help here, though, unless it was more of a general principle than something specific to this case. The two branches of the union don't have any tables in common, so I don't see what I could be joining to. Ideally any alternative would keep the semantics the same as the existing version, or at least as similar as possible, as the application does need (or at least very much wants) to be able to work with items, including using them in further joins, without caring whether they're loose or part of a bundle. (And yes, it is a rather scary design in places, but it's the best thing I could come up with to achieve the requirements. Not sure if that says more about the requirements or me....)
David Leverton <levertond@googlemail.com> writes: > On 7 March 2013 05:52, Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Josh Berkus <josh@agliodbs.com> writes: >>> Actually, in case #4, Postgres *is* pushing down the join qual into the >>> segments of the Union. >> Yeah, but not further. I believe the core issue here (as of 9.2) is >> that we're not willing to generate parameterized paths for subquery >> relations. We could do that without a huge amount of new code, >> I think, but the scary thing is how much time it might take to generate >> (and then discard most of the) plans for assorted parameterizations of >> complicated subqueries. > Thanks for looking at this, both of you. > Does "as of 9.2" mean it's better in 9.3? No, I meant it was worse before 9.2 --- previous versions weren't even theoretically capable of generating the plan shape you want. What you're after is for the sub-join to be treated as a parameterized sub-plan, and we did not have any ability to do that for anything more complicated than a single-relation scan. > I do intend to upgrade once > it's released, so if it can handle this better (or if there's anything > that can be done to improve it between now and then without making > other things worse) that would be great. Otherwise, I'm wondering if > the addition of LATERAL will help persuade the planner to do what I > want, something like this, perhaps? Good idea, but no such luck in that form: it's still not going to try to push the parameterization down into the sub-query. I think you'd have to write out the query with the views expanded and manually put the WHERE restrictions into the lowest join level. [ experiments... ] Looks like only the UNION view has to be manually expanded to get a good plan with HEAD: regression=# explain SELECT * FROM item_reference, LATERAL ( SELECT item_id_a, item_id_b FROM bundled_item WHERE (item_id_a, item_id_b) = (item_reference.item_id_a, item_reference.item_id_b) UNION ALL SELECT item_id_a, item_id_b FROM unbundled_item WHERE (item_id_a, item_id_b) = (item_reference.item_id_a, item_reference.item_id_b) ) item WHERE reference_id = 1; QUERY PLAN --------------------------------------------------------------------------------------------------------------- Nested Loop (cost=0.57..25.99 rows=2 width=20) -> Seq Scan on item_reference (cost=0.00..1.02 rows=1 width=12) Filter: (reference_id = 1) -> Append (cost=0.57..24.94 rows=2 width=8) -> Nested Loop (cost=0.57..16.61 rows=1 width=8) Join Filter: (bundle.bundle_type = bundle_contents.bundle_type) -> Index Scan using bundle_pkey on bundle (cost=0.29..8.31 rows=1 width=8) Index Cond: (bundle_id = item_reference.item_id_a) -> Index Scan using bundle_contents_pkey on bundle_contents (cost=0.28..8.29 rows=1 width=8) Index Cond: (item_id = item_reference.item_id_b) -> Index Only Scan using unbundled_item_pkey on unbundled_item (cost=0.29..8.31 rows=1 width=8) Index Cond: ((item_id_a = item_reference.item_id_a) AND (item_id_b = item_reference.item_id_b)) (12 rows) You might be able to accomplish something similar without LATERAL, if you're willing to give up the notational convenience of the views. Don't have time right now to experiment further though. regards, tom lane
On 7 March 2013 18:47, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Good idea, but no such luck in that form: it's still not going to try to > push the parameterization down into the sub-query. I think you'd have > to write out the query with the views expanded and manually put the > WHERE restrictions into the lowest join level. [ experiments... ] > Looks like only the UNION view has to be manually expanded to get a > good plan with HEAD: Thanks for checking that, good to know that there's a way forward with 9.3 even in the worst case of not finding anything for 9.2. > You might be able to accomplish something similar without LATERAL, if > you're willing to give up the notational convenience of the views. > Don't have time right now to experiment further though. No problem, I don't expect you do to everything for me ;-) (although if you do find the time to come up with something before I do, that would of course be very welcome) and there's no desperate rush in any case. Pondering your earlier comment some more: On 7 March 2013 05:52, Tom Lane <tgl@sss.pgh.pa.us> wrote: > I believe the core issue here (as of 9.2) is > that we're not willing to generate parameterized paths for subquery > relations. We could do that without a huge amount of new code, > I think, but the scary thing is how much time it might take to generate > (and then discard most of the) plans for assorted parameterizations of > complicated subqueries. Would it be reasonable to support this with some sort of configurable complexity threshold for the subquery, above which the planner won't bother? Probably not the most elegant solution, but maybe something to consider. It seems similar in spirit to from_collapse_limit and join_collapse_limit, in the sense of controlling how much effort to put in for complex queries.