I've encoutered similar issues myself (with UNION so far), so I tried
to build a simple test case, which may or may not cover Clemens's
case.
Test case 1 and 2 illustrates the issue, and case 3-9 are variations.
My observation: Looks like the optimizer cannot be close friends with
both UNION and IN/JOIN at the same time.
Actually - it looks like the UNION SELECT kids don't wanna share the
IN/JOIN toy we gave them, but are happy when they get their own toys
to play with ;)
DROP TABLE IF EXISTS table1;
CREATE TABLE table1 AS SELECT i AS id FROM generate_series(1, 300000)
S(i);
CREATE INDEX ON table1(id);
ANALYZE table1;
-- Test 1: Slow. IN()
SELECT * FROM (
SELECT * FROM table1
UNION
SELECT * FROM table1
) Q WHERE id IN (SELECT id FROM table1 LIMIT 10);
-- Test 2: Fast. ANY(ARRAY())
SELECT * FROM (
SELECT * FROM table1
UNION
SELECT * FROM table1
) Q WHERE id = ANY(ARRAY(SELECT id FROM table1 LIMIT 10));
-- Test 3: Fast. Duplicate IN. Symptom fix? Or would you call it a
"better" query in terms of sql? -except for the unnecessary subquery,
which I kept for readability.
SELECT * FROM (
SELECT * FROM table1
WHERE id IN (SELECT id FROM table1 LIMIT 10)
UNION
SELECT * FROM table1
WHERE id IN (SELECT id FROM table1 LIMIT 10)
) Q;
-- Test 4: Fast. Duplicate JOIN CTE.
WITH id_list AS (SELECT id FROM table1 LIMIT 10)
SELECT * FROM (
SELECT * FROM table1 JOIN id_list USING(id)
UNION
SELECT * FROM table1 JOIN id_list USING(id)
) Q;
-- Test 5: Slow. IN(CTE)
WITH id_list AS (SELECT id FROM table1 LIMIT 10)
SELECT * FROM (
SELECT * FROM table1
UNION
SELECT * FROM table1
) Q WHERE id IN (SELECT * FROM id_list);
-- Test 6: Slow. IN(explicit id list)
SELECT * FROM (
SELECT * FROM table1
UNION
SELECT * FROM table1
) Q WHERE id IN (SELECT
UNNEST('{100001,100002,100003,100004,100005,100006,100007,100008,100009,10010}'::BIGINT[] )
AS id);
-- Test 7: Slow. IN(UNNEST(ARRAY())
SELECT * FROM (
SELECT * FROM table1
UNION
SELECT * FROM table1
) Q WHERE id IN (SELECT UNNEST(ARRAY(SELECT id FROM table1 LIMIT 10))
AS id);
-- Test 8: Slow. JOIN CTE
WITH id_list AS (SELECT id FROM table1 LIMIT 10)
SELECT * FROM (
SELECT * FROM table1
UNION
SELECT * FROM table1
) Q JOIN id_list USING(id);
-- Test 9: Fast. JOIN CTE + UNION ALL/DISTINCT (not quite the same
query)
WITH id_list AS (SELECT id FROM table1 LIMIT 10)
SELECT DISTINCT * FROM (
SELECT * FROM table1
UNION ALL
SELECT * FROM table1
) Q JOIN id_list USING(id);
--
Geir Bostad
9.1.3(x64,win)