I was glad to find in 8.3.7 that pg was now smart enough to use an index
for a simple UNION ALL. But unfortunately, it's not quite there yet for
our use case.
Consider the following dummy tables:
create table foo (id serial primary key, val int not null);
create table bar (id serial primary key, val int not null);
create table baz (id1 serial primary key, id2 int not null);
insert into foo (val) select x from generate_series(0,10000) as x;
insert into bar (val) select x from generate_series(0,10000) as x;
insert into baz (id2) select x from generate_series(0,10000) as x;
This query correctly uses the primary key indexes on foo and bar:
explain analyze select * from baz join (
select id, val from foo
union all
select id, val from bar
) as foobar on(baz.id2=foobar.id) where baz.id1=42;
But if I add a constant-valued column to indicate which branch of the
union each result came from:
explain analyze select * from baz join (
select id, val, 'foo'::text as source from foo
union all
select id, val, 'bar'::text as source from bar
) as foobar on(baz.id2=foobar.id) where baz.id1=42;
All of a sudden it insists on a sequential scan (and takes 800 times as
long to run) even when enable_seqscan is set false. Is there a good
reason for this, or is it just a missed opportunity in the optimizer?