Thread: UNION ALL and sequential scans

From:
"Brad Jorsch"
Date:

 I was glad to find in 8.3.7 that pg was now smart enough to use an index
 for a simple UNION ALL. But unfortunately, it's not quite there yet for
 our use case.

 Consider the following dummy tables:

 create table foo (id serial primary key, val int not null);
 create table bar (id serial primary key, val int not null);
 create table baz (id1 serial primary key, id2 int not null);

 insert into foo (val) select x from generate_series(0,10000) as x;
 insert into bar (val) select x from generate_series(0,10000) as x;
 insert into baz (id2) select x from generate_series(0,10000) as x;

 This query correctly uses the primary key indexes on foo and bar:

 explain analyze select * from baz join (
 select id, val from foo
 union all
 select id, val from bar
 ) as foobar on(baz.id2=foobar.id) where baz.id1=42;

 But if I add a constant-valued column to indicate which branch of the
 union each result came from:

 explain analyze select * from baz join (
 select id, val, 'foo'::text as source from foo
 union all
 select id, val, 'bar'::text as source from bar
 ) as foobar on(baz.id2=foobar.id) where baz.id1=42;

 All of a sudden it insists on a sequential scan (and takes 800 times as
 long to run) even when enable_seqscan is set false. Is there a good
 reason for this, or is it just a missed opportunity in the optimizer?

From:
Tom Lane
Date:

"Brad Jorsch" <> writes:
>  But if I add a constant-valued column to indicate which branch of the
>  union each result came from:

>  explain analyze select * from baz join (
>  select id, val, 'foo'::text as source from foo
>  union all
>  select id, val, 'bar'::text as source from bar
>  ) as foobar on(baz.id2=foobar.id) where baz.id1=42;

>  All of a sudden it insists on a sequential scan (and takes 800 times as
>  long to run) even when enable_seqscan is set false. Is there a good
>  reason for this, or is it just a missed opportunity in the optimizer?

It's an ancient and fundamental limitation that is fixed in 8.4.
Do not expect to see it fixed in 8.3.x.

            regards, tom lane

From:
Mathieu De Zutter
Date:

On Thu, May 14, 2009 at 4:52 PM, Tom Lane <> wrote:
> "Brad Jorsch" <> writes:
>>  But if I add a constant-valued column to indicate which branch of the
>>  union each result came from:
>
>>  explain analyze select * from baz join (
>>  select id, val, 'foo'::text as source from foo
>>  union all
>>  select id, val, 'bar'::text as source from bar
>>  ) as foobar on(baz.id2=foobar.id) where baz.id1=42;
>
>>  All of a sudden it insists on a sequential scan (and takes 800 times as
>>  long to run) even when enable_seqscan is set false. Is there a good
>>  reason for this, or is it just a missed opportunity in the optimizer?
>
> It's an ancient and fundamental limitation that is fixed in 8.4.
> Do not expect to see it fixed in 8.3.x.

Does this also apply to the case of a join on an inherited table ?

example: http://archives.postgresql.org/pgsql-performance/2003-10/msg00018.php

Kind regards,

Mathieu

From:
Tom Lane
Date:

Mathieu De Zutter <> writes:
> On Thu, May 14, 2009 at 4:52 PM, Tom Lane <> wrote:
>> It's an ancient and fundamental limitation that is fixed in 8.4.
>> Do not expect to see it fixed in 8.3.x.

> Does this also apply to the case of a join on an inherited table ?

> example: http://archives.postgresql.org/pgsql-performance/2003-10/msg00018.php

Well, the particular issue described in that message is long gone.
What Brad is complaining about is non-strict expressions in the
outputs of append-relation members.  An inheritance tree also forms
an append-relation, but AFAIK there is no way to have anything but
simple Vars in its outputs so the case wouldn't arise.  Do you have
a specific problem example in mind?

            regards, tom lane