Home > mailing lists

2 left joins causes seqscan - Mailing list pgsql-general

From	Willy-Bas Loos
Subject	2 left joins causes seqscan
Date	September 12, 2014 19:19:41
Msg-id	CAHnozTi3jjaYKk5+u6uYm7w_QTbPWhEHdYEU2=iobkTkQM7JyQ@mail.gmail.com Whole thread Raw
Responses	Re: 2 left joins causes seqscan
List	pgsql-general

Tree view

Hi,

Today i ran into a situation where a second left join on an indexed field would prevent the index from being used, even though the index is clearly more efficient.

Removing either of the 2 joins would cause that the planner will use the index again.

I tested this on postgres 9.1 and 9.3 on my ubuntu (amd64) laptop.

--Here's the test data:

create table a (id serial primary key, field1 text);
create table b (id integer, title text, lang integer);
create index b_title_lowerto on b using btree (lower(title) text_pattern_ops);
vacuum analyze;

with x as (
insert into a
select generate_series(1,40000) as id
returning id
)
insert into b
select id, translate((random()*100*id)::text, '1234567890.', 'abcdefghij'), 1
from x;
update a set field1=translate(id::text, '1234567890.', 'abcdefghij');
insert into b
select b2.id, translate((random()*100*b2.id)::text, '1234567890.', 'abcdefghij'), 2
from b b2;

--Here's the query that doesn't use the index on "b":

select a.field1, b1.title , b2.title
from a
left join b b1 on b1.id = a.id and b1.lang=1
left join b b2 on b2.id = a.id and b2.lang=2
where (lower(b1.title) like'abcd%' or lower(b2.title) like 'abcd%')

--plan:
Hash Right Join (cost=4298.60..7214.76 rows=8 width=35)
Hash Cond: (b1.id = a.id)
Filter: ((lower(b1.title) ~~ 'abcd%'::text) OR (lower(b2.title) ~~ 'abcd%'::text))
-> Seq Scan on b b1 (cost=0.00..1510.00 rows=40176 width=19)
        Filter: (lang = 1)
-> Hash (cost=3798.60..3798.60 rows=40000 width=24)
        -> Hash Right Join (cost=1293.00..3798.60 rows=40000 width=24)
              Hash Cond: (b2.id = a.id)
              -> Seq Scan on b b2 (cost=0.00..1510.00 rows=39824 width=19)
                    Filter: (lang = 2)
              -> Hash (cost=793.00..793.00 rows=40000 width=9)
                    -> Seq Scan on a (cost=0.00..793.00 rows=40000 width=9)

--Here's the query that does use the index on "b":

select a.field1, b1.title
from a
left join b b1 on b1.id = a.id and b1.lang=1
where lower(b1.title) like 'abcd%'
union
select a.field1, b2.title
from a
left join b b2 on b2.id = a.id and b2.lang=2
where lower(b2.title) like 'abcd%'

--plan:
HashAggregate (cost=98.31..98.39 rows=8 width=20)
-> Append (cost=4.74..98.27 rows=8 width=20)
        -> Nested Loop (cost=4.74..49.10 rows=4 width=20)
              -> Bitmap Heap Scan on b b1 (cost=4.45..15.82 rows=4 width=19)
                    Filter: ((lang = 1) AND (lower(title) ~~ 'abcd%'::text))
                    -> Bitmap Index Scan on b_title_lowerto (cost=0.00..4.45 rows=3 width=0)
                          Index Cond: ((lower(title) ~>=~ 'abcd'::text) AND (lower(title) ~<~ 'abce'::text))
              -> Index Scan using a_pkey on a (cost=0.29..8.31 rows=1 width=9)
                    Index Cond: (id = b1.id)
        -> Nested Loop (cost=4.74..49.10 rows=4 width=20)
              -> Bitmap Heap Scan on b b2 (cost=4.45..15.82 rows=4 width=19)
                    Filter: ((lang = 2) AND (lower(title) ~~ 'abcd%'::text))
                    -> Bitmap Index Scan on b_title_lowerto (cost=0.00..4.45 rows=3 width=0)
                          Index Cond: ((lower(title) ~>=~ 'abcd'::text) AND (lower(title) ~<~ 'abce'::text))
              -> Index Scan using a_pkey on a a_1 (cost=0.29..8.31 rows=1 width=9)
                    Index Cond: (id = b2.id)

As you can see, the second query is far more efficient, even though it scans both tables twice to combine the results.

Is this some glitch in the query planner?

Cheers,

Willy-Bas Loos

pgsql-general by date:

From: Adrian Klaver
Date: 12 September 2014, 17:03:55
Subject: Re: permission denied for schema topology

From: Kevin Grittner
Date: 13 September 2014, 00:12:09
Subject: Re: 2 left joins causes seqscan

2 left joins causes seqscan - Mailing list pgsql-general

Previous

Next