Re: How to optimize WHERE column_a IS NOT NULL OR column_b = 'value' - Mailing list pgsql-sql

From Igor Neyman
Subject Re: How to optimize WHERE column_a IS NOT NULL OR column_b = 'value'
Date
Msg-id A76B25F2823E954C9E45E32FA49D70EC919DE754@mail.corp.perceptron.com
Whole thread Raw
In response to How to optimize WHERE column_a IS NOT NULL OR column_b = 'value'  (Andreas Joseph Krogh <andreas@visena.com>)
List pgsql-sql

 

 

From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Andreas Joseph Krogh
Sent: Monday, August 11, 2014 12:01 PM
To: pgsql-sql@postgresql.org
Subject: [SQL] How to optimize WHERE column_a IS NOT NULL OR column_b = 'value'

 

Hi folks,

 

I have the following schema (simplified for this example).

 

create table folder(id integer primary key, name varchar not null);

 

create table document(id serial primary key, name varchar not null, owner_id integer not null, folder_id integer references folder(id));

create index document_owner_idx ON document(owner_id);

create index document_folder_idx ON document(folder_id);

 

insert into folder(id, name) values(1, 'Folder A');

insert into folder(id, name) values(2, 'Folder B');

insert into document(name, owner_id, folder_id) values('Document A',  1, 1);

insert into document(name, owner_id, folder_id) values('Document B',  1, NULL);

insert into document(name, owner_id, folder_id) values('Document C',  2, 2);

insert into document(name, owner_id, folder_id) values('Document D',  2, NULL);

 

select f.id, f.name, doc.id, doc.owner_id, doc.name

FROM document doc left outer join folder f ON doc.folder_id = f.id

WHERE doc.folder_id is not null OR doc.owner_id = 1;

 

                                                         QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------------
 Nested Loop Left Join  (cost=0.15..13.77 rows=4 width=76) (actual time=0.031..0.045 rows=3 loops=1)
   ->  Seq Scan on document doc  (cost=0.00..1.05 rows=4 width=44) (actual time=0.012..0.018 rows=3 loops=1)
         Filter: ((folder_id IS NOT NULL) OR (owner_id = 1))
         Rows Removed by Filter: 1
   ->  Index Scan using folder_pkey on folder f  (cost=0.15..3.17 rows=1 width=36) (actual time=0.005..0.006 rows=1 loops=3)
         Index Cond: (doc.folder_id = id)
 Planning time: 0.267 ms
 Execution time: 0.094 ms
(8 rows)

 

 

Is the a way to write a query which uses an index efficiently for such a schema?

 

I'd like to eliminate the Filter: ((folder_id IS NOT NULL) OR (owner_id = 1)) and rather have "index cond" insted, is that possible?

 

--

Andreas Joseph Krogh

CTO / Partner - Visena AS

Mobile: +47 909 56 963

 

 

You could try partial index on (column_b) where column_a IS NULL.

 

Regards,

Igor Neyman

 

Attachment

pgsql-sql by date:

Previous
From: David G Johnston
Date:
Subject: Re: function call
Next
From: Adrian Klaver
Date:
Subject: Re: function call