Home > mailing lists

Efficiency of EXISTS? - Mailing list pgsql-performance

From	Kenneth Tilton
Subject	Efficiency of EXISTS?
Date	July 23, 2012 18:13:02
Msg-id	CAECCA8Zj3d7_tpXGd4N+=9fZxSKO+FrHBbzxhEB8Q=e33G1p9Q@mail.gmail.com Whole thread
Responses	Re: Efficiency of EXISTS?
List	pgsql-performance

Tree view

My mental model of the EXISTS clause must be off. This snippet appears at the end of a series of WITH clauses I suspect are irrelevant:

with etc etc ... , cids as
(select distinct c.id from ddr2 c
join claim_entries ce on ce.claim_id = c.id
where (c.assigned_ddr = 879
or exists (select 1 from ddr_cdt dc
where
dc.sys_user_id = 879
and dc.document_type = c.document_type
-- makes it faster: and (dc.cdt_code is null or dc.cdt_code = ce.cpt_code)
)))

select count(*) from cids

If I uncomment the bit where it says "make it faster" I get decent response and the graphical analyze display shows the expected user+doctype+cdtcode index is being used (and nice thin lines suggesting efficient lookup).

As it is, the analyze display shows the expected user+doctype index* being used but the lines are fat, and performance is an exponential disaster.

* I created the (to me ) redundant user+doctype index trying to get Postgres to Do the Right Thing(tm), but I can see that was not the issue.

I presume the reason performance drops off a cliff is because there can be 9000 cdt_codes for one user+doctype, but I was hoping EXISTS would just look to see if there was at least one row matching user+doctype and return its decision. I have tried select *, select 1, and limit 1 on the nested select to no avail.

Am I just doing something wrong? I am a relative noob. Is there some other hint I can give the planner?

Thx, ken

pgsql-performance by date:

From: Robert Haas
Date: 23 July 2012, 16:38:53
Subject: Re: High CPU Usage

From: Merlin Moncure
Date: 23 July 2012, 18:52:50
Subject: Re: Efficiency of EXISTS?

Efficiency of EXISTS? - Mailing list pgsql-performance

Previous

Next