Efficiency of EXISTS? - Mailing list pgsql-performance

From Kenneth Tilton
Subject Efficiency of EXISTS?
Date
Msg-id CAECCA8Zj3d7_tpXGd4N+=9fZxSKO+FrHBbzxhEB8Q=e33G1p9Q@mail.gmail.com
Whole thread Raw
Responses Re: Efficiency of EXISTS?
List pgsql-performance
My mental model of the EXISTS clause must be off. This snippet appears at the end of a series of WITH clauses I suspect are irrelevant:

with etc etc ... , cids as
  (select distinct c.id from ddr2 c
join claim_entries ce on ce.claim_id = c.id
where (c.assigned_ddr = 879
or exists (select 1 from ddr_cdt dc 
where
dc.sys_user_id = 879
and dc.document_type = c.document_type
-- makes it faster: and (dc.cdt_code is null or dc.cdt_code = ce.cpt_code)
)))

select count(*) from cids

If I uncomment the bit where it says "make it faster" I get decent response and the graphical analyze display shows the expected user+doctype+cdtcode index is being used (and nice thin lines suggesting efficient lookup).

As it is, the analyze display shows the expected user+doctype index* being used but the lines are fat, and performance is an exponential disaster.

* I created the (to me ) redundant user+doctype index trying to get Postgres to Do the Right Thing(tm), but I can see that was not the issue.

I presume the reason performance drops off a cliff is because there can be 9000 cdt_codes for one user+doctype, but I was hoping EXISTS would just look to see if there was at least one row matching user+doctype and return its decision. I have tried select *, select 1, and limit 1 on the nested select to no avail.

Am I just doing something wrong? I am a relative noob. Is there some other hint I can give the planner?

Thx, ken


pgsql-performance by date:

Previous
From: Robert Haas
Date:
Subject: Re: High CPU Usage
Next
From: Merlin Moncure
Date:
Subject: Re: Efficiency of EXISTS?