Thread: optimizing a (simple?) query on a largeish table
Hi. I'm just getting started with PostgreSQL. Porting over an huge Oracle database application for a fun first project :^) I've got the following query which I'm trying to run on a 4.2 million row table: SELECT ActionItems.* FROM ActionItems WHERE attn=upper(SESSION_USER) or attn in ( select upper(groname) from pg_group where (select oid from pg_roles where rolname = SESSION_USER) = ANY(grolist) ) ORDER BY dateTimeCreated That is, "match any ActionItem directed to me personally, or to the groups to which I belong". It currently takes about 8 seconds. I have indexes on both used columns in the large table ("attn" and "dateTimeCreated"), but it doesn't seem to be using them --- I've attached the "EXPLAIN" result below. Any ideas about what's going on here? How can I reduce the execution time? Thanks, Kurt --- Sort (cost=1242644.46..1247909.54 rows=2106033 width=200) Sort Key: datetimecreated -> Seq Scan on actionitems (cost=573.01..186430.80 rows=2106033 width=200) Filter: (((attn)::text = upper(("session_user"())::text)) OR (hashed subplan)) SubPlan -> Seq Scan on pg_authid (cost=5.10..573.01 rows=2 width=64) Filter: ((NOT rolcanlogin) AND ($0 = ANY ((subplan)))) InitPlan -> Seq Scan on pg_authid (cost=0.00..5.10 rows=1 width=4) Filter: (rolname = "session_user"()) SubPlan -> Seq Scan on pg_auth_members (cost=0.00..4.01 rows=15 width=4) Filter: (roleid = $1)
"Dr. Kurt Ruff" <kurt.ruff@gmail.com> writes: > I've got the following query which I'm trying to run on a 4.2 million row table: > SELECT ActionItems.* > FROM ActionItems > WHERE > attn=upper(SESSION_USER) > or attn in ( > select upper(groname) > from pg_group > where (select oid from pg_roles where rolname = SESSION_USER) = ANY(grolist) > ) > ORDER BY dateTimeCreated Replacing the OR with a UNION or UNION ALL might help, though I also wonder whether you've selected a compatible datatype for "attn". The upper() calls will yield type TEXT. [ fools around a bit... ] Another possibility, if you're using PG 8.2 or later, is to replace the "attn IN (sub-SELECT)" with "attn = ANY (ARRAY(sub-SELECT))". This is a hack --- the planner probably ought to think of that for itself --- but currently it doesn't. All this advice is predicated on the assumption that there are few enough matching rows that multiple indexscans really are a better plan than one seqscan. Since you didn't say how many rows you expect, it's not impossible that the plan you've got is in fact the best. regards, tom lane