Thread: optimizing a (simple?) query on a largeish table

optimizing a (simple?) query on a largeish table

From

"Dr. Kurt Ruff"

Date:

19 November 2007, 03:15:32

Hi.

I'm just getting started with PostgreSQL.  Porting over an huge Oracle
database application for a fun first project :^)

I've got the following query which I'm trying to run on a 4.2 million row table:

SELECT ActionItems.*
FROM ActionItems
WHERE
    attn=upper(SESSION_USER)
    or attn in (
        select upper(groname)
        from pg_group
        where (select oid from pg_roles where rolname = SESSION_USER) = ANY(grolist)
    )
ORDER BY dateTimeCreated

That is, "match any ActionItem directed to me personally, or to the
groups to which I belong".

It currently takes about 8 seconds.

I have indexes on both used columns in the large table ("attn" and
"dateTimeCreated"), but it doesn't seem to be using them --- I've
attached the "EXPLAIN" result below.

Any ideas about what's going on here?   How can I reduce the execution time?

Thanks,
Kurt

---

Sort  (cost=1242644.46..1247909.54 rows=2106033 width=200)
  Sort Key: datetimecreated
  ->  Seq Scan on actionitems  (cost=573.01..186430.80 rows=2106033 width=200)
        Filter: (((attn)::text = upper(("session_user"())::text)) OR
(hashed subplan))
        SubPlan
          ->  Seq Scan on pg_authid  (cost=5.10..573.01 rows=2 width=64)
                Filter: ((NOT rolcanlogin) AND ($0 = ANY ((subplan))))
                InitPlan
                  ->  Seq Scan on pg_authid  (cost=0.00..5.10 rows=1 width=4)
                        Filter: (rolname = "session_user"())
                SubPlan
                  ->  Seq Scan on pg_auth_members  (cost=0.00..4.01
rows=15 width=4)
                        Filter: (roleid = $1)

Re: optimizing a (simple?) query on a largeish table

From

Tom Lane

Date:

19 November 2007, 03:40:00

"Dr. Kurt Ruff" <kurt.ruff@gmail.com> writes:
> I've got the following query which I'm trying to run on a 4.2 million row table:

> SELECT ActionItems.*
> FROM ActionItems
> WHERE
>     attn=upper(SESSION_USER)
>     or attn in (
>         select upper(groname)
>         from pg_group
>         where (select oid from pg_roles where rolname = SESSION_USER) = ANY(grolist)
>     )
> ORDER BY dateTimeCreated

Replacing the OR with a UNION or UNION ALL might help, though I also
wonder whether you've selected a compatible datatype for "attn".
The upper() calls will yield type TEXT.

[ fools around a bit... ]  Another possibility, if you're using PG 8.2
or later, is to replace the "attn IN (sub-SELECT)" with "attn = ANY
(ARRAY(sub-SELECT))".  This is a hack --- the planner probably ought to
think of that for itself --- but currently it doesn't.

All this advice is predicated on the assumption that there are few
enough matching rows that multiple indexscans really are a better plan
than one seqscan.  Since you didn't say how many rows you expect, it's
not impossible that the plan you've got is in fact the best.

            regards, tom lane