Re: Query planner question - Mailing list pgsql-general

From Stephan Szabo
Subject Re: Query planner question
Date
Msg-id 20030612145211.Y37555-100000@megazone23.bigpanda.com
Whole thread Raw
In response to Query planner question  (Ernest E Vogelsinger <ernest@vogelsinger.at>)
Responses Re: Query planner question  (Ernest E Vogelsinger <ernest@vogelsinger.at>)
List pgsql-general
On Thu, 12 Jun 2003, Ernest E Vogelsinger wrote:

>
> Maybe someone can shed a light on this:
>
> I have a (test) table, populated with 2M rows:
>     rid       int4,       -- primary key
>     dcid      varchar,
>     dsid      varchar,
>     drid      int4,
>     owid      int4,
>     nxid      int4,
>     usg       int4,
>     --- something more that's not important
>
> Indexes in question:
>     CREATE INDEX id_owid ON table (owid,nxid,usg) USING BTREE
>     CREATE INDEX id_dowid ON table (owid, dcid, dsid, drid, nxid, usg)
> USING BTREE
>     CREATE INDEX id_dictid ON table (dcid, dsid, drid, nxid) USING BTREE
>
> Test Run
>
> VACUUM ANALYZE FULL;
> VACUUM
> EXPLAIN ANALYZE SELECT DISTINCT dcid,dsid,drid FROM table WHERE owid = 1;
> NOTICE:  QUERY PLAN:
>   Unique  (cost=402.07..402.90 rows=11 width=20) (actual time=10.13..11.49
> rows=512 loops=1)
>     ->  Sort  (cost=402.07..402.07 rows=111 width=20) (actual
> time=10.13..10.46 rows=512 loops=1)
>           ->  Index Scan using id_owid on table (cost=0.00..398.30 rows=111
> width=20) (actual time=0.05..4.44 rows=512 loops=1)
> Total runtime: 11.95 msec
>
> EXPLAIN ANALYZE SELECT DISTINCT dcid,dsid,drid FROM table WHERE owid is null;
> NOTICE:  QUERY PLAN:
>   Unique  (cost=126611.73..128034.59 rows=18971 width=20) (actual
> time=27515.63..28359.88 rows=513 loops=1)
>     ->  Sort  (cost=126611.73..126611.73 rows=189714 width=20) (actual
> time=27515.62..27792.29 rows=199131 loops=1)
>           ->  Seq Scan on table  (cost=0.00..106885.04 rows=189714
> width=20) (actual time=18.76..16467.28 rows=199131 loops=1)
> Total runtime: 28633.68 msec
>
> SET enable_seqscan TO off;
> SET VARIABLE
> EXPLAIN SELECT DISTINCT dcid,dsid,drid FROM table WHERE owid is null;
> -- this process uses 34M RSS!!!
> NOTICE:  QUERY PLAN:
>   Unique  (cost=0.00..7887659.31 rows=18971 width=20) (actual
> time=2.57..711940.78 rows=513 loops=1)
>     ->  Index Scan using id_dictid on table  (cost=0.00..7886236.46
> rows=189714 width=20) (actual time=2.57..710482.07 rows=199131 loops=1)
> Total runtime: 711942.76 msec
>
> A couple of questions arise:
>
> 1) Why chooses the planner to use id_owid (and not id_dowid as it would
> seem logical) for the first query?

I think that it doesn't entirely know that owid=1, sort by dcid, dsid,
drid can be handled by that index.  I think it's possible that if you
added owid to the select list it might use id_dowid instead. I think this
is similar to the issues with order by, conditions and index choice, which
you may find useful information in the archives about)

> 2) Why is NO index used for the second query, the only difference being in
> the constraint value (owid is set vs. owid is null)?

IS NULL is not considered an indexable condition currently (there are past
discussions and hackarounds in the archives)

> 3) Why does it use id_dictid for the second query when forced to, and not
> id_owid or id_dowid?

As for #2, it doesn't think it can use an index with owid in the front.


pgsql-general by date:

Previous
From: Dmitry Tkach
Date:
Subject: Re: Query planner question
Next
From: "Jim C. Nasby"
Date:
Subject: Re: Index not being used in MAX function (7.2.3)