Question about a query plan - Mailing list pgsql-general

From Bill Moseley
Subject Question about a query plan
Date
Msg-id 20050916141825.GA17486@hank.org
Whole thread Raw
Responses Re: Question about a query plan
Re: Question about a query plan
Re: Question about a query plan
List pgsql-general
PostgreSQL 7.4.8 on i486-pc-linux-gnu, Debian
Reading: http://www.postgresql.org/docs/8.0/static/performance-tips.html

I'm just starting to look at query plans, and I'm not understanding a
few things.  I don't have that many questions, but I'm including my
examples below, so it's a bit long.


First table is "class" (as in a class taught at a school) and has an
indexed column "class_time" as timestamp(0) with time zone.


First question is why the planner is not using an index scan when I
use "now()" or CURRENT_TIMESTAMP?


    EXPLAIN ANALYZE select id from class where class_time > now();
                                                    QUERY PLAN
    ----------------------------------------------------------------------------------------------------------
     Seq Scan on "class"  (cost=0.00..655.62 rows=414 width=4) (actual time=213.327..288.407 rows=28 loops=1)
       Filter: (class_time > now())

    EXPLAIN ANALYZE select id from class where class_time > now()::timestamp(0) with time zone;
                                                  QUERY PLAN
    ------------------------------------------------------------------------------------------------------
     Seq Scan on "class"  (cost=0.00..658.72 rows=414 width=4) (actual time=2.065..5.251 rows=28 loops=1)
       Filter: (class_time > (now())::timestamp(0) with time zone)


At first I thought the planner was related to the ration of rows the
planner was expecting to return to the total number of rows.  But
using < or > uses a a scan.  But if I do "class_time = now()" then it
uses an Index Scan.


But, if I specify the timestamp then it always uses an Index Scan:

    select now()::timestamp(0) with time zone;
              now
    ------------------------
     2005-09-16 06:44:10-07

     EXPLAIN ANALYZE select id from class where class_time > '2005-09-16 06:44:10-07';
                                                                  QUERY PLAN
                  

--------------------------------------------------------------------------------------------------------------------------------------
     Index Scan using class_class_time_index on "class"  (cost=0.00..191.17 rows=50 width=4) (actual
time=66.072..66.248rows=28 loops=1) 
       Index Cond: (class_time > '2005-09-16 06:44:10-07'::timestamp with time zone)



Ok now on to the second question.  I have two other related tables.
First, I have a table "person" which you can guess what it holds.  And
a link table instructors (a class can have more than one instructor):

     \d instructors
      Table "public.instructors"
     Column |  Type   | Modifiers
    --------+---------+-----------
     person | integer | not null
     class  | integer | not null
    Indexes:
        "instructors_pkey" primary key, btree (person, "class")
        "instructors_class_index" btree ("class")
        "instructors_person_index" btree (person)
    Foreign-key constraints:
        "$2" FOREIGN KEY ("class") REFERENCES "class"(id)
        "$1" FOREIGN KEY (person) REFERENCES person(id)



I want to find out who is teaching classes in the future:


EXPLAIN ANALYZE select person, class from instructors
                where instructors.class in
                    (select class.id from class where class_time > now());
                                                    QUERY PLAN
--------------------------------------------------------------------------------------------------------------------
 Hash IN Join  (cost=656.65..687.64 rows=437 width=8) (actual time=31.741..33.443 rows=29 loops=1)
   Hash Cond: ("outer"."class" = "inner".id)
   ->  Seq Scan on instructors  (cost=0.00..20.08 rows=1308 width=8) (actual time=0.057..1.433 rows=1308 loops=1)
   ->  Hash  (cost=655.62..655.62 rows=414 width=4) (actual time=30.963..30.963 rows=0 loops=1)
         ->  Seq Scan on "class"  (cost=0.00..655.62 rows=414 width=4) (actual time=18.716..30.892 rows=28 loops=1)
               Filter: (class_time > now())

Perhaps I'm reading that incorrectly, but the sub-select is returning
28 rows of "class.id".  Then why is it doing a Seq Scan on instructors
instead of an index scan?  If I innumerate all 28 classes I get an
Index Scan.



Finally, not really a question, but my goal is to show a count of classes taught by each in
instructor.  Perhaps there's a better query?

EXPLAIN select person, first_name, count(class)
        from instructors, person
        where instructors.class in
            (select id from class where class_time > now() )
            AND person.id = instructors.person
            group by person, first_name;
                                          QUERY PLAN
----------------------------------------------------------------------------------------------
 HashAggregate  (cost=734.06..735.15 rows=437 width=17)
   ->  Merge Join  (cost=706.81..730.78 rows=437 width=17)
         Merge Cond: ("outer".id = "inner".person)
         ->  Index Scan using person_pkey on person  (cost=0.00..1703.82 rows=12246 width=13)
         ->  Sort  (cost=706.81..707.90 rows=437 width=8)
               Sort Key: instructors.person
               ->  Hash IN Join  (cost=656.65..687.64 rows=437 width=8)
                     Hash Cond: ("outer"."class" = "inner".id)
                     ->  Seq Scan on instructors  (cost=0.00..20.08 rows=1308 width=8)
                     ->  Hash  (cost=655.62..655.62 rows=414 width=4)
                           ->  Seq Scan on "class"  (cost=0.00..655.62 rows=414 width=4)
                                 Filter: (class_time > now())



--
Bill Moseley
moseley@hank.org


pgsql-general by date:

Previous
From: "Dinesh Pandey"
Date:
Subject: Re: help needed for functions
Next
From: "A. Kretschmer"
Date:
Subject: Re: help needed for functions