Re: How to get the optimizer to use an index with multiple - Mailing list pgsql-performance

From Jim C. Nasby
Subject Re: How to get the optimizer to use an index with multiple
Date
Msg-id 20030428175439.L66185@flake.decibel.org
Whole thread Raw
In response to Re: How to get the optimizer to use an index with multiple  ("scott.marlowe" <scott.marlowe@ihs.com>)
List pgsql-performance
On Mon, Apr 28, 2003 at 01:52:57PM -0600, scott.marlowe wrote:
> On Mon, 28 Apr 2003, Jim C. Nasby wrote:
>
> > select id into temp NewRetires from stats_participant where retire_to>=1
> > AND retire_date = (SELECT last_date FROM Project_statsrun WHERE
> > project_id = :ProjectID);
> >
> > results in a table with 5 values...
> >
> > explain analyze delete from email_rank where project_id=25 and id in
> > (select id from NewRetires);
> >
> >  Index Scan using email_rank__day_rank on email_rank
> > (cost=0.00..9003741627715.16 rows=45019 width=6) (actual time=408.12..9688.37 rows=3 loops=1)
> >    Index Cond: (project_id = 25)
> >    Filter: (subplan)
> >    SubPlan
> >      ->  Seq Scan on newretires  (cost=100000000.00..100000020.00 rows=1000 width=4) (actual time=0.01..0.05 rows=5
loops=91834)
> >  Total runtime: 9689.86 msec
> >
> > But, there's already an index that would fit the bill here perfectly:
> >
> >               Table "public.email_rank"
> >         Column         |  Type   |     Modifiers
> > -----------------------+---------+--------------------
> >  project_id            | integer | not null
> >  id                    | integer | not null
> >  first_date            | date    | not null
> >  last_date             | date    | not null
> >  day_rank              | integer | not null default 0
> >  day_rank_previous     | integer | not null default 0
> >  overall_rank          | integer | not null default 0
> >  overall_rank_previous | integer | not null default 0
> >  work_today            | bigint  | not null default 0
> >  work_total            | bigint  | not null default 0
> > Indexes: email_rank_pkey primary key btree (project_id, id),
> >          email_rank__day_rank btree (project_id, day_rank),
> >          email_rank__overall_rank btree (project_id, overall_rank)
> >
> > Why isn't it using email_rank_pkey instead of using day_rank then a
> > filter? The original query on sybase (see below) is essentially instant,
> > because it's using the index of (project_id, id), so it doesn't have to
> > read the whole table.
>
> It looks like the seq scan is newretires up there, from your 'id in
> (select id from NewRetires);' part of your query.  I.e. the where in() has
> to be done first, and the query planner has no stats on that table, so it
> assumes a seq scan will be faster in case we need the whole thing anyway.
>
> Try adding an analyze newretires in there between the two queries.
>
> No clue as to why it's choosing one index over the other.  I don't think
> that really matters a lot, it's the seq scan on the temp table that is
> taking your time on this.

There's no index at all on the temporary table; I fully expect it to
seqscan than. :) The issue is the choice of index on email_rank. It's
only going to hit at most 5 rows in email_rank (which it should be able
to figure out based on newretires and the fact that email_rank_pkey is
unique. I didn't show it, but I did run analyze on the temporary table
(why it doesn't have statistics I don't know...)
--
Jim C. Nasby (aka Decibel!)                    jim@nasby.net
Member: Triangle Fraternity, Sports Car Club of America
Give your computer some brain candy! www.distributed.net Team #1828

Windows: "Where do you want to go today?"
Linux: "Where do you want to go tomorrow?"
FreeBSD: "Are you guys coming, or what?"


pgsql-performance by date:

Previous
From: "scott.marlowe"
Date:
Subject: Re: How to get the optimizer to use an index with multiple
Next
From: Rajesh Kumar Mallah
Date:
Subject: Is 292 inserts/sec acceptable performance ?