Thread: Index scan never executed?

Index scan never executed?

From
"Chad Thompson"
Date:
I have never been very good at reading these query plans, but I have a bit
of a problem w/ my query.  So any help is appreciated.

The query is fairly self explanitory i think. 2 tables, call_results ( 6.5
Million records ) and lists ( 11 Million records )

weblink=# explain analyze
weblink-# select count(*) as count
weblink-# from call_results cr join lists l on cr.list_id = l.id
weblink-# where cr.project_id  = '55'
weblink-# and cr.start_time between '4/4/2003 0:0' and now()
weblink-# and l.list_of_lists_id = '691';

                                                                 QUERY PLAN
----------------------------------------------------------------------------
----
-------------------------------------------------------------------------Aggregate  (cost=2519.58..2519.58 rows=1
width=16)(actual
 
time=110715.45..110715.46 rows=1 loops=1)  ->  Nested Loop  (cost=0.00..2519.58 rows=1 width=16) (actual
time=110715.43..110715.43 rows=0 loops=1)        ->  Index Scan using start_time_idx on call_results cr
(cost=0.00..2021.00 rows=164 width=8) (actual time=110715.42..110715.42
rows=0 loops=1)              Index Cond: ((start_time >= '2003-04-04
00:00:00-07'::timestamp with time zone) AND (start_time <= now()))              Filter: (project_id = 55::bigint)
->  Index Scan using lists_pkey on lists l  (cost=0.00..3.03 rows=1
 
width=8) (never executed)              Index Cond: ("outer".list_id = l.id)              Filter: (list_of_lists_id =
691::bigint)Totalruntime: 110747.58 msec
 
(9 rows)

The big thing I dont understand is why it tells me (never executed) on
lists_pkey.
I also dont see where all the time is being taken up.  I thought that
(actual time=110715.42..110715.42) meant from millisecond this... TO
millisecond that, but that would mean that this index scan took no time.  So
as you can see I am very confused. :-)

TIA for any suggestions on how to make this query faster.
Chad

PS I have run vacuum full and analyze as reciently as last night :-)



Re: Index scan never executed?

From
Stephan Szabo
Date:
On Thu, 29 May 2003, Chad Thompson wrote:

> I have never been very good at reading these query plans, but I have a bit
> of a problem w/ my query.  So any help is appreciated.
>
> The query is fairly self explanitory i think. 2 tables, call_results ( 6.5
> Million records ) and lists ( 11 Million records )
>
> weblink=# explain analyze
> weblink-# select count(*) as count
> weblink-# from call_results cr join lists l on cr.list_id = l.id
> weblink-# where cr.project_id  = '55'
> weblink-# and cr.start_time between '4/4/2003 0:0' and now()
> weblink-# and l.list_of_lists_id = '691';
>
>
>                                                                   QUERY PLAN
> ----------------------------------------------------------------------------
> ----
> -------------------------------------------------------------------------
>  Aggregate  (cost=2519.58..2519.58 rows=1 width=16) (actual
> time=110715.45..110715.46 rows=1 loops=1)
>    ->  Nested Loop  (cost=0.00..2519.58 rows=1 width=16) (actual
> time=110715.43..110715.43 rows=0 loops=1)
>          ->  Index Scan using start_time_idx on call_results cr
> (cost=0.00..2021.00 rows=164 width=8) (actual time=110715.42..110715.42
> rows=0 loops=1)
>                Index Cond: ((start_time >= '2003-04-04
> 00:00:00-07'::timestamp with time zone) AND (start_time <= now()))
>                Filter: (project_id = 55::bigint)
>          ->  Index Scan using lists_pkey on lists l  (cost=0.00..3.03 rows=1
> width=8) (never executed)
>                Index Cond: ("outer".list_id = l.id)
>                Filter: (list_of_lists_id = 691::bigint)
>  Total runtime: 110747.58 msec
> (9 rows)
>
> The big thing I dont understand is why it tells me (never executed) on
> lists_pkey.

It's doing a nested loop.  AFAIK, it's saying for each row returned from
the index scan on call_results it does a scan on lists.  Since no rows
were returned from call_results it didn't need to ever scan lists.

I think you want an index on (start_time, project_id).  It looks like
there are lots of rows matching start_time conditions but none of those
have project_id 55, so you'd be doing alot of extra disk access.



Re: Index scan never executed?

From
Tom Lane
Date:
"Chad Thompson" <chad@weblinkservices.com> writes:
>  Aggregate  (cost=2519.58..2519.58 rows=1 width=16) (actual
> time=110715.45..110715.46 rows=1 loops=1)
>    ->  Nested Loop  (cost=0.00..2519.58 rows=1 width=16) (actual
> time=110715.43..110715.43 rows=0 loops=1)
>          ->  Index Scan using start_time_idx on call_results cr
> (cost=0.00..2021.00 rows=164 width=8) (actual time=110715.42..110715.42
> rows=0 loops=1)
>                Index Cond: ((start_time >= '2003-04-04
> 00:00:00-07'::timestamp with time zone) AND (start_time <= now()))
>                Filter: (project_id = 55::bigint)
>          ->  Index Scan using lists_pkey on lists l  (cost=0.00..3.03 rows=1
> width=8) (never executed)
>                Index Cond: ("outer".list_id = l.id)
>                Filter: (list_of_lists_id = 691::bigint)
>  Total runtime: 110747.58 msec
> (9 rows)

> The big thing I dont understand is why it tells me (never executed) on
> lists_pkey.

Because it was never executed --- the outer scan on call_results
produced zero rows, so there was never a reason to run the inner scan.
A nestloop join scans the inner table once for each row output by the
outer table.

> I also dont see where all the time is being taken up.  I thought that
> (actual time=110715.42..110715.42) meant from millisecond this... TO
> millisecond that, but that would mean that this index scan took no time.

No, the first "actual time" number means the time required to produce
the first output row.  So the call_results scan is what is taking all
the time (as you'd expect, given that there's nothing for any of the
other steps to do).

I guess it's a little unclear what to print for the first number when no
rows are output at all.  The code evidently is using the total time spent
in the plan node, but I think it would be at least as justifiable to
print a zero instead.  Would you have found that less confusing?  Anyone
else have an opinion about whether to change that detail?


> TIA for any suggestions on how to make this query faster.

Doesn't seem like the index on start_time is doing you much good ---
you're obviously scanning a huge lot of rows and not finding anything.
Perhaps an index on project_id would be more helpful, or a two-column
index on (project_id, start_time).
        regards, tom lane


Re: Index scan never executed?

From
"Chad Thompson"
Date:
>
> I guess it's a little unclear what to print for the first number when no
> rows are output at all.  The code evidently is using the total time spent
> in the plan node, but I think it would be at least as justifiable to
> print a zero instead.  Would you have found that less confusing?  Anyone
> else have an opinion about whether to change that detail?
>

No, that makes perfect sense now that it has been explained.

> Perhaps an index on project_id would be more helpful, or a two-column
> index on (project_id, start_time).

As usual, you are in fine form.  A two-column index brought the query from
115 sec to 6.

Thanks!
Chad



Re: Index scan never executed?

From
"Morten Tryfoss"
Date:
Hi,

This may make the query faster.

try to split the query into two parts:

first fetch all list-id's into a temp table (where cr.project_id  = '55' and cr.start_time between '4/4/2003 0:0' and
now())

then, join the temp table on lists (where l.list_of_lists_id = '691').

Morten

-----Opprinnelig melding-----
Fra: Chad Thompson [mailto:chad@weblinkservices.com]
Sendt: 29. mai 2003 19:20
Til: pgsql-sql
Emne: [SQL] Index scan never executed?


I have never been very good at reading these query plans, but I have a bit
of a problem w/ my query.  So any help is appreciated.

The query is fairly self explanitory i think. 2 tables, call_results ( 6.5
Million records ) and lists ( 11 Million records )

weblink=# explain analyze
weblink-# select count(*) as count
weblink-# from call_results cr join lists l on cr.list_id = l.id
weblink-# where cr.project_id  = '55'
weblink-# and cr.start_time between '4/4/2003 0:0' and now()
weblink-# and l.list_of_lists_id = '691';

                                                                 QUERY PLAN
----------------------------------------------------------------------------
----
-------------------------------------------------------------------------Aggregate  (cost=2519.58..2519.58 rows=1
width=16)(actual 
time=110715.45..110715.46 rows=1 loops=1)  ->  Nested Loop  (cost=0.00..2519.58 rows=1 width=16) (actual
time=110715.43..110715.43 rows=0 loops=1)        ->  Index Scan using start_time_idx on call_results cr
(cost=0.00..2021.00 rows=164 width=8) (actual time=110715.42..110715.42
rows=0 loops=1)              Index Cond: ((start_time >= '2003-04-04
00:00:00-07'::timestamp with time zone) AND (start_time <= now()))              Filter: (project_id = 55::bigint)
->  Index Scan using lists_pkey on lists l  (cost=0.00..3.03 rows=1 
width=8) (never executed)              Index Cond: ("outer".list_id = l.id)              Filter: (list_of_lists_id =
691::bigint)Totalruntime: 110747.58 msec 
(9 rows)

The big thing I dont understand is why it tells me (never executed) on
lists_pkey.
I also dont see where all the time is being taken up.  I thought that
(actual time=110715.42..110715.42) meant from millisecond this... TO
millisecond that, but that would mean that this index scan took no time.  So
as you can see I am very confused. :-)

TIA for any suggestions on how to make this query faster.
Chad

PS I have run vacuum full and analyze as reciently as last night :-)


---------------------------(end of broadcast)---------------------------
TIP 3: if posting/reading through Usenet, please send an appropriate
subscribe-nomail command to majordomo@postgresql.org so that your
message can get through to the mailing list cleanly