Re: Shouldn't the planner have a higher cost for reverse index scans? - Mailing list pgsql-performance

From Tom Lane
Subject Re: Shouldn't the planner have a higher cost for reverse index scans?
Date
Msg-id 18710.1239896563@sss.pgh.pa.us
Whole thread Raw
In response to Re: Shouldn't the planner have a higher cost for reverse index scans?  (Lists <lists@on-track.ca>)
Responses Re: Shouldn't the planner have a higher cost for reverse index scans?  (Lists <lists@on-track.ca>)
List pgsql-performance
Lists <lists@on-track.ca> writes:
> The query

>     select comment_date
>         from user_comments
>         where user_comments.uid=1
>         order by comment_date desc limit 1

>     Explain:
>     "Limit  (cost=0.00..2699.07 rows=1 width=8) (actual
>     time=52848.785..52848.787 rows=1 loops=1)"
>     "  ->  Index Scan Backward using idx_user_comments_comment_date on
>     user_comments  (cost=0.00..5789515.40 rows=2145 width=8) (actual
>     time=52848.781..52848.781 rows=1 loops=1)"
>     "        Filter: (uid = 1)"
>     "Total runtime: 52848.840 ms"

> takes 10's of seconds to complete (52 sec last run). However

>     select comment_date
>         from user_comments
>         where user_comments.uid=1
>         order by comment_date limit 1

>     Explain:
>     "Limit  (cost=0.00..2699.07 rows=1 width=8) (actual
>     time=70.402..70.403 rows=1 loops=1)"
>     "  ->  Index Scan using idx_user_comments_comment_date on
>     user_comments  (cost=0.00..5789515.40 rows=2145 width=8) (actual
>     time=70.398..70.398 rows=1 loops=1)"
>     "        Filter: (uid = 1)"
>     "Total runtime: 70.453 ms"

> takes well under 1 sec.

AFAICS this is pure chance --- it is based on when we happen to hit the
first row with uid = 1 while scanning in forward or reverse comment_date
order.  Unless you have evidence that the number of rows skipped over
is similar in both cases, there is no reason to suppose that this
example bears on Josh's concern.

As noted by Merlin, if you're willing to create another index to help
this type of query, then a two-column index on (uid, comment_date) would
be ideal.

            regards, tom lane

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Shouldn't the planner have a higher cost for reverse index scans?
Next
From: Robert Haas
Date:
Subject: Re: Really dumb planner decision