Re: Some Improvement - Mailing list pgsql-hackers

From Tom Lane
Subject Re: Some Improvement
Date
Msg-id 27090.963462010@sss.pgh.pa.us
Whole thread Raw
In response to Re: Some Improvement  (Tim Perdue <tperdue@valinux.com>)
List pgsql-hackers
Tim Perdue <tperdue@valinux.com> writes:
> Tom Lane wrote:
>> Tim Perdue <tperdue@valinux.com> writes:
>>>> I still think there must be sorting going on, as the result is returned
>>>> instantly if you remove the ORDER BY.
>> 
>> You "think"?  What does EXPLAIN show in the two cases?

> With the ORDER BY

> db_geocrawler=# explain verbose SELECT fld_mailid, fld_mail_date,
> fld_mail_is_followup, fld_mail_from, fld_mail_subject FROM
> tbl_mail_archive WHERE fld_mail_list='35' AND fld_mail_date between
> '20000100' AND '20000199' ORDER BY fld_mail_date DESC LIMIT 51 OFFSET 0;

> NOTICE:  QUERY PLAN:

> Sort  (cost=5.03..5.03 rows=1 width=44)
>    -> Index Scan using idx_archive_list_date on tbl_mail_archive 
>       (cost=0.00..5.02 rows=1 width=44)

Well, you obviously are getting a sort step here, which you want to
avoid because the LIMIT isn't doing you much good when there's a SORT
in between --- the indexscan has to run over the whole month then.

I assume idx_archive_list_date is an index on tbl_mail_archive
(fld_mail_list, fld_mail_date) in that order?  The reason you're
getting the extra sort is that the planner believes the indexscan
will produce data ordered likeORDER BY fld_mail_list, fld_mail_date
which is not what you asked for: you asked for a sort by fld_mail_date,
period.  (Now you know and I know that since the query retrieves only
tuples with a single value of fld_mail_list, there's no practical
difference.  The planner, however, is less bright than we are and does
not make the connection.)  To avoid the extra sort, you need to specify
an ORDER BY that the planner will recognize as compatible with the
index:ORDER BY fld_mail_list DESC, fld_mail_date DESC
Note it's important that both clauses be marked DESC or neither;
otherwise the clause still won't look like it matches the index's
ordering.  But with the correct ORDER BY incantation, you should
get a plan like

Index Scan Backwards using idx_archive_list_date on tbl_mail_archive 

and then you will be happy ;-).

(Alternatively, you could declare the index on (fld_mail_date,
fld_mail_list) and then ORDER BY fld_mail_date DESC would work by
itself.  You should think about which ordering you'd want for a
query retrieving rows from more than one list before you decide.)

BTW, the 6.5 planner was quite incapable of generating a plan like
this, so I'm still not sure why you saw better performance with 6.5.
Was there anything to the theory about LOCALE slowing down the sort?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Re: Query 'Bout A Bug.
Next
From: Tatsuo Ishii
Date:
Subject: Re: Serious Performance Loss in 7.0.2??