Thread: Order by optimisations?

Order by optimisations?

From

Christopher Kings-Lynne

Date:

13 July 2005, 05:05:11

Hi,

Does PostgreSQL do the following optimisation:

SELECT * FROM diary WHERE date = '2005-05-01' ORDER BY date;

or in fact even better (for my situation)

SELECT * FROM diary WHERE date BETWEEN '2005-05-01' AND '2005-05-01' 
ORDER BY date;

Does it know that the input to the sort routine is already sorted and 
hence is a no-op?

Chris

Re: Order by optimisations?

From

Hannu Krosing

Date:

14 July 2005, 05:49:00

On K, 2005-07-13 at 16:08 +0800, Christopher Kings-Lynne wrote:
> Hi,
> 
> Does PostgreSQL do the following optimisation:
> 
> SELECT * FROM diary WHERE date = '2005-05-01' ORDER BY date;
> 
> or in fact even better (for my situation)
> 
> SELECT * FROM diary WHERE date BETWEEN '2005-05-01' AND '2005-05-01' 
> ORDER BY date;
> 
> Does it know that the input to the sort routine is already sorted and 
> hence is a no-op?

Yes

try EXPLAIN ;)

-- 
Hannu Krosing <hannu@skype.net>

Re: Order by optimisations?

From

Christopher Kings-Lynne

Date:

14 July 2005, 06:00:44


Hannu Krosing wrote:
> On K, 2005-07-13 at 16:08 +0800, Christopher Kings-Lynne wrote:
> 
>>Hi,
>>
>>Does PostgreSQL do the following optimisation:
>>
>>SELECT * FROM diary WHERE date = '2005-05-01' ORDER BY date;
>>
>>or in fact even better (for my situation)
>>
>>SELECT * FROM diary WHERE date BETWEEN '2005-05-01' AND '2005-05-01' 
>>ORDER BY date;
>>
>>Does it know that the input to the sort routine is already sorted and 
>>hence is a no-op?
> 
> 
> Yes
> 
> try EXPLAIN ;)

Doesn't seem like it does:

usatest=# explain select * from users_myfoods_map where 
date='2004-11-21' order by date;                                QUERY PLAN
--------------------------------------------------------------------------- Sort  (cost=17.17..17.48 rows=123 width=22)
 Sort Key: date   ->  Seq Scan on users_myfoods_map  (cost=0.00..12.90 rows=123 width=22)         Filter: (date =
'2004-11-21'::date)
(4 rows)

The sort cost is non-zero.  Or am I not looking at the right thing...

Chris

Re: Order by optimisations?

From

"Michael Paesold"

Date:

14 July 2005, 06:32:56

Christopher Kings-Lynne wrote:
> Doesn't seem like it does:
>
> usatest=# explain select * from users_myfoods_map where date='2004-11-21' 
> order by date;
>                                 QUERY PLAN
> ---------------------------------------------------------------------------
>  Sort  (cost=17.17..17.48 rows=123 width=22)
>    Sort Key: date
>    ->  Seq Scan on users_myfoods_map  (cost=0.00..12.90 rows=123 width=22)
>          Filter: (date = '2004-11-21'::date)
> (4 rows)
>
> The sort cost is non-zero.  Or am I not looking at the right thing...

You are looking at the right thing, AFAIK. Well, it seems the planner cannot 
reason that if a field should have only one value, sorting on that field is 
not needed.

I remember there are examples where the planner will know that the input to 
a sort is already sorted and will skip the sort. Tom will be able to explain 
if this here is a reasonable optimization. I *guess* it could be done, with 
some restrictions.

Best Regards,
Michael Paesold

Re: Order by optimisations?

From

"Andrew Dunstan"

Date:

14 July 2005, 09:24:23

Christopher Kings-Lynne said:
>

> usatest=# explain select * from users_myfoods_map where
> date='2004-11-21' order by date;

I assume that this is program generated SQL, as I hope a human would know
better than to write this. In which case, isn't the answer to improve the
generator rather than expect postgres to make up for its defficiencies?

cheers

andrew

Re: Order by optimisations?

From

Jochem van Dieten

Date:

14 July 2005, 09:52:28

On 7/14/05, Michael Paesold wrote:
> Christopher Kings-Lynne wrote:
>>
>> usatest=# explain select * from users_myfoods_map where date='2004-11-21'
>> order by date;
>>                                 QUERY PLAN
>> ---------------------------------------------------------------------------
>>  Sort  (cost=17.17..17.48 rows=123 width=22)
>>    Sort Key: date
>>    ->  Seq Scan on users_myfoods_map  (cost=0.00..12.90 rows=123 width=22)
>>          Filter: (date = '2004-11-21'::date)
>> (4 rows)
>>
>> The sort cost is non-zero.  Or am I not looking at the right thing...
>
> You are looking at the right thing, AFAIK. Well, it seems the planner cannot
> reason that if a field should have only one value, sorting on that field is
> not needed.

For the planner to deduct that, it should first deduct that the field
should only have one value. Is that a deduction the planner can even
make for this query if we consider for instance implicit timestamp to
date casting?

> I remember there are examples where the planner will know that the input to
>a sort is already sorted and will skip the sort.

The planner knows the output of an indexscan is sorted. With a proper
index on the "date" field (I hope that is not really the name) and
favourable statistics the planner should switch to an indexscan and
the order node should disappear.

Jochem

Re: Order by optimisations?

From

Tom Lane

Date:

14 July 2005, 10:32:32

Hannu Krosing <hannu@skype.net> writes:
> On K, 2005-07-13 at 16:08 +0800, Christopher Kings-Lynne wrote:
>> Does it know that the input to the sort routine is already sorted and 
>> hence is a no-op?

> Yes

No, but in most cases this will use an index and hence will assume that
the index is responsible for ordering.
        regards, tom lane

Re: Order by optimisations?

From

Christopher Kings-Lynne

Date:

14 July 2005, 23:29:31

> I assume that this is program generated SQL, as I hope a human would know
> better than to write this. In which case, isn't the answer to improve the
> generator rather than expect postgres to make up for its defficiencies?

Well, the issue in my case is we have user food diaries.  Usually, 
99.9999% of the time we pull up a single date of their diary.  However, 
for printing purposes we need a range.

It's a large query, so it's implemented as a simple PL/PSQL stored 
procedure.

I was trying to avoid having to c&p the entire stored proc to make a 
'range version'.

If PostgreSQL was smart enough to deal with a range of 1 day and a sort 
on it efficiently, I'd just use the range stored proc exclusively....

Chris

Re: Order by optimisations?

From

Christopher Kings-Lynne

Date:

14 July 2005, 23:34:03

>>>Does it know that the input to the sort routine is already sorted and 
>>>hence is a no-op?
> 
>>Yes
> 
> No, but in most cases this will use an index and hence will assume that
> the index is responsible for ordering.

OK, so what's going on here?

usa=> explain select * from users_myfoods_map where user_id=1 and 
date='2003-11-03' order by date;                                                    QUERY PLAN 

-------------------------------------------------------------------------------------------------------------------
Sort (cost=4.84..4.85 rows=2 width=22)   Sort Key: date   ->  Index Scan using users_myfoods_map_user_id_date_key on 
 
users_myfoods_map  (cost=0.00..4.83 rows=2 width=22)         Index Cond: ((user_id = 1) AND (date =
'2003-11-03'::date))
(4 rows)

(That's on our enormous live table)

Chris

Re: Order by optimisations?

From

Tom Lane

Date:

15 July 2005, 02:03:08

Christopher Kings-Lynne <chriskl@familyhealth.com.au> writes:
> OK, so what's going on here?

> usa=> explain select * from users_myfoods_map where user_id=1 and 
> date='2003-11-03' order by date;
>                                                      QUERY PLAN 

> -------------------------------------------------------------------------------------------------------------------
>   Sort  (cost=4.84..4.85 rows=2 width=22)
>     Sort Key: date
>     ->  Index Scan using users_myfoods_map_user_id_date_key on 
> users_myfoods_map  (cost=0.00..4.83 rows=2 width=22)
>           Index Cond: ((user_id = 1) AND (date = '2003-11-03'::date))
> (4 rows)

Well, date evidently isn't the high-order key of this index.  But why
exactly are you worried about a sort of 2 rows?
        regards, tom lane

Re: Order by optimisations?

From

Christopher Kings-Lynne

Date:

15 July 2005, 02:35:47

> Well, date evidently isn't the high-order key of this index.  But why
> exactly are you worried about a sort of 2 rows?

Aha that's nailed it:

usa=> explain select * from users_myfoods_map where user_id=1 and date 
between '2003-11-03' and '2003-11-03' order by user_id, date;                                                 QUERY
PLAN
 

------------------------------------------------------------------------------------------------------------- Index
Scanusing users_myfoods_map_user_id_date_key on 
 
users_myfoods_map  (cost=0.00..3.78 rows=1 width=22)   Index Cond: ((user_id = 1) AND (date >= '2003-11-03'::date) AND

(date <= '2003-11-03'::date))
(2 rows)


I don't care about this particular result.  But imagine it running 
thousands of times a minute, with result sets between 0 and 50 rows...

Chris

Re: Order by optimisations?

From

Christopher Kings-Lynne

Date:

17 July 2005, 22:30:25

>>>Does it know that the input to the sort routine is already sorted and 
>>>hence is a no-op?
> 
> 
>>Yes
> 
> 
> No, but in most cases this will use an index and hence will assume that
> the index is responsible for ordering.
> 
>             regards, tom lane
> 
> ---------------------------(end of broadcast)---------------------------
> TIP 1: if posting/reading through Usenet, please send an appropriate
>        subscribe-nomail command to majordomo@postgresql.org so that your
>        message can get through to the mailing list cleanly