Thread: How to avoid redundant Sort operations? (pgsql 7.1.2)

How to avoid redundant Sort operations? (pgsql 7.1.2)

From
Aris Tsois
Date:
Using the EXPLAIN command I have notice that PostgreSQL performs
redundant Sort operations in several cases.
(postgresql-7.1.2 on Linux)

The 3 examples use the following schema:
create table t (a int, b int);
create view v as select * from t order by a;

The first case containing a redundant sort operation:

explain select * from v order by a;

NOTICE:  QUERY PLAN:

Sort  (cost=0.02..0.02 rows=1 width=8) ->  Subquery Scan v  (cost=0.01..0.01 rows=1 width=8)       ->  Sort
(cost=0.01..0.01rows=1 width=8)             ->  Seq Scan on t  (cost=0.00..0.00 rows=1 width=8)
 

Although the view v is already sorted on a the plan re-sorts it.

The next similar case is:
explain select * from (select * from t order by a) as k order by a;

NOTICE:  QUERY PLAN:

Sort  (cost=0.02..0.02 rows=1 width=8) ->  Subquery Scan k  (cost=0.01..0.01 rows=1 width=8)       ->  Sort
(cost=0.01..0.01rows=1 width=8)             ->  Seq Scan on t  (cost=0.00..0.00 rows=1 width=8)
 

Also the merge-join operation sorts the input even when it is already
sorted as you can see in the followin statement:
(I forced the use of merge-sort with:
set ENABLE_HASHJOIN=0
set ENABLE_INDEXSCAN=0;
set ENABLE_NESTLOOP =0;
but I got the same behaviour on a more complex schema when the planner
choosed the merge-join)

explain select * from t NATURAL JOIN v;

NOTICE:  QUERY PLAN:

Merge Join  (cost=0.03..0.06 rows=1 width=16) ->  Sort  (cost=0.01..0.01 rows=1 width=8)       ->  Seq Scan on t
(cost=0.00..0.00rows=1 width=8) ->  Sort  (cost=0.02..0.02 rows=1 width=8)       ->  Subquery Scan v  (cost=0.01..0.01
rows=1width=8)             ->  Sort  (cost=0.01..0.01 rows=1 width=8)                   ->  Seq Scan on t
(cost=0.00..0.00rows=1 width=8)
 

The same happens with:
explain select * from v NATURAL JOIN (select * from t order by a) AS k;

NOTICE:  QUERY PLAN:

Merge Join  (cost=0.04..0.07 rows=1 width=16) ->  Sort  (cost=0.02..0.02 rows=1 width=8)       ->  Subquery Scan v
(cost=0.01..0.01rows=1 width=8)             ->  Sort  (cost=0.01..0.01 rows=1 width=8)                   ->  Seq Scan
ont  (cost=0.00..0.00 rows=1 width=8) ->  Sort  (cost=0.02..0.02 rows=1 width=8)       ->  Subquery Scan k
(cost=0.01..0.01rows=1 width=8)             ->  Sort  (cost=0.01..0.01 rows=1 width=8)                   ->  Seq Scan
ont  (cost=0.00..0.00 rows=1 width=8)
 



Question:
---------
Is there any way to avoid those redundant sort operations?
(I could even attempt to hack the source code if you can tell me where
to start looking)

There is line in the release notes of 7.0:
"Prevent descending sort if result is already sorted(Hiroshi)"
but I have tested the same examples with descending order and I get
the same results.

Thank you for your time.

Regards,
Aris.




Re: How to avoid redundant Sort operations? (pgsql 7.1.2)

From
Tom Lane
Date:
Aris Tsois <atsois@dblab.ece.ntua.gr> writes:
> Is there any way to avoid those redundant sort operations?

Information about sort order doesn't propagate up through "subquery
scan" nodes.  There are a couple of reasons for this, one being that
pathkeys aren't returned as part of a completed Plan; there'd have to be
some uglification of the APIs of subquery_planner, grouping_planner, etc
to return the appropriate pathkeys along with the finished plan.
A more subtle problem is that pathkeys of a sub-plan aren't necessarily
meaningful for an outer plan (eg, the variables being sorted on might
not even be visible to the outer query).  I think you'd have to do some
kind of translation to see if you could restate the inner pathkeys in
terms of the subplan's output variables.
        regards, tom lane