Re: [HACKERS] 6.5 beta and ORDER BY patch - Mailing list pgsql-hackers

From Hannu Krosing
Subject Re: [HACKERS] 6.5 beta and ORDER BY patch
Date
Msg-id 36B898A3.DBEA251C@trust.ee
Whole thread Raw
In response to Re: [HACKERS] 6.5 beta and ORDER BY patch  (jwieck@debis.com (Jan Wieck))
Responses Re: [HACKERS] 6.5 beta and ORDER BY patch
RE: [HACKERS] 6.5 beta and ORDER BY patch
List pgsql-hackers
Jan Wieck wrote:
> 
> 
>     Ok ok ok - OK. You got me, I'll go ahead and put it in.

Thanks ;)
> > I agree that it is a hack and only a partial solution and that in
> > ideal world the optimiser would also know about sort nodes.
> 
>     First  the  executor  must  know better how to handle LIMIT's
>     OFFSET.  For now it  processes  the  query  until  OFFSET  is
>     reached,  simply  suppressing  the  in  fact  produced result
>     tuples in the output. The it stops sending if the LIMIT count
>     is  reached.   For  joins  or other complex things, it has no
>     chance to do something different. But for an  indexed  single
>     table  scan,  where  ALL  the  qualifications are done on the
>     index, it should handle the OFFSET by skipping  index  tuples
>     only.

And we must also tie this kind of scan to triggers (my quess is that 
currently the triggers are fired by accessing the data in the actual
relation data).

It probably does not affect rules as much, though it would be cool to 
define rules for index scans or sort nodes.

>     Second  the  optimizer  must  take  LIMIT  into  account  and
>     depending on the known number of  tuples,  LIMIT  and  OFFSET
>     produce  an  index  scan even if the query isn't qualified at
>     all but has an ORDER BY clause matched by the index.
> 
>     These two  features  would  finally  solve  your  huge  table
>     problems.

Yes, it seems so.

Next thing to attack then would be aggregates, so that they too can 
benefit from indexes, I can immediately think of MIN, MAX and COUNT
on simple scans. But as the aggregates are user-defined, we probably 
need a flag that tells the optimiser if said aggregate can in fact 
use indexes (and what type of index)

Maybe we can even cache some data (for example tuple count) in 
backend, so that COUNT(*) can be made real fast ?

After that the reverse index scans, so that the index that are 
backwards can also be used for sorting.
BTW, can this be easily implemented/effective in PostgreSQL or are
our btree indexes optimised for forward scans ?

Also, how do indexes interact with TRX manager (is there some docs
on it).

---------------------
Hannu


pgsql-hackers by date:

Previous
From: Michael Meskes
Date:
Subject: ecpg patch
Next
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] 6.5 beta and ORDER BY patch