Re: Sort time - Mailing list pgsql-performance

From Tom Lane
Subject Re: Sort time
Date
Msg-id 10091.1037577282@sss.pgh.pa.us
Whole thread Raw
In response to Re: Sort time  (Hannu Krosing <hannu@tm.ee>)
List pgsql-performance
Hannu Krosing <hannu@tm.ee> writes:
> Some speed could be gained by doing the original --> "sorting string"
> conversion only once for each line, but that will probably require a
> major rewrite of sorting code - in essence

> select loctxt,a,b,c,d,e,f,g from mytab sort by localestring;

> should become

> select loctxt,a,b,c,d,e,f,g from (
>    select localestring,a,b,c,d,e,f,g
>      from mytab
>   sort by sorting_string(loctxt)
> ) t;

> or even

> select loctxt,a,b,c,d,e,f,g from (
>   select localestring,a,b,c,d,e,f,g, ss  from (
>     select localestring,a,b,c,d,e,f,g, sorting_string(loctxt) as ss from
>       from mytab
>     )
>   sort by ss
> ) t;

> depending on how the second form is implemented (i.e. if
> sorting_string(loctxt) is evaluated once per row or one per compare)

Indeed the function call will be evaluated only once per row, so it
wouldn't be too hard to kluge up a prototype implementation to test what
the real speed difference turns out to be.  You'd basically need
(a) a non-locale-aware set of comparison operators for type text ---
you might as well build a whole index opclass, so that non-locale-aware
indexes could be made (this'd be a huge win for LIKE optimization too);
(b) a strxfrm() function to produce the sortable strings.

If it turns out to be a big win, which is looking probable from the
comparisons Stephan and I just reported, then the next question is how
to make the transformation occur automatically.  I think it'd be
relatively simple to put a hack in the planner to do this when it's
emitting a SORT operation that uses the locale-aware sort operators.
It'd be kind of an ugly special case, but surely no worse than the ones
that are in there already for LIKE and some other operators.

            regards, tom lane

pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Sort time
Next
From: pginfo
Date:
Subject: Re: Sort time