Thread: Multicolumn order by

Multicolumn order by

From
Theo Kramer
Date:
Hi

Apologies if this has already been raised...

PostgreSQL 8.1.3 and prior versions. Vacuum done.

Assuming a single table with columns named c1 to cn and a requirement to
select from a particular position in multiple column order.

The column values in my simple example below denoted by 'cnv' a typical
query would look as follows

select * from mytable where
  (c1 = 'c1v' and c2 = 'c2v' and c3 >= 'c3v') or
  (c1 = 'c1v' and c2 > 'c2v') or
  (c1 > 'c1v')
  order by c1, c2, c3;

In real life with the table containing many rows (>9 Million) and
a single multicolumn index on the required columns existing I get the
following

explain analyse
 SELECT
 tran_subledger,
 tran_subaccount,
 tran_mtch,
 tran_self,
 tran_Rflg FROM tran
WHERE ((tran_subledger = 2 AND tran_subaccount = 'ARM                 '
AND tran_mtch = 0 AND tran_self >= 0 )
OR (tran_subledger = 2 AND tran_subaccount = 'ARM                 ' AND
tran_mtch > 0 )
OR (tran_subledger = 2 AND tran_subaccount > 'ARM                 ' )
OR (tran_subledger > 2 ))
ORDER BY tran_subledger,
 tran_subaccount,
 tran_mtch,
 tran_self
limit 10;

 Limit  (cost=0.00..25.21 rows=10 width=36) (actual
time=2390271.832..2390290.305 rows=10 loops=1)
   ->  Index Scan using tran_mtc_idx on tran  (cost=0.00..13777295.04
rows=5465198 width=36) (actual time=2390271.823..2390290.252 rows=10
loops=1)
         Filter: (((tran_subledger = 2) AND (tran_subaccount = 'ARM
'::bpchar) AND (tran_mtch = 0) AND (tran_self >= 0)) OR ((tran_subledger
= 2) AND (tran_subaccount = 'ARM                 '::bpchar) AND
(tran_mtch > 0)) OR ((tran_subledger = 2) AND (tran_subaccount >
'ARM                 '::bpchar)) OR (tran_subledger > 2))
 Total runtime: 2390290.417 ms

Any suggestions/comments/ideas appreciated.
--
Regards
Theo


Re: Multicolumn order by

From
Tom Lane
Date:
Theo Kramer <theo@flame.co.za> writes:
> select * from mytable where
>   (c1 = 'c1v' and c2 = 'c2v' and c3 >= 'c3v') or
>   (c1 = 'c1v' and c2 > 'c2v') or
>   (c1 > 'c1v')
>   order by c1, c2, c3;

Yeah ... what you really want is the SQL-spec row comparison operator

select ... where (c1,c2,c3) >= ('c1v','c2v','c3v') order by c1,c2,c3;

This does not work properly in any current PG release :-( but it does
work and is optimized well in CVS HEAD.  See eg this thread
http://archives.postgresql.org/pgsql-hackers/2006-02/msg00209.php

            regards, tom lane

Re: Multicolumn order by

From
"Jim C. Nasby"
Date:
Assuming stats are accurate, you're reading through 5.5M index rows in
order to run that limit query. You didn't say what the index was
actually on, but you might want to try giving each column it's own
index. That might make a bitmap scan feasable.

I know this doesn't help right now, but 8.2 will also allow you to do
this using a row comparitor. You might want to compile cvs HEAD and see
how that does with this query (specifically if using a row comparitor
performs better than the query below).

On Wed, Apr 19, 2006 at 12:07:55AM +0200, Theo Kramer wrote:
> Hi
>
> Apologies if this has already been raised...
>
> PostgreSQL 8.1.3 and prior versions. Vacuum done.
>
> Assuming a single table with columns named c1 to cn and a requirement to
> select from a particular position in multiple column order.
>
> The column values in my simple example below denoted by 'cnv' a typical
> query would look as follows
>
> select * from mytable where
>   (c1 = 'c1v' and c2 = 'c2v' and c3 >= 'c3v') or
>   (c1 = 'c1v' and c2 > 'c2v') or
>   (c1 > 'c1v')
>   order by c1, c2, c3;
>
> In real life with the table containing many rows (>9 Million) and
> a single multicolumn index on the required columns existing I get the
> following
>
> explain analyse
>  SELECT
>  tran_subledger,
>  tran_subaccount,
>  tran_mtch,
>  tran_self,
>  tran_Rflg FROM tran
> WHERE ((tran_subledger = 2 AND tran_subaccount = 'ARM                 '
> AND tran_mtch = 0 AND tran_self >= 0 )
> OR (tran_subledger = 2 AND tran_subaccount = 'ARM                 ' AND
> tran_mtch > 0 )
> OR (tran_subledger = 2 AND tran_subaccount > 'ARM                 ' )
> OR (tran_subledger > 2 ))
> ORDER BY tran_subledger,
>  tran_subaccount,
>  tran_mtch,
>  tran_self
> limit 10;
>
>  Limit  (cost=0.00..25.21 rows=10 width=36) (actual
> time=2390271.832..2390290.305 rows=10 loops=1)
>    ->  Index Scan using tran_mtc_idx on tran  (cost=0.00..13777295.04
> rows=5465198 width=36) (actual time=2390271.823..2390290.252 rows=10
> loops=1)
>          Filter: (((tran_subledger = 2) AND (tran_subaccount = 'ARM
> '::bpchar) AND (tran_mtch = 0) AND (tran_self >= 0)) OR ((tran_subledger
> = 2) AND (tran_subaccount = 'ARM                 '::bpchar) AND
> (tran_mtch > 0)) OR ((tran_subledger = 2) AND (tran_subaccount >
> 'ARM                 '::bpchar)) OR (tran_subledger > 2))
>  Total runtime: 2390290.417 ms
>
> Any suggestions/comments/ideas appreciated.
> --
> Regards
> Theo
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 2: Don't 'kill -9' the postmaster
>

--
Jim C. Nasby, Sr. Engineering Consultant      jnasby@pervasive.com
Pervasive Software      http://pervasive.com    work: 512-231-6117
vcard: http://jim.nasby.net/pervasive.vcf       cell: 512-569-9461

Re: Multicolumn order by

From
Theo Kramer
Date:
On Wed, 2006-04-19 at 01:08, Tom Lane wrote:
> Theo Kramer <theo@flame.co.za> writes:
> > select * from mytable where
> >   (c1 = 'c1v' and c2 = 'c2v' and c3 >= 'c3v') or
> >   (c1 = 'c1v' and c2 > 'c2v') or
> >   (c1 > 'c1v')
> >   order by c1, c2, c3;
>
> Yeah ... what you really want is the SQL-spec row comparison operator
>
> select ... where (c1,c2,c3) >= ('c1v','c2v','c3v') order by c1,c2,c3;
>
> This does not work properly in any current PG release :-( but it does
> work and is optimized well in CVS HEAD.  See eg this thread
> http://archives.postgresql.org/pgsql-hackers/2006-02/msg00209.php

That is awesome - been fighting with porting my isam based stuff onto
sql for a long time and the row comparison operator is exactly what I
have been looking for.

I tried this on my test system running 8.1.3 and appears to work fine.
Appreciate it if you could let me know in what cases it does not work
properly.

--
Regards
Theo


Re: Multicolumn order by

From
Theo Kramer
Date:
On Wed, 2006-04-19 at 08:00, Theo Kramer wrote:

> I tried this on my test system running 8.1.3 and appears to work fine.
> Appreciate it if you could let me know in what cases it does not work
> properly.

Please ignore - 'Explain is your friend' - got to look at the tips :)
--
Regards
Theo