Thread: Multicolumn order by
Hi Apologies if this has already been raised... PostgreSQL 8.1.3 and prior versions. Vacuum done. Assuming a single table with columns named c1 to cn and a requirement to select from a particular position in multiple column order. The column values in my simple example below denoted by 'cnv' a typical query would look as follows select * from mytable where (c1 = 'c1v' and c2 = 'c2v' and c3 >= 'c3v') or (c1 = 'c1v' and c2 > 'c2v') or (c1 > 'c1v') order by c1, c2, c3; In real life with the table containing many rows (>9 Million) and a single multicolumn index on the required columns existing I get the following explain analyse SELECT tran_subledger, tran_subaccount, tran_mtch, tran_self, tran_Rflg FROM tran WHERE ((tran_subledger = 2 AND tran_subaccount = 'ARM ' AND tran_mtch = 0 AND tran_self >= 0 ) OR (tran_subledger = 2 AND tran_subaccount = 'ARM ' AND tran_mtch > 0 ) OR (tran_subledger = 2 AND tran_subaccount > 'ARM ' ) OR (tran_subledger > 2 )) ORDER BY tran_subledger, tran_subaccount, tran_mtch, tran_self limit 10; Limit (cost=0.00..25.21 rows=10 width=36) (actual time=2390271.832..2390290.305 rows=10 loops=1) -> Index Scan using tran_mtc_idx on tran (cost=0.00..13777295.04 rows=5465198 width=36) (actual time=2390271.823..2390290.252 rows=10 loops=1) Filter: (((tran_subledger = 2) AND (tran_subaccount = 'ARM '::bpchar) AND (tran_mtch = 0) AND (tran_self >= 0)) OR ((tran_subledger = 2) AND (tran_subaccount = 'ARM '::bpchar) AND (tran_mtch > 0)) OR ((tran_subledger = 2) AND (tran_subaccount > 'ARM '::bpchar)) OR (tran_subledger > 2)) Total runtime: 2390290.417 ms Any suggestions/comments/ideas appreciated. -- Regards Theo
Theo Kramer <theo@flame.co.za> writes: > select * from mytable where > (c1 = 'c1v' and c2 = 'c2v' and c3 >= 'c3v') or > (c1 = 'c1v' and c2 > 'c2v') or > (c1 > 'c1v') > order by c1, c2, c3; Yeah ... what you really want is the SQL-spec row comparison operator select ... where (c1,c2,c3) >= ('c1v','c2v','c3v') order by c1,c2,c3; This does not work properly in any current PG release :-( but it does work and is optimized well in CVS HEAD. See eg this thread http://archives.postgresql.org/pgsql-hackers/2006-02/msg00209.php regards, tom lane
Assuming stats are accurate, you're reading through 5.5M index rows in order to run that limit query. You didn't say what the index was actually on, but you might want to try giving each column it's own index. That might make a bitmap scan feasable. I know this doesn't help right now, but 8.2 will also allow you to do this using a row comparitor. You might want to compile cvs HEAD and see how that does with this query (specifically if using a row comparitor performs better than the query below). On Wed, Apr 19, 2006 at 12:07:55AM +0200, Theo Kramer wrote: > Hi > > Apologies if this has already been raised... > > PostgreSQL 8.1.3 and prior versions. Vacuum done. > > Assuming a single table with columns named c1 to cn and a requirement to > select from a particular position in multiple column order. > > The column values in my simple example below denoted by 'cnv' a typical > query would look as follows > > select * from mytable where > (c1 = 'c1v' and c2 = 'c2v' and c3 >= 'c3v') or > (c1 = 'c1v' and c2 > 'c2v') or > (c1 > 'c1v') > order by c1, c2, c3; > > In real life with the table containing many rows (>9 Million) and > a single multicolumn index on the required columns existing I get the > following > > explain analyse > SELECT > tran_subledger, > tran_subaccount, > tran_mtch, > tran_self, > tran_Rflg FROM tran > WHERE ((tran_subledger = 2 AND tran_subaccount = 'ARM ' > AND tran_mtch = 0 AND tran_self >= 0 ) > OR (tran_subledger = 2 AND tran_subaccount = 'ARM ' AND > tran_mtch > 0 ) > OR (tran_subledger = 2 AND tran_subaccount > 'ARM ' ) > OR (tran_subledger > 2 )) > ORDER BY tran_subledger, > tran_subaccount, > tran_mtch, > tran_self > limit 10; > > Limit (cost=0.00..25.21 rows=10 width=36) (actual > time=2390271.832..2390290.305 rows=10 loops=1) > -> Index Scan using tran_mtc_idx on tran (cost=0.00..13777295.04 > rows=5465198 width=36) (actual time=2390271.823..2390290.252 rows=10 > loops=1) > Filter: (((tran_subledger = 2) AND (tran_subaccount = 'ARM > '::bpchar) AND (tran_mtch = 0) AND (tran_self >= 0)) OR ((tran_subledger > = 2) AND (tran_subaccount = 'ARM '::bpchar) AND > (tran_mtch > 0)) OR ((tran_subledger = 2) AND (tran_subaccount > > 'ARM '::bpchar)) OR (tran_subledger > 2)) > Total runtime: 2390290.417 ms > > Any suggestions/comments/ideas appreciated. > -- > Regards > Theo > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: Don't 'kill -9' the postmaster > -- Jim C. Nasby, Sr. Engineering Consultant jnasby@pervasive.com Pervasive Software http://pervasive.com work: 512-231-6117 vcard: http://jim.nasby.net/pervasive.vcf cell: 512-569-9461
On Wed, 2006-04-19 at 01:08, Tom Lane wrote: > Theo Kramer <theo@flame.co.za> writes: > > select * from mytable where > > (c1 = 'c1v' and c2 = 'c2v' and c3 >= 'c3v') or > > (c1 = 'c1v' and c2 > 'c2v') or > > (c1 > 'c1v') > > order by c1, c2, c3; > > Yeah ... what you really want is the SQL-spec row comparison operator > > select ... where (c1,c2,c3) >= ('c1v','c2v','c3v') order by c1,c2,c3; > > This does not work properly in any current PG release :-( but it does > work and is optimized well in CVS HEAD. See eg this thread > http://archives.postgresql.org/pgsql-hackers/2006-02/msg00209.php That is awesome - been fighting with porting my isam based stuff onto sql for a long time and the row comparison operator is exactly what I have been looking for. I tried this on my test system running 8.1.3 and appears to work fine. Appreciate it if you could let me know in what cases it does not work properly. -- Regards Theo
On Wed, 2006-04-19 at 08:00, Theo Kramer wrote: > I tried this on my test system running 8.1.3 and appears to work fine. > Appreciate it if you could let me know in what cases it does not work > properly. Please ignore - 'Explain is your friend' - got to look at the tips :) -- Regards Theo