Thread: Index usage in order by with multiple columns in order-by-clause
I have the following test-case: CREATE TABLE test( name varchar PRIMARY KEY, value varchar NOT NULL, created timestamp not null ); create index test_lowernamevalue_idx ON test ((lower(name) || lower(value))); create index test_lowernamevaluecreated_idx ON test ((lower(name) || lower(value)), created); andreak=# EXPLAIN ANALYZE select * from test order by lower(name) || lower(value) ASC, created ASC; QUERY PLAN -----------------------------------------------------------------------------------------------------------------------------------------Index Scanusing test_lowernamevaluecreated_idx on test (cost=0.00..61.58 rows=770 width=72) (actual time=0.013..0.013 rows=0 loops=1)Total runtime: 0.127 ms (2 rows) andreak=# EXPLAIN ANALYZE select * from test order by lower(name) || lower(value) ASC, created DESC; QUERY PLAN --------------------------------------------------------------------------------------------------------Sort (cost=60.39..62.32rows=770 width=72) (actual time=0.034..0.034 rows=0 loops=1) Sort Key: (lower((name)::text) || lower((value)::text)), created -> Seq Scan on test (cost=0.00..23.47 rows=770width=72) (actual time=0.004..0.004 rows=0 loops=1)Total runtime: 0.123 ms (4 rows) As the EXPLAIN-output shows, the index is not used when sort-ordering differs in the two order-by-columns. Is there a way I can have multiple columns in the ORDER BY clause, each with different ASC/DESC-order and still use an index to speed up sorting? In my application I often have a need to sort by more than 3 columns, so I'm really wondering if there is a way to make sorting of multiple columsn (each which may have different sort-order) use an index? Preferrably without having to create 2^N indexes. -- Andreas Joseph Krogh <andreak@officenet.no> Senior Software Developer / Manager ------------------------+---------------------------------------------+ OfficeNet AS | The most difficult thing in the world is to | Karenslyst Allé 11 | know how to do a thing and to watch | PO. Box 529 Skøyen | somebody else doing it wrong, without | 0214 Oslo | comment. | NORWAY | | Tlf: +47 24 15 38 90 | | Fax: +47 24 15 38 91 | | Mobile: +47 909 56 963 | | ------------------------+---------------------------------------------+
Andreas Joseph Krogh <andreak@officenet.no> writes: > Is there a way I can have multiple columns in the ORDER BY clause, each with > different ASC/DESC-order and still use an index to speed up sorting? A btree index isn't magic, it's just an ordered list of entries. So you can't just randomly flip the ordering of individual columns. For instance, the natural sort order of a 2-column index on (x,y) is like x y 1 11 21 32 12 22 33 13 23 3 If you scan this index forwards, you get the equivalent ofORDER BY x ASC, y ASC If you scan it backwards, you get the equivalent ofORDER BY x DESC, y DESC But there is no way to get the equivalent of x ASC, y DESC from a scan of this index, nor x DESC, y ASC. If you have a specific requirement for one of those combinations, what you can do is build an index in which one of the columns is "reverse sorted". For instance, if we reverse-sort y, the index ordering looks like x y 1 31 21 12 32 22 13 33 23 1 Now we can get ORDER BY x ASC, y DESC from a forwards indexscan, or ORDER BY x DESC, y ASC from a backwards scan. But there's no way to get ASC/ASC or DESC/DESC from this index. If you really need all four orderings to be available, you're stuck with maintaining two indexes. Reverse-sorted index columns are possible but not well supported in existing PG releases (you need a custom operator class, and the planner is not all that bright about using them). 8.3 will have full support. regards, tom lane
Re: Index usage in order by with multiple columns in order-by-clause
From
Andreas Joseph Krogh
Date:
On Friday 10 August 2007 23:30:14 Tom Lane wrote: > Andreas Joseph Krogh <andreak@officenet.no> writes: > > Is there a way I can have multiple columns in the ORDER BY clause, each > > with different ASC/DESC-order and still use an index to speed up sorting? > > A btree index isn't magic, it's just an ordered list of entries. So you > can't just randomly flip the ordering of individual columns. For > instance, the natural sort order of a 2-column index on (x,y) is like > > x y > > 1 1 > 1 2 > 1 3 > 2 1 > 2 2 > 2 3 > 3 1 > 3 2 > 3 3 > > If you scan this index forwards, you get the equivalent of > ORDER BY x ASC, y ASC > If you scan it backwards, you get the equivalent of > ORDER BY x DESC, y DESC > But there is no way to get the equivalent of x ASC, y DESC from > a scan of this index, nor x DESC, y ASC. > > If you have a specific requirement for one of those combinations, > what you can do is build an index in which one of the columns is > "reverse sorted". For instance, if we reverse-sort y, the index > ordering looks like > > x y > > 1 3 > 1 2 > 1 1 > 2 3 > 2 2 > 2 1 > 3 3 > 3 2 > 3 1 > > Now we can get ORDER BY x ASC, y DESC from a forwards indexscan, > or ORDER BY x DESC, y ASC from a backwards scan. But there's no > way to get ASC/ASC or DESC/DESC from this index. If you really need > all four orderings to be available, you're stuck with maintaining > two indexes. > > Reverse-sorted index columns are possible but not well supported in > existing PG releases (you need a custom operator class, and the planner > is not all that bright about using them). 8.3 will have full support. Thank you for your in-depth reply (a always)! How exactly do I build an index in which one of the columns is "reverse sorted" in 8.2 (and 8.3)? This may be *the* reason to upgrade for me if 8.3 is better at this. -- Andreas Joseph Krogh <andreak@officenet.no> Senior Software Developer / Manager ------------------------+---------------------------------------------+ OfficeNet AS | The most difficult thing in the world is to | Karenslyst Allé 11 | know how to do a thing and to watch | PO. Box 529 Skøyen | somebody else doing it wrong, without | 0214 Oslo | comment. | NORWAY | | Tlf: +47 24 15 38 90 | | Fax: +47 24 15 38 91 | | Mobile: +47 909 56 963 | | ------------------------+---------------------------------------------+
Re: Index usage in order by with multiple columns in order-by-clause
From
hubert depesz lubaczewski
Date:
On Fri, Aug 10, 2007 at 04:53:12PM +0200, Andreas Joseph Krogh wrote: > I have the following test-case: > > CREATE TABLE test( > name varchar PRIMARY KEY, > value varchar NOT NULL, > created timestamp not null > ); > > create index test_lowernamevalue_idx ON test ((lower(name) || lower(value))); > create index test_lowernamevaluecreated_idx ON test ((lower(name) || > lower(value)), created); > andreak=# EXPLAIN ANALYZE select * from test order by lower(name) || > lower(value) ASC, created DESC; > QUERY PLAN > -------------------------------------------------------------------------------------------------------- > Sort (cost=60.39..62.32 rows=770 width=72) (actual time=0.034..0.034 rows=0 > loops=1) > Sort Key: (lower((name)::text) || lower((value)::text)), created > -> Seq Scan on test (cost=0.00..23.47 rows=770 width=72) (actual > time=0.004..0.004 rows=0 loops=1) > Total runtime: 0.123 ms > (4 rows) > In my application I often have a need to sort by more than 3 columns, so I'm > really wondering if there is a way to make sorting of multiple columsn (each > which may have different sort-order) use an index? Preferrably without having > to create 2^N indexes. first of all - you can try with separate indexes on lower()||lower(), and created. then - you can use a trick. create a function that will reverse order of your date (using a simple "-" operator) and then index your lower() and output of this function. you will need to modify the query, but it's perfectly doable. for example: create function test_ts(timestamp) returns interval as $BODY$ begin return '2000-01-01 00:00:00'::timestamp-$1; end; $BODY$ language plpgsql immutable; of course this particular date is irrelevant, we just have to substract from something. then: create index test_lowernamevaluecreated_idx2 ON test ((lower(name) || lower(value)), test_ts(created)); and change your query to: select * from test order by lower(name) || lower(value) ASC, test_ts(created); it would show you what you need. depesz -- quicksil1er: "postgres is excellent, but like any DB it requires a highly paid DBA. here's my CV!" :) http://www.depesz.com/ - blog dla ciebie (i moje CV)
Re: Index usage in order by with multiple columns in order-by-clause
From
Andreas Joseph Krogh
Date:
On Saturday 11 August 2007 21:05:22 hubert depesz lubaczewski wrote: > On Fri, Aug 10, 2007 at 04:53:12PM +0200, Andreas Joseph Krogh wrote: > > I have the following test-case: > > > > CREATE TABLE test( > > name varchar PRIMARY KEY, > > value varchar NOT NULL, > > created timestamp not null > > ); > > > > create index test_lowernamevalue_idx ON test ((lower(name) || > > lower(value))); create index test_lowernamevaluecreated_idx ON test > > ((lower(name) || lower(value)), created); > > andreak=# EXPLAIN ANALYZE select * from test order by lower(name) || > > lower(value) ASC, created DESC; > > QUERY PLAN > > ------------------------------------------------------------------------- > >------------------------------- Sort (cost=60.39..62.32 rows=770 > > width=72) (actual time=0.034..0.034 rows=0 loops=1) > > Sort Key: (lower((name)::text) || lower((value)::text)), created > > -> Seq Scan on test (cost=0.00..23.47 rows=770 width=72) (actual > > time=0.004..0.004 rows=0 loops=1) > > Total runtime: 0.123 ms > > (4 rows) > > In my application I often have a need to sort by more than 3 columns, so > > I'm really wondering if there is a way to make sorting of multiple > > columsn (each which may have different sort-order) use an index? > > Preferrably without having to create 2^N indexes. > > first of all - you can try with separate indexes on lower()||lower(), > and created. > > then - you can use a trick. > create a function that will reverse order of your date (using a simple > "-" operator) > and then index your lower() and output of this function. > > you will need to modify the query, but it's perfectly doable. > > for example: > create function test_ts(timestamp) returns interval as $BODY$ > begin > return '2000-01-01 00:00:00'::timestamp-$1; > end; > $BODY$ language plpgsql immutable; > > of course this particular date is irrelevant, we just have to substract > from something. > > then: > create index test_lowernamevaluecreated_idx2 ON test ((lower(name) || > lower(value)), test_ts(created)); > > and change your query to: > select * from test order by lower(name) || lower(value) ASC, > test_ts(created); it would show you what you need. > > depesz Thanks. I actaully do have an index on lower(a) || lower(b). Then, as Tom Lane explained, I need to have lots of indexes if I want to sort with different ordering (ASC|DESC) on each column. -- Andreas Joseph Krogh <andreak@officenet.no> Senior Software Developer / Manager ------------------------+---------------------------------------------+ OfficeNet AS | The most difficult thing in the world is to | Karenslyst Allé 11 | know how to do a thing and to watch | PO. Box 529 Skøyen | somebody else doing it wrong, without | 0214 Oslo | comment. | NORWAY | | Tlf: +47 24 15 38 90 | | Fax: +47 24 15 38 91 | | Mobile: +47 909 56 963 | | ------------------------+---------------------------------------------+
Andreas Joseph Krogh <andreak@officenet.no> writes: > On Friday 10 August 2007 23:30:14 Tom Lane wrote: >> Reverse-sorted index columns are possible but not well supported in >> existing PG releases (you need a custom operator class, and the planner >> is not all that bright about using them). 8.3 will have full support. > How exactly do I build an index in which one of the columns is "reverse > sorted" in 8.2 (and 8.3)? Here's a minimal example (tested in 8.2). The pain-in-the-neck part is creating a btree comparison function that reverses the normal one's comparisons. For the example I just did it in plpgsql, but if you were to do this sort of thing on large tables you'd probably find you needed a function written in C for speed: regression=# create function btrevfloat8cmp(float8,float8) returns int as regression-# $$begin return btfloat8cmp($2, $1); end$$ regression-# language plpgsql strict immutable; CREATE FUNCTION You then make the opclass using the regular comparison operators listed in backwards order, plus the reverse comparison function: regression=# create operator class rev_float8_ops for type float8 using btree regression-# as regression-# operator 1 > , regression-# operator 2 >= , regression-# operator 3 = , regression-# operator 4 <= , regression-# operator 5 < , regression-# function 1 btrevfloat8cmp(float8,float8) ; CREATE OPERATOR CLASS And you're off: regression=# create table myt (f1 float8, f2 float8); CREATE TABLE regression=# create index myi on myt using btree (f1, f2 rev_float8_ops); CREATE INDEX regression=# insert into myt values(1,1),(1,2),(1,3),(2,1),(2,2),(2,3); INSERT 0 6 regression=# explain select * from myt order by f1 asc, f2 desc; QUERY PLAN --------------------------------------------------------------------Index Scan using myi on myt (cost=0.00..72.70 rows=1630width=16) (1 row) regression=# select * from myt order by f1 asc, f2 desc;f1 | f2 ----+---- 1 | 3 1 | 2 1 | 1 2 | 3 2 | 2 2 | 1 (6 rows) regression=# explain select * from myt order by f1 desc, f2 asc; QUERY PLAN -----------------------------------------------------------------------------Index Scan Backward using myi on myt (cost=0.00..72.70rows=1630 width=16) (1 row) regression=# select * from myt order by f1 desc, f2 asc;f1 | f2 ----+---- 2 | 1 2 | 2 2 | 3 1 | 1 1 | 2 1 | 3 (6 rows) This is only a minimal example because I didn't bother with any cross-type comparisons; you might need those depending on how much use you expect to get out of the index. The main problem with this is that you don't have any control over the NULLS FIRST/LAST behavior. Pre-8.3, btree indexes will always put nulls at the end; the opclass has no control over that. So the effective sort order here is like ORDER BY f1 ASC NULLS LAST, f2 DESC NULLS LAST (or NULLS FIRST for the backward scan), which might not be what you'd want. I'm also pretty sure that the pre-8.3 planner will not figure out how to use such an index for mergejoins (and it might not work if it did figure it out, because of the nulls-ordering issue). You might be able to finesse all that if you can choose to put the reverse-sort opclass on a NOT NULL column. regards, tom lane
Re: Index usage in order by with multiple columns in order-by-clause
From
Andreas Joseph Krogh
Date:
On Saturday 11 August 2007 21:55:49 Tom Lane wrote: > Andreas Joseph Krogh <andreak@officenet.no> writes: > > On Friday 10 August 2007 23:30:14 Tom Lane wrote: > >> Reverse-sorted index columns are possible but not well supported in > >> existing PG releases (you need a custom operator class, and the planner > >> is not all that bright about using them). 8.3 will have full support. > > > > How exactly do I build an index in which one of the columns is "reverse > > sorted" in 8.2 (and 8.3)? > > Here's a minimal example (tested in 8.2). The pain-in-the-neck part > is creating a btree comparison function that reverses the normal one's > comparisons. For the example I just did it in plpgsql, but if you were > to do this sort of thing on large tables you'd probably find you needed > a function written in C for speed: > > regression=# create function btrevfloat8cmp(float8,float8) returns int as > regression-# $$begin return btfloat8cmp($2, $1); end$$ > regression-# language plpgsql strict immutable; > CREATE FUNCTION > > You then make the opclass using the regular comparison operators listed > in backwards order, plus the reverse comparison function: > > regression=# create operator class rev_float8_ops for type float8 using > btree regression-# as > regression-# operator 1 > , > regression-# operator 2 >= , > regression-# operator 3 = , > regression-# operator 4 <= , > regression-# operator 5 < , > regression-# function 1 btrevfloat8cmp(float8,float8) ; > CREATE OPERATOR CLASS > > And you're off: > > regression=# create table myt (f1 float8, f2 float8); > CREATE TABLE > regression=# create index myi on myt using btree (f1, f2 rev_float8_ops); > CREATE INDEX > regression=# insert into myt values(1,1),(1,2),(1,3),(2,1),(2,2),(2,3); > INSERT 0 6 > regression=# explain select * from myt order by f1 asc, f2 desc; > QUERY PLAN > -------------------------------------------------------------------- > Index Scan using myi on myt (cost=0.00..72.70 rows=1630 width=16) > (1 row) > > regression=# select * from myt order by f1 asc, f2 desc; > f1 | f2 > ----+---- > 1 | 3 > 1 | 2 > 1 | 1 > 2 | 3 > 2 | 2 > 2 | 1 > (6 rows) > > regression=# explain select * from myt order by f1 desc, f2 asc; > QUERY PLAN > --------------------------------------------------------------------------- >-- Index Scan Backward using myi on myt (cost=0.00..72.70 rows=1630 > width=16) (1 row) > > regression=# select * from myt order by f1 desc, f2 asc; > f1 | f2 > ----+---- > 2 | 1 > 2 | 2 > 2 | 3 > 1 | 1 > 1 | 2 > 1 | 3 > (6 rows) > > This is only a minimal example because I didn't bother with any > cross-type comparisons; you might need those depending on how much > use you expect to get out of the index. > > The main problem with this is that you don't have any control over the > NULLS FIRST/LAST behavior. Pre-8.3, btree indexes will always put nulls > at the end; the opclass has no control over that. So the effective sort > order here is like ORDER BY f1 ASC NULLS LAST, f2 DESC NULLS LAST (or > NULLS FIRST for the backward scan), which might not be what you'd want. > I'm also pretty sure that the pre-8.3 planner will not figure out how to > use such an index for mergejoins (and it might not work if it did figure > it out, because of the nulls-ordering issue). You might be able to > finesse all that if you can choose to put the reverse-sort opclass on a > NOT NULL column. Thank you, really neat stuff. -- Andreas Joseph Krogh <andreak@officenet.no> Senior Software Developer / Manager ------------------------+---------------------------------------------+ OfficeNet AS | The most difficult thing in the world is to | Karenslyst Allé 11 | know how to do a thing and to watch | PO. Box 529 Skøyen | somebody else doing it wrong, without | 0214 Oslo | comment. | NORWAY | | Tlf: +47 24 15 38 90 | | Fax: +47 24 15 38 91 | | Mobile: +47 909 56 963 | | ------------------------+---------------------------------------------+