Thread: order by and limit with multi-column index, bug?

order by and limit with multi-column index, bug?

From
Harald Krake
Date:
as a workaround for min()/max() on indexed columns forcing an index-scan
I tried "order by" with "limit 1". Works fine for the first record
but fails for the last. I don't know why.

Here's the setup:

A table "journal" containing several million records with an index

"CREATE INDEX journal_kblattsoll ON journal(sollkontoid,waehrungid,periode,belegdatum,journalnr)"

with periode being an INT, belegdatum DATE, the rest is INT8.

As a replacement for

"select min(periode) from journal 
where sollkontoid=266122::int8 and waehrungid=17::int8"

which for some reason in the design of postgres scans the whole index, 
I tried:

"select   sollkontoid,waehrungid,periode,belegdatum,journalnr from journal where sollkontoid=266122::int8 and
waehrungid=17::int8order by sollkontoid,waehrungid,periode,belegdatum,journalnr asc limit 1;"
 

this yields:sollkontoid | waehrungid | periode | belegdatum | journalnr 
-------------+------------+---------+------------+-----------     266122 |         17 |       0 | 2002-01-01 |
411

which is correct and works in a fraction of a second as expected.

now, doing the same with "desc" instead of "asc" should return 
"periode = 12" (see below) for the last record, but it doesn't!

After a fairly long time I get:
sollkontoid | waehrungid | periode | belegdatum | journalnr 
-------------+------------+---------+------------+-----------     266122 |         17 |       0 | 2002-01-01 |
2783

ooops???! periode = 0???

Query plan:Limit  (cost=491999.72..491999.73 rows=1 width=32)  ->  Sort  (cost=491999.72..492309.30 rows=123828
width=32)       Sort Key: sollkontoid, waehrungid, periode, belegdatum, journalnr        ->  Index Scan using
journal_kblattsollon journal  
 
(cost=0.00..481525.10 rows=123828 width=32)              Index Cond: ((sollkontoid = 266122::bigint) AND (waehrungid =

17::bigint))


Surprisingly enough, reducing the fields in the order-by clause
returns the correct value for "periode":

select   sollkontoid,waehrungid,periode,belegdatum,journalnr from journal
where sollkontoid=266122::int8 and waehrungid=17::int8 
order by sollkontoid,waehrungid,periode desc limit 1;
sollkontoid | waehrungid | periode | belegdatum | journalnr 
-------------+------------+---------+------------+-----------     266122 |         17 |      12 | 2002-12-09 |
303609



min/max-checks:

select max(periode)from journal where sollkontoid=266122::int8 andwaehrungid=17::int8;

returns 12.
select max(belegdatum) from journal where sollkontoid=266122::int8 andwaehrungid=17::int8 and periode=12;

returns "2002-12-10"
select max(journalnr) from journal where sollkontoid=266122::int8 andwaehrungid=17::int8 and periode=12 and
belegdatum='2002-12-10';

returns 305098.

Consequently, the last record according to the order by clause should be:
sollkontoid | waehrungid | periode | belegdatum | journalnr 
-------------+------------+---------+------------+-----------     266122 |         17 |      12 | 2002-12-10 |
305098




questions:

- what's this???
- why does it take that long?
- is "ORDER BY ... DESC" broken? (even after dropping the index I get the same results)
- am I missing something (friday, 13th, ...) ?

thanx for any answer,
harald.

(postgres 7.3 on redhat 8.0)



Re: order by and limit with multi-column index, bug?

From
Jakub Ouhrabka
Date:
hi,

could it be that "order by col1, col2 desc" is different from "order by
col1 desc, col2 desc" ? these are different and it's correct.

i'm not sure if this is your problem since i haven't digged into it...
sorry... but check it to be sure...

hth,

kuba



On Fri, 13 Dec 2002, Harald Krake wrote:

> as a workaround for min()/max() on indexed columns forcing an index-scan
> I tried "order by" with "limit 1". Works fine for the first record
> but fails for the last. I don't know why.
>
> Here's the setup:
>
> A table "journal" containing several million records with an index
>
> "CREATE INDEX journal_kblattsoll ON journal
>  (sollkontoid,waehrungid,periode,belegdatum,journalnr)"
>
> with periode being an INT, belegdatum DATE, the rest is INT8.
>
> As a replacement for
>
> "select min(periode) from journal
> where sollkontoid=266122::int8 and waehrungid=17::int8"
>
> which for some reason in the design of postgres scans the whole index,
> I tried:
>
> "select   sollkontoid,waehrungid,periode,belegdatum,journalnr from journal
>  where sollkontoid=266122::int8 and waehrungid=17::int8
>  order by sollkontoid,waehrungid,periode,belegdatum,journalnr asc limit 1;"
>
> this yields:
>  sollkontoid | waehrungid | periode | belegdatum | journalnr
> -------------+------------+---------+------------+-----------
>       266122 |         17 |       0 | 2002-01-01 |       411
>
> which is correct and works in a fraction of a second as expected.
>
> now, doing the same with "desc" instead of "asc" should return
> "periode = 12" (see below) for the last record, but it doesn't!
>
> After a fairly long time I get:
>
>  sollkontoid | waehrungid | periode | belegdatum | journalnr
> -------------+------------+---------+------------+-----------
>       266122 |         17 |       0 | 2002-01-01 |      2783
>
> ooops???! periode = 0???
>
> Query plan:
>  Limit  (cost=491999.72..491999.73 rows=1 width=32)
>    ->  Sort  (cost=491999.72..492309.30 rows=123828 width=32)
>          Sort Key: sollkontoid, waehrungid, periode, belegdatum, journalnr
>          ->  Index Scan using journal_kblattsoll on journal
> (cost=0.00..481525.10 rows=123828 width=32)
>                Index Cond: ((sollkontoid = 266122::bigint) AND (waehrungid =
> 17::bigint))
>
>
> Surprisingly enough, reducing the fields in the order-by clause
> returns the correct value for "periode":
>
> select   sollkontoid,waehrungid,periode,belegdatum,journalnr from journal
> where sollkontoid=266122::int8 and waehrungid=17::int8
> order by sollkontoid,waehrungid,periode desc limit 1;
>
>  sollkontoid | waehrungid | periode | belegdatum | journalnr
> -------------+------------+---------+------------+-----------
>       266122 |         17 |      12 | 2002-12-09 |    303609
>
>
>
> min/max-checks:
>
> select max(periode)from journal where sollkontoid=266122::int8 and
>  waehrungid=17::int8;
>
> returns 12.
>
> select max(belegdatum) from journal where sollkontoid=266122::int8 and
>  waehrungid=17::int8 and periode=12;
>
> returns "2002-12-10"
>
>  select max(journalnr) from journal where sollkontoid=266122::int8 and
>  waehrungid=17::int8 and periode=12 and belegdatum='2002-12-10';
>
> returns 305098.
>
> Consequently, the last record according to the order by clause should be:
>
>  sollkontoid | waehrungid | periode | belegdatum | journalnr
> -------------+------------+---------+------------+-----------
>       266122 |         17 |      12 | 2002-12-10 |    305098
>
>
>
>
> questions:
>
> - what's this???
> - why does it take that long?
> - is "ORDER BY ... DESC" broken? (even after dropping the index I get
>   the same results)
> - am I missing something (friday, 13th, ...) ?
>
> thanx for any answer,
> harald.
>
> (postgres 7.3 on redhat 8.0)
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
>



Re: order by and limit with multi-column index, bug?

From
Harald Krake
Date:
On Friday 13 December 2002 03:47 pm, Jakub Ouhrabka wrote:
> hi,
>
> could it be that "order by col1, col2 desc" is different from "order by
> col1 desc, col2 desc" ? these are different and it's correct.
>
> i'm not sure if this is your problem since i haven't digged into it...
> sorry... but check it to be sure...
>
> hth,
>
> kuba

bingo!
good morning.... 

thanx a lot!