Thread: BUG #4224: issue with LIMIT and ORDER BY

BUG #4224: issue with LIMIT and ORDER BY

From
"Lawrence Cohan"
Date:
The following bug has been logged online:

Bug reference:      4224
Logged by:          Lawrence Cohan
Email address:      lawrencec@1shoppingcart.com
PostgreSQL version: 8.2.5
Operating system:   red hat 4.1.1-52
Description:        issue with LIMIT and ORDER BY
Details:

Following queries run FOREVER in PG if an index exists on the "id" column
which is a integer - serial and PKey on the table.

SELECT id FROM orders WHERE merchant_id = xxxxxx ORDER BY id DESC LIMIT 31
-- or 30, 29, 28, 27, 26, 25

or

SELECT id FROM clients WHERE merchant_id = XXXXXX ORDER BY id LIMIT 3 -- or
1, 2.

With different limits we get different results but  the queris are running
forever with DESC as well.
This is a serrios issue as PG documentation says that:

"When using LIMIT, it is important to use an ORDER BY clause that constrains
the result rows into a unique order. Otherwise you will get an unpredictable
subset of the query's rows. You might be asking for the tenth through
twentieth rows, but tenth through twentieth in what ordering?"

Re: BUG #4224: issue with LIMIT and ORDER BY

From
hubert depesz lubaczewski
Date:
On Thu, Jun 05, 2008 at 06:15:29PM +0000, Lawrence Cohan wrote:
> Following queries run FOREVER in PG if an index exists on the "id" column
> which is a integer - serial and PKey on the table.
> SELECT id FROM orders WHERE merchant_id = xxxxxx ORDER BY id DESC LIMIT 31
> -- or 30, 29, 28, 27, 26, 25
> or
> SELECT id FROM clients WHERE merchant_id = XXXXXX ORDER BY id LIMIT 3 -- or
> 1, 2.
> With different limits we get different results but  the queris are running
> forever with DESC as well.

my guess is that you:
1. don't have index on merchant_id
2. have a lot of rows in this table
3. very little rows have given merchant_id

you can easily fix the situation with:
create index q on clients (merchant_id, id);

depesz

Re: BUG #4224: issue with LIMIT and ORDER BY

From
"Lawrence Cohan"
Date:
Many thanks for the quick reply and suggestion! Indeed we do have many
records in these tables - 20/50 million rows, and we do have index on
merchant_id already which is a NOT NULLable column as well. In my
opinion the duplicate index we have on the "id" column which is a
NONCLUSTERED Pkey as well is confusing the optimizer because if we drop
it all goes well. The problem is that without it a few other queries we
run for reporting are running forever which is really hard to understand
why because the Pkey assumes that an index will be created by default. I
tried to analyze then vacuum/analyze/reindex/analyze and even after that
the results were the same. I just tried your suggestion and IT WORKED!
Thanks a lot again,

Lawrence Cohan.

-----Original Message-----
From: hubert depesz lubaczewski [mailto:depesz@depesz.com]=20
Sent: Thursday, June 05, 2008 2:41 PM
To: Lawrence Cohan
Cc: pgsql-bugs@postgresql.org
Subject: Re: [BUGS] BUG #4224: issue with LIMIT and ORDER BY

On Thu, Jun 05, 2008 at 06:15:29PM +0000, Lawrence Cohan wrote:
> Following queries run FOREVER in PG if an index exists on the "id"
column
> which is a integer - serial and PKey on the table.=20
> SELECT id FROM orders WHERE merchant_id =3D xxxxxx ORDER BY id DESC
LIMIT 31
> -- or 30, 29, 28, 27, 26, 25
> or=20
> SELECT id FROM clients WHERE merchant_id =3D XXXXXX ORDER BY id LIMIT 3
-- or
> 1, 2.
> With different limits we get different results but  the queris are
running
> forever with DESC as well.=20

my guess is that you:
1. don't have index on merchant_id
2. have a lot of rows in this table
3. very little rows have given merchant_id

you can easily fix the situation with:
create index q on clients (merchant_id, id);

depesz