Thread: Query is slow when order by and limit clause are used in the query

Query is slow when order by and limit clause are used in the query

From

sreekanth vajrapu

Date:

24 May 2021, 10:49:01

Hi Team,

We are having a slow query issue for one of our applications. We are seeing slowness(5 seconds) when we use both ORDER BY and LIMIT 30 clause whereas the same query is performing very good(200 MS) when using only ORDER BY clause. Also note that the query performed very fast(200 MS) when we increased LIMIT to 100 along with ORDER BY.

Can you please help us if there are any bugs related to this? OR Can someone kindly provide some solution to this issue?

Thanks & Regards,

Sreekanth

Re: Query is slow when order by and limit clause are used in the query

From

Bharath Rupireddy

Date:

24 May 2021, 11:03:10

On Mon, May 24, 2021 at 4:19 PM sreekanth vajrapu
<sreekanthvajrapu@gmail.com> wrote:
>
>
> Hi Team,
>
> We are having a slow query issue for one of our applications. We are seeing slowness(5 seconds) when we use both
ORDERBY and LIMIT 30 clause whereas the same query is performing very good(200 MS) when using only ORDER BY clause.
Alsonote that the query performed very fast(200 MS) when we increased LIMIT to 100 along with ORDER  BY. 
>
>
> Can you please help us if there are any bugs related to this?  OR Can someone kindly provide some solution to this
issue?

I think it's good to provide explain (analyze, buffers, costs,
verbose, summary) of the query and a reproducible test case (if
possible). It will be easier for those who investigate the issue.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Query is slow when order by and limit clause are used in the query

From

sreekanth vajrapu

Date:

24 May 2021, 11:31:28

Hi Bharath,

Thanks for the quick reply, I have attached the execution plan for below 3 scenarios. Our application is using 1st Scenario

1) WITH ORDER BY AND LIMIT 30 -- Very slow(3 to 5 seconds)

2) WITH ORDER BY WITHOUT LIMIT -- Very fast(160 MS)

3) WITH ORDER BY WITHOUT LIMIT 100 -- Very fast(160 MS)

Kidney let me know if you need any more details on this.

Thanks,

Sreekanth.

On Mon, May 24, 2021 at 4:33 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote:

On Mon, May 24, 2021 at 4:19 PM sreekanth vajrapu
<sreekanthvajrapu@gmail.com> wrote:
>
>
> Hi Team,
>
> We are having a slow query issue for one of our applications. We are seeing slowness(5 seconds) when we use both ORDER BY and LIMIT 30 clause whereas the same query is performing very good(200 MS) when using only ORDER BY clause. Also note that the query performed very fast(200 MS) when we increased LIMIT to 100 along with ORDER BY.
>
>
> Can you please help us if there are any bugs related to this? OR Can someone kindly provide some solution to this issue?

I think it's good to provide explain (analyze, buffers, costs,
verbose, summary) of the query and a reproducible test case (if
possible). It will be easier for those who investigate the issue.

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Thanks & Regards,

Sreekanth

Attachment

ORDER_BY_WITH_LIMIT_WITHOUT_LIMIT_Execution_Plans.sql

Re: Query is slow when order by and limit clause are used in the query

From

Bharath Rupireddy

Date:

24 May 2021, 12:28:40

On Mon, May 24, 2021 at 5:01 PM sreekanth vajrapu
<sreekanthvajrapu@gmail.com> wrote:
>
> Hi Bharath,
>
> Thanks for the quick reply, I have attached the execution plan for below 3 scenarios. Our application is using 1st
Scenario
>
> 1) WITH ORDER BY AND LIMIT 30 -- Very slow(3 to 5 seconds)
> 2) WITH ORDER BY WITHOUT LIMIT -- Very fast(160 MS)
> 3) WITH ORDER BY WITHOUT LIMIT 100 -- Very fast(160 MS)
>
> Kidney let me know if you need any more details on this.

I see that there are a huge number of Heap Fetches: 599354 with LIMIT
30 clause vs Heap Fetches: 11897 without LIMIT clause, maybe that
could be the reason for the slowness. I'm not sure why this is
happening with the LIMIT 30 clause only. Is it that this issue happens
every time? Say, if you run with LIMIT 30, then the query finishes in
3-5sec. Immediately if you run without a LIMIT clause then the query
completes in 160ms. Is vacuum running successfully on the tables and
indexes for which there's a huge number of heap fetches?

I have no further thoughts on this, other hackers may have better
suggestions though.

BTW, which version of postgresql are you using?

With Regards,
Bharath Rupireddy.
EnterpriseDB: http://www.enterprisedb.com

Re: Query is slow when order by and limit clause are used in the query

From

Tomas Vondra

Date:

24 May 2021, 14:19:24

On 5/24/21 2:28 PM, Bharath Rupireddy wrote:
> On Mon, May 24, 2021 at 5:01 PM sreekanth vajrapu
> <sreekanthvajrapu@gmail.com> wrote:
>>
>> Hi Bharath,
>>
>> Thanks for the quick reply, I have attached the execution plan for below 3 scenarios. Our application is using 1st
Scenario
>>
>> 1) WITH ORDER BY AND LIMIT 30 -- Very slow(3 to 5 seconds)
>> 2) WITH ORDER BY WITHOUT LIMIT -- Very fast(160 MS)
>> 3) WITH ORDER BY WITHOUT LIMIT 100 -- Very fast(160 MS)
>>
>> Kidney let me know if you need any more details on this.
> 
> I see that there are a huge number of Heap Fetches: 599354 with LIMIT
> 30 clause vs Heap Fetches: 11897 without LIMIT clause, maybe that
> could be the reason for the slowness. I'm not sure why this is
> happening with the LIMIT 30 clause only. Is it that this issue happens
> every time? Say, if you run with LIMIT 30, then the query finishes in
> 3-5sec. Immediately if you run without a LIMIT clause then the query
> completes in 160ms. Is vacuum running successfully on the tables and
> indexes for which there's a huge number of heap fetches?
> 

But the heap fetches are in the index-only part of the plan, and it
matches the number of loops for that node. There are 521996 loops and
599354 heap fetches, so roughly 1:1, i.e. roughly the same ratio as for
the faster plans.

It's hard to give advice when we only have the plans, not the original
queries and index definitions, and when the plans are "anonymized" in a
rather inconsistent way (hard to say how the tables/indexes match
between the queries).

If I had to guess, I'd say this is a case of the usual LIMIT problem,
where the optimizer assumes the matching rows are uniformly distributed
in the input relation, when in reality it's "concentrated" at the end.

For example, imagine you have a table with two random columns:

    CREATE TABLE t (a int, b int);

    INSERT INTO t SELECT 100 * random(), 100 * random()
      FROM generate_series(1,1000000) g;

    CREATE INDEX ON t (a);

Now imagine you do this:

    SELECT * FROM t WHERE a = 10 AND b = 20 LIMIT 1;

In this case there's ~10000 rows with a=10, and we can fetch that from
the index. And we know there's ~100 groups with different b values,
distributed uniformly in the 10k rows. So if we start scanning the
index, after ~100 rows we should get a value with b=20, in which case
LIMIT 1 is done.

Now imagine the values are not distributed uniformly like this, but
instead we have this:

    INSERT INTO t SELECT i/10000, i/10000
      FROM generate_series(1,1000000) s(i);

In this case the DB will still believe it'll only scan ~100 rows, but
there are no rows with a=10 and b=20, so it'll end up scanning all 10k
rows with a=10. Or maybe all the "b=20" rows are at the very end of the
index, or something like that. Perhaps the "delete" column makes it
behave like this.

This is only an issue for queries with LIMIT, because that pushes the
planner to pick a plan with very low startup cost. But those plans often
have very high total cost, and degrade poorly when the uniformity
assumption is incorrect.

Hard to say, though, confirming it would require looking at the data
more closely. The one thing I'd suggest is changing the xxxx_index to
also include the "deleted" column, but it's a stab in the dark.

regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: Query is slow when order by and limit clause are used in the query

From

David Rowley

Date:

25 May 2021, 00:05:27

On Tue, 25 May 2021 at 02:19, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> If I had to guess, I'd say this is a case of the usual LIMIT problem,
> where the optimizer assumes the matching rows are uniformly distributed
> in the input relation, when in reality it's "concentrated" at the end.

I'd guess that too. But hard to say due to the inconsistent
anonymisation of the plan,

> Hard to say, though, confirming it would require looking at the data
> more closely. The one thing I'd suggest is changing the xxxx_index to
> also include the "deleted" column, but it's a stab in the dark.

I'd say, providing xxxx_item and xxxxx_item are actually the same
table but just anonymised poorly, then an index such as:

create index on xxxx_item(COALESCE(deleted,false), name);

then change the query so instead of doing WHERE NOT deleted or deleted
is null;   do instead WHERE NOT COALESCE(deleted,false);

Without the query change then there's no hope of that index being used.

I think this would improve the situation as the LIMIT 30 plan is using
xxxxx_index to provide presorted results for the ORDER BY but can only
do index filtering on: (((NOT deleted) OR (deleted IS NULL)) AND
(SubPlan 6)).  So providing not too many rows are filtered out by
SubPlan 6, then that should reduce the Rows Removed by Filter.
However, if the majority of those rows are filtered out by Subplan 6,
then the index won't help much.

It would be nice if the schema was better designed so the deleted
column could only be true or false though.

sreekanth, for the future, you can use https://explain.depesz.com/ to
anonymise your queries. It'll do it in a consistent way that changes
the names of things in a consistent way that people can still follow.

David

Re: Query is slow when order by and limit clause are used in the query

From

sreekanth vajrapu

Date:

28 May 2021, 09:40:12

Hi Team,

Thank you so much for the quick response on my issue.

I minimized the query to the problematic part, Also I tried with all the options that you suggested, Still I am seeing the slow when I USE LIMIT 30.

Answers to your questions.

1) Statistics are up to date.

2) Version of the postgres is "PostgreSQL 9.5.21".

Below are the execution plans along with the query and index definitions. Kindly help me to resolve this issue.

SCENARIO 1: QUERY PLAN WITH ORDER BY ASC AND WITH LIMIT 31 (This is actual one which is being used by application)
explain analyze
SELECT item.*
FROM (
SELECT
t1.name,
t1.deleted
FROM t1
JOIN t2 USING (id)
JOIN t3 USING (id)
) item
WHERE (
deleted = FALSE
OR deleted IS NULL
)
ORDER BY item.name LIMIT 31 OFFSET 0;

QUERY PLAN WITH ORDER BY ASC AND WITH LIMIT 31
--------------------------------------------------------------------------------------------------------------------
Limit (cost=1.14..1034.75 rows=31 width=26) (actual time=1206.826..1207.108 rows=31 loops=1)
-> Nested Loop (cost=1.14..339827.40 rows=10192 width=26) (actual time=1206.824..1207.102 rows=31 loops=1)
-> Nested Loop (cost=0.71..334957.35 rows=10415 width=58) (actual time=1206.811..1206.951 rows=31 loops=1)
-> Index Scan using t1_name_index on t1 (cost=0.42..165174.97 rows=542766 width=42) (actual time=0.012..636.150 rows=521998 loops=1)
Filter: ((NOT deleted) OR (deleted IS NULL))
-> Index Only Scan using t2_pkey on t2 (cost=0.29..0.30 rows=1 width=16) (actual time=0.001..0.001 rows=0 loops=521998)
Index Cond: (id = t1.id)
Heap Fetches: 0
-> Index Only Scan using t3_pkey on t3 (cost=0.42..0.46 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=31)
Index Cond: (id = t1.id)
Heap Fetches: 0
Planning time: 0.624 ms
Execution time: 1207.162 ms
(13 rows)

SCENARIO 2: QUERY PLAN WITH ORDER BY ASC AND WITHOUT LIMIT
explain analyze
SELECT item.*
FROM (
SELECT
t1.name,
t1.deleted
FROM t1
JOIN t2 USING (id)
JOIN t3 USING (id)
) item
WHERE (
deleted = FALSE
OR deleted IS NULL
)
ORDER BY item.name;

QUERY PLAN WITH ORDER BY ASC AND WITHOUT LIMIT
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=18989.88..19015.36 rows=10192 width=26) (actual time=166.891..167.575 rows=10415 loops=1)
Sort Key: t1.name
Sort Method: quicksort Memory: 1198kB
-> Nested Loop (cost=291.76..18311.34 rows=10192 width=26) (actual time=1.912..139.370 rows=10415 loops=1)
Join Filter: (t2.id = t1.id)
-> Hash Join (cost=291.34..10571.03 rows=10415 width=32) (actual time=1.897..93.228 rows=10415 loops=1)
Hash Cond: (t3.id = t2.id)
-> Seq Scan on t3 (cost=0.00..8183.67 rows=531167 width=16) (actual time=0.005..30.359 rows=531167 loops=1)
-> Hash (cost=161.15..161.15 rows=10415 width=16) (actual time=1.861..1.861 rows=10415 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 617kB
-> Seq Scan on t2 (cost=0.00..161.15 rows=10415 width=16) (actual time=0.002..0.753 rows=10415 loops=1)
-> Index Scan using t1_pkey on t1 (cost=0.42..0.73 rows=1 width=42) (actual time=0.004..0.004 rows=1 loops=10415)
Index Cond: (id = t3.id)
Filter: ((NOT deleted) OR (deleted IS NULL))
Planning time: 0.562 ms
Execution time: 167.973 ms
(16 rows)

SCENARIO 3: QUERY PLAN WITH ORDER BY DESC AND WITH LIMIT 31
explain analyze
SELECT item.*
FROM (
SELECT
t1.name,
t1.deleted
FROM t1
JOIN t2 USING (id)
JOIN t3 USING (id)
) item
WHERE (
deleted = FALSE
OR deleted IS NULL
)
ORDER BY item.name DESC LIMIT 31 OFFSET 0;

QUERY PLAN WITH ORDER BY DESC AND WITH LIMIT 31
---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1.14..1034.75 rows=31 width=26) (actual time=0.039..0.431 rows=31 loops=1)
-> Nested Loop (cost=1.14..339827.40 rows=10192 width=26) (actual time=0.038..0.428 rows=31 loops=1)
-> Nested Loop (cost=0.71..334957.35 rows=10415 width=58) (actual time=0.021..0.261 rows=31 loops=1)
-> Index Scan Backward using t1_name_index on t1 (cost=0.42..165174.97 rows=542766 width=42) (actual time=0.012..0.091 rows=61 loops=1)
Filter: ((NOT deleted) OR (deleted IS NULL))
-> Index Only Scan using t2_pkey on t2 (cost=0.29..0.30 rows=1 width=16) (actual time=0.003..0.003 rows=1 loops=61)
Index Cond: (id = t1.id)
Heap Fetches: 0
-> Index Only Scan using t3_pkey on t3 (cost=0.42..0.46 rows=1 width=16) (actual time=0.005..0.005 rows=1 loops=31)
Index Cond: (id = t1.id)
Heap Fetches: 0
Planning time: 0.663 ms
Execution time: 0.491 ms
(13 rows)

SCENARIO 4: QUERY PLAN WITH ORDER BY DESC AND WITHOUT LIMIT
explain analyze
SELECT item.*
FROM (
SELECT
t1.name,
t1.deleted
FROM t1
JOIN t2 USING (id)
JOIN t3 USING (id)
) item
WHERE (
deleted = FALSE
OR deleted IS NULL
)
ORDER BY item.name DESC

QUERY PLAN WITH ORDER BY DESC AND WITHOUT LIMIT
-----------------------------------------------------------------------------------------------------------------------------------------------------------
Sort (cost=18989.88..19015.36 rows=10192 width=26) (actual time=177.915..178.513 rows=10415 loops=1)
Sort Key: t1.name DESC
Sort Method: quicksort Memory: 1198kB
-> Nested Loop (cost=291.76..18311.34 rows=10192 width=26) (actual time=2.052..149.175 rows=10415 loops=1)
Join Filter: (t2.id = t1.id)
-> Hash Join (cost=291.34..10571.03 rows=10415 width=32) (actual time=2.038..96.798 rows=10415 loops=1)
Hash Cond: (t3.id = t2.id)
-> Seq Scan on t3 (cost=0.00..8183.67 rows=531167 width=16) (actual time=0.005..31.440 rows=531167 loops=1)
-> Hash (cost=161.15..161.15 rows=10415 width=16) (actual time=1.993..1.993 rows=10415 loops=1)
Buckets: 16384 Batches: 1 Memory Usage: 617kB
-> Seq Scan on t2 (cost=0.00..161.15 rows=10415 width=16) (actual time=0.004..0.822 rows=10415 loops=1)
-> Index Scan using t1_pkey on t1 (cost=0.42..0.73 rows=1 width=42) (actual time=0.005..0.005 rows=1 loops=10415)
Index Cond: (id = t3.id)
Filter: ((NOT deleted) OR (deleted IS NULL))
Planning time: 0.585 ms
Execution time: 178.915 ms
(16 rows)

SCENARIO 5: QUERY PLAN WITH ORDER BY ASC AND LIMIT 30 BY CHAING QUERY TO (NOT COALESCE(deleted, false))

explain analyze
SELECT item.*
FROM (
SELECT
t1.name,
t1.deleted
FROM t1
JOIN t2 USING (id)
JOIN t3 USING (id)
) item
WHERE ( NOT COALESCE(deleted,false)
--deleted = FALSE
--OR deleted IS NULL
)
ORDER BY item.name LIMIT 31 OFFSET 0;

QUERY PLAN WITH ORDER BY ASC AND LIMIT 30 BY CHAING QUERY TO (NOT COALESCE(deleted, false))
--------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=1.14..1034.75 rows=31 width=26) (actual time=1230.705..1230.983 rows=31 loops=1)
-> Nested Loop (cost=1.14..339827.40 rows=10192 width=26) (actual time=1230.704..1230.980 rows=31 loops=1)
-> Nested Loop (cost=0.71..334957.35 rows=10415 width=58) (actual time=1230.693..1230.835 rows=31 loops=1)
-> Index Scan using t1_name_index on t1 (cost=0.42..165174.97 rows=542766 width=42) (actual time=0.017..666.816 rows=521998 loops=1)
Filter: (NOT COALESCE(deleted, false))
-> Index Only Scan using t2_pkey on t2 (cost=0.29..0.30 rows=1 width=16) (actual time=0.001..0.001 rows=0 loops=521998)
Index Cond: (id = t1.id)
Heap Fetches: 0
-> Index Only Scan using t3_pkey on t3 (cost=0.42..0.46 rows=1 width=16) (actual time=0.004..0.004 rows=1 loops=31)
Index Cond: (id = t1.id)
Heap Fetches: 0
Planning time: 0.848 ms
Execution time: 1231.039 ms
(13 rows)

Time: 1255.381 ms (00:01.255)

Index Definitions;

ALTER TABLE t1 ADD CONSTRAINT t1_pkey PRIMARY KEY (id);

CREATE INDEX t1_name_index ON t1 USING btree (name COLLATE pg_catalog."default" ASC NULLS LAST);
CREATE INDEX t1_name_deleted_peteam1 ON t1 USING btree ((COALESCE(deleted, false)), name);
CREATE INDEX t1_name_deleted_peteam2 ON t1 USING btree (name, (COALESCE(deleted, false)));
CREATE INDEX t1_name_deleted_peteam3 ON t1 USING btree ((COALESCE(deleted, false)), name DESC);
CREATE INDEX t1_name_deleted_peteam4 ON t1 USING btree (name DESC, (COALESCE(deleted, false)));
CREATE INDEX t1_name_index_peteam5 ON t1 USING btree (name, deleted);

ALTER TABLE t2 ADD CONSTRAINT t2_pkey PRIMARY KEY (id);

ALTER TABLE t3 ADD CONSTRAINT t3_pkey PRIMARY KEY (id);

On Tue, May 25, 2021 at 5:35 AM David Rowley <dgrowleyml@gmail.com> wrote:

On Tue, 25 May 2021 at 02:19, Tomas Vondra
<tomas.vondra@enterprisedb.com> wrote:
> If I had to guess, I'd say this is a case of the usual LIMIT problem,
> where the optimizer assumes the matching rows are uniformly distributed
> in the input relation, when in reality it's "concentrated" at the end.

I'd guess that too. But hard to say due to the inconsistent
anonymisation of the plan,

> Hard to say, though, confirming it would require looking at the data
> more closely. The one thing I'd suggest is changing the xxxx_index to
> also include the "deleted" column, but it's a stab in the dark.

I'd say, providing xxxx_item and xxxxx_item are actually the same
table but just anonymised poorly, then an index such as:

create index on xxxx_item(COALESCE(deleted,false), name);

then change the query so instead of doing WHERE NOT deleted or deleted
is null; do instead WHERE NOT COALESCE(deleted,false);

Without the query change then there's no hope of that index being used.

I think this would improve the situation as the LIMIT 30 plan is using
xxxxx_index to provide presorted results for the ORDER BY but can only
do index filtering on: (((NOT deleted) OR (deleted IS NULL)) AND
(SubPlan 6)). So providing not too many rows are filtered out by
SubPlan 6, then that should reduce the Rows Removed by Filter.
However, if the majority of those rows are filtered out by Subplan 6,
then the index won't help much.

It would be nice if the schema was better designed so the deleted
column could only be true or false though.

sreekanth, for the future, you can use https://explain.depesz.com/ to
anonymise your queries. It'll do it in a consistent way that changes
the names of things in a consistent way that people can still follow.

David

Thanks & Regards,

Sreekanth

Re: Query is slow when order by and limit clause are used in the query

From

David Rowley

Date:

28 May 2021, 13:08:51

On Fri, 28 May 2021 at 21:40, sreekanth vajrapu
<sreekanthvajrapu@gmail.com> wrote:
> 2) Version of the postgres is "PostgreSQL 9.5.21".
>
> Below are the execution plans along with the query and index definitions. Kindly help me to resolve this issue.

The following works ok for me all the way back to PostgreSQL 10.

create table t1 (id int primary key, name text not null, deleted bool);
create table t2 (id int primary key);
create table t3 (id int primary key);
create index on t1 (coalesce(deleted,false),name);

 explain (costs off)
SELECT item.*
FROM (
SELECT
          t1.name,
          t1.deleted
     FROM t1
     JOIN t2 USING (id)
     JOIN t3 USING (id)
) item
WHERE NOT COALESCE(deleted,false)
ORDER BY item.name LIMIT 31 OFFSET 0;
                             QUERY PLAN
--------------------------------------------------------------------
 Limit
   ->  Nested Loop
         Join Filter: (t1.id = t3.id)
         ->  Nested Loop
               ->  Index Scan using t1_coalesce_name_idx on t1
                     Index Cond: (COALESCE(deleted, false) = false)
               ->  Index Only Scan using t2_pkey on t2
                     Index Cond: (id = t1.id)
         ->  Index Only Scan using t3_pkey on t3
               Index Cond: (id = t2.id)
(10 rows)

However, I see it does not work on 9.5.  Something must have been
changed in 10 to allow the index to be used.  I don't really see any
indication of what that might be from the release notes and I'm too
lazy to git bisect to find out what the change was.

Either way, you should likely upgrade to a supported version of
PostgreSQL. 9.5 went out of support in Feb.  See:
https://www.postgresql.org/support/versioning/

If you need it to work on 9.5, you'll likely get it working if you use
a partial index:

create index on t1 (name) where not coalesce(deleted,false);

David.

Re: Query is slow when order by and limit clause are used in the query

From

sreekanth vajrapu

Date:

31 May 2021, 11:15:01

Thank you David for the quick response.

create index on t1 (name) where not coalesce(deleted,false) worked out and execution time reduced to 900MS from 1200MS.

So I recommended the application team to upgrade to higher Postgresql versions.

Thank you so much for the help and guidance.

Thanks.

Sreekanth.

On Fri, May 28, 2021 at 6:39 PM David Rowley <dgrowleyml@gmail.com> wrote:

On Fri, 28 May 2021 at 21:40, sreekanth vajrapu
<sreekanthvajrapu@gmail.com> wrote:
> 2) Version of the postgres is "PostgreSQL 9.5.21".
>
> Below are the execution plans along with the query and index definitions. Kindly help me to resolve this issue.

The following works ok for me all the way back to PostgreSQL 10.

create table t1 (id int primary key, name text not null, deleted bool);
create table t2 (id int primary key);
create table t3 (id int primary key);
create index on t1 (coalesce(deleted,false),name);

explain (costs off)
SELECT item.*
FROM (
SELECT
t1.name,
t1.deleted
FROM t1
JOIN t2 USING (id)
JOIN t3 USING (id)
) item
WHERE NOT COALESCE(deleted,false)
ORDER BY item.name LIMIT 31 OFFSET 0;
QUERY PLAN
--------------------------------------------------------------------
Limit
-> Nested Loop
Join Filter: (t1.id = t3.id)
-> Nested Loop
-> Index Scan using t1_coalesce_name_idx on t1
Index Cond: (COALESCE(deleted, false) = false)
-> Index Only Scan using t2_pkey on t2
Index Cond: (id = t1.id)
-> Index Only Scan using t3_pkey on t3
Index Cond: (id = t2.id)
(10 rows)

However, I see it does not work on 9.5. Something must have been
changed in 10 to allow the index to be used. I don't really see any
indication of what that might be from the release notes and I'm too
lazy to git bisect to find out what the change was.

Either way, you should likely upgrade to a supported version of
PostgreSQL. 9.5 went out of support in Feb. See:
https://www.postgresql.org/support/versioning/

If you need it to work on 9.5, you'll likely get it working if you use
a partial index:

create index on t1 (name) where not coalesce(deleted,false);

David.

Thanks & Regards,

Sreekanth

Re: Query is slow when order by and limit clause are used in the query

From

David Fetter

Date:

31 May 2021, 16:41:05

On Mon, May 24, 2021 at 04:19:01PM +0530, sreekanth vajrapu wrote:
> Hi Team,
> 
> We are having a slow query issue for one of our applications. We are seeing
> slowness(5 seconds) when we use both ORDER BY and LIMIT 30 clause whereas
> the same query is performing very good(200 MS) when using only ORDER BY
> clause.  Also note that the query performed very fast(200 MS) when we
> increased LIMIT to 100 along with ORDER  BY.
> 
> Can you please help us if there are any bugs related to this?  OR Can
> someone kindly provide some solution to this issue?

This is not at base a bug. Instead, it's a behavior which compliance
with the SQL standard mandates. You're doing pagination, and
unfortunately you're doing it in a way that, while it appears simple
and intuitive, guarantees poor performance for later pages.

Here's a concise description of the fundamental problem you're
encountering along with some suggestions as to how to do this more
efficiently, i.e. faster consistently.

https://use-the-index-luke.com/no-offset

Here are some more references on pagination and how to do it
efficiently:

http://www.depesz.com/2007/08/29/better-results-paging-in-postgresql-82/
https://wiki.postgresql.org/wiki/File:Pagination_Done_the_PostgreSQL_Way.pdf
https://coderwall.com/p/lkcaag
https://www.citusdata.com/blog/2016/03/30/five-ways-to-paginate/
https://ask.use-the-index-luke.com/questions/205/how-to-query-for-previous-page-with-keyset-pagination

Best,
David.
-- 
David Fetter <david(at)fetter(dot)org> http://fetter.org/
Phone: +1 415 235 3778

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate