Thread: BUG #6426: Complex query runs 10 times longer with "LIMIT 20"

BUG #6426: Complex query runs 10 times longer with "LIMIT 20"

From

aschetinin@gmail.com

Date:

01 February 2012, 16:32:53

The following bug has been logged on the website:

Bug reference:      6426
Logged by:          Andrew Schetinin
Email address:      aschetinin@gmail.com
PostgreSQL version: 9.1.2
Operating system:   Debian Linux
Description:=20=20=20=20=20=20=20=20

I have a complex query that, when unlimited, runs in 2.5 seconds and returns
400+ records.

The same query, with added "LIMIT 20 OFFSET 0" at the end, runs 25 seconds!
(and returns correct 20 records)

I think there is a bug in the query optimizer. Other variations of this
query (with slightly different constraints and joins) work fast with and
without LIMIT.=20

The same query works fast in PostgreSQL 8.4 (tested on another machine with
Ubuntu and not exactly the same but very-very similar database).

I can provide the query execution plans of the two cases (as long as they
are not published on the Web); but not the database because there is
customer's data in it.

Please let me know if anybody is interested in researching the problem.

Regards,

Andrew Schetinin

Re: BUG #6426: Complex query runs 10 times longer with "LIMIT 20"

From

Alex Lai

Date:

01 February 2012, 18:43:10

aschetinin@gmail.com wrote:
> The following bug has been logged on the website:
>
> Bug reference:      6426
> Logged by:          Andrew Schetinin
> Email address:      aschetinin@gmail.com
> PostgreSQL version: 9.1.2
> Operating system:   Debian Linux
> Description:
>
> I have a complex query that, when unlimited, runs in 2.5 seconds and returns
> 400+ records.
>
> The same query, with added "LIMIT 20 OFFSET 0" at the end, runs 25 seconds!
> (and returns correct 20 records)
>
> I think there is a bug in the query optimizer. Other variations of this
> query (with slightly different constraints and joins) work fast with and
> without LIMIT.
>
> The same query works fast in PostgreSQL 8.4 (tested on another machine with
> Ubuntu and not exactly the same but very-very similar database).
>
> I can provide the query execution plans of the two cases (as long as they
> are not published on the Web); but not the database because there is
> customer's data in it.
>
> Please let me know if anybody is interested in researching the problem.
>
> Regards,
>
> Andrew Schetinin
>
>
>
Hi Andrew,
I posted for another post, its may give you a workaround.
I still not fully understand how PG choose execute plan that slow down
so much.

I had the same situation in one of my query.
Use the subquery can speed up almost by 100 times faster.

explain analyse
select FileId as "FileId", ESDT as "ESDT",1 as "Position" from
V_FileMeta_L3
          where Archiveset = 61000 and ESDT= 'ESDT123'
                    and Source = 'SOURCE1234'
                    and (
                       (StartTime between '2012-01-28
05:59:57.000000Z'::timestamp - '-135 minutes'::interval
                                     and '2012-01-28
07:41:27.000000Z'::timestamp + '100 days'::interval)
                          or
                        (EndTime between '2012-01-28
05:59:57.000000Z'::timestamp - '-135 minutes'::interval
                                    and '2012-01-28
07:41:27.000000Z'::timestamp + '100 days'::interval)
                         )
                order by starttime
limit 1;

Limit  (cost=0.00..15.20 rows=1 width=22) (actual time=200.048..200.048
rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..117596.32 rows=7736 width=22) (actual
time=200.046..200.046 rows=1 loops=1)
        ->  Index Scan using ak_filemeta_l3_esdt_starttime_endtime on
filemeta_l3 b  (cost=0.00..77200.55 rows=7736 width=22) (actual
time=199.986..199.989
rows=2 loops=1)
              Index Cond: ((esdt)::text = 'ROLPT'::text)
              Filter: (((source)::text = 'OMPS-NPP'::text) AND
(((starttime >= '2012-01-28 08:14:57'::timestamp without time zone) AND
(starttime <= '2012-
05-07 07:41:27'::timestamp without time zone)) OR ((endtime >=
'2012-01-28 08:14:57'::timestamp without time zone) AND (endtime <=
'2012-05-07 07:41:27'::ti
mestamp without time zone))))
        ->  Index Scan using pk_filemeta_archiveset on
filemeta_archiveset a  (cost=0.00..5.21 rows=1 width=4) (actual
time=0.025..0.025 rows=0 loops=2)
              Index Cond: ((a.fileid = b.fileid) AND (a.archiveset =
61000))
Total runtime: 200.102 ms
(8 rows)

explain analyse
select FileId as "FileId", ESDT as "ESDT",1 as "Position" from
V_FileMeta_L3
where FileId in (select fileid from V_FileMeta_L3
          where Archiveset = 61000 and ESDT= 'ESDT123'
                    and Source = 'SOUCE1234'
                    and (
                       (StartTime between '2012-01-28
05:59:57.000000Z'::timestamp - '-135 minutes'::interval
                                     and '2012-01-28
07:41:27.000000Z'::timestamp + '100 days'::interval)
                          or
                        (EndTime between '2012-01-28
05:59:57.000000Z'::timestamp - '-135 minutes'::interval
                                    and '2012-01-28
07:41:27.000000Z'::timestamp + '100 days'::interval)
                         )
                order by starttime)
limit 1;

Limit  (cost=53038.95..53039.41 rows=1 width=14) (actual
time=2.502..2.502 rows=1 loops=1)
 ->  Nested Loop  (cost=53038.95..6661196.32 rows=14611464 width=14)
(actual time=2.480..2.480 rows=1 loops=1)
       ->  Nested Loop  (cost=53038.95..53716.80 rows=2549631 width=18)
(actual time=2.464..2.464 rows=1 loops=1)
             ->  HashAggregate  (cost=53038.95..53040.95 rows=200
width=4) (actual time=2.445..2.445 rows=1 loops=1)
                   ->  Sort  (cost=52923.05..52942.37 rows=7727
width=12) (actual time=2.346..2.372 rows=43 loops=1)
                         Sort Key: b.starttime
                         Sort Method:  quicksort  Memory: 27kB
                         ->  Nested Loop  (cost=209.87..52424.05
rows=7727 width=12) (actual time=0.358..2.251 rows=43 loops=1)
                               ->  Bitmap Heap Scan on filemeta_l3 b
(cost=209.87..12077.66 rows=7727 width=12) (actual time=0.262..0.685
rows=108 loops=1)
                                     Recheck Cond: ((((esdt)::text =
'ROLPT'::text) AND (starttime >= '2012-01-28 08:14:57'::timestamp
without time zone) AND (starttime <= '2012-05-07 07:41:2
7'::timestamp without time zone)) OR (((esdt)::text = 'ROLPT'::text) AND
(endtime >= '2012-01-28 08:14:57'::timestamp without time zone) AND
(endtime <= '2012-05-07 07:41:27'::timestamp without
time zone)))
                                     Filter: ((source)::text =
'OMPS-NPP'::text)
                                     ->  BitmapOr  (cost=209.87..209.87
rows=9895 width=0) (actual time=0.223..0.223 rows=0 loops=1)
                                           ->  Bitmap Index Scan on
ak_filemeta_l3_esdt_starttime_endtime  (cost=0.00..107.72 rows=4961
width=0) (actual time=0.127..0.127 rows=106 loops=1)
                                                 Index Cond:
(((esdt)::text = 'ROLPT'::text) AND (starttime >= '2012-01-28
08:14:57'::timestamp without time zone) AND (starttime <= '2012-05-0
7 07:41:27'::timestamp without time zone))
                                           ->  Bitmap Index Scan on
ak_filemeta_l3_esdt_endtime  (cost=0.00..98.29 rows=4934 width=0)
(actual time=0.093..0.093 rows=108 loops=1)
                                                 Index Cond:
(((esdt)::text = 'ROLPT'::text) AND (endtime >= '2012-01-28
08:14:57'::timestamp without time zone) AND (endtime <= '2012-05-07 07
:41:27'::timestamp without time zone))
                               ->  Index Scan using
pk_filemeta_archiveset on filemeta_archiveset a  (cost=0.00..5.21 rows=1
width=4) (actual time=0.011..0.011 rows=0 loops=108)
                                     Index Cond: ((a.fileid = b.fileid)
AND (a.archiveset = 61000))
             ->  Index Scan using pk_filemeta_l3 on filemeta_l3 b
(cost=0.00..3.37 rows=1 width=14) (actual time=0.015..0.015 rows=1 loops=1)
                   Index Cond: (b.fileid = a.fileid)
       ->  Index Scan using ak_filemeta_archiveset_fileid on
filemeta_archiveset a  (cost=0.00..2.52 rows=6 width=4) (actual
time=0.012..0.012 rows=1 loops=1)
             Index Cond: (a.fileid = b.fileid)
Total runtime: 2.711 ms

Hope this help.


Best Regards,
Alex Lai

--
Best regards,


Alex Lai
OMI SIPS DBA ADNET Systems , Inc.
7515 Mission Drive,
Suite A100 Lanham, MD 20706
301-352-4657 (phone)
301-352-0437 (fax)
alai@sesda2.com

Re: BUG #6426: Complex query runs 10 times longer with "LIMIT 20"

From

Alex Lai

Date:

01 February 2012, 19:20:15

Andrew Schetinin wrote:
> Hi Alex,
>
> Thank you. I played with subqueries, but in this case I did not see a
> good enough improvement.
>
> In my specific case, what I've seen from the query execution plans, is
> that without LIMIT the query uses Hash Joins, but once I add LIMIT, it
> starts using Nested Loop Joins almost everywhere.
>
> Regards,
>
> Andrew
>
> On Wed, Feb 1, 2012 at 9:42 PM, Alex Lai <alai@sesda2.com
> <mailto:alai@sesda2.com>> wrote:
>
>     Hi Andrew,
>     I posted for another post, its may give you a workaround.
>     I still not fully understand how PG choose execute plan that slow
>     down so much.
>
>     I had the same situation in one of my query.
>     Use the subquery can speed up almost by 100 times faster.
>
>     ..........
>
>
>
>     Hope this help.
>
>
>     Best Regards,
>     Alex Lai
>
>     --
>     Best regards,
>
>
>     Alex Lai
>     OMI SIPS DBA ADNET Systems , Inc. 7515 Mission Drive, Suite A100
>     Lanham, MD 20706 301-352-4657 (phone) 301-352-0437 (fax)
>     alai@sesda2.com <mailto:alai@sesda2.com>
>
>
>
>
> --
> --
> Andrew Schetinin
Hi Andrew,

I am interested in see your query.

--
Best regards,


Alex Lai
OMI SIPS DBA ADNET Systems , Inc.
7515 Mission Drive,
Suite A100 Lanham, MD 20706
301-352-4657 (phone)
301-352-0437 (fax)
alai@sesda2.com

Re: BUG #6426: Complex query runs 10 times longer with "LIMIT 20"

From

Andrew Schetinin

Date:

01 February 2012, 19:42:38

Hi Alex,

Thank you. I played with subqueries, but in this case I did not see a good
enough improvement.

In my specific case, what I've seen from the query execution plans, is that
without LIMIT the query uses Hash Joins, but once I add LIMIT, it starts
using Nested Loop Joins almost everywhere.

Regards,

Andrew

On Wed, Feb 1, 2012 at 9:42 PM, Alex Lai <alai@sesda2.com> wrote:

> Hi Andrew,
> I posted for another post, its may give you a workaround.
> I still not fully understand how PG choose execute plan that slow down so
> much.
>
> I had the same situation in one of my query.
> Use the subquery can speed up almost by 100 times faster.
>
> ..........
>

> Hope this help.
>
>
> Best Regards,
> Alex Lai
>
> --
> Best regards,
>
>
> Alex Lai
> OMI SIPS DBA ADNET Systems , Inc. 7515 Mission Drive, Suite A100 Lanham,
> MD 20706 301-352-4657 (phone) 301-352-0437 (fax) alai@sesda2.com
>
>

--
--
Andrew Schetinin

Re: BUG #6426: Complex query runs 10 times longer with "LIMIT 20"

From

Tom Lane

Date:

01 February 2012, 20:48:49

Andrew Schetinin <aschetinin@gmail.com> writes:
> In my specific case, what I've seen from the query execution plans, is that
> without LIMIT the query uses Hash Joins, but once I add LIMIT, it starts
> using Nested Loop Joins almost everywhere.

Usually, that's an appropriate change for a small LIMIT.  It's certainly
not a priori evidence of a planner bug.

If you want useful comments about this, please review
http://wiki.postgresql.org/wiki/Slow_Query_Questions
about how to provide an adequate description of your problem.

            regards, tom lane

Re: BUG #6426: Complex query runs 10 times longer with "LIMIT 20"

From

Andrew Schetinin

Date:

02 February 2012, 15:10:49

Hi Tom,

Thanks for pointing the FAQ out, I did not see it.
I especially liked the link to http://explain.depesz.com - it's a useful
tool.

I succeeded to fix my problem by changing the order of JOINs (the query
remained exactly the same otherwise). According to EXPLAIN ANALIZE, it
eliminated those problematic nested loop joins. BTW, changing the order of
JOINs did not affect the times of the query without LIMIT - it works the
same way, while for the query with LIMIT it fixed the problem.

Previously I always thought that the order of JOINs or conditions in WHERE
is irrelevant, and query optimizer rearranges the order according to its
logic. Now it appears that sometimes it may be important.

Regards,

Andrew

On Wed, Feb 1, 2012 at 11:48 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:

> Andrew Schetinin <aschetinin@gmail.com> writes:
> > In my specific case, what I've seen from the query execution plans, is
> that
> > without LIMIT the query uses Hash Joins, but once I add LIMIT, it starts
> > using Nested Loop Joins almost everywhere.
>
> Usually, that's an appropriate change for a small LIMIT.  It's certainly
> not a priori evidence of a planner bug.
>
> If you want useful comments about this, please review
> http://wiki.postgresql.org/wiki/Slow_Query_Questions
> about how to provide an adequate description of your problem.
>
>                        regards, tom lane
>

--
--
Andrew Schetinin

Re: BUG #6426: Complex query runs 10 times longer with "LIMIT 20"

From

Tom Lane

Date:

02 February 2012, 15:40:47

Andrew Schetinin <aschetinin@gmail.com> writes:
> Previously I always thought that the order of JOINs or conditions in WHERE
> is irrelevant, and query optimizer rearranges the order according to its
> logic. Now it appears that sometimes it may be important.

If you have more than join_collapse_limit JOINs in the query, the order
can matter.

            regards, tom lane