Re: Using LIMIT changes index used by planner - Mailing list pgsql-performance

From Sven Willenberger
Subject Re: Using LIMIT changes index used by planner
Date
Msg-id 41BE1270.7060908@dmv.com
Whole thread Raw
In response to Re: Using LIMIT changes index used by planner  (Andrew McMillan <andrew@catalyst.net.nz>)
Responses Re: Using LIMIT changes index used by planner
List pgsql-performance

Andrew McMillan wrote:
> On Mon, 2004-12-13 at 01:13 -0500, Sven Willenberger wrote:
>
>>I have a question regarding a serious performance hit taken when using a
>>LIMIT clause. I am using version 7.4.6 on FreeBSD 4.10-STABLE with 2GB
>>of memory. The table in question contains some 25 million rows with a
>>bigserial primary key, orderdate index and a referrer index. The 2
>>select statements are as follow:
>
>
> It's an interesting question, but to be able to get answers from this
> list you will need to provide "EXPLAIN ANALYZE ..." rather than just
> "EXPLAIN ...".
>

A) Query without limit clause:
explain analyze select storelocation,order_number from custacct where
referrer = 1365  and orderdate between '2004-12-07' and '2004-12-07
12:00:00' order by custacctid;

        QUERY PLAN


-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Sort  (cost=1226485.32..1226538.78 rows=21382 width=43) (actual
time=30340.322..30426.274 rows=21432 loops=1)
    Sort Key: custacctid
    ->  Index Scan using orderdate_idx on custacct
(cost=0.00..1224947.52 rows=21382 width=43) (actual
time=159.218..30196.686 rows=21432 loops=1)
          Index Cond: ((orderdate >= '2004-12-07 00:00:00'::timestamp
without time zone) AND (orderdate <= '2004-12-07 12:00:00'::timestamp
without time zone))
          Filter: (referrer = 1365)
  Total runtime: 30529.151 ms
(6 rows)

************************************

A2) Same query run again, to see effect of caching:
explain analyze select storelocation,order_number from custacct where
referrer = 1365  and orderdate between '2004-12-07' and '2004-12-07
12:00:00' order by custacctid;

        QUERY PLAN


-----------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Sort  (cost=1226485.32..1226538.78 rows=21382 width=43) (actual
time=1402.410..1488.395 rows=21432 loops=1)
    Sort Key: custacctid
    ->  Index Scan using orderdate_idx on custacct
(cost=0.00..1224947.52 rows=21382 width=43) (actual time=0.736..1259.964
rows=21432 loops=1)
          Index Cond: ((orderdate >= '2004-12-07 00:00:00'::timestamp
without time zone) AND (orderdate <= '2004-12-07 12:00:00'::timestamp
without time zone))
          Filter: (referrer = 1365)
  Total runtime: 1590.675 ms
(6 rows)

***********************************

B) Query run with LIMIT

explain analyze select storelocation,order_number from custacct where
referrer = 1365  and orderdate between '2004-12-07' and '2004-12-07
12:00:00' order by custacctid limit 10;

                  QUERY PLAN


-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=0.00..43065.76 rows=10 width=43) (actual
time=1306957.216..1307072.111 rows=10 loops=1)
    ->  Index Scan using custacct2_pkey on custacct
(cost=0.00..92083209.38 rows=21382 width=43) (actual
time=1306957.205..1307072.017 rows=10 loops=1)
          Filter: ((referrer = 1365) AND (orderdate >= '2004-12-07
00:00:00'::timestamp without time zone) AND (orderdate <= '2004-12-07
12:00:00'::timestamp without time zone))
  Total runtime: 1307072.231 ms
(4 rows)

************************************

C) Query using the subselect variation

explain analyze select foo.storelocation, foo.order_number from (select
storelocation,order_number from custacct where referrer = 1365  and
orderdate between '2004-12-07' and '2004-12-07 12:00:00' order by
custacctid) as foo  limit 10;

              QUERY PLAN


-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Limit  (cost=1226485.32..1226485.45 rows=10 width=100) (actual
time=1413.829..1414.024 rows=10 loops=1)
    ->  Subquery Scan foo  (cost=1226485.32..1226752.60 rows=21382
width=100) (actual time=1413.818..1413.933 rows=10 loops=1)
          ->  Sort  (cost=1226485.32..1226538.78 rows=21382 width=43)
(actual time=1413.798..1413.834 rows=10 loops=1)
                Sort Key: custacctid
                ->  Index Scan using orderdate_idx on custacct
(cost=0.00..1224947.52 rows=21382 width=43) (actual time=0.740..1272.380
rows=21432 loops=1)
                      Index Cond: ((orderdate >= '2004-12-07
00:00:00'::timestamp without time zone) AND (orderdate <= '2004-12-07
12:00:00'::timestamp without time zone))
                      Filter: (referrer = 1365)
  Total runtime: 1418.964 ms
(8 rows)


Thanks,
Sven

pgsql-performance by date:

Previous
From: Alvaro Nunes Melo
Date:
Subject: Re: Similar tables, different indexes performance
Next
From: Tom Lane
Date:
Subject: Re: Using LIMIT changes index used by planner