Re: OFFSET impact on Performance??? - Mailing list pgsql-performance

From David Brown
Subject Re: OFFSET impact on Performance???
Date
Msg-id 41F86501.1060006@bigpond.net.au
Whole thread Raw
In response to OFFSET impact on Performance???  ("Andrei Bintintan" <klodoma@ar-sd.net>)
List pgsql-performance
Although larger offsets have some effect, your real problem is the sort
(of 42693 rows).

Try:

SELECT r.id_order
FROM report r
WHERE r.id_order IN
   (SELECT id
   FROM orders
   WHERE id_status = 6
   ORDER BY 1
   LIMIT 10 OFFSET 1000)
ORDER BY 1

The subquery doesn't *have* to sort because the table is already ordered
on the primary key.
You can still add a join to orders outside the subselect without
significant cost.

Incidentally, I don't know how you got the first plan - it should
include a sort as well.

Andrei Bintintan wrote:

 > explain analyze
 > SELECT o.id
 > FROM report r
 > INNER JOIN orders o ON o.id=r.id_order AND o.id_status=6
 > ORDER BY 1 LIMIT 10 OFFSET 10
 >
 > Limit  (cost=44.37..88.75 rows=10 width=4) (actual time=0.160..0.275
rows=10 loops=1)
 >   ->  Merge Join  (cost=0.00..182150.17 rows=41049 width=4) (actual
time=0.041..0.260 rows=20 loops=1)
 >         Merge Cond: ("outer".id_order = "inner".id)
 >         ->  Index Scan using report_id_order_idx on report r
(cost=0.00..157550.90 rows=42862 width=4) (actual time=0.018..0.075
rows=20 loops=1)
 >         ->  Index Scan using orders_pkey on orders o
(cost=0.00..24127.04 rows=42501 width=4) (actual time=0.013..0.078
rows=20 loops=1)
 >               Filter: (id_status = 6)
 > Total runtime: 0.373 ms
 >
 > explain analyze
 > SELECT o.id
 > FROM report r
 > INNER JOIN orders o ON o.id=r.id_order AND o.id_status=6
 > ORDER BY 1 LIMIT 10 OFFSET 1000000
 > Limit  (cost=31216.85..31216.85 rows=1 width=4) (actual
time=1168.152..1168.152 rows=0 loops=1)
 >   ->  Sort  (cost=31114.23..31216.85 rows=41049 width=4) (actual
time=1121.769..1152.246 rows=42693 loops=1)
 >         Sort Key: o.id
 >         ->  Hash Join  (cost=2329.99..27684.03 rows=41049 width=4)
(actual time=441.879..925.498 rows=42693 loops=1)
 >               Hash Cond: ("outer".id_order = "inner".id)
 >               ->  Seq Scan on report r  (cost=0.00..23860.62
rows=42862 width=4) (actual time=38.634..366.035 rows=42864 loops=1)
 >               ->  Hash  (cost=2077.74..2077.74 rows=42501 width=4)
(actual time=140.200..140.200 rows=0 loops=1)
 >                     ->  Seq Scan on orders o  (cost=0.00..2077.74
rows=42501 width=4) (actual time=0.059..96.890 rows=42693 loops=1)
 >                           Filter: (id_status = 6)
 > Total runtime: 1170.586 ms

pgsql-performance by date:

Previous
From: Josh Berkus
Date:
Subject: Re: Ideal disk setup for Postgresql 7.4?
Next
From: Alex Turner
Date:
Subject: Re: [SQL] OFFSET impact on Performance???