Re: Efficiently selecting single row from a select with window functions row_number, lag and lead - Mailing list pgsql-general

From David Rowley
Subject Re: Efficiently selecting single row from a select with window functions row_number, lag and lead
Date
Msg-id CAKJS1f-KbrYRxc3vdRUoA8kW46mfNbf9PtFDf2vqSAsq6JaDRA@mail.gmail.com
Whole thread Raw
In response to Efficiently selecting single row from a select with window functions row_number, lag and lead  (Andrew Bailey <hazlorealidad@gmail.com>)
List pgsql-general
On 2 January 2016 at 16:39, Andrew Bailey <hazlorealidad@gmail.com> wrote:
I would like to do the following:

select id, row_number() over w as rownum, lag(id, 1) over w as prev, lead(id, 1) over w as next from route where id=1350 window w as (order by shortname, id asc rows between 1 preceding and 1 following)  order by shortname, id ;

However this gives the result
1350;1;;


This does not work due to the id=1350 is always applied before the rows make it into the window therefore you only have rows which match id=1350, which is not what you want in this case.
 
The following query gives the result I am expecting

select * from (select id, row_number() over w as rownum,
lag(id, 1) over w as prev, lead(id, 1) over w as next
from route window w as (order by shortname, id
rows between 1 preceding and 1 following) order by shortname, id) as s where id=1350

1350;3;1815;1813


This works because the id=1350 is not pushed down into the subquery which contain the windowing functions, this also means that the entire route table is processed and you may suffer from performance problems if the route table is, or gets big. You'll be able to confirm this by looking at the EXPLAIN output and noticing the lack of filter on the seqscan.
 
The explain plan is
"Subquery Scan on s  (cost=0.14..15.29 rows=1 width=32)"
"  Filter: (s.id = 1350)"
"  ->  WindowAgg  (cost=0.14..13.51 rows=143 width=12)"
"        ->  Index Only Scan using route_idx on route  (cost=0.14..10.29 rows=143 width=12)"

as it makes use of the index created as follows

CREATE INDEX route_idx
  ON route
  USING btree
  (shortname COLLATE pg_catalog."default", id);

I believe that the index has all the data that is needed to obtain the results in a single query.
Is it possible to write the query as a single select and if so how?

why not just write it as: select id, (select max(id) from route where id < 1350) as prev, (select min(id) from route where id > 1350) as next from route where id=2; ?
That should be much more efficient for a larger table as it should avoid the seqscan and allow the index to be used for all 3 numbers.

--
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services

pgsql-general by date:

Previous
From: Jim Nasby
Date:
Subject: Re: Efficiently selecting single row from a select with window functions row_number, lag and lead
Next
From: Vitaly Burovoy
Date:
Subject: Re: Efficiently selecting single row from a select with window functions row_number, lag and lead