Home > mailing lists

Re: Efficiently selecting single row from a select with window functions row_number, lag and lead - Mailing list pgsql-general

From	David Rowley
Subject	Re: Efficiently selecting single row from a select with window functions row_number, lag and lead
Date	January 6, 2016 05:10:15
Msg-id	CAKJS1f-KbrYRxc3vdRUoA8kW46mfNbf9PtFDf2vqSAsq6JaDRA@mail.gmail.com Whole thread Raw
In response to	Efficiently selecting single row from a select with window functions row_number, lag and lead (Andrew Bailey <hazlorealidad@gmail.com>)
List	pgsql-general

Tree view

On 2 January 2016 at 16:39, Andrew Bailey <hazlorealidad@gmail.com> wrote:

I would like to do the following:

select id, row_number() over w as rownum, lag(id, 1) over w as prev, lead(id, 1) over w as next from route where id=1350 window w as (order by shortname, id asc rows between 1 preceding and 1 following) order by shortname, id ;

However this gives the result
1350;1;;

This does not work due to the id=1350 is always applied before the rows make it into the window therefore you only have rows which match id=1350, which is not what you want in this case.

The following query gives the result I am expecting

select * from (select id, row_number() over w as rownum,
lag(id, 1) over w as prev, lead(id, 1) over w as next
from route window w as (order by shortname, id
rows between 1 preceding and 1 following) order by shortname, id) as s where id=1350

1350;3;1815;1813

This works because the id=1350 is not pushed down into the subquery which contain the windowing functions, this also means that the entire route table is processed and you may suffer from performance problems if the route table is, or gets big. You'll be able to confirm this by looking at the EXPLAIN output and noticing the lack of filter on the seqscan.

The explain plan is
"Subquery Scan on s (cost=0.14..15.29 rows=1 width=32)"
" Filter: (s.id = 1350)"
" -> WindowAgg (cost=0.14..13.51 rows=143 width=12)"
" -> Index Only Scan using route_idx on route (cost=0.14..10.29 rows=143 width=12)"

as it makes use of the index created as follows

CREATE INDEX route_idx
ON route
USING btree
(shortname COLLATE pg_catalog."default", id);

I believe that the index has all the data that is needed to obtain the results in a single query.
Is it possible to write the query as a single select and if so how?

why not just write it as: select id, (select max(id) from route where id < 1350) as prev, (select min(id) from route where id > 1350) as next from route where id=2; ?

That should be much more efficient for a larger table as it should avoid the seqscan and allow the index to be used for all 3 numbers.

David Rowley http://www.2ndQuadrant.com/

PostgreSQL Development, 24x7 Support, Training & Services

pgsql-general by date:

From: Jim Nasby
Date: 06 January 2016, 04:52:10
Subject: Re: Efficiently selecting single row from a select with window functions row_number, lag and lead

From: Vitaly Burovoy
Date: 06 January 2016, 05:15:34
Subject: Re: Efficiently selecting single row from a select with window functions row_number, lag and lead

Re: Efficiently selecting single row from a select with window functions row_number, lag and lead - Mailing list pgsql-general

Previous

Next