Re: FETCH FIRST clause WITH TIES option - Mailing list pgsql-hackers

From Andrew Gierth
Subject Re: FETCH FIRST clause WITH TIES option
Date
Msg-id 87muqw7luv.fsf@news-spur.riddles.org.uk
Whole thread Raw
In response to Re: FETCH FIRST clause WITH TIES option  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: FETCH FIRST clause WITH TIES option  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Re: FETCH FIRST clause WITH TIES option  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
>>>>> "Tomas" == Tomas Vondra <tomas.vondra@2ndquadrant.com> writes:

 >> I still think that this is the wrong approach. Implementing WITH
 >> TIES and PERCENT together using an implicit window function call
 >> kills two birds with one very small stone (the only executor change
 >> needed would be teaching LIMIT to be able to stop on a boolean
 >> condition), with maximum reuse of existing facilities.

 Tomas> Hmmm, maybe. How would that work, exactly? Wouldn't that mean
 Tomas> extra overhead (the window functions are hardly free) and
 Tomas> limitations?

They're not free, but then this case probably shouldn't be treated as a
particularly hot code path.

The basic idea is this: extend the Limit node to be able to stop on a
boolean expression, such that the first false value ends the result.

Then FETCH FIRST N WITH TIES becomes "stop when the expression
  rank() over (order by ...) <= N
is no longer true" (where the ... is the existing top level order by)

and FETCH FIRST N PERCENT is the same but with percent_rank (or
cume_dist, I'd have to check the exact semantics)

rank() doesn't need to read ahead in the input. percent_rank of course
does, but it's clearly impossible to implement PERCENT without doing
that.

Also, one could consider extending LIMIT to support arbitrary
expressions, with some syntax like LIMIT WHEN (condition) which would be
the general form.

 Tomas> FWIW, I doubt the patch can be much smaller/simpler - a
 Tomas> significant part of the new stuff is in gram.y and node read/out
 Tomas> infrastructure, the changes to LIMIT node are fairly minimal.

It's not a matter of patch size as such, but reuse of mechanisms rather
than adding new ones. Also doing WITH TIES as a special case in the
Limit node doesn't make adding PERCENT any easier at all, whereas the
above gets it basically for free.

-- 
Andrew (irc:RhodiumToad)


pgsql-hackers by date:

Previous
From: Vik Fearing
Date:
Subject: Re: date_trunc() in a specific time zone
Next
From: Paul A Jungwirth
Date:
Subject: Re: date_trunc() in a specific time zone