On Wed, 16 Mar 2022 at 10:24, Greg Stark <stark@mit.edu> wrote:
>
> This looks like an awesome addition.
Thanks
> I have one technical questions...
>
> Is it possible to actually transform the row_number case into a LIMIT
> clause or make the planner support for this case equivalent to it (in
> which case we can replace the LIMIT clause planning to transform into
> a window function)?
Currently, I have only coded it to support monotonically increasing
and decreasing functions. Putting a <= <const> type condition on a
row_number() function with no PARTITION BY clause I think is logically
the same as a LIMIT clause, but that's not the case for rank() and
dense_rank(). There may be multiple peer rows with the same rank in
those cases. We'd have no way to know what the LIMIT should be set to.
I don't really want to just do this for row_number().
> The reason I ask is because the Limit plan node is actually quite a
> bit more optimized than the general window function plan node. It
> calculates cost estimates based on the limit and can support Top-N
> sort nodes.
This is true. There's perhaps no reason why an additional property
could not be added to allow the prosupport function to optionally set
*exactly* the maximum number of rows that could match the condition.
e.g. for select * from (select *,row_number() over (order by c) rn
from ..) w where rn <= 10; that could be set to 10, and if we used
rank() instead of row_number(), it could just be left unset.
I think this is probably worth thinking about at some future date. I
don't really want to make it part of this effort. I also don't think
I'm doing anything here that would need to be undone to make that
work.
David