Thread: table function: limit, offset, order

table function: limit, offset, order

From
Vadim Menshakov
Date:
Hello,

I've got questions on the use of table functions feature in PostgreSQL 7.3.2.
I'm using FuncCallContext structure to define the maximum number of calls of
my "SRF returning a composite type":

FuncCallContext  *funcctx;
if (SRF_IS_FIRSTCALL())
{
      funcctx->max_calls = 10000;
      [.....]
}

10000 here is the maximum number of rows one can request.
When I want to select only 3 rows, I use my function My_Table_Func in the
following way:

     select * from My_Table_Func ( arg1, arg2, .... ) limit 3;

expecting the function to execute only 3 times. But it executes 10000 times
(taking lots of time), and THEN returns only 3 rows. Of course, I can pass
the limit into my function as an argument, but this will increase the number
of arguments (in fact, there's plenty of them already).
The question is, is there another way to know that the limit is exceeded ?
Another question is the same thing on OFFSET clause - is there some way to
use it inside the table function, e.g. by initializing funcctx->call_cntr to
OFFSET on the first call? I mean, except from passing it into the function as
an argument.
And the final question is about an order of rows returned by a table
function. Can it change? And if it can, in which cases it changes? Can it
change when I specify limit?




wbr Vadim Menshakov



Re: table function: limit, offset, order

From
Joe Conway
Date:
Vadim Menshakov wrote:
>      select * from My_Table_Func ( arg1, arg2, .... ) limit 3;
>
> expecting the function to execute only 3 times. But it executes 10000 times
> (taking lots of time), and THEN returns only 3 rows. Of course, I can pass
> the limit into my function as an argument, but this will increase the number
> of arguments (in fact, there's plenty of them already).

I don't think there is any way for the function to determine that there
is a limit clause. An argument is most likely your only choice. If you
are running out of function arguments, perhaps you could combine a few
related ones into an array.

> Another question is the same thing on OFFSET clause - is there some way to
> use it inside the table function, e.g. by initializing funcctx->call_cntr to
> OFFSET on the first call? I mean, except from passing it into the function as
> an argument.

Same answer.

> And the final question is about an order of rows returned by a table
> function. Can it change? And if it can, in which cases it changes? Can it
> change when I specify limit?
>

The rows will be returned in whatever order you produce them in your
function.

The bottom line on your questions seems to be a reluctance to pass
controlling parameters in to your function. As I said above, I'd
recommend that if the number of function arguments is bumping into the
max allowed, look into using arrays to pass in parameters.

Joe


Re: table function: limit, offset, order

From
Tom Lane
Date:
Joe Conway <mail@joeconway.com> writes:
> Vadim Menshakov wrote:
>> select * from My_Table_Func ( arg1, arg2, .... ) limit 3;
>> expecting the function to execute only 3 times. But it executes 10000 times
>> (taking lots of time), and THEN returns only 3 rows. Of course, I can pass
>> the limit into my function as an argument, but this will increase the number
>> of arguments (in fact, there's plenty of them already).

> I don't think there is any way for the function to determine that there
> is a limit clause. An argument is most likely your only choice.

This is one of the limitations of the present table function
implementation; there should be a way for the function to return one row
per call.  (We talked about that back in the early stages of the table
function project, but it remains undone.)  Given that, a LIMIT would
simply cause the executor to stop fetching rows from the function.

I'm not sure the plpgsql implementation could support such an operating
mode, but the SQL-function implementation could do it easily; and of
course C functions could do it if they don't mind saving their state
from call to call.

Actually, the pieces are all in place for this already, now that
TupleStore can support interleaved read and write.  For a set function
using the row-per-call behavior, it'd be possible to run the function
only when new rows are actually demanded from the FunctionScan node.
However, making this work in parallel with the single-call-returns-a-
TupleStore mode could make the code pretty ugly...

            regards, tom lane


Re: table function: limit, offset, order

From
Joe Conway
Date:
Tom Lane wrote:
> This is one of the limitations of the present table function
> implementation; there should be a way for the function to return one row
> per call.  (We talked about that back in the early stages of the table
> function project, but it remains undone.)  Given that, a LIMIT would
> simply cause the executor to stop fetching rows from the function.

Yup, I remember that.

> I'm not sure the plpgsql implementation could support such an operating
> mode, but the SQL-function implementation could do it easily; and of
> course C functions could do it if they don't mind saving their state
> from call to call.

It would be really ugly to get a stream mode working for plpgsql (or
plr, pltcl, plperl, plpython, etc for that matter), but it does seem
ultimately desirable for sql and C.

> Actually, the pieces are all in place for this already, now that
> TupleStore can support interleaved read and write.  For a set function
> using the row-per-call behavior, it'd be possible to run the function
> only when new rows are actually demanded from the FunctionScan node.
> However, making this work in parallel with the single-call-returns-a-
> TupleStore mode could make the code pretty ugly...

I always intended to get back to this, and it seems your recent
innovations will make it easier. I'll be mired in the array support
coding for at least a few weeks (although I hope to have a "phase 1"
patch in by Monday), but after that I'll try to pick back up on the
table function streaming mode support.

Joe