Home > mailing lists

Thread: modifying the tbale function

modifying the tbale function

From

"Islam Hegazy"

Date:

18 March 2007, 16:55:26

Hi there

I am trying to modify the execution of the table function to work in iterator fashion instead of materializing the output. I have been digging the Postgresql code source for about a month now and I can figure out where the execution of the table function works. I will be very grateful if anyone tell where to begin as my project due is after 10 days only.

Regards

Islam Hegazy

Re: modifying the tbale function

From

Martijn van Oosterhout

Date:

18 March 2007, 17:15:38

On Sun, Mar 18, 2007 at 01:54:55PM -0600, Islam Hegazy wrote:
> I am trying to modify the execution of the table function to work in
> iterator fashion instead of materializing the output. I have been
> digging the Postgresql code source for about a month now and I can
> figure out where the execution of the table function works. I will be
> very grateful if anyone tell where to begin as my project due is
> after 10 days only.

I've been thinking recently about why it's so difficult. It occurs to
me that the problem is because the language interpreters don't lend
themselves to being an iterator.

What you want is that when you call a perl tablefunction that as soon
as the perl function returns a row to return that to the caller. That
means the perl interpreter has to be able to save all its state,
return to the caller and when next called resume where it left off.
I don't know if it can do that, but it would have to be implemented for
each language (or use threads).

Have a nice day,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Re: modifying the tbale function

From

Andrew Dunstan

Date:

18 March 2007, 18:26:37

Martijn van Oosterhout wrote:
> What you want is that when you call a perl tablefunction that as soon
> as the perl function returns a row to return that to the caller. That
> means the perl interpreter has to be able to save all its state,
> return to the caller and when next called resume where it left off.
> I don't know if it can do that, but it would have to be implemented for
> each language (or use threads).
>
>
>   

We haven't even worked out how to do that cleanly for plpgsql, which we 
control, let alone for any third party interpreter.

I'm not convinced it would be a huge gain anyway. Switching madly in and 
out of the perl interpreter at least is a known performance problem, 
IIRC - Perl XML objects have, or used to have, problems with that (and 
they still don't perform terribly well).

cheers

andrew

Re: modifying the tbale function

From

Neil Conway

Date:

18 March 2007, 19:57:44

Andrew Dunstan wrote:
> I'm not convinced it would be a huge gain anyway. Switching madly in 
> and out of the perl interpreter at least is a known performance 
> problem, IIRC

Returning control to the backend for every row returned would likely be 
excessive, but you could return once every k rows and get most of the 
benefits of both approaches (k might be on the order of 1000). The 
problem with the current approach is that it makes returning large 
result sets from PL functions very expensive, since they need to be 
spooled to disk.

As for using threads, that's pretty much a non-starter: we can't safely 
allow calls into the backend from multiple concurrent threads, and I 
doubt that will chance any time soon.

-Neil

Re: modifying the tbale function

From

"Islam Hegazy"

Date:

18 March 2007, 20:10:08

Returning k rows would be a reasonable solution but which functions need to 
be modified to achieve this.


----- Original Message ----- 
From: "Neil Conway" <neilc@samurai.com>
To: "Andrew Dunstan" <andrew@dunslane.net>
Cc: "Martijn van Oosterhout" <kleptog@svana.org>; "Islam Hegazy" 
<islheg@hotmail.com>; <pgsql-hackers@postgresql.org>
Sent: Sunday, March 18, 2007 4:57 PM
Subject: Re: [HACKERS] modifying the tbale function


> Andrew Dunstan wrote:
>> I'm not convinced it would be a huge gain anyway. Switching madly in and 
>> out of the perl interpreter at least is a known performance problem, IIRC
>
> Returning control to the backend for every row returned would likely be 
> excessive, but you could return once every k rows and get most of the 
> benefits of both approaches (k might be on the order of 1000). The problem 
> with the current approach is that it makes returning large result sets 
> from PL functions very expensive, since they need to be spooled to disk.
>
> As for using threads, that's pretty much a non-starter: we can't safely 
> allow calls into the backend from multiple concurrent threads, and I doubt 
> that will chance any time soon.
>
> -Neil
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 4: Have you searched our list archives?
>
>               http://archives.postgresql.org
>

Re: modifying the tbale function

From

Tom Lane

Date:

19 March 2007, 00:50:21

Neil Conway <neilc@samurai.com> writes:
> Returning control to the backend for every row returned would likely be 
> excessive, but you could return once every k rows and get most of the 
> benefits of both approaches (k might be on the order of 1000).

However, this still leaves us with no idea how to persuade perl, tcl,
python, et al to cooperate.

I think you are underestimating the cost of suspending/resuming any of
those interpreters, and overestimating the cost of a tuplestore, which
on a per-tuple basis is really pretty cheap.  It's quite likely that the
proposed project would produce piddling or negative gains, after
expending a huge amount of work.  (A tenth of the effort on optimizing
tuplestore some more would probably be a better investment.)

A cross-check on this theory could be made without a lot of effort: hack
SQL functions to use a tuplestore (fed via the tuplestore destreceiver,
so as not to exit the executor) instead of return-after-every-tuple.
Compare performance.  I kinda suspect you'll find it a loss even there.
        regards, tom lane

Re: modifying the tbale function

From

Heikki Linnakangas

Date:

19 March 2007, 08:00:24

Tom Lane wrote:
> Neil Conway <neilc@samurai.com> writes:
>> Returning control to the backend for every row returned would likely be 
>> excessive, but you could return once every k rows and get most of the 
>> benefits of both approaches (k might be on the order of 1000).
> 
> However, this still leaves us with no idea how to persuade perl, tcl,
> python, et al to cooperate.

It seem like a useful optimization for C-functions, though. I was caught 
by surprise a while ago when I realized that the way I've been using to 
create simple test data quickly:

INSERT INTO foo SELECT key FROM generate_series(1, <large number>) key

materializes the generate_series result set first.

I'd like to have that changed, even if we leave the behavior as it is 
for PLs.

Another affected use case is using dblink to copy large tables.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: modifying the tbale function

From

Andrew Dunstan

Date:

19 March 2007, 11:25:55

Heikki Linnakangas wrote:
> Tom Lane wrote:
>> Neil Conway <neilc@samurai.com> writes:
>>> Returning control to the backend for every row returned would likely 
>>> be excessive, but you could return once every k rows and get most of 
>>> the benefits of both approaches (k might be on the order of 1000).
>>
>> However, this still leaves us with no idea how to persuade perl, tcl,
>> python, et al to cooperate.
>
> It seem like a useful optimization for C-functions, though. I was 
> caught by surprise a while ago when I realized that the way I've been 
> using to create simple test data quickly:
>
>

Actually, I think we could teach the PLs to do it - just not 
transparently, so we'd need to mark which functions used the new 
protocol. Such functions would get a state object as an implied first 
argument, so in plperl it might work like this (for a 
generate_series-like function):
   my $state = shift;   my $low = shift;   my $high = shift;   if ($state->{callstatus} eq 'firstcall')   {
$state->{counter}= $low;   }   elseif ($state->{callstatus} eq 'cleanup')   {      # do cleanup here
$state->{return_status}= 'cleaned';      return;   }   my $next  = $state->{counter}++;   $state->{return_status}  =
$next< $high ? 'result' : 'last_result';   return $next;

To support this I think we'd need to do something like:
 create function mygs(int, int)   returns setof int   language plperl   with srfstate   as $$ ... $$;

cheers

andrew

Re: modifying the tbale function

From

Richard Huxton

Date:

19 March 2007, 13:27:46

Andrew Dunstan wrote:
> 
> Actually, I think we could teach the PLs to do it - just not 
> transparently, so we'd need to mark which functions used the new 
> protocol. Such functions would get a state object as an implied first 
> argument, so in plperl it might work like this (for a 
> generate_series-like function):

> To support this I think we'd need to do something like:
> 
>  create function mygs(int, int)
>    returns setof int
>    language plperl
>    with srfstate
>    as $$ ... $$;

Is this not what we do with aggregate functions at present?

--   Richard Huxton  Archonet Ltd

Re: modifying the tbale function

From

Andrew Dunstan

Date:

19 March 2007, 13:36:44

Richard Huxton wrote:
> Andrew Dunstan wrote:
>>
>> Actually, I think we could teach the PLs to do it - just not 
>> transparently, so we'd need to mark which functions used the new 
>> protocol. Such functions would get a state object as an implied first 
>> argument, so in plperl it might work like this (for a 
>> generate_series-like function):
>
>> To support this I think we'd need to do something like:
>>
>>  create function mygs(int, int)
>>    returns setof int
>>    language plperl
>>    with srfstate
>>    as $$ ... $$;
>
> Is this not what we do with aggregate functions at present?
>

Yes, more or less. That's what made me think of it.

OTOH, before we rush out and do it someone needs to show that it's a net 
win. I agree with Tom that making tuplestore faster would probably be a 
much better investment of time.

cheers

andrew

Re: modifying the tbale function

From

Tom Lane

Date:

19 March 2007, 13:49:38

Andrew Dunstan <andrew@dunslane.net> writes:
> Richard Huxton wrote:
>> Is this not what we do with aggregate functions at present?

> Yes, more or less. That's what made me think of it.

> OTOH, before we rush out and do it someone needs to show that it's a net 
> win.

Yeah, because this isn't doing anything to address the problem of
entry/exit overhead from calling a PL function many times.  I kinda
dislike shoving the problem onto the heads of PL programmers anyway...
        regards, tom lane

Re: modifying the tbale function

From

Gregory Stark

Date:

19 March 2007, 14:29:51

"Andrew Dunstan" <andrew@dunslane.net> writes:

> Yes, more or less. That's what made me think of it.
>
> OTOH, before we rush out and do it someone needs to show that it's a net win. I
> agree with Tom that making tuplestore faster would probably be a much better
> investment of time.

I don't think the problem with the tuplestore is a matter of speed. It's a
matter of scalability and flexibility. It limits the types of applications
that can use SRFs and the amount of data they can manipulate before it becomes
impractical.

Consider applications like dblink that have SRFs that read data from a slow
network sources. Or that generate more data than the server can actually store
at any one time. Or that overflow work_mem but are used in queries that could
return quickly based on the first few records.

Unfortunately, I don't think there's a simple fix that'll work for all PLs
using the current interface. Even languages with iterators themselves (python,
I think) probably don't expect to be called externally while an iterator is in
progress.

It seems to me the way to fix it is to abandon the iterator style interface
and an interface that allows you to implement a SRF by providing a function
that returns just the "next" record. It would have to save enough state for
the next iteration explicitly in a data structure rather than being able to
depend on the entire program state being restored.

You could argue you could already do this using a non-SRF but there are two
problems: 1) there's no convenient way to stash the state anywhere and 2) it
wouldn't be convenient to use in SQL in FROM clauses the way SRFs are.

IIRC there is already a hack in at least one of the PLs to stash data in a
place where you can access it in repeated invocations. It doesn't work
correctly if you call your function from two different places from a query. It
would take executor support for such state data structures to fix that
problem.

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com

Re: modifying the tbale function

From

"Florian G. Pflug"

Date:

19 March 2007, 14:53:55

Gregory Stark wrote:
> "Andrew Dunstan" <andrew@dunslane.net> writes:
> 
>> Yes, more or less. That's what made me think of it.
>>
>> OTOH, before we rush out and do it someone needs to show that it's a net win. I
>> agree with Tom that making tuplestore faster would probably be a much better
>> investment of time.
> 
> I don't think the problem with the tuplestore is a matter of speed. It's a
> matter of scalability and flexibility. It limits the types of applications
> that can use SRFs and the amount of data they can manipulate before it becomes
> impractical.
> 
> Consider applications like dblink that have SRFs that read data from a slow
> network sources. Or that generate more data than the server can actually store
> at any one time. Or that overflow work_mem but are used in queries that could
> return quickly based on the first few records.
> 
> Unfortunately, I don't think there's a simple fix that'll work for all PLs
> using the current interface. Even languages with iterators themselves (python,
> I think) probably don't expect to be called externally while an iterator is in
> progress.

Just a thought - I believe that there are portable user-space thread 
implementations that contain little or no machine-specific code. What
if postgres used one of those to switch from the PL into the executor
and back after, say, 1000 rows were returned by the SFR?

What would be needed is basically some enhanced version of setjmp/longjmp
that actually saves the stack, and not just resets the stackpointer.

Since context switching would occur only at two well-defined places
(Some return_next_row function that PLs call when a SFR returns a row,
and in the executor if no more previously returned rows from that SFR
are available), this wouldn't introduce the usual multithreading
headache, but still allow to switch in and out of the PL interpreter.

greetings, Florian Pflug

Re: modifying the tbale function

From

Andrew Dunstan

Date:

19 March 2007, 15:18:53

Florian G. Pflug wrote:
> Just a thought - I believe that there are portable user-space thread 
> implementations that contain little or no machine-specific code. What
> if postgres used one of those to switch from the PL into the executor
> and back after, say, 1000 rows were returned by the SFR?
>
> What would be needed is basically some enhanced version of setjmp/longjmp
> that actually saves the stack, and not just resets the stackpointer.
>
> Since context switching would occur only at two well-defined places
> (Some return_next_row function that PLs call when a SFR returns a row,
> and in the executor if no more previously returned rows from that SFR
> are available), this wouldn't introduce the usual multithreading
> headache, but still allow to switch in and out of the PL interpreter.
>
>

This just sounds horribly fragile.

Are we really sure that this isn't a solution in search of a problem?

cheers

andrew

Re: modifying the tbale function

From

"Islam Hegazy"

Date:

19 March 2007, 15:44:08

So, I understood from all those opinions that much of the work is to be done
in the interface language interpreter not postgresql code itself. Am I right
or I missed something?

Regards
Islam Hegazy


----- Original Message ----- 
From: "Andrew Dunstan" <andrew@dunslane.net>
To: "Florian G. Pflug" <fgp@phlo.org>
Cc: "Gregory Stark" <stark@enterprisedb.com>; "Richard Huxton"
<dev@archonet.com>; "Heikki Linnakangas" <heikki@enterprisedb.com>; "Tom
Lane" <tgl@sss.pgh.pa.us>; "Neil Conway" <neilc@samurai.com>; "Martijn van
Oosterhout" <kleptog@svana.org>; "Islam Hegazy" <islheg@hotmail.com>;
<pgsql-hackers@postgresql.org>
Sent: Monday, March 19, 2007 12:18 PM
Subject: Re: [HACKERS] modifying the tbale function


> Florian G. Pflug wrote:
>> Just a thought - I believe that there are portable user-space thread
>> implementations that contain little or no machine-specific code. What
>> if postgres used one of those to switch from the PL into the executor
>> and back after, say, 1000 rows were returned by the SFR?
>>
>> What would be needed is basically some enhanced version of setjmp/longjmp
>> that actually saves the stack, and not just resets the stackpointer.
>>
>> Since context switching would occur only at two well-defined places
>> (Some return_next_row function that PLs call when a SFR returns a row,
>> and in the executor if no more previously returned rows from that SFR
>> are available), this wouldn't introduce the usual multithreading
>> headache, but still allow to switch in and out of the PL interpreter.
>>
>>
>
> This just sounds horribly fragile.
>
> Are we really sure that this isn't a solution in search of a problem?
>
> cheers
>
> andrew
>
>
>
>

Re: modifying the tbale function

From

Joe Conway

Date:

19 March 2007, 15:51:13

Andrew Dunstan wrote:
> 
> Are we really sure that this isn't a solution in search of a problem?

The need for value-per-call is real (examples mentioned down-thread) and 
was anticipated from day one of the SRF implementation (in fact the 
first patch I wrote was value-per-call, not materialize). But when we 
realized that value-per-call was not going to work very well for any PL 
*except* C-functions, we switched to SFRM_Materialize as the only 
supported mode, with SFRM_ValuePerCall left as a to-be-coded-later 
option (see SetFunctionReturnMode in execnodes.h).

Personally I think it is worth having SFRM_ValuePerCall even if only C 
functions can make use of it.

Joe

Re: modifying the tbale function

From

Andrew Dunstan

Date:

19 March 2007, 16:16:54

Joe Conway wrote:
> Andrew Dunstan wrote:
>>
>> Are we really sure that this isn't a solution in search of a problem?
>
> The need for value-per-call is real (examples mentioned down-thread) 
> and was anticipated from day one of the SRF implementation (in fact 
> the first patch I wrote was value-per-call, not materialize). But when 
> we realized that value-per-call was not going to work very well for 
> any PL *except* C-functions, we switched to SFRM_Materialize as the 
> only supported mode, with SFRM_ValuePerCall left as a 
> to-be-coded-later option (see SetFunctionReturnMode in execnodes.h).
>
> Personally I think it is worth having SFRM_ValuePerCall even if only C 
> functions can make use of it.
>


Yeah, makes plenty of sense for C funcs. I don't think there's an 
argument about that. But for that we don't need any threading 
infrastructure.

cheers

andrew

Re: modifying the tbale function

From

Joe Conway

Date:

19 March 2007, 16:23:45

Andrew Dunstan wrote:
> Joe Conway wrote:
>> Andrew Dunstan wrote:
>>> Are we really sure that this isn't a solution in search of a problem?
>> The need for value-per-call is real (examples mentioned down-thread) 
>> and was anticipated from day one of the SRF implementation (in fact 
>> the first patch I wrote was value-per-call, not materialize). But when 
>> we realized that value-per-call was not going to work very well for 
>> any PL *except* C-functions, we switched to SFRM_Materialize as the 
>> only supported mode, with SFRM_ValuePerCall left as a 
>> to-be-coded-later option (see SetFunctionReturnMode in execnodes.h).
>>
>> Personally I think it is worth having SFRM_ValuePerCall even if only C 
>> functions can make use of it.
> 
> Yeah, makes plenty of sense for C funcs. I don't think there's an 
> argument about that. But for that we don't need any threading 
> infrastructure.

Oh sure -- sorry I wasn't clear. I wasn't trying to support the idea of 
threading so much as the idea that value-per-call itself has merit for a 
number of use cases.

Joe

Re: modifying the tbale function

From

Gregory Stark

Date:

19 March 2007, 16:35:41

"Florian G. Pflug" <fgp@phlo.org> writes:

> Since context switching would occur only at two well-defined places
> (Some return_next_row function that PLs call when a SFR returns a row,
> and in the executor if no more previously returned rows from that SFR
> are available), this wouldn't introduce the usual multithreading
> headache...

Yes it would. Consider what happens if the PL function calls into SPI to
execute a query....

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com

Re: modifying the tbale function

From

"Florian G. Pflug"

Date:

19 March 2007, 20:17:42

Andrew Dunstan wrote:
> Florian G. Pflug wrote:
>> Just a thought - I believe that there are portable user-space thread 
>> implementations that contain little or no machine-specific code. What
>> if postgres used one of those to switch from the PL into the executor
>> and back after, say, 1000 rows were returned by the SFR?
>>
>> What would be needed is basically some enhanced version of setjmp/longjmp
>> that actually saves the stack, and not just resets the stackpointer.
>>
>> Since context switching would occur only at two well-defined places
>> (Some return_next_row function that PLs call when a SFR returns a row,
>> and in the executor if no more previously returned rows from that SFR
>> are available), this wouldn't introduce the usual multithreading
>> headache, but still allow to switch in and out of the PL interpreter.
>>
>>
> This just sounds horribly fragile.

Why would it be? It's about the same as running postgresql in one thread,
and some PL in another. This should only cause trouble if both use some
non-reentrant libc-functions. But even that wouldn't matter because of
the well-defined context switching points.

Here is a paper about portable userspace threads that I just googled.
http://www.gnu.org/software/pth/rse-pmt.ps

> Are we really sure that this isn't a solution in search of a problem?
I think this really depends on how you define "problem". Some people
might think that "select * from myfunc(...) limit 1" should stop and
return a result after myfunc(...) has returned one row. Others will
say "well, just use a different software design that doesn't depend
on this optimization".

greetings, Florian Pflug

Re: modifying the tbale function

From

"Florian G. Pflug"

Date:

19 March 2007, 20:27:20

Gregory Stark wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
> 
>> Since context switching would occur only at two well-defined places
>> (Some return_next_row function that PLs call when a SFR returns a row,
>> and in the executor if no more previously returned rows from that SFR
>> are available), this wouldn't introduce the usual multithreading
>> headache...
> 
> Yes it would. Consider what happens if the PL function calls into SPI to
> execute a query....

I don't that would cause trouble. Postgres and the PL would still not run 
concurrently - in fact, from a control-flow point of view the PL would run at 
_exactly_ the same place as it is now. Precisely if someone tries to pull rows 
out of some executor node. The only difference is that now it's execution would
would be stopped and restarted multiple times by some sort of stack-switching.

There might be trouble if a second function has to be executed with the same PL
as an already running (but currently "stopped") function. This would only work
for PL that is thread-safe in some way.

greetings, Florian Pflug

Re: modifying the tbale function

From

Tom Lane

Date:

19 March 2007, 21:34:47

"Florian G. Pflug" <fgp@phlo.org> writes:
> Here is a paper about portable userspace threads that I just googled.
> http://www.gnu.org/software/pth/rse-pmt.ps

I'm not impressed --- the list of unsolved problems is a bit too long.
One that seems a showstopper for our purposes is lack of ability to
deal reliably with stack overflow on alternate stacks.  If we're going
to be doing anything as loosely defined as running a third-party
language interpreter in these stacks, that just won't do.

Another little problem is it's LGPL.
        regards, tom lane

Re: modifying the tbale function

From

Tom Lane

Date:

19 March 2007, 21:49:58

"Florian G. Pflug" <fgp@phlo.org> writes:
> There might be trouble if a second function has to be executed with
> the same PL as an already running (but currently "stopped")
> function. This would only work for PL that is thread-safe in some way.

Seems a bit iffy.

It strikes me that at least for plpgsql, it might be possible to support
value-per-call mode without any thread support.  What you'd need to do
is get rid of the current arrangement whereby the control structure of
the plpgsql code is modeled on-the-fly by the call stack of the C code,
and instead have block nesting, loops, etc represented by explicit data
structures that're manipulated by C code with a flat call stack.  If the
function wants to do a RETURN NEXT, you just return, leaving its current
state all nicely tucked in a data structure.  This would be a little
tedious but is in principle a straightforward change.  I'm not sure if
there'd be any meaningful performance impact.

The tricky part is what about exception handling?  If the function does
RETURN NEXT inside a BEGIN/EXCEPTION block, what do you do ... what does
that even mean?  There be equally nasty dragons lurking behind that
question for a threaded implementation, of course.  It might be that we
could get away with decreeing that RETURN NEXT inside EXCEPTION isn't
legal.
        regards, tom lane

Re: modifying the tbale function

From

Andrew Dunstan

Date:

20 March 2007, 10:35:29

Florian G. Pflug wrote:
>> Are we really sure that this isn't a solution in search of a problem?
> I think this really depends on how you define "problem". Some people
> might think that "select * from myfunc(...) limit 1" should stop and
> return a result after myfunc(...) has returned one row. Others will
> say "well, just use a different software design that doesn't depend
> on this optimization".
>
>

I think that's a very thin use case to justify all the scaffolding you 
propose, so put me in with the others I guess.

If we can provide a per-call mechanism for C functions, that should be 
adequate, I think. If you need heavy optimization then you need C 
functions anyway.

cheers

andrew

Re: modifying the tbale function

From

Hannu Krosing

Date:

20 March 2007, 12:58:48

Ühel kenal päeval, T, 2007-03-20 kell 09:35, kirjutas Andrew Dunstan:
> Florian G. Pflug wrote:
> >> Are we really sure that this isn't a solution in search of a problem?
> > I think this really depends on how you define "problem". Some people
> > might think that "select * from myfunc(...) limit 1" should stop and
> > return a result after myfunc(...) has returned one row. Others will
> > say "well, just use a different software design that doesn't depend
> > on this optimization".

The same argument could be used for justifying leaving out any
optimisations.

> I think that's a very thin use case to justify all the scaffolding you 
> propose, so put me in with the others I guess.
> 
> If we can provide a per-call mechanism for C functions, that should be 
> adequate, I think. 

Sure, having it at least at C level would make the job of pl
implementors wanting to do per-call returns a little easier.

> If you need heavy optimization then you need C 
> functions anyway.

Check out pl/proxy ( http://pgfoundry.org/projects/plproxy ), which is a
dblink replacement and database partitioning tool in one. A short
summary is here
https://developer.skype.com/SkypeGarage/DbProjects/PlProxy (may be a bit
outdated).

I'm sure there would be use-cases (like moving huge tables) where
per-call (or per-kilocall :) ) mechanisms would come handy.

One example would be doing a huge sort split between N partitions so
that each partition sorts its part of the data and then merge the
results in-order from all partitions in-line, without doing re-sorting
on the master.

With a function returning the full recordset the sorting can't be
distributed very effectively, as the whole query result needs to be
saved on the master node before returning.

-- 
----------------
Hannu Krosing
Database Architect
Skype Technologies OÜ
Akadeemia tee 21 F, Tallinn, 12618, Estonia

Skype me:  callto:hkrosing
Get Skype for free:  http://www.skype.com

Re: modifying the tbale function

From

"Florian G. Pflug"

Date:

20 March 2007, 18:06:54

Tom Lane wrote:
> "Florian G. Pflug" <fgp@phlo.org> writes:
>> Here is a paper about portable userspace threads that I just googled.
>> http://www.gnu.org/software/pth/rse-pmt.ps
> 
> I'm not impressed --- the list of unsolved problems is a bit too long.
> One that seems a showstopper for our purposes is lack of ability to
> deal reliably with stack overflow on alternate stacks.  If we're going
> to be doing anything as loosely defined as running a third-party
> language interpreter in these stacks, that just won't do.
> 
> Another little problem is it's LGPL.

I don't think using that GnuPT library the paper mentions is a viable
approach for postgres. I just posted that link because it shows
that this is not impossible to do.

What would actually be interesting is to find out of the
ucontext/getcontext/makecontext/swapcontext/setcontext
functions mentioned in the paper are available on the
plattform where postgres is used in production for larger
dataset. I'd assume that people who need this PL optimization
will run a quite update-to-date version of their particular OS,
so maybe the portability problems of those functions wouldn't
be a problem for postgres - the PL optimization would just be
disabled at configure time if they are not available.

The main question is: IF the stability problems like stack overflow
can be addressed, would this be in principle considered to be feature
that people would like to have? Or is it considered not worth the effort?

greetings, Florian Pflug