Thread: Arguments to foreign tables?

Arguments to foreign tables?

From
Jeff Davis
Date:
Is there any fundamental or philosophical reason why a foreign table
can't accept arguments? Should that be a TODO?

Right now, to accept arguments to a from-clause item, you have to use an
SRF, which is much more limited than a foreign table. If foreign tables
could accept arguments, then SRFs could just be a simplified special
case of foreign tables.

I looked for some previous discussions on this topic and nothing turned
up.

Regards,Jeff Davis




Re: Arguments to foreign tables?

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> Is there any fundamental or philosophical reason why a foreign table
> can't accept arguments?

That isn't a table; it's some sort of function.  Now that we have
LATERAL, there is no good reason to contort SQL's syntax and semantics
in the direction you suggest.
        regards, tom lane



Re: Arguments to foreign tables?

From
Willem Leenen
Date:

I don't understand the question. A table accepting arguments? What do you mean with that? Can you give an code example in an other RDBMS (like oracle)?

Perhaps nothing came up because there are more people not understanding what you want to accomplish?

regards,
Willem

> Subject: [HACKERS] Arguments to foreign tables?
> From: pgsql@j-davis.com
> To: pgsql-hackers@postgresql.org
> Date: Sun, 4 Nov 2012 11:59:20 -0800
>
> Is there any fundamental or philosophical reason why a foreign table
> can't accept arguments? Should that be a TODO?
>
> Right now, to accept arguments to a from-clause item, you have to use an
> SRF, which is much more limited than a foreign table. If foreign tables
> could accept arguments, then SRFs could just be a simplified special
> case of foreign tables.
>
> I looked for some previous discussions on this topic and nothing turned
> up.
>
> Regards,
> Jeff Davis
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers

Re: Arguments to foreign tables?

From
John R Pierce
Date:
On 11/04/12 11:59 AM, Jeff Davis wrote:
> Is there any fundamental or philosophical reason why a foreign table
> can't accept arguments? Should that be a TODO?

what does that even mean?    how would 'data' accept 'arguments' ??!



-- 
john r pierce                            N 37, W 122
santa cruz ca                         mid-left coast




Re: Arguments to foreign tables?

From
Willem Leenen
Date:

>
> what does that even mean? how would 'data' accept 'arguments' ??!
>
>

That's Sherlock Holmes!  "Don't argue with your data"


Re: Arguments to foreign tables?

From
Jeff Davis
Date:
On Sun, 2012-11-04 at 15:13 -0500, Tom Lane wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> > Is there any fundamental or philosophical reason why a foreign table
> > can't accept arguments?
> 
> That isn't a table; it's some sort of function.  Now that we have
> LATERAL, there is no good reason to contort SQL's syntax and semantics
> in the direction you suggest.

Maybe I should rephrase this as a problem with SRFs: you don't get to
define the init/exec/end executor functions, and you don't get access to
the optimizer information.

It seems like foreign tables are a better mechanism (except for the
simple cases where you don't care about the details), and the only thing
an SRF can do that a foreign table can't is accept arguments. So, I
thought maybe it would make more sense to combine the mechanisms
somehow.

Take something as simple as generate_series: right now, it materializes
the entire thing if it's in the FROM clause, but it wouldn't need to if
it could use the foreign table mechanism.

Regards,Jeff Davis




Re: Arguments to foreign tables?

From
Ronan Dunklau
Date:
Maybe you could set some options on the foreign table before selecting from it ?
Another way you could achieve the same result would be to give some column a special meaning (like it is done in the twitter_fdw for example).

If you don't mind, do you have a specific use-case for this ?

--
Ronan Dunklau

2012/11/6 Jeff Davis <pgsql@j-davis.com>
On Sun, 2012-11-04 at 15:13 -0500, Tom Lane wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> > Is there any fundamental or philosophical reason why a foreign table
> > can't accept arguments?
>
> That isn't a table; it's some sort of function.  Now that we have
> LATERAL, there is no good reason to contort SQL's syntax and semantics
> in the direction you suggest.

Maybe I should rephrase this as a problem with SRFs: you don't get to
define the init/exec/end executor functions, and you don't get access to
the optimizer information.

It seems like foreign tables are a better mechanism (except for the
simple cases where you don't care about the details), and the only thing
an SRF can do that a foreign table can't is accept arguments. So, I
thought maybe it would make more sense to combine the mechanisms
somehow.

Take something as simple as generate_series: right now, it materializes
the entire thing if it's in the FROM clause, but it wouldn't need to if
it could use the foreign table mechanism.

Regards,
        Jeff Davis



--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: Arguments to foreign tables?

From
Dimitri Fontaine
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> Take something as simple as generate_series: right now, it materializes
> the entire thing if it's in the FROM clause, but it wouldn't need to if
> it could use the foreign table mechanism.

So, my understanding of your proposal is that a good way to implement
streaming SRF would be on top of the internals of Foreign Data Wrappers?

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Re: Arguments to foreign tables?

From
Simon Riggs
Date:
On 6 November 2012 06:09, Jeff Davis <pgsql@j-davis.com> wrote:
> On Sun, 2012-11-04 at 15:13 -0500, Tom Lane wrote:
>> Jeff Davis <pgsql@j-davis.com> writes:
>> > Is there any fundamental or philosophical reason why a foreign table
>> > can't accept arguments?
>>
>> That isn't a table; it's some sort of function.  Now that we have
>> LATERAL, there is no good reason to contort SQL's syntax and semantics
>> in the direction you suggest.
>
> Maybe I should rephrase this as a problem with SRFs: you don't get to
> define the init/exec/end executor functions, and you don't get access to
> the optimizer information.
>
> It seems like foreign tables are a better mechanism (except for the
> simple cases where you don't care about the details), and the only thing
> an SRF can do that a foreign table can't is accept arguments. So, I
> thought maybe it would make more sense to combine the mechanisms
> somehow.
>
> Take something as simple as generate_series: right now, it materializes
> the entire thing if it's in the FROM clause, but it wouldn't need to if
> it could use the foreign table mechanism.

This seems a useful thought to me.

Tom is complaining about conflating the two features, which does seem
unnecessary. But we can still merge code.

Providing run-time parameters or table-level options to Foreign Scans
seems sensible. I can see why you'd want to pass through parameters
like "work_mem" to the foreign scan and yet not have those as
permanent table options.

(On another thought: why do Foreign Scans get to avoid the
materialization we hate with SRFs?)

-- Simon Riggs                   http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training & Services



File Corruption recovery

From
"md@rpzdesign.com"
Date:
I have been working on external replication on Postgresql 9.2 for a 
little while
(with too many interruptions blocking my progress!)

Who knows a good utility to aggressively analyze
and recover Postgresql Databases?

It seems the standard reply that I see
is "Make regular backups", but that guarantees maximum full data loss
defined by the backup time interval.

Our MariaDB Mysql/ExtraDB/Innodb friends and Aria_check and some other tools
to "recover" as much as possible up to the moment of failure.

While full replication is the ultimate safeguard, in "split brain" mode, 
I could
see a hardware failure causing loss of data up to the last replication 
exchange
or last backup interval.

During a data crash, I want the recovery tool to HELP me get as much 
data recovered
and get back to operations.  What I do not want to do is a bunch of 
manual command line
file copy and deletes to "guess" my way back to operational mode (some 
data loss is inevitable)

I could make a daily snapshot of the system catalog to assist the 
recovery tool in
restoring the database.

Who has ideas on this?







Re: Arguments to foreign tables?

From
Tom Lane
Date:
Dimitri Fontaine <dimitri@2ndQuadrant.fr> writes:
> Jeff Davis <pgsql@j-davis.com> writes:
>> Take something as simple as generate_series: right now, it materializes
>> the entire thing if it's in the FROM clause, but it wouldn't need to if
>> it could use the foreign table mechanism.

> So, my understanding of your proposal is that a good way to implement
> streaming SRF would be on top of the internals of Foreign Data Wrappers?

I'm inclined to think that this proposal is at best premature.  The FDW
mechanisms are still very much a work in progress, and I will *not*
promise anybody that we aren't going to change those APIs repeatedly
before all the dust has settled.  As such, I think we'd better confine
the audience to those few souls brave enough to try to write actual
foreign data wrappers.  If we put up signs telling the average Joe to
use those APIs for SRFs, we're going to have a lot of pressure to
preserve the existing APIs, no matter how broken they turn out to be.

I'd also opine that the FDW APIs are pretty darn heavyweight for an SRF.
There might be a small number of SRFs for which it's actually worth
dealing with the planner in full generality, but surely not very many.
        regards, tom lane



Re: Arguments to foreign tables?

From
Tom Lane
Date:
Simon Riggs <simon@2ndQuadrant.com> writes:
> (On another thought: why do Foreign Scans get to avoid the
> materialization we hate with SRFs?)

[ shrug... ]  That works both ways.  If it would be convenient for an
FDW to return a tuplestore, which is hardly unlikely, it's out of luck;
it has to do the expansion itself.

The fact that nodeFunctionscan forces materialization is not by any
means a fundamental property.  It's just that nobody bothered to
implement two code paths there.
        regards, tom lane



Re: Arguments to foreign tables?

From
Jeff Davis
Date:
On Tue, 2012-11-06 at 08:35 +0000, Simon Riggs wrote:
> Tom is complaining about conflating the two features, which does seem
> unnecessary. But we can still merge code.

That's a good way to put it.

> Providing run-time parameters or table-level options to Foreign Scans
> seems sensible. I can see why you'd want to pass through parameters
> like "work_mem" to the foreign scan and yet not have those as
> permanent table options.

Right, if you want to set the foreign work_mem differently than the
local work_mem.

> (On another thought: why do Foreign Scans get to avoid the
> materialization we hate with SRFs?)

I don't have an authoritative answer for this, but I *think* it's
because you can specify the begin/iterate/end for the FDW, which allows
you to do proper initialization and teardown regardless of whether the
function runs to completion. Also, you can specify the rescan, which
means that it's up to the FDW author to make sure they produce the same
results again (an SRF isn't required to because it can just output the
materialized results again).

There's probably more work to be done here in FDWs for a few remaining
edge cases, like if the results can't be reproduced on the foreign side
(e.g. twitter feed or something) and parameters are changing.

Regards,Jeff Davis





Re: Arguments to foreign tables?

From
Jeff Davis
Date:
On Tue, 2012-11-06 at 09:39 +0100, Dimitri Fontaine wrote:
> Jeff Davis <pgsql@j-davis.com> writes:
> > Take something as simple as generate_series: right now, it materializes
> > the entire thing if it's in the FROM clause, but it wouldn't need to if
> > it could use the foreign table mechanism.
> 
> So, my understanding of your proposal is that a good way to implement
> streaming SRF would be on top of the internals of Foreign Data Wrappers?

Yes, I was talking about combining the mechanisms and combining the
capabilities of each mechanism. That would hopefully be a net decrease
in code, and offer the ability of FDWs to be strictly more powerful than
SRFs (though perhaps more challenging to develop).

Regards,Jeff Davis





Re: Arguments to foreign tables?

From
Jeff Davis
Date:
On Tue, 2012-11-06 at 09:19 +0100, Ronan Dunklau wrote:
> Maybe you could set some options on the foreign table before selecting
> from it ?
> Another way you could achieve the same result would be to give some
> column a special meaning (like it is done in the twitter_fdw for
> example).
> 
> 
> If you don't mind, do you have a specific use-case for this ?
> 
Let's say that your remote data source is a stream of data that is not
actually being stored anywhere, e.g. network events. The data you want
to retrieve are all events with a timestamp less than X, and you assume
that the timestamp is monotonically increasing (so as soon as you get to
X, the read is finished). 

An SRF isn't good enough because it always materializes (and that's the
only way it allows you to control initialization and teardown of the
stream connection). But you don't want to have to define a new foreign
table each time. It would be better if there were a way to pass the
argument X to the FDW mechanism.

Regards,Jeff Davis





Re: Arguments to foreign tables?

From
Pavel Stehule
Date:
2012/11/6 Jeff Davis <pgsql@j-davis.com>:
> On Tue, 2012-11-06 at 09:19 +0100, Ronan Dunklau wrote:
>> Maybe you could set some options on the foreign table before selecting
>> from it ?
>> Another way you could achieve the same result would be to give some
>> column a special meaning (like it is done in the twitter_fdw for
>> example).
>>
>>
>> If you don't mind, do you have a specific use-case for this ?
>>
> Let's say that your remote data source is a stream of data that is not
> actually being stored anywhere, e.g. network events. The data you want
> to retrieve are all events with a timestamp less than X, and you assume
> that the timestamp is monotonically increasing (so as soon as you get to
> X, the read is finished).
>
> An SRF isn't good enough because it always materializes (and that's the
> only way it allows you to control initialization and teardown of the
> stream connection). But you don't want to have to define a new foreign
> table each time. It would be better if there were a way to pass the
> argument X to the FDW mechanism.

An SRF must not materialize always - depends on implementation

Regards

Pavel

>
> Regards,
>         Jeff Davis
>
>
>
>
> --
> Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-hackers



Re: Arguments to foreign tables?

From
Jeff Davis
Date:
On Tue, 2012-11-06 at 10:55 -0500, Tom Lane wrote:
> I'd also opine that the FDW APIs are pretty darn heavyweight for an SRF.
> There might be a small number of SRFs for which it's actually worth
> dealing with the planner in full generality, but surely not very many.

I was not suggesting that we remove the user-facing simplicity of SRFs.

Granted, this is not very well thought-through from an API standpoint.
But I was imagining adding something to the FDW API that allows for
arguments to be passed from the FROM clause down the the FDW code.

Regards,Jeff Davis




Re: Arguments to foreign tables?

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> On Tue, 2012-11-06 at 10:55 -0500, Tom Lane wrote:
>> I'd also opine that the FDW APIs are pretty darn heavyweight for an SRF.
>> There might be a small number of SRFs for which it's actually worth
>> dealing with the planner in full generality, but surely not very many.

> I was not suggesting that we remove the user-facing simplicity of SRFs.

> Granted, this is not very well thought-through from an API standpoint.
> But I was imagining adding something to the FDW API that allows for
> arguments to be passed from the FROM clause down the the FDW code.

Well, that's not terribly exciting.  An FDW can already look at the RTE
for the relation, which would contain whatever information is available
about parameters-in-FROM.  But the real problem is how do you get into
the FDW code path to start with, when the FROM item looks like a
function call?
        regards, tom lane



Re: Arguments to foreign tables?

From
Tom Lane
Date:
Jeff Davis <pgsql@j-davis.com> writes:
> On Tue, 2012-11-06 at 09:19 +0100, Ronan Dunklau wrote:
>> If you don't mind, do you have a specific use-case for this ?

> Let's say that your remote data source is a stream of data that is not
> actually being stored anywhere, e.g. network events. The data you want
> to retrieve are all events with a timestamp less than X, and you assume
> that the timestamp is monotonically increasing (so as soon as you get to
> X, the read is finished). 

> An SRF isn't good enough because it always materializes (and that's the
> only way it allows you to control initialization and teardown of the
> stream connection). But you don't want to have to define a new foreign
> table each time. It would be better if there were a way to pass the
> argument X to the FDW mechanism.

That particular example can be handled perfectly well today, with
select * from stream_table where tscol < 'whatever';

The FDW could be coded to throw an error if the query doesn't provide a
WHERE clause that constrains the timestamp column suitably.  It does
mean that you have to expose the "argument" as a result column, but at
least for this use-case that hardly seems like a problem.
        regards, tom lane



Re: Arguments to foreign tables?

From
Jeff Davis
Date:
On Tue, 2012-11-06 at 12:57 -0500, Tom Lane wrote:
> That particular example can be handled perfectly well today, with
> 
>     select * from stream_table where tscol < 'whatever';
> 
> The FDW could be coded to throw an error if the query doesn't provide a
> WHERE clause that constrains the timestamp column suitably.  It does
> mean that you have to expose the "argument" as a result column, but at
> least for this use-case that hardly seems like a problem.

Oh, interesting idea to require a WHERE clause that way. I guess, in
theory, you could have an operator that always returns TRUE, and pass
everything down in WHERE clauses that way.

I think there is a case for actual arguments though: remote settings for
GUCs (as Simon pointed out); maybe the host to get the data from if it's
available from several locations; and if we support defining the schema
on the fly (like SRFs that return RECORD) then you could pass column
lists.

Regards,Jeff Davis