Thread: FDW: should GetFdwRoutine be called when drop table?

FDW: should GetFdwRoutine be called when drop table?

From
Feng Tian
Date:
Hi, Hackers,

I have an fdw that each foreign table will acquire some persisted resource.
In my case, some files in file system.   To drop the table cleanly, I have written
an object_access_hook that remove those files.  The hook is installed in _PG_init.  

It all worked well except one case. Suppose a user login, the very first command is 
drop foreign table.  Drop foreign table will not load the module, so that the hook 
is not installed and the files are not properly cleaned up.

Should drop foreign table call GetFdwRoutine?   _PG_init is the only entrance point 
that I know for registering hooks, I feel we need to trigger a load for all DML/DDL on 
FDW including drop.   Does this make sense?  

Thanks,
Feng
   


Re: FDW: should GetFdwRoutine be called when drop table?

From
Peter Eisentraut
Date:
On 2/19/16 12:21 PM, Feng Tian wrote:
> I have an fdw that each foreign table will acquire some persisted resource.
> In my case, some files in file system.   To drop the table cleanly, I
> have written
> an object_access_hook that remove those files.  The hook is installed in
> _PG_init.  
> 
> It all worked well except one case. Suppose a user login, the very first
> command is 
> drop foreign table.  Drop foreign table will not load the module, so
> that the hook 
> is not installed and the files are not properly cleaned up.

You could load your library with one of the *_library_preload settings
to make sure the hook is always present.

But foreign data wrappers are meant to be wrappers around data managed
elsewhere, not their own storage managers (although that is clearly
tempting), so there might well be other places where this breaks down.




Re: FDW: should GetFdwRoutine be called when drop table?

From
Andres Freund
Date:
On 2016-02-19 14:18:19 -0500, Peter Eisentraut wrote:
> On 2/19/16 12:21 PM, Feng Tian wrote:
> > I have an fdw that each foreign table will acquire some persisted resource.
> > In my case, some files in file system.   To drop the table cleanly, I
> > have written
> > an object_access_hook that remove those files.  The hook is installed in
> > _PG_init.  
> > 
> > It all worked well except one case. Suppose a user login, the very first
> > command is 
> > drop foreign table.  Drop foreign table will not load the module, so
> > that the hook 
> > is not installed and the files are not properly cleaned up.
> 
> You could load your library with one of the *_library_preload settings
> to make sure the hook is always present.
> 
> But foreign data wrappers are meant to be wrappers around data managed
> elsewhere, not their own storage managers (although that is clearly
> tempting), so there might well be other places where this breaks down.

Sounds like even a BEGIN;DROP TABLE foo;ROLLBACK; will break this
approach.

Andres



Re: FDW: should GetFdwRoutine be called when drop table?

From
Tom Lane
Date:
Andres Freund <andres@anarazel.de> writes:
> On 2016-02-19 14:18:19 -0500, Peter Eisentraut wrote:
>> On 2/19/16 12:21 PM, Feng Tian wrote:
>>> I have an fdw that each foreign table will acquire some persisted resource.

>> But foreign data wrappers are meant to be wrappers around data managed
>> elsewhere, not their own storage managers (although that is clearly
>> tempting), so there might well be other places where this breaks down.

> Sounds like even a BEGIN;DROP TABLE foo;ROLLBACK; will break this
> approach.

Yes, that's exactly the problem: you'd need some sort of atomic commit
mechanism to make this work safely.

It's possible we could give FDWs a bunch of hooks that would let them
manage post-commit cleanup the same way smgr does, but it's a far larger
project than it might have seemed.
        regards, tom lane



Re: FDW: should GetFdwRoutine be called when drop table?

From
Robert Haas
Date:
On Sat, Feb 20, 2016 at 1:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
>> On 2016-02-19 14:18:19 -0500, Peter Eisentraut wrote:
>>> On 2/19/16 12:21 PM, Feng Tian wrote:
>>>> I have an fdw that each foreign table will acquire some persisted resource.
>
>>> But foreign data wrappers are meant to be wrappers around data managed
>>> elsewhere, not their own storage managers (although that is clearly
>>> tempting), so there might well be other places where this breaks down.
>
>> Sounds like even a BEGIN;DROP TABLE foo;ROLLBACK; will break this
>> approach.
>
> Yes, that's exactly the problem: you'd need some sort of atomic commit
> mechanism to make this work safely.
>
> It's possible we could give FDWs a bunch of hooks that would let them
> manage post-commit cleanup the same way smgr does, but it's a far larger
> project than it might have seemed.

I've been thinking about the idea of letting foreign data wrappers
have either (a) a relfilenode that is not zero, representing local
storage; or perhaps even (b) an array of relfilenodes.  The
relfilenode, or relfilenodes, would be automatically dropped.  It
seems like this would be handy for things like cstore_fdw or the
problem mentioned here, where you do want to manage local storage.  If
you then also had the generic XLOG patch, maybe you could make it
WAL-logged, too, if you wanted...

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



Re: FDW: should GetFdwRoutine be called when drop table?

From
Tom Lane
Date:
Robert Haas <robertmhaas@gmail.com> writes:
> On Sat, Feb 20, 2016 at 1:43 AM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> Yes, that's exactly the problem: you'd need some sort of atomic commit
>> mechanism to make this work safely.
>> 
>> It's possible we could give FDWs a bunch of hooks that would let them
>> manage post-commit cleanup the same way smgr does, but it's a far larger
>> project than it might have seemed.

> I've been thinking about the idea of letting foreign data wrappers
> have either (a) a relfilenode that is not zero, representing local
> storage; or perhaps even (b) an array of relfilenodes.  The
> relfilenode, or relfilenodes, would be automatically dropped.  It
> seems like this would be handy for things like cstore_fdw or the
> problem mentioned here, where you do want to manage local storage.

Hmm, mumble.  This assumes that the "FDW" is willing to keep its data
in something that looks externally just like a Postgres heap file (as
opposed to, say, keeping it somewhere else in the filesystem).  That
pretty much gives up the notion that this is "foreign" data access and
instead means that you're trying to force-fit an alternate storage
manager into our FDW-shaped slot.  I doubt it will fit very well.
For one thing, we have never supposed that FDWs were 100% responsible
for managing the data they access, which is why they are not hooked up
to either DROP or a boatload of other maintenance activities like VACUUM,
CLUSTER, REINDEX, etc.  Not to mention that an alternate storage manager
might have its own maintenance activities that don't really fit any of
those concepts.

> If you then also had the generic XLOG patch, maybe you could make it
> WAL-logged, too, if you wanted...

While I've not paid close attention, I had the idea that the "generic
XLOG" patches that have been discussed would still be restricted to
dealing with data that fits into Postgres-style pages (because, for
example, it would have to pass bufmgr's page sanity checks).  That's a
restriction that an alternate storage manager would likely not want.

My point remains that building anything actually useful in this space
is not a small task.
        regards, tom lane