Thread: [HACKERS] [PATCH] Call RelationDropStorage() for broader range of object drops.

[HACKERS] [PATCH] Call RelationDropStorage() for broader range of object drops.

From
Hadi Moshayedi
Date:
Motivation for this patch is that some FDWs (notably, cstore_fdw) try utilizing PostgreSQL internal storage. PostgreSQL assigns relfilenode's to foreign tables, but doesn't clean up storage for foreign tables when dropping tables. Therefore, in cstore_fdw we have to do some tricks to handle dropping objects that lead to dropping of cstore table properly.

As far as I can see in the code, the requirement for RelationDropStorage(rel) is having valid rel->rd_node.relNode, but it doesn't actually require the storage files to actually exist. We don't emit warning messages in mdunlinkfork() if the result of unlink() is ENOENT. So we can RelationDropStorage() regardless of storage files existing or not, given that the relation has valid relfilenode.

So I am suggesting to change the check at heap_drop_with_catalog() at src/backend/catalog/heap.c:

-    if (rel->rd_rel->relkind != RELKIND_VIEW &&
-        rel->rd_rel->relkind != RELKIND_COMPOSITE_TYPE &&
-        rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE &&
-        rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
+    if (OidIsValid(rel->rd_node.relNode))
     {
         RelationDropStorage(rel);
     }

Any feedback on this?

Thanks,
Hadi
Attachment

Re: [HACKERS] [PATCH] Call RelationDropStorage() for broader range ofobject drops.

From
Michael Paquier
Date:
On Wed, Sep 13, 2017 at 2:40 AM, Hadi Moshayedi <hadi@citusdata.com> wrote:
> Motivation for this patch is that some FDWs (notably, cstore_fdw) try
> utilizing PostgreSQL internal storage. PostgreSQL assigns relfilenode's to
> foreign tables, but doesn't clean up storage for foreign tables when
> dropping tables. Therefore, in cstore_fdw we have to do some tricks to
> handle dropping objects that lead to dropping of cstore table properly.

Foreign tables do not have physical storage assigned to by default. At
least heap_create() tells so, create_storage being set to false for a
foreign table. So there is nothing to clean up normally. Or is
cstore_fdw using directly heap_create with its own relfilenode set,
creating a physical storage?

> So I am suggesting to change the check at heap_drop_with_catalog() at
> src/backend/catalog/heap.c:
>
> -    if (rel->rd_rel->relkind != RELKIND_VIEW &&
> -        rel->rd_rel->relkind != RELKIND_COMPOSITE_TYPE &&
> -        rel->rd_rel->relkind != RELKIND_FOREIGN_TABLE &&
> -        rel->rd_rel->relkind != RELKIND_PARTITIONED_TABLE)
> +    if (OidIsValid(rel->rd_node.relNode))
>      {
>          RelationDropStorage(rel);
>      }
>
> Any feedback on this?

I agree that there is an inconsistency here if a module calls
heap_create() with an enforced relfilenode.
-- 
Michael


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Call RelationDropStorage() for broader range ofobject drops.

From
Hadi Moshayedi
Date:
On Wed, Sep 13, 2017 at 12:12 AM, Michael Paquier <michael.paquier@gmail.com> wrote: 
Foreign tables do not have physical storage assigned to by default. At
least heap_create() tells so, create_storage being set to false for a
foreign table. So there is nothing to clean up normally. Or is
cstore_fdw using directly heap_create with its own relfilenode set,
creating a physical storage?

cstore_fdw (in store_data_in_internal_storage branch) calls RelationCreateStorage() after CREATE FOREIGN TABLE completes [1]. Later it also creates the FSM fork and uses it for storing some metadata.


Re: [HACKERS] [PATCH] Call RelationDropStorage() for broader range ofobject drops.

From
Hadi Moshayedi
Date:
To provide more context, in cstore_fdw creating the storage is easy, we only need to hook into CREATE FOREIGN TABLE using event triggers. Removing the storage is not that easy, for DROP FOREIGN TABLE we can use event triggers. But when we do DROP EXTENSION, the event triggers don't get fired (because they have already been dropped), so to handle DROP EXTENSION, we need to hook into the process utility hook. Now to implement this, (1) we get a list of all cstore tables (2) call postgres's utility hook, (3) if #2 succeeds clean-up all cstore table storage's. But when #3 happens the relation isn't there anymore, so we create a pseudo-relation [1] and call RelationDropStorage().

Implementing all of this seemed messy, so we thought maybe postgres could try to clean-up storage for every relation it assigns a relfilenode for. We are open to ideas.


Re: [HACKERS] [PATCH] Call RelationDropStorage() for broader rangeof object drops.

From
Alvaro Herrera
Date:
Hadi Moshayedi wrote:
> To provide more context, in cstore_fdw creating the storage is easy, we
> only need to hook into CREATE FOREIGN TABLE using event triggers. Removing
> the storage is not that easy, for DROP FOREIGN TABLE we can use event
> triggers.

This all sounds a little more complicated than it should ... but since
FDW are not really IMO the right interface to be implementing a
different storage format, I'm not terribly on board with supporting this
more completely.  So what you're doing now is probably acceptable.

> But when we do DROP EXTENSION, the event triggers don't get fired
> (because they have already been dropped),

This however sounds like a silly design bug ... surely the event
triggers should have been fired when the table is dropped, before
dropping the event triggers themselves.  I can't offhand think of any
good way to fix that, however.

> so to handle DROP EXTENSION, we need to hook into the process utility
> hook. Now to implement this, (1) we get a list of all cstore tables
> (2) call postgres's utility hook, (3) if #2 succeeds clean-up all
> cstore table storage's. But when #3 happens the relation isn't there
> anymore, so we create a pseudo-relation [1] and call
> RelationDropStorage().

This sounds terrible.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Call RelationDropStorage() for broader rangeof object drops.

From
Andres Freund
Date:
On 2017-09-14 12:05:37 +0200, Alvaro Herrera wrote:
> This sounds terrible.

Welcome to the life of writing extensions.


-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Re: [HACKERS] [PATCH] Call RelationDropStorage() for broader range ofobject drops.

From
Alexander Korotkov
Date:
On Thu, Sep 14, 2017 at 1:05 PM, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Hadi Moshayedi wrote:
> To provide more context, in cstore_fdw creating the storage is easy, we
> only need to hook into CREATE FOREIGN TABLE using event triggers. Removing
> the storage is not that easy, for DROP FOREIGN TABLE we can use event
> triggers.

This all sounds a little more complicated than it should ... but since
FDW are not really IMO the right interface to be implementing a
different storage format, I'm not terribly on board with supporting this
more completely.  So what you're doing now is probably acceptable.

+1,
FDW looks OK for prototyping pluggable storage, but it doesn't seem suitable for permanent implementation.
BTW, Hadi, could you visit "Pluggable storage" thread and check how suitable upcoming pluggable storage API is for cstore?

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company 

Re: [HACKERS] [PATCH] Call RelationDropStorage() for broader range ofobject drops.

From
Michael Paquier
Date:
On Fri, Sep 15, 2017 at 4:36 AM, Alexander Korotkov
<a.korotkov@postgrespro.ru> wrote:
> +1,
> FDW looks OK for prototyping pluggable storage, but it doesn't seem suitable
> for permanent implementation.
> BTW, Hadi, could you visit "Pluggable storage" thread and check how suitable
> upcoming pluggable storage API is for cstore?

I cannot make a clear opinion about this patch. Not changing the
situation, or changing it have both downsides and upsides. The
suggestion from Alexander is surely something to look at. I am bumping
this to next CF for now..
-- 
Michael


On Wed, Nov 29, 2017 at 10:33 PM, Michael Paquier
<michael.paquier@gmail.com> wrote:
> On Fri, Sep 15, 2017 at 4:36 AM, Alexander Korotkov
> <a.korotkov@postgrespro.ru> wrote:
>> +1,
>> FDW looks OK for prototyping pluggable storage, but it doesn't seem suitable
>> for permanent implementation.
>> BTW, Hadi, could you visit "Pluggable storage" thread and check how suitable
>> upcoming pluggable storage API is for cstore?
>
> I cannot make a clear opinion about this patch. Not changing the
> situation, or changing it have both downsides and upsides. The
> suggestion from Alexander is surely something to look at. I am bumping
> this to next CF for now..

Well, I think there's no question that this patch is not really the
right way of solving the problem; the right way is pluggable storage.
What remains to be decided is whether we should adopt this approach to
make life easier for extension authors between now and then.  I don't
have a big problem with adopting this approach *provided that* we're
confident that it won't ever put us at risk of removing the wrong
files.  For example, if it were possible for rel->rd_node.relNode,
which we expect to be pointing at nothing in the case of a foreign
table, to instead be pointing at somebody else's relfilenode, this
patch would be bad news.

And I'm worried that might actually be a problem, because the only
hard guarantee GetNewRelFileNode() provides is that the file doesn't
exist on disk.  If the pg_class argument to that function is passed as
non-NULL, we additionally guarantee that the OID is not in use in
pg_class.oid, but not all callers do that, and it anyway doesn't
guarantee that the OID is not in use in pg_class.relfilenode.  So I
think it might be possible for a table that gets rewritten by CLUSTER
or VACUUM FULL to end up with the same relfilenode that was previously
assigned to a foreign table; dropping the foreign table would then
nuke the other relation's storage.  This probably wouldn't be a
problem in Citus's case, because they would create a file for the
FDW's relfilenode and that would prevent it from getting used -- but
it would be a problem for postgres_fdw or any other FDW which is doing
the expected thing of not creating files on disk.

Unless somebody can prove convincingly that this argument is wrong and
that there are no other possible problems either, and memorialize that
argument in the form of a detailed comment, I think we should reject
this patch.  http://postgr.es/m/CA+Tgmoa7TJA6-MvjWdcb_bouwFKx9apnU+Ok9m94TkTZTYmqjw@mail.gmail.com
from earlier this morning is good evidence for the proposition that we
must be careful to document the reasons for the changes we make, even
seemingly minor ones, if we don't want developers to be guessing in
ten years whether those changes were actually safe and correct.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Greetings,

* Robert Haas (robertmhaas@gmail.com) wrote:
> Unless somebody can prove convincingly that this argument is wrong and
> that there are no other possible problems either, and memorialize that
> argument in the form of a detailed comment, I think we should reject
> this patch.  http://postgr.es/m/CA+Tgmoa7TJA6-MvjWdcb_bouwFKx9apnU+Ok9m94TkTZTYmqjw@mail.gmail.com
> from earlier this morning is good evidence for the proposition that we
> must be careful to document the reasons for the changes we make, even
> seemingly minor ones, if we don't want developers to be guessing in
> ten years whether those changes were actually safe and correct.

Based on Robert's feedback (which, given his comments, I agree with),
I'm going to mark this as rejected.  The approach for dealing with this
seems to be what Alvaro was getting at above where we should have a way
for an extension to get control to handle cleaning things up during
DROP EXTENSION, or call the appropriate event triggers before we drop
the functions, or, well, something else, but certainly not by just
making this change.

Thanks!

Stephen

Attachment