extensions are hitting the ceiling - Mailing list pgsql-hackers

From Eric Hanson
Subject extensions are hitting the ceiling
Date
Msg-id CACA6kxjhMM0-Vcb+foF4+n+kHW4kCSzOuK5hSuKqoUzqdfZ9qw@mail.gmail.com
Whole thread Raw
Responses Re: extensions are hitting the ceiling
Re: extensions are hitting the ceiling
Re: extensions are hitting the ceiling
List pgsql-hackers
Hi folks,

After months and years of really trying to make EXTENSIONs meet the requirements of my machinations, I have come to the conclusion that either a) I am missing something or b) they are architecturally flawed.  Or possibly both.

Admittedly, I might be trying to push extensions beyond what the great elephant in the sky ever intended. The general bent here is to try to achieve a level of modular reusable components similar to those in "traditional" programming environments like pip, gem, npm, cpan, etc. Personally, I am trying to migrate as much of my dev stack as possible away from the filesystem and into the database. Files, especially code, configuration, templates, permissions, manifests and other development files, would be much happier in a database where they have constraints and an information model and can be queried!

Regardless, it would be really great to be able to install an extension, and have it cascade down to multiple other extensions, which in turn cascade down to more, and have everything just work. Clearly, this was considered in the extension architecture, but I'm running into some problems making it a reality.  So here they are.


#1: Dependencies

Let's say we have two extensions, A and B, both of which depend on a third extension C, let's just say C is hstore.  A and B are written by different developers, and both contain in their .control file the line

        requires = 'hstore'

When A is installed, if A creates a schema, it puts hstore in that schema. If not, hstore is already installed, it uses it in that location.  How does the extension know where to reference hstore?

Then, when B is installed, it checks to see if extension hstore is installed, sees that it is, and moves on.  What if it expects it in a different place than A does? The hstore extension can only be installed once, in a single schema, but if multiple extensions depend on it and look for it in different places, they are incompatible.

I have heard talk of a way to write extensions so that they dynamically reference the schema of their dependencies, but sure don't know how that would work if it's possible.  The @extschema@ variable references the *current* extension's schema, but not there is no dynamic variable to reference the schema of a dependency.

Also it is possible in theory to dynamically set search_path to contain every schema of every dependency in play and then just not specify a schema when you use something in a dependency.  But this ANDs together all the scopes of all the dependencies of an extension, introducing potential for collisions, and is generally kind of clunky.


#2:  Data in Extensions

Extensions that are just a collection of functions and types seem to be the norm.  Extensions can contain what the docs call "configuration" data, but rows are really second class citizens:  They aren't tracked with pg_catalog.pg_depend, they aren't deleted when the extension is dropped, etc.

Sometimes it would make sense for an extension to contain *only* data, or insert some rows in a table that the extension doesn't "own", but has as a dependency.  For example, a "webserver" extension might contain a "resource" table that serves up the content of resources in the table at a specified path. But then, another extension, say an app, might want to just list the webserver extension as a dependency, and insert a few resource rows into it.  This is really from what I can tell beyond the scope of what extensions are capable of.


#3 pg_dump and Extensions

Tables created by extensions are skipped by pg_dump unless they are flagged at create time with:

        pg_catalog.pg_extension_config_dump('my_table', 'where id < 20')

However, there's no way that I can tell to mix and match rows and tables across multiple extensions, so pg_dump can't keep track of multiple extensions that contain rows in the same table.


I'd like an extension framework that can contain data as first class citizens, and can gracefully handle a dependency chain and share dependencies.  I have some ideas for a better approach, but they are pretty radical.  I thought I would send this out and see what folks think.

Thanks,
Eric
--
http://aquameta.org/

pgsql-hackers by date:

Previous
From: Tatsuro Yamada
Date:
Subject: Re: [HACKERS] CLUSTER command progress monitor
Next
From: Kyotaro HORIGUCHI
Date:
Subject: Re: Problem with default partition pruning