Home > mailing lists

Re: Creating a read/write virtual table? - Mailing list pgsql-general

From	Ben Martin
Subject	Re: Creating a read/write virtual table?
Date	April 12, 2010 06:52:32
Msg-id	1271065925.23273.139.camel@sam.localdomain Whole thread Raw
In response to	Re: Creating a read/write virtual table? (Craig Ringer <craig@postnewspapers.com.au>)
Responses	Re: Creating a read/write virtual table?
List	pgsql-general

Tree view

On Mon, 2010-04-12 at 13:15 +0800, Craig Ringer wrote:
> On 12/04/10 11:34, Ben Martin wrote:
>
> >    In my SQLite vtable both of the above restrictions don't apply. To
> > mitigate (1) I was hopeful that there was a way to get the connection to
> > PG on the server to fork/setuid
>
> The pg backends are designed to run as one user. You might have issues
> setting it up to do otherwise.

I was suspecting this to be the case, with apache it's somewhat
different to be able to set the uid and then access the filesystem, but
trying to have multiple users + processes accessing pg data files would
be a messier thing.

>
> Consider talking to your storage backend over a unix socket, pipe,
> tcp/ip socket, shared memory, or other suitable IPC mechanism from Pg.
> This also protects Pg from any issues caused by the storage backend
> doing things like spawning unexpected threads, corrupting the
> stack/heap, outright crashing, etc.

Hmm, this might be a good way to go anyway just for the isolation. And
actually fork/execing ferrisls --xml and massaging that as the return
value might be a fairly quick and robust initial implementation.

Thinking broader, being able to run "a program" and massage
xml/json/yaml back as the return value of a custom function might be a
handy generic thing to have. I'll have a quick search around to make
sure such a thing doesn't already exist...

>
> >    I think using a function restricts the results to being read only.
>
> If you're planning on writing a C-level UDF loaded via 'CREATE
> FUNCTION', then yes it'll have to return a materialized data set and
> there's no way to UPDATE/INSERT/DELETE against the function's results.
>
> > Perhaps having a second function that allows updates to be processed
> > separately.
>
> Yep, though that's clumsy I expect it'd work, and honestly it's probably
> your best bet.

OK

>
> >    Any pointers / suggestions on how to do this sort of thing with
> > postgresql would be wonderful. Please CC any such replies as I'm not
> > currently on list.
>
> AFAIK Pg's innards aren't currently very friendly toward extension with
> random user-provided table/database storage backends. Unlike
> (say) MySQL, there's no pluggable storage engine system.
>
> That means Pg's planner/executor can make a lot of decisions based on
> things it "knows" about how Pg's data storage works, speeding and
> simplifying it and letting it be much smarter. Pg can enforce strict
> rules about transactional scope, ACID, error handling, etc based on
> knowledge of the data storage's capabilities and expectations. On the
> other hand, it means it's hard to plug in your own storage system.

This all sounds very reasonable. Things also become a quagmire when the
database would need to ask the table implementation about virtual
indexes and their selectivity etc, etc. Once again, not too much of a
problem for smallish data sets and SQLite when a full virtual table scan
is acceptable for performance.

>
> After all, how would your external system handle something like this:
>
> BEGIN;
> INSERT INTO my_external_storage(a, b) VALUES (1,2);
> SAVEPOINT sp_a;
> INSERT INTO my_external_storage(a, b) VALUES (3,4);
> ROLLBACK TO SAVEPOINT sp_a;
> INSERT INTO my_external_storage(a, b) VALUES (5,6);
> COMMIT;
>
> How would it honour SERIALIZABLE isolation if it doesn't its self
> maintain MVCC snapshots and have some way of mapping pg's xids to its
> internal versions? Etc.

This indeed gets very hairy. I have thought about trying to implement
transactions in the filesystem, but for some virtual filesystems it is
not really possible, especially when rollback itself could fail.

>
>
> Pg would have to be able to error out on many commands with "not
> supported by underlying storage system" or the like. Yet sometimes it
> won't even know that until it's well into executing something.
>
> SQLite has the advantage of simplicity. It doesn't have to worry about
> complicated concurrency rules, versioning, user-defined data type
> storage, etc etc etc. And MySQL was designed to handle pluggable
> backends, so it's prepared for storage backends not to be able to
> perform certain operations, and it makes some big sacrifices in terms of
> its broader capabilities to permit pluggable backends.
>
> I guess at this point I have to ask "why do you want to do this? What
> will you gain?"
>

A few things I do with the SQLite engine are analysing apache log files,
or other separated files (like running awk with SQL as the query
language), and interacting with web services as a table, eg, selecting
data from a google spreadsheet and a local sqlite database in a single
query. As a fair amount of the use cases I have are simply introducing
and joining data from other sources into queries I think perhaps
starting out with just a record set returning function would be a good
thing.

This nicely avoids the issues that the filesystem can't really do MVCC
or transactions properly, but gives access to some more diverse data in
a query for folks who want to perform such queries.

Attachment

signature.asc

pgsql-general by date:

From: Craig Ringer
Date: 12 April 2010, 06:50:29
Subject: Re: File Handling in pgsql

From: Vikram A
Date: 12 April 2010, 06:59:09
Subject: Re: File Handling in pgsql

Re: Creating a read/write virtual table? - Mailing list pgsql-general

Attachment

Previous

Next