Re: delta relations in AFTER triggers - Mailing list pgsql-hackers
From | Thomas Munro |
---|---|
Subject | Re: delta relations in AFTER triggers |
Date | |
Msg-id | CAEepm=3NvS1MZqzJ5T-rh2xtshqEXwCRCL16Bwhksy1EY03g6A@mail.gmail.com Whole thread Raw |
In response to | Re: delta relations in AFTER triggers (Kevin Grittner <kgrittn@gmail.com>) |
Responses |
Re: delta relations in AFTER triggers
Re: delta relations in AFTER triggers |
List | pgsql-hackers |
On Sat, Nov 5, 2016 at 5:09 AM, Kevin Grittner <kgrittn@gmail.com> wrote: > Hearing none, done. Hopefully that makes what remains easier to > review. So, as of commit 8c48375e we have the standard syntax REFERENCING NEW/OLD TABLE AS so that you can provide the name for transition relations that trigger functions can access, and they are filled with before and after tuples as appropriate and made available to trigger code via the AfterTriggersData struct. So now we can write triggers in C and access those tuplestores. Great! I spent some time last week reviewing the remaining patches in the series. They've bit-rotted a bit and one is incomplete, but it's clear enough what's going on. First, we have transition-noapi-v7.diff which introduces a new executor node "TuplestoreScan" and a named tuplestore registry mechanism called "Tsrcache". Then we have a choice of: 1. transition-tsr-v7.diff, which allows SPI client code to register tuplestores + associated metadata though a new SPI interface, and which passes Tsrcache objects to all the places they are needed. 2. transition-via-params-v7.diff, which allows SPI client code to provide a new parser hook to resolve references to the new relations, analogous to the way that plpgsql deals with variables referenced in SQL statements, and then does some juggling to get the tuplestore to the executor ndoe via a parameter slot. Both ways have attracted criticism: the first involves touching basically every core function that might eventually parse, plan or execute a query to make it accept a Tsrcache and pass that on, and the second involves a bunch of Rube Goldberg machine-like callback/variable/parameter code. I spent some time investigating whether a third way would be viable: use ParamListInfo's setupParser hook and add an analogous one for the executor, so that there is no registry equivalent to Tsrcache, but also no dealing with param slots or (in plpgsql's case) new kinds of variables. Instead, there would just be two callbacks: one for asking the tuplestore provider for metadata (statistics, TupleDesc) at planning time, and another for asking for the tuplestore by name at execution time. One problem with this approach is that it requires using the SPI_*_with_paramlist interfaces, not the more commonly used arg-based versions, and requires using a ParamListInfo when you otherwise have no reason to because you have no parameters. Also, dealing with callbacks to register your tuplestore supplier is a little clunky. More seriously, requiring providers of those callbacks to write code that directly manipulates ParserState and EState and calls addRangeTableEntryXXX seems like a modularity violation -- especially for PLs that are less integrated with core Postgres code than plpgsql. I got this more or less working, but didn't like it much and didn't think it would pass muster. After going through that experience, I now agree with Kevin: an interface where a new SPI interface lets PLs push a named tuplestore into the SPI connection to make it available to SQL seems like the simplest and tidiest way. I do have some feedback and suggestions though: 1. Naming: Using tuplestores in AfterTriggersData make perfect sense to me but I don't think it follows that the exact thing we should expose to the executor to allow these to be scanned is a TuplestoreScan. We have other nodes that essentially scan a tuplestore like WorkTableScan and Material but they have names that tell you what they are for. This is for scanning data that has either been conjured up or passed on by SPI client code and exposed to SQL queries. How about SpiRelationScan, SpiDataScan, SpiRelVarScan, ....? Also, Tsrcache is strangely named: it's not exactly a cache, it's more of a registry. 2. Scope of changes: If we're going to touch functions all over the source tree to get the Tsrcache where it needs to be for parsing and execution, then I wonder if we should consider thinking a bit more about where this is going. What if we had a thing called a QueryEnvironment, and we passed a pointer to that into to all those places, and it could contain the named tuplestore registry? Then we wouldn't have to change all those interfaces again in future if someone wants to allow more kinds of transient objects to be injected into the SQL namespace via SPI. For example, future uses could include transient functions (ie uncatalogued functions that are made available to a query by the SPI client using a callback to support private and nested functions), or relations provided via SPI that emit tuples one-at-a-time, or more sophisticated kinds of transient relations that support writes, indexes and constraints along the lines of those in SQL Server's T-SQL. See attached patches: * spi-relation-v1.patch, which provides: (1) an SpiRelationScan executor node, (2) the SPI interface required to feed data to it, (3) a new QueryEnvironment struct which is used to convey SpiRelation into the right bits of the planner and executor; this patch is based on fragments extracted from the -noapi and -tsr patches, and modified as described above * spi-relation-plpgsql-v1.patch, to teach plpgsql how to expose the new and old tables to SQL via the above * spi-relation-plpython-v1.patch, ditto for plpython; this patch makes the OLD TABLE and NEW TABLE automatically available to any query you run via plpy.execute, and is intended to show that the code required to make this work for each PL is small, in support of Kevin's earlier argument (a more featureful patch might would presumably also expose the contents of the tuplestores directly to Python code, and let Python code create arbitrary new tuplestores and expose those to SQL queries) Thoughts? -- Thomas Munro http://www.enterprisedb.com
Attachment
pgsql-hackers by date: