Re: TABLESAMPLE patch is really in pretty sad shape - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: TABLESAMPLE patch is really in pretty sad shape |
Date | |
Msg-id | 5438.1436740611@sss.pgh.pa.us Whole thread Raw |
In response to | TABLESAMPLE patch is really in pretty sad shape (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: TABLESAMPLE patch is really in pretty sad shape
Re: TABLESAMPLE patch is really in pretty sad shape |
List | pgsql-hackers |
I wrote: > The two contrib modules this patch added are nowhere near fit for public > consumption. They cannot clean up after themselves when dropped: > ... > Raw inserts into system catalogs just > aren't a sane thing to do in extensions. I had some thoughts about how we might fix that, without going to the rather tedious lengths of creating a complete set of DDL infrastructure for CREATE/DROP TABLESAMPLE METHOD. First off, the extension API designed for the tablesample patch is evidently modeled on the index AM API, which was not a particularly good precedent --- it's not a coincidence that index AMs can't be added or dropped on-the-fly. Modeling a server internal API as a set of SQL-visible functions is problematic when the call signatures of those functions can't be accurately described by SQL datatypes, and it's rather pointless and inefficient when none of the functions in question is meant to be SQL-callable. It's even less attractive if you don't think you've got a completely stable API spec, because adding, dropping, or changing signature of any one of the API functions then involves a pile of easy-to-mess-up catalog changes on top of the actually useful work. Not to mention then having to think about backwards compatibility of your CREATE command's arguments. We have a far better model to follow already, namely the foreign data wrapper API. In that, there's a single SQL-visible function that returns a dummy datatype indicating that it's an FDW handler, and when called, it hands back a struct containing pointers to all the other functions that the particular wrapper needs to supply (and, if necessary, the struct could have non-function-pointer fields containing other info the core system might need to know about the handler). We could similarly invent a pseudotype "tablesample_handler" and represent each tablesample method by a single SQL function returning tablesample_handler. Any future changes in the API for tablesample handlers would then appear as changes in the C definition of the struct returned by the handler, which requires no SQL-visible thrashing, hence creates no headaches for pg_upgrade, and it is pretty easy to design it so that failure to update an external module that implements the API results in C compiler errors and/or sane fallback behavior. Once we've done that, I think we don't even need a special catalog representing tablesample methods. Given "FROM foo TABLESAMPLE bernoulli(...)", the parser could just look for a function bernoulli() returning tablesample_handler, and it's done. The querytree would have an ordinary function dependency on that function, which would be sufficient to handle DROP dependency behaviors properly. (On reflection, maybe better if it's "bernoulli(internal) returns tablesample_handler", so as to guarantee that such a function doesn't create a conflict with any user-defined function of the same name.) Thoughts? regards, tom lane PS: now that I've written this rant, I wonder why we don't redesign the index AM API along the same lines. It probably doesn't matter much at the moment, but if we ever get serious about supporting index AM extensions, I think we ought to consider doing that.
pgsql-hackers by date: