Thread: Dry run through input function for a given built-in data type
Hello Our business is writing a small batch insert engine with constraint aware validation (pg_version(): PostgreSQL 9.2.10 on x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat 4.4.7-11), 64-bit) The the goal is to run abstracted multi-line INSERT/UPDATES and return a result set detailing what was successfully inserted and what domain/constraint violations would have happened if an attempt were to be made at inserting those bad rows. Everything happens in a single big auto generated CTE cascade to mitigate the risk of race conditions (serializable isolation on top of that would be our last resort). Metadata is retrieved from pg_catalog, custom table and domain CHECK expression are nicely turned into boolean compliance states, EXISTS() calls do all the required look-aside for foreign keys existence and so on. Everything is working as intended so far. Ironically, I'm hitting a wall when it comes to native types. The only ways I've found to reliably create the validation SQL expression is to either hard code it or attempt a cast in a PL/PGSQL function and catch the exception (much slower). Is there any cleaner way to, say, only run the validation part of a type input function to infer domain compatibility? Or maybe is there a built-in dictionary of regular expressions? Or else? Many thanks guys F
Is there any cleaner way to, say, only run the validation part of a type input function [...]
This pre-supposes that said type input function has a distinct validation phase as opposed to simply performing its parse and failing when it encounters something it cannot handle.
I would suggest you attempt to separate the non-context-aware stuff and the context-aware stuff into two separate phases; and only data that passes the non-context-aware tests would then be bundled up into the full CTE where you then check constraints and the like. The how is beyond me but this is an open source project so you have access to all of the relevant functions. How you would go about using them without having to invoke the engine I do not know but that seems like the only performant option since the PostgreSQL executor isn't designed to handle your usage pattern.
Without considerably more understanding of the how suggestions are difficult to make - even were I to know the innards of PostgreSQL and using C. Others more knowledgeable will likely chime in but I suspect that I'm pretty close to the mark on this one.
Yes, you could make a regexp-base validation library...but that screams "maintenance nightmare" to me. I'm doubtful such a thing already exists.
David J.