Thread: Dry run through input function for a given built-in data type

Dry run through input function for a given built-in data type

From
Fabio Ugo Venchiarutti
Date:
Hello


Our business is writing a small batch insert engine with constraint
aware validation (pg_version(): PostgreSQL 9.2.10 on
x86_64-unknown-linux-gnu, compiled by gcc (GCC) 4.4.7 20120313 (Red Hat
4.4.7-11), 64-bit)


The the goal is to run abstracted multi-line INSERT/UPDATES and return a
result set detailing what was successfully inserted and what
domain/constraint violations would have happened if an attempt were to
be made at inserting those bad rows.

Everything happens in a single big auto generated CTE cascade to
mitigate the risk of race conditions (serializable isolation on top of
that would be our last resort).

Metadata is retrieved from pg_catalog, custom table and domain CHECK
expression are nicely turned into boolean compliance states, EXISTS()
calls do all the required look-aside for foreign keys existence and so
on. Everything is working as intended so far.

Ironically, I'm hitting a wall when it comes to native types. The only
ways I've found to reliably create the validation SQL expression is to
either hard code it or attempt a cast in a PL/PGSQL function and catch
the exception (much slower).


Is there any cleaner way to, say, only run the validation part of a type
input function to infer domain compatibility? Or maybe is there a
built-in dictionary of regular expressions? Or else?


Many thanks guys


F


Re: Dry run through input function for a given built-in data type

From
"David G. Johnston"
Date:
On Tue, May 12, 2015 at 11:23 PM, Fabio Ugo Venchiarutti <fabio@vuole.me> wrote:
Is there any cleaner way to, say, only run the validation part of a type input function
​ [...]​

​This pre-supposes that said type input function has a distinct validation phase as opposed to simply performing its parse and failing when it encounters something it cannot handle.

​I would suggest you attempt to separate the non-context-aware stuff and the context-aware stuff into two separate phases; and only data that passes the non-context-aware tests would then be bundled up into the full CTE where you then check constraints and the like.​  The how is beyond me but this is an open source project so you have access to all of the relevant functions.  How you would go about using them without having to invoke the engine I do not know but that seems like the only performant option since the PostgreSQL executor isn't designed to handle your usage pattern.

Without considerably more understanding of the how suggestions are difficult to make - even were I to know the innards of PostgreSQL and using C.  Others more knowledgeable will likely chime in but I suspect that I'm pretty close to the mark on this one.

Yes, you could make a regexp-base validation library...but that screams "maintenance nightmare" to me.  I'm doubtful such a thing already exists.

David J.