Fixing domain input - Mailing list pgsql-hackers

From Tom Lane
Subject Fixing domain input
Date
Msg-id 28421.1120858543@sss.pgh.pa.us
Whole thread Raw
Responses Re: Fixing domain input
List pgsql-hackers
We've seen a couple of bug reports now about how domain constraints
aren't checked during input of a parameter that's been deduced to be
of a domain type, eg
http://archives.postgresql.org/pgsql-interfaces/2005-07/msg00009.php
http://archives.postgresql.org/pgsql-bugs/2005-07/msg00084.php
There's also the long-standing bugaboo that plpgsql doesn't enforce
domain constraints.

In the first of these threads, I suggested hacking the parameter type
resolution rules so that parameters wouldn't be assigned inferred types
that are domains, but only their base types.  However, that only fixes
things when the parameter type is inferred --- if it's specified as a
domain by the client, we'd still see the problem.  And it does nothing
for plpgsql.

It occurs to me that a cleaner solution would be to stop giving domain
types the same typinput routines as their base types.  Instead, give
them all a specialized routine domain_in (comparable to array_in) that
first invokes the base type's input function and then applies any
relevant constraint checks.  Likewise for typreceive (but we'd not need
to touch the output functions).  This has a number of attractions:

* Solves both cases of the domain-parameter problem.

* Since plpgsql does all type coercions by calling output and input
functions, I believe this would automatically fix the bugs in plpgsql.

* Allows us to eliminate special cases for domains in parse_coerce.c,
copy.c, possibly other places.

The main disadvantage of it is that for domains that have CHECK
constraints, it's necessary to set up an ExprContext in which the check
expressions can be evaluated; and in turn that requires an
ExecutorState, plus ExecInitExpr, etc.  So there's a pretty fair amount
of setup overhead involved, and doing that repeatedly in a series of
calls is not attractive from a performance standpoint.  (This may be why
we didn't do it that way originally, though I don't recall any more
whether it was even considered.)

We could eliminate this overhead in the case of COPY by adding an API
kluge that lets domain_in() detect whether it's being called inside COPY
IN, and let it piggyback on COPY's EState, so that the setup overhead is
still only paid once per COPY command.

In other scenarios such as plpgsql I'm not sure we can afford to try to
amortize the setup across multiple calls --- plpgsql is pretty cavalier
about the context it calls things in, and I think we'd see huge memory
leaks if we didn't free the EState before returning from domain_in().
Still, a slow feature is better than silently failing to apply the
constraint, which is where we are now.

Thoughts?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Checkpoint cost, looks like it is WAL/CRC
Next
From: "Rodrigo Moreno"
Date:
Subject: RES: Pg_autovacuum on FreeBSD