Re: plpgsql versus domains - Mailing list pgsql-hackers
From | Pavel Stehule |
---|---|
Subject | Re: plpgsql versus domains |
Date | |
Msg-id | CAFj8pRANCc=jmeNkFT64Z+1jL6RBtTHtpOSao3sCzqw-rgowLQ@mail.gmail.com Whole thread Raw |
In response to | plpgsql versus domains (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
Hi
2015-02-26 18:31 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
+1
At the behest of Salesforce, I've been looking into improving plpgsql's
handling of variables of domain types, which is currently pretty awful.
It's really slow, because any assignment to a domain variable goes through
the slow double-I/O-conversion path in exec_cast_value, even if no actual
work is needed. And I noticed that the domain's constraints get looked up
only once per function per session; for example this script gets
unexpected results:
create domain di as int;
create function foo1(int) returns di as $$
declare d di;
begin
d := $1;
return d;
end
$$ language plpgsql immutable;
select foo1(0); -- call once to set up cache
alter domain di add constraint pos check (value > 0);
select 0::di; -- fails, as expected
select foo1(0); -- should fail, does not
\c -
select foo1(0); -- now it fails
The reason for this is that domain_in looks up the domain's constraints
and caches them under the calling FmgrInfo struct. That behavior was
designed to ensure we'd do just one constraint-lookup per query, which
I think is reasonable. But plpgsql sets up an FmgrInfo in the variable's
PLpgSQL_type record, which persists for the whole session unless the
function's parse tree is flushed for some reason. So the constraints
only get looked up once.
The rough sketch I have in mind for fixing this is:
1. Arrange to cache the constraints for domain types in typcache.c,
so as to reduce the number of trips to the actual catalogs. I think
typcache.c will have to flush these cache items upon seeing any sinval
change in pg_constraint, but that's still cheaper than searching
pg_constraint for every query.
2. In plpgsql, explicitly model a domain type as being its base type
plus some constraints --- that is, stop depending on domain_in() to
enforce the constraints, and do it ourselves. This would mean that
the FmgrInfo in a PLpgSQL_type struct would reference the base type's
input function rather than domain_in, so we'd not have to be afraid
to cache that. The constraints would be represented as a separate list,
which we'd arrange to fetch from the typcache once per transaction.
(I think checking for new constraints once per transaction is sufficient
for reasonable situations, though I suppose that could be argued about.)
The advantage of this approach is first that we could avoid an I/O
conversion when the incoming value to be assigned matches the domain's
base type, which is the typical case; and second that a domain without
any CHECK constraints would become essentially free, which is also a
common case.
I like this variant
There can be some good optimization with scalar types: text, varchars, numbers and reduce IO cast.
3. We could still have domains.c responsible for most of the details;
the domain_check() API may be good enough as-is, though it seems to lack
any simple way to force re-lookup of the domain constraints once per xact.
Thoughts, better ideas?
Given the lack of field complaints, I don't feel a need to try to
back-patch a fix for this, but I do want to see it fixed for 9.5.
+1
Regards
Pavel
regards, tom lane
--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers
pgsql-hackers by date: