Re: plpgsql versus domains - Mailing list pgsql-hackers

From Pavel Stehule
Subject Re: plpgsql versus domains
Date
Msg-id CAFj8pRANCc=jmeNkFT64Z+1jL6RBtTHtpOSao3sCzqw-rgowLQ@mail.gmail.com
Whole thread Raw
In response to plpgsql versus domains  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi

2015-02-26 18:31 GMT+01:00 Tom Lane <tgl@sss.pgh.pa.us>:
At the behest of Salesforce, I've been looking into improving plpgsql's
handling of variables of domain types, which is currently pretty awful.
It's really slow, because any assignment to a domain variable goes through
the slow double-I/O-conversion path in exec_cast_value, even if no actual
work is needed.  And I noticed that the domain's constraints get looked up
only once per function per session; for example this script gets
unexpected results:

create domain di as int;

create function foo1(int) returns di as $$
declare d di;
begin
  d := $1;
  return d;
end
$$ language plpgsql immutable;

select foo1(0); -- call once to set up cache

alter domain di add constraint pos check (value > 0);

select 0::di;   -- fails, as expected

select foo1(0); -- should fail, does not

\c -

select foo1(0); -- now it fails

The reason for this is that domain_in looks up the domain's constraints
and caches them under the calling FmgrInfo struct.  That behavior was
designed to ensure we'd do just one constraint-lookup per query, which
I think is reasonable.  But plpgsql sets up an FmgrInfo in the variable's
PLpgSQL_type record, which persists for the whole session unless the
function's parse tree is flushed for some reason.  So the constraints
only get looked up once.

The rough sketch I have in mind for fixing this is:

1. Arrange to cache the constraints for domain types in typcache.c,
so as to reduce the number of trips to the actual catalogs.  I think
typcache.c will have to flush these cache items upon seeing any sinval
change in pg_constraint, but that's still cheaper than searching
pg_constraint for every query.

2. In plpgsql, explicitly model a domain type as being its base type
plus some constraints --- that is, stop depending on domain_in() to
enforce the constraints, and do it ourselves.  This would mean that
the FmgrInfo in a PLpgSQL_type struct would reference the base type's
input function rather than domain_in, so we'd not have to be afraid
to cache that.  The constraints would be represented as a separate list,
which we'd arrange to fetch from the typcache once per transaction.
(I think checking for new constraints once per transaction is sufficient
for reasonable situations, though I suppose that could be argued about.)
The advantage of this approach is first that we could avoid an I/O
conversion when the incoming value to be assigned matches the domain's
base type, which is the typical case; and second that a domain without
any CHECK constraints would become essentially free, which is also a
common case.

I like this variant

There can be some good optimization with  scalar types: text, varchars, numbers and reduce IO cast.

 

3. We could still have domains.c responsible for most of the details;
the domain_check() API may be good enough as-is, though it seems to lack
any simple way to force re-lookup of the domain constraints once per xact.

Thoughts, better ideas?

Given the lack of field complaints, I don't feel a need to try to
back-patch a fix for this, but I do want to see it fixed for 9.5.

+1

Regards

Pavel
 

                        regards, tom lane


--
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

pgsql-hackers by date:

Previous
From: Jim Nasby
Date:
Subject: Re: Partitioning WIP patch
Next
From: Jim Nasby
Date:
Subject: Re: Partitioning WIP patch