Thread: CHECK that involves a function call behaves differently during bulk load?

I have run into a problem when using dump/restore to move an existing system to the current version 17.  The restore is failing a column constraint that works perfectly on the existing system, and also on the new system once the data is loaded, but fails during the "restore" loading via psql.  I first did the dump in COPY from STDIN mode, then when I encountered the issue switched to --inserts with no change.

First, here is the function defined for use in column constraints, defined thus:

    -- function is implicitly 'CALLED ON NULL INPUT' 
    -- since a NULL return value won't work as this function is applied

    CREATE FUNCTION public.valid_upc_ean(target character varying) RETURNS boolean
        LANGUAGE plpgsql
        AS $$
    DECLARE
        len integer;
        holder varchar;

    begin
        if target is null then
            return true;
        end if;
        len = length(trim(target));
        if  len = 12 then
            holder := target::upc;
            return true;
        elsif len = 13  then
            holder := target::ean13;
            return true;
        elsif len = 0 then
            return true;
        else
            return false;
        end if;
    exception when others then
        return false;
    end;
    $$;

It's used like this:

    CREATE TABLE public.inv_variant
    (
    id integer NOT NULL,
    prod_id integer NOT NULL,
    sku character varying(126) NOT NULL,
    upc character varying(14),
    ...
    CONSTRAINT var_upc_check CHECK (public.valid_upc_ean(upc))
    );

As mentioned about, the function fires and works perfectly once the data is loaded, but the bulk load gives this error:

    ERROR:  new row for relation "inv_variant" violates check constraint "var_upc_check"
    DETAIL:  Failing row contains (2264, 2170, BOX-1, 012345678912, ...

Every row with NULL in this column inserts successfully, every one with data fails in this way.  The data itself is valid -- I can copy the text starting with the '(' from the DETAIL: Failing row... message, paste it after "insert into inv_variant", and it executes perfectly--this in the very same psql where that exact set of insert data is failing on the redirected dump-file input.

I am quite at a loss about how to even go about troubleshooting this.  Since valid_upc_ean() is a pure function with no side effects or references to anything other than the table in question, and since it DOES get called on NULL input... is there a problem with what I'm expecting the tools to produce as far as dump/restore between versions, or is this exposing an actual problem or bug somewhere?  Note the entire database does have quite a few other CHECK constraints, but the others all involve only calculations on the row data, with no calls to user-defined functions.

Thanks!

--Kirk

Re: CHECK that involves a function call behaves differently during bulk load?

From
"David G. Johnston"
Date:
On Thursday, March 13, 2025, Kirk Parker <khp@equatoria.us> wrote:

    CREATE FUNCTION public.valid_upc_ean(target character varying) RETURNS boolean
        LANGUAGE plpgsql
        AS $$
    DECLARE
        len integer;
        holder varchar;

    begin
        if target is null then
            return true;
        end if;
        len = length(trim(target));
        if  len = 12 then
            holder := target::upc;
            return true;
        elsif len = 13  then
            holder := target::ean13;
            return true;
        elsif len = 0 then
            return true;
        else
            return false;
        end if;
    exception when others then
        return false;
    end;
    $$;

It's used like this:

    CREATE TABLE public.inv_variant
    (
    id integer NOT NULL,
    prod_id integer NOT NULL,
    sku character varying(126) NOT NULL,
    upc character varying(14),
    ...
    CONSTRAINT var_upc_check CHECK (public.valid_upc_ean(upc))
    );

As mentioned about, the function fires and works perfectly once the data is loaded, but the bulk load gives this error:

    ERROR:  new row for relation "inv_variant" violates check constraint "var_upc_check"
    DETAIL:  Failing row contains (2264, 2170, BOX-1, 012345678912, ...

Every row with NULL in this column inserts successfully, every one with data fails in this way.  The data itself is valid -- I can copy the text starting with the '(' from the DETAIL: Failing row... message, paste it after "insert into inv_variant", and it executes perfectly--this in the very same psql where that exact set of insert data is failing on the redirected dump-file input.

I am quite at a loss about how to even go about troubleshooting this. 

Not discarding useful error messages as you do in the exception block is a good start.



  Since valid_upc_ean() is a pure function with no side effects or references to anything other than the table in question,


And two non-core data types that lack schema qualifications.
 
David J.



On Thu, Mar 13, 2025 at 2:05 PM David G. Johnston <david.g.johnston@gmail.com> wrote:
On Thursday, March 13, 2025, Kirk Parker <khp@equatoria.us> wrote:

    CREATE FUNCTION public.valid_upc_ean(target character varying) RETURNS boolean
        LANGUAGE plpgsql
        AS $$
    DECLARE
        len integer;
        holder varchar;

    begin
        if target is null then
            return true;
        end if;
        len = length(trim(target));
        if  len = 12 then
            holder := target::upc;
            return true;
        elsif len = 13  then
            holder := target::ean13;
            return true;
        elsif len = 0 then
            return true;
        else
            return false;
        end if;
    exception when others then
        return false;
    end;
    $$;

It's used like this:

    CREATE TABLE public.inv_variant
    (
    id integer NOT NULL,
    prod_id integer NOT NULL,
    sku character varying(126) NOT NULL,
    upc character varying(14),
    ...
    CONSTRAINT var_upc_check CHECK (public.valid_upc_ean(upc))
    );

As mentioned about, the function fires and works perfectly once the data is loaded, but the bulk load gives this error:

    ERROR:  new row for relation "inv_variant" violates check constraint "var_upc_check"
    DETAIL:  Failing row contains (2264, 2170, BOX-1, 012345678912, ...

Every row with NULL in this column inserts successfully, every one with data fails in this way.  The data itself is valid -- I can copy the text starting with the '(' from the DETAIL: Failing row... message, paste it after "insert into inv_variant", and it executes perfectly--this in the very same psql where that exact set of insert data is failing on the redirected dump-file input.

I am quite at a loss about how to even go about troubleshooting this. 

Not discarding useful error messages as you do in the exception block is a good start.



  Since valid_upc_ean() is a pure function with no side effects or references to anything other than the table in question,


And two non-core data types that lack schema qualifications.
 
David J.


Schema-qualifying the casts to non-core types did the trick, thanks!

But it does raise two questions:

(1) why does it work with individual inserts sent via psql, but not with the redirected input from the dump? 
(2) everything is in the public schema so why wouldn't it find the type anyway?

Finally, is this maybe something to take up on the documentation list?  There are quite a few example code snippets in the  PL/pgSQL  chapter, and not a single instance of a schema-qualifying name among them.

Re: CHECK that involves a function call behaves differently during bulk load?

From
"David G. Johnston"
Date:
On Thu, Mar 13, 2025 at 3:33 PM Kirk Parker <khp@equatoria.us> wrote:

(1) why does it work with individual inserts sent via psql, but not with the redirected input from the dump? 
(2) everything is in the public schema so why wouldn't it find the type anyway?



System maintenance and actions use a secure search_path.

David J.

"David G. Johnston" <david.g.johnston@gmail.com> writes:
> On Thu, Mar 13, 2025 at 3:33 PM Kirk Parker <khp@equatoria.us> wrote:
>> (1) why does it work with individual inserts sent via psql, but not with
>> the redirected input from the dump?

> https://www.postgresql.org/docs/current/ddl-schemas.html#DDL-SCHEMAS-PATTERNS
> System maintenance and actions use a secure search_path.

Even more to the point, see the first compatibility entry at [1]:

    * Change functions to use a safe search_path during maintenance
      operations (Jeff Davis)

      This prevents maintenance operations (ANALYZE, CLUSTER, CREATE
      INDEX, CREATE MATERIALIZED VIEW, REFRESH MATERIALIZED VIEW,
      REINDEX, or VACUUM) from performing unsafe access. Functions
      used by expression indexes and materialized views that need to
      reference non-default schemas must specify a search path during
      function creation.

            regards, tom lane

[1] https://www.postgresql.org/docs/17/release-17.html#RELEASE-17-MIGRATION