Re: FDW INSERT batching can change behavior - Mailing list pgsql-bugs

From Tomas Vondra
Subject Re: FDW INSERT batching can change behavior
Date
Msg-id ef565ecb-4807-491c-8eb7-577eab6522aa@vondra.me
Whole thread Raw
In response to FDW INSERT batching can change behavior  (git@jasonk.me)
List pgsql-bugs
On 8/13/24 06:57, Jason Kim wrote:
> On 2024-08-09T21:55:00+0200, Tomas Vondra wrote:
>> Yeah, we don't seem to check for this. I don't recall if it didn't occur
>> to me we could have DEFAULT on the foreign table.
>>
>> We could/should disable batching, but I'm not quite sure what exactly to
>> check. AFAIK this can happen only when there are default expressions on
>> the foreign table, so maybe that? Or maybe only when the DEFAULT calls a
>> volatile function?
> 
> I didn't bother checking, but if CHECK constraints can call volatile functions,
> that is another avenue for differing behavior.
> 

While we technically allow volatile functions in CHECK constraints, the
documentation [1] says:

  PostgreSQL assumes that CHECK constraints' conditions are immutable,
  that is, they will always give the same result for the same input row.

So the expectation is that CHECK only calls immutable stuff.


[1] https://www.postgresql.org/docs/current/ddl-constraints.html

> Here is a completely different example concerning the way batches to different
> partitions are flushed per-partition resulting in out-of-order insertion:
> 
>     CREATE EXTENSION postgres_fdw;
> 
>     CREATE SERVER testserver1 FOREIGN DATA WRAPPER postgres_fdw;
>     DO $d$
>     BEGIN
>         EXECUTE $$CREATE SERVER loopback FOREIGN DATA WRAPPER postgres_fdw
>             OPTIONS (dbname '$$||current_database()||$$',
>                      port '$$||current_setting('port')||$$'
>             )$$;
>     END;
>     $d$;
> 
>     CREATE USER MAPPING FOR CURRENT_USER SERVER loopback;
> 
>     create table itrtest (a int) partition by range (a);
>     create table loct1 (a int check (a % 100 != 3));
>     create foreign table remp1 (a int check (a % 100 != 3)) server loopback options (table_name 'loct1');
>     create table loct2 (a int check (a % 100 != 3));
>     create foreign table remp2 (a int check (a % 100 != 3)) server loopback options (table_name 'loct2');
>     alter table itrtest attach partition remp1 for values from (1) to (100);
>     alter table itrtest attach partition remp2 for values from (101) to (200);
> 
>     insert into itrtest values (1), (2), (101), (103), (3);
>     truncate itrtest;
>     alter server loopback options (add batch_size '3');
>     insert into itrtest values (1), (2), (101), (103), (3);
> 
> The first insert (non-batched) gives
> 
>     ERROR:  new row for relation "loct2" violates check constraint "loct2_a_check"
>     DETAIL:  Failing row contains (103).
>     CONTEXT:  remote SQL command: INSERT INTO public.loct2(a) VALUES ($1)
> 
> But the second insert (batched) gives
> 
>     ERROR:  new row for relation "loct1" violates check constraint "loct1_a_check"
>     DETAIL:  Failing row contains (3).
>     CONTEXT:  remote SQL command: INSERT INTO public.loct1(a) VALUES ($1), ($2), ($3)
> 
> This is because (103) is queued up in the batch and not actually inserted.
> There might be a more severe example than this, but I did not think too much
> about it.
> 

I agree this is a bit annoying, because it makes the observed error a
bit unpredictable. But as long as it does not allow inserting rows
violating the CHECK constraint, I wouldn't call this a bug.

The only way to prevent this would be to ensure we inserts actually
happen in the input order, and that's inherently against the whole idea
of batching. So if you want that, you have to disable batching.


regards

-- 
Tomas Vondra



pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #18581: psql symbol append_history not found when quitting
Next
From: Kuntal Ghosh
Date:
Subject: Re: BUG #18559: Crash after detaching a partition concurrently from another session