Thread: BUG #17413: update of partitioned table via postgres_fdw updates to much rows

BUG #17413: update of partitioned table via postgres_fdw updates to much rows

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      17413
Logged by:          Stepan Yankevych
Email address:      stepya@ukr.net
PostgreSQL version: 14.2
Operating system:   CentOS
Description:

We noticed that update foreign table in some cases passes following update
to the remote DB 
update part_update_test
set field=$2
where ctid=$1
In that case one row from each partition can be updated. 
See steps to reproduce

-- pgprod1 
drop table if exists trash.part_update_test;

CREATE TABLE trash.part_update_test (
    id serial,
    date_id int4 NOT NULL,
    simple_text varchar
) PARTITION BY RANGE (date_id);

CREATE TABLE trash.part_update_test_20220221 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220221) TO (20220222);
CREATE TABLE trash.part_update_test_20220222 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220222) TO (20220223);
CREATE TABLE trash.part_update_test_20220223 PARTITION OF
trash.part_update_test FOR VALUES FROM (20220223) TO (20220224);

insert into trash.part_update_test (date_id, simple_text)
values (20220221, 'Im 20220221'),(20220222, 'I amd 20220222'),(20220223, 'I
am the third row ');

select ctid, *
from trash.part_update_test;

ctid |id|date_id |simple_text        |
-----+--+--------+-------------------+
(0,1)| 1|20220221|Im 20220221        |
(0,1)| 2|20220222|I amd 20220222     |
(0,1)| 3|20220223|I am the third row |

--  pgprod2 
DROP FOREIGN TABLE if EXISTS staging.part_update_test;
IMPORT FOREIGN SCHEMA "trash" LIMIT TO (part_update_test)
FROM SERVER postgresprod
into staging;


with ids as materialized (select 1 as id, 20220221 as date_id  )
update staging.part_update_test t
set simple_text = 'I am  updated version of 20220221 '
from ids
where t.id = ids.id
and t.date_id = ids.date_id ;

select ctid, *
from staging.part_update_test;

 ctid |id|date_id |simple_text                       |
-----+--+--------+----------------------------------+
(0,2)| 1|20220221|I am  updated version of 20220221 |
(0,2)| 2|20220222|I am  updated version of 20220221 |
(0,2)| 3|20220223|I am  updated version of 20220221 |

As you can see all of rows that had (0,1) where updated , but the only first
row with ID =1 had to be updated .
The same was reproducible at least in PG14.1


On Tue, 22 Feb 2022 at 18:03, PG Bug reporting form <noreply@postgresql.org> wrote:
> The following bug has been logged on the website:
>
> Bug reference:      17413
> Logged by:          Stepan Yankevych
> Email address:      stepya@ukr.net
> PostgreSQL version: 14.2
> Operating system:   CentOS
> Description:        
>
> We noticed that update foreign table in some cases passes following update
> to the remote DB 
> update part_update_test
> set field=$2
> where ctid=$1
> In that case one row from each partition can be updated. 
> See steps to reproduce
>
> -- pgprod1 
> drop table if exists trash.part_update_test;
>
> CREATE TABLE trash.part_update_test (
>     id serial,
>     date_id int4 NOT NULL,
>     simple_text varchar
> ) PARTITION BY RANGE (date_id);
>
> CREATE TABLE trash.part_update_test_20220221 PARTITION OF
> trash.part_update_test FOR VALUES FROM (20220221) TO (20220222);
> CREATE TABLE trash.part_update_test_20220222 PARTITION OF
> trash.part_update_test FOR VALUES FROM (20220222) TO (20220223);
> CREATE TABLE trash.part_update_test_20220223 PARTITION OF
> trash.part_update_test FOR VALUES FROM (20220223) TO (20220224);
>
> insert into trash.part_update_test (date_id, simple_text)
> values (20220221, 'Im 20220221'),(20220222, 'I amd 20220222'),(20220223, 'I
> am the third row ');
>
> select ctid, *
> from trash.part_update_test;
>
> ctid |id|date_id |simple_text        |
> -----+--+--------+-------------------+
> (0,1)| 1|20220221|Im 20220221        |
> (0,1)| 2|20220222|I amd 20220222     |
> (0,1)| 3|20220223|I am the third row |
>
> --  pgprod2 
> DROP FOREIGN TABLE if EXISTS staging.part_update_test;
> IMPORT FOREIGN SCHEMA "trash" LIMIT TO (part_update_test)
> FROM SERVER postgresprod
> into staging;
>
>
> with ids as materialized (select 1 as id, 20220221 as date_id  )
> update staging.part_update_test t
> set simple_text = 'I am  updated version of 20220221 '
> from ids
> where t.id = ids.id
> and t.date_id = ids.date_id ;
>
> select ctid, *
> from staging.part_update_test;
>
>  ctid |id|date_id |simple_text                       |
> -----+--+--------+----------------------------------+
> (0,2)| 1|20220221|I am  updated version of 20220221 |
> (0,2)| 2|20220222|I am  updated version of 20220221 |
> (0,2)| 3|20220223|I am  updated version of 20220221 |
>
> As you can see all of rows that had (0,1) where updated , but the only first
> row with ID =1 had to be updated .
> The same was reproducible at least in PG14.1

Yeah, it is a bug IMO.  The deparseUpdateSql() function assume that the tuple
id is as unique, however, it ignores the partitioned table, in which the tuple
id may be duplicated.

-- 
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.



RE: BUG #17413: update of partitioned table via postgres_fdw updates to much rows

From
Stepan Yankevych
Date:
I propose to add secondary check of all field values from the where condition. 
So in that particular case it could be something like that 

update part_update_test 
set field=$2 
where  ctid=$1
and id = $3
and date_id = $4;

Generally speaking it can even improve performance dur to partition pruning. 

-----Original Message-----
From: Japin Li <japinli@hotmail.com> 
Sent: Tuesday, February 22, 2022 7:09 PM
To: stepya@ukr.net; pgsql-bugs@lists.postgresql.org
Subject: Re: BUG #17413: update of partitioned table via postgres_fdw updates to much rows


On Tue, 22 Feb 2022 at 18:03, PG Bug reporting form <noreply@postgresql.org> wrote:
> The following bug has been logged on the website:
>
> Bug reference:      17413
> Logged by:          Stepan Yankevych
> Email address:      stepya@ukr.net
> PostgreSQL version: 14.2
> Operating system:   CentOS
> Description:        
>
> We noticed that update foreign table in some cases passes following 
> update to the remote DB update part_update_test set field=$2 where 
> ctid=$1 In that case one row from each partition can be updated.
> See steps to reproduce
>
> -- pgprod1
> drop table if exists trash.part_update_test;
>
> CREATE TABLE trash.part_update_test (
>     id serial,
>     date_id int4 NOT NULL,
>     simple_text varchar
> ) PARTITION BY RANGE (date_id);
>
> CREATE TABLE trash.part_update_test_20220221 PARTITION OF 
> trash.part_update_test FOR VALUES FROM (20220221) TO (20220222); 
> CREATE TABLE trash.part_update_test_20220222 PARTITION OF 
> trash.part_update_test FOR VALUES FROM (20220222) TO (20220223); 
> CREATE TABLE trash.part_update_test_20220223 PARTITION OF 
> trash.part_update_test FOR VALUES FROM (20220223) TO (20220224);
>
> insert into trash.part_update_test (date_id, simple_text) values 
> (20220221, 'Im 20220221'),(20220222, 'I amd 20220222'),(20220223, 'I 
> am the third row ');
>
> select ctid, *
> from trash.part_update_test;
>
> ctid |id|date_id |simple_text        |
> -----+--+--------+-------------------+
> (0,1)| 1|20220221|Im 20220221        |
> (0,1)| 2|20220222|I amd 20220222     |
> (0,1)| 3|20220223|I am the third row |
>
> --  pgprod2
> DROP FOREIGN TABLE if EXISTS staging.part_update_test; IMPORT FOREIGN 
> SCHEMA "trash" LIMIT TO (part_update_test) FROM SERVER postgresprod 
> into staging;
>
>
> with ids as materialized (select 1 as id, 20220221 as date_id  ) 
> update staging.part_update_test t set simple_text = 'I am  updated 
> version of 20220221 '
> from ids
> where t.id = ids.id
> and t.date_id = ids.date_id ;
>
> select ctid, *
> from staging.part_update_test;
>
>  ctid |id|date_id |simple_text                       |
> -----+--+--------+----------------------------------+
> (0,2)| 1|20220221|I am  updated version of 20220221 | (0,2)| 
> 2|20220222|I am  updated version of 20220221 | (0,2)| 3|20220223|I am  
> updated version of 20220221 |
>
> As you can see all of rows that had (0,1) where updated , but the only 
> first row with ID =1 had to be updated .
> The same was reproducible at least in PG14.1

Yeah, it is a bug IMO.  The deparseUpdateSql() function assume that the tuple id is as unique, however, it ignores the
partitionedtable, in which the tuple id may be duplicated.
 

--
Regrads,
Japin Li.
ChengDu WenWu Information Technology Co.,Ltd.

Re: BUG #17413: update of partitioned table via postgres_fdw updates to much rows

From
Etsuro Fujita
Date:
On Wed, Feb 23, 2022 at 2:09 AM Japin Li <japinli@hotmail.com> wrote:
> On Tue, 22 Feb 2022 at 18:03, PG Bug reporting form <noreply@postgresql.org> wrote:
> > The following bug has been logged on the website:

> > See steps to reproduce
> >
> > -- pgprod1
> > drop table if exists trash.part_update_test;
> >
> > CREATE TABLE trash.part_update_test (
> >       id serial,
> >       date_id int4 NOT NULL,
> >       simple_text varchar
> > ) PARTITION BY RANGE (date_id);
> >
> > CREATE TABLE trash.part_update_test_20220221 PARTITION OF
> > trash.part_update_test FOR VALUES FROM (20220221) TO (20220222);
> > CREATE TABLE trash.part_update_test_20220222 PARTITION OF
> > trash.part_update_test FOR VALUES FROM (20220222) TO (20220223);
> > CREATE TABLE trash.part_update_test_20220223 PARTITION OF
> > trash.part_update_test FOR VALUES FROM (20220223) TO (20220224);
> >
> > insert into trash.part_update_test (date_id, simple_text)
> > values (20220221, 'Im 20220221'),(20220222, 'I amd 20220222'),(20220223, 'I
> > am the third row ');
> >
> > select ctid, *
> > from trash.part_update_test;
> >
> > ctid |id|date_id |simple_text        |
> > -----+--+--------+-------------------+
> > (0,1)| 1|20220221|Im 20220221        |
> > (0,1)| 2|20220222|I amd 20220222     |
> > (0,1)| 3|20220223|I am the third row |
> >
> > --  pgprod2
> > DROP FOREIGN TABLE if EXISTS staging.part_update_test;
> > IMPORT FOREIGN SCHEMA "trash" LIMIT TO (part_update_test)
> > FROM SERVER postgresprod
> > into staging;
> >
> >
> > with ids as materialized (select 1 as id, 20220221 as date_id  )
> > update staging.part_update_test t
> > set simple_text = 'I am  updated version of 20220221 '
> > from ids
> > where t.id = ids.id
> > and t.date_id = ids.date_id ;
> >
> > select ctid, *
> > from staging.part_update_test;
> >
> >  ctid |id|date_id |simple_text                       |
> > -----+--+--------+----------------------------------+
> > (0,2)| 1|20220221|I am  updated version of 20220221 |
> > (0,2)| 2|20220222|I am  updated version of 20220221 |
> > (0,2)| 3|20220223|I am  updated version of 20220221 |
> >
> > As you can see all of rows that had (0,1) where updated , but the only first
> > row with ID =1 had to be updated .
> > The same was reproducible at least in PG14.1
>
> Yeah, it is a bug IMO.  The deparseUpdateSql() function assume that the tuple
> id is as unique, however, it ignores the partitioned table, in which the tuple
> id may be duplicated.

Yeah, this is a known issue [1].  I created a patch for the issue, but
the patch needs more work.  It’s on my TODO list to revisit the patch.

Best regards,
Etsuro Fujita

[1] https://www.postgresql.org/message-id/CAFjFpRfcgwsHRmpvoOK-GUQi-n8MgAS%2BOxcQo%3DaBDn1COywmcg%40mail.gmail.com