Thread: BUG #17413: update of partitioned table via postgres_fdw updates to much rows
BUG #17413: update of partitioned table via postgres_fdw updates to much rows
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 17413 Logged by: Stepan Yankevych Email address: stepya@ukr.net PostgreSQL version: 14.2 Operating system: CentOS Description: We noticed that update foreign table in some cases passes following update to the remote DB update part_update_test set field=$2 where ctid=$1 In that case one row from each partition can be updated. See steps to reproduce -- pgprod1 drop table if exists trash.part_update_test; CREATE TABLE trash.part_update_test ( id serial, date_id int4 NOT NULL, simple_text varchar ) PARTITION BY RANGE (date_id); CREATE TABLE trash.part_update_test_20220221 PARTITION OF trash.part_update_test FOR VALUES FROM (20220221) TO (20220222); CREATE TABLE trash.part_update_test_20220222 PARTITION OF trash.part_update_test FOR VALUES FROM (20220222) TO (20220223); CREATE TABLE trash.part_update_test_20220223 PARTITION OF trash.part_update_test FOR VALUES FROM (20220223) TO (20220224); insert into trash.part_update_test (date_id, simple_text) values (20220221, 'Im 20220221'),(20220222, 'I amd 20220222'),(20220223, 'I am the third row '); select ctid, * from trash.part_update_test; ctid |id|date_id |simple_text | -----+--+--------+-------------------+ (0,1)| 1|20220221|Im 20220221 | (0,1)| 2|20220222|I amd 20220222 | (0,1)| 3|20220223|I am the third row | -- pgprod2 DROP FOREIGN TABLE if EXISTS staging.part_update_test; IMPORT FOREIGN SCHEMA "trash" LIMIT TO (part_update_test) FROM SERVER postgresprod into staging; with ids as materialized (select 1 as id, 20220221 as date_id ) update staging.part_update_test t set simple_text = 'I am updated version of 20220221 ' from ids where t.id = ids.id and t.date_id = ids.date_id ; select ctid, * from staging.part_update_test; ctid |id|date_id |simple_text | -----+--+--------+----------------------------------+ (0,2)| 1|20220221|I am updated version of 20220221 | (0,2)| 2|20220222|I am updated version of 20220221 | (0,2)| 3|20220223|I am updated version of 20220221 | As you can see all of rows that had (0,1) where updated , but the only first row with ID =1 had to be updated . The same was reproducible at least in PG14.1
Re: BUG #17413: update of partitioned table via postgres_fdw updates to much rows
From
Japin Li
Date:
On Tue, 22 Feb 2022 at 18:03, PG Bug reporting form <noreply@postgresql.org> wrote: > The following bug has been logged on the website: > > Bug reference: 17413 > Logged by: Stepan Yankevych > Email address: stepya@ukr.net > PostgreSQL version: 14.2 > Operating system: CentOS > Description: > > We noticed that update foreign table in some cases passes following update > to the remote DB > update part_update_test > set field=$2 > where ctid=$1 > In that case one row from each partition can be updated. > See steps to reproduce > > -- pgprod1 > drop table if exists trash.part_update_test; > > CREATE TABLE trash.part_update_test ( > id serial, > date_id int4 NOT NULL, > simple_text varchar > ) PARTITION BY RANGE (date_id); > > CREATE TABLE trash.part_update_test_20220221 PARTITION OF > trash.part_update_test FOR VALUES FROM (20220221) TO (20220222); > CREATE TABLE trash.part_update_test_20220222 PARTITION OF > trash.part_update_test FOR VALUES FROM (20220222) TO (20220223); > CREATE TABLE trash.part_update_test_20220223 PARTITION OF > trash.part_update_test FOR VALUES FROM (20220223) TO (20220224); > > insert into trash.part_update_test (date_id, simple_text) > values (20220221, 'Im 20220221'),(20220222, 'I amd 20220222'),(20220223, 'I > am the third row '); > > select ctid, * > from trash.part_update_test; > > ctid |id|date_id |simple_text | > -----+--+--------+-------------------+ > (0,1)| 1|20220221|Im 20220221 | > (0,1)| 2|20220222|I amd 20220222 | > (0,1)| 3|20220223|I am the third row | > > -- pgprod2 > DROP FOREIGN TABLE if EXISTS staging.part_update_test; > IMPORT FOREIGN SCHEMA "trash" LIMIT TO (part_update_test) > FROM SERVER postgresprod > into staging; > > > with ids as materialized (select 1 as id, 20220221 as date_id ) > update staging.part_update_test t > set simple_text = 'I am updated version of 20220221 ' > from ids > where t.id = ids.id > and t.date_id = ids.date_id ; > > select ctid, * > from staging.part_update_test; > > ctid |id|date_id |simple_text | > -----+--+--------+----------------------------------+ > (0,2)| 1|20220221|I am updated version of 20220221 | > (0,2)| 2|20220222|I am updated version of 20220221 | > (0,2)| 3|20220223|I am updated version of 20220221 | > > As you can see all of rows that had (0,1) where updated , but the only first > row with ID =1 had to be updated . > The same was reproducible at least in PG14.1 Yeah, it is a bug IMO. The deparseUpdateSql() function assume that the tuple id is as unique, however, it ignores the partitioned table, in which the tuple id may be duplicated. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
RE: BUG #17413: update of partitioned table via postgres_fdw updates to much rows
From
Stepan Yankevych
Date:
I propose to add secondary check of all field values from the where condition. So in that particular case it could be something like that update part_update_test set field=$2 where ctid=$1 and id = $3 and date_id = $4; Generally speaking it can even improve performance dur to partition pruning. -----Original Message----- From: Japin Li <japinli@hotmail.com> Sent: Tuesday, February 22, 2022 7:09 PM To: stepya@ukr.net; pgsql-bugs@lists.postgresql.org Subject: Re: BUG #17413: update of partitioned table via postgres_fdw updates to much rows On Tue, 22 Feb 2022 at 18:03, PG Bug reporting form <noreply@postgresql.org> wrote: > The following bug has been logged on the website: > > Bug reference: 17413 > Logged by: Stepan Yankevych > Email address: stepya@ukr.net > PostgreSQL version: 14.2 > Operating system: CentOS > Description: > > We noticed that update foreign table in some cases passes following > update to the remote DB update part_update_test set field=$2 where > ctid=$1 In that case one row from each partition can be updated. > See steps to reproduce > > -- pgprod1 > drop table if exists trash.part_update_test; > > CREATE TABLE trash.part_update_test ( > id serial, > date_id int4 NOT NULL, > simple_text varchar > ) PARTITION BY RANGE (date_id); > > CREATE TABLE trash.part_update_test_20220221 PARTITION OF > trash.part_update_test FOR VALUES FROM (20220221) TO (20220222); > CREATE TABLE trash.part_update_test_20220222 PARTITION OF > trash.part_update_test FOR VALUES FROM (20220222) TO (20220223); > CREATE TABLE trash.part_update_test_20220223 PARTITION OF > trash.part_update_test FOR VALUES FROM (20220223) TO (20220224); > > insert into trash.part_update_test (date_id, simple_text) values > (20220221, 'Im 20220221'),(20220222, 'I amd 20220222'),(20220223, 'I > am the third row '); > > select ctid, * > from trash.part_update_test; > > ctid |id|date_id |simple_text | > -----+--+--------+-------------------+ > (0,1)| 1|20220221|Im 20220221 | > (0,1)| 2|20220222|I amd 20220222 | > (0,1)| 3|20220223|I am the third row | > > -- pgprod2 > DROP FOREIGN TABLE if EXISTS staging.part_update_test; IMPORT FOREIGN > SCHEMA "trash" LIMIT TO (part_update_test) FROM SERVER postgresprod > into staging; > > > with ids as materialized (select 1 as id, 20220221 as date_id ) > update staging.part_update_test t set simple_text = 'I am updated > version of 20220221 ' > from ids > where t.id = ids.id > and t.date_id = ids.date_id ; > > select ctid, * > from staging.part_update_test; > > ctid |id|date_id |simple_text | > -----+--+--------+----------------------------------+ > (0,2)| 1|20220221|I am updated version of 20220221 | (0,2)| > 2|20220222|I am updated version of 20220221 | (0,2)| 3|20220223|I am > updated version of 20220221 | > > As you can see all of rows that had (0,1) where updated , but the only > first row with ID =1 had to be updated . > The same was reproducible at least in PG14.1 Yeah, it is a bug IMO. The deparseUpdateSql() function assume that the tuple id is as unique, however, it ignores the partitionedtable, in which the tuple id may be duplicated. -- Regrads, Japin Li. ChengDu WenWu Information Technology Co.,Ltd.
Re: BUG #17413: update of partitioned table via postgres_fdw updates to much rows
From
Etsuro Fujita
Date:
On Wed, Feb 23, 2022 at 2:09 AM Japin Li <japinli@hotmail.com> wrote: > On Tue, 22 Feb 2022 at 18:03, PG Bug reporting form <noreply@postgresql.org> wrote: > > The following bug has been logged on the website: > > See steps to reproduce > > > > -- pgprod1 > > drop table if exists trash.part_update_test; > > > > CREATE TABLE trash.part_update_test ( > > id serial, > > date_id int4 NOT NULL, > > simple_text varchar > > ) PARTITION BY RANGE (date_id); > > > > CREATE TABLE trash.part_update_test_20220221 PARTITION OF > > trash.part_update_test FOR VALUES FROM (20220221) TO (20220222); > > CREATE TABLE trash.part_update_test_20220222 PARTITION OF > > trash.part_update_test FOR VALUES FROM (20220222) TO (20220223); > > CREATE TABLE trash.part_update_test_20220223 PARTITION OF > > trash.part_update_test FOR VALUES FROM (20220223) TO (20220224); > > > > insert into trash.part_update_test (date_id, simple_text) > > values (20220221, 'Im 20220221'),(20220222, 'I amd 20220222'),(20220223, 'I > > am the third row '); > > > > select ctid, * > > from trash.part_update_test; > > > > ctid |id|date_id |simple_text | > > -----+--+--------+-------------------+ > > (0,1)| 1|20220221|Im 20220221 | > > (0,1)| 2|20220222|I amd 20220222 | > > (0,1)| 3|20220223|I am the third row | > > > > -- pgprod2 > > DROP FOREIGN TABLE if EXISTS staging.part_update_test; > > IMPORT FOREIGN SCHEMA "trash" LIMIT TO (part_update_test) > > FROM SERVER postgresprod > > into staging; > > > > > > with ids as materialized (select 1 as id, 20220221 as date_id ) > > update staging.part_update_test t > > set simple_text = 'I am updated version of 20220221 ' > > from ids > > where t.id = ids.id > > and t.date_id = ids.date_id ; > > > > select ctid, * > > from staging.part_update_test; > > > > ctid |id|date_id |simple_text | > > -----+--+--------+----------------------------------+ > > (0,2)| 1|20220221|I am updated version of 20220221 | > > (0,2)| 2|20220222|I am updated version of 20220221 | > > (0,2)| 3|20220223|I am updated version of 20220221 | > > > > As you can see all of rows that had (0,1) where updated , but the only first > > row with ID =1 had to be updated . > > The same was reproducible at least in PG14.1 > > Yeah, it is a bug IMO. The deparseUpdateSql() function assume that the tuple > id is as unique, however, it ignores the partitioned table, in which the tuple > id may be duplicated. Yeah, this is a known issue [1]. I created a patch for the issue, but the patch needs more work. It’s on my TODO list to revisit the patch. Best regards, Etsuro Fujita [1] https://www.postgresql.org/message-id/CAFjFpRfcgwsHRmpvoOK-GUQi-n8MgAS%2BOxcQo%3DaBDn1COywmcg%40mail.gmail.com