Thread: Incorrect explain output for updates/delete operations with returning-list on partitioned tables
Incorrect explain output for updates/delete operations with returning-list on partitioned tables
From
SAIKIRAN AVULA
Date:
Hi PostgreSQL Community,
I have been working on partitioned tables recently, and I have noticed something that doesn't seem correct with the EXPLAIN output of an update/delete query with a returning list.
For example, consider two partitioned tables, "t1" and "t2," with partitions "t11," "t12," and "t21," "t22," respectively. The table definitions are as follows:
```sql
postgres=# \d+ t1
Partitioned table "public.t1"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
a | integer | | | | plain | | |
b | integer | | | | plain | | |
c | integer | | | | plain | | |
Partition key: RANGE (a)
Partitions: t11 FOR VALUES FROM (0) TO (1000),
t12 FOR VALUES FROM (1000) TO (10000)
postgres=# \d+ t2
Partitioned table "public.t2"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
a | integer | | | | plain | | |
b | integer | | | | plain | | |
c | integer | | | | plain | | |
Partition key: RANGE (a)
Partitions: t21 FOR VALUES FROM (0) TO (1000),
t22 FOR VALUES FROM (1000) TO (10000)
```
The EXPLAIN output for an update query with a returning list doesn't seem correct to me. Here are the examples (the part that doesn't seem right is highlighted in bold):
Query1:
```
I have been working on partitioned tables recently, and I have noticed something that doesn't seem correct with the EXPLAIN output of an update/delete query with a returning list.
For example, consider two partitioned tables, "t1" and "t2," with partitions "t11," "t12," and "t21," "t22," respectively. The table definitions are as follows:
```sql
postgres=# \d+ t1
Partitioned table "public.t1"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
a | integer | | | | plain | | |
b | integer | | | | plain | | |
c | integer | | | | plain | | |
Partition key: RANGE (a)
Partitions: t11 FOR VALUES FROM (0) TO (1000),
t12 FOR VALUES FROM (1000) TO (10000)
postgres=# \d+ t2
Partitioned table "public.t2"
Column | Type | Collation | Nullable | Default | Storage | Compression | Stats target | Description
--------+---------+-----------+----------+---------+---------+-------------+--------------+-------------
a | integer | | | | plain | | |
b | integer | | | | plain | | |
c | integer | | | | plain | | |
Partition key: RANGE (a)
Partitions: t21 FOR VALUES FROM (0) TO (1000),
t22 FOR VALUES FROM (1000) TO (10000)
```
The EXPLAIN output for an update query with a returning list doesn't seem correct to me. Here are the examples (the part that doesn't seem right is highlighted in bold):
Query1:
```
postgres=# explain verbose update t1 set b = 10 from t2 where t1.a = t2.a returning t1.c;
QUERY PLAN
-------------------------------------------------------------------------------------------
Update on public.t1 (cost=0.00..125187.88 rows=41616 width=14)
Output: t1_1.c -----> something not right??
Update on public.t11 t1_1
Update on public.t12 t1_2
-> Append (cost=0.00..125187.88 rows=41616 width=14)
-> Nested Loop (cost=0.00..62489.90 rows=20808 width=14)
Output: 10, t1_1.tableoid, t1_1.ctid
Join Filter: (t1_1.a = t2_1.a)
-> Seq Scan on public.t11 t1_1 (cost=0.00..30.40 rows=2040 width=14)
Output: t1_1.a, t1_1.tableoid, t1_1.ctid
-> Materialize (cost=0.00..40.60 rows=2040 width=4)
Output: t2_1.a
-> Seq Scan on public.t21 t2_1 (cost=0.00..30.40 rows=2040 width=4)
Output: t2_1.a
-> Nested Loop (cost=0.00..62489.90 rows=20808 width=14)
Output: 10, t1_2.tableoid, t1_2.ctid
Join Filter: (t1_2.a = t2_2.a)
-> Seq Scan on public.t12 t1_2 (cost=0.00..30.40 rows=2040 width=14)
Output: t1_2.a, t1_2.tableoid, t1_2.ctid
-> Materialize (cost=0.00..40.60 rows=2040 width=4)
Output: t2_2.a
-> Seq Scan on public.t22 t2_2 (cost=0.00..30.40 rows=2040 width=4)
Output: t2_2.a
(23 rows)
```
Query2:
QUERY PLAN
-------------------------------------------------------------------------------------------
Update on public.t1 (cost=0.00..125187.88 rows=41616 width=14)
Output: t1_1.c -----> something not right??
Update on public.t11 t1_1
Update on public.t12 t1_2
-> Append (cost=0.00..125187.88 rows=41616 width=14)
-> Nested Loop (cost=0.00..62489.90 rows=20808 width=14)
Output: 10, t1_1.tableoid, t1_1.ctid
Join Filter: (t1_1.a = t2_1.a)
-> Seq Scan on public.t11 t1_1 (cost=0.00..30.40 rows=2040 width=14)
Output: t1_1.a, t1_1.tableoid, t1_1.ctid
-> Materialize (cost=0.00..40.60 rows=2040 width=4)
Output: t2_1.a
-> Seq Scan on public.t21 t2_1 (cost=0.00..30.40 rows=2040 width=4)
Output: t2_1.a
-> Nested Loop (cost=0.00..62489.90 rows=20808 width=14)
Output: 10, t1_2.tableoid, t1_2.ctid
Join Filter: (t1_2.a = t2_2.a)
-> Seq Scan on public.t12 t1_2 (cost=0.00..30.40 rows=2040 width=14)
Output: t1_2.a, t1_2.tableoid, t1_2.ctid
-> Materialize (cost=0.00..40.60 rows=2040 width=4)
Output: t2_2.a
-> Seq Scan on public.t22 t2_2 (cost=0.00..30.40 rows=2040 width=4)
Output: t2_2.a
(23 rows)
```
Query2:
```
postgres=# explain verbose update t1 set b = 10 from t2 where t1.a = t2.a returning t2.c;
QUERY PLAN
-------------------------------------------------------------------------------------------
Update on public.t1 (cost=0.00..125187.88 rows=41616 width=18)
Output: t2.c
Update on public.t11 t1_1
Update on public.t12 t1_2
-> Append (cost=0.00..125187.88 rows=41616 width=18)
-> Nested Loop (cost=0.00..62489.90 rows=20808 width=18)
Output: 10, t2_1.c, t1_1.tableoid, t1_1.ctid
Join Filter: (t1_1.a = t2_1.a)
-> Seq Scan on public.t11 t1_1 (cost=0.00..30.40 rows=2040 width=14)
Output: t1_1.a, t1_1.tableoid, t1_1.ctid
-> Materialize (cost=0.00..40.60 rows=2040 width=8)
Output: t2_1.c, t2_1.a
-> Seq Scan on public.t21 t2_1 (cost=0.00..30.40 rows=2040 width=8)
Output: t2_1.c, t2_1.a
-> Nested Loop (cost=0.00..62489.90 rows=20808 width=18)
Output: 10, t2_2.c, t1_2.tableoid, t1_2.ctid
Join Filter: (t1_2.a = t2_2.a)
-> Seq Scan on public.t12 t1_2 (cost=0.00..30.40 rows=2040 width=14)
Output: t1_2.a, t1_2.tableoid, t1_2.ctid
-> Materialize (cost=0.00..40.60 rows=2040 width=8)
Output: t2_2.c, t2_2.a
-> Seq Scan on public.t22 t2_2 (cost=0.00..30.40 rows=2040 width=8)
Output: t2_2.c, t2_2.a
(23 rows)
```
After further investigation into the code, I noticed following:
postgres=# explain verbose update t1 set b = 10 from t2 where t1.a = t2.a returning t2.c;
QUERY PLAN
-------------------------------------------------------------------------------------------
Update on public.t1 (cost=0.00..125187.88 rows=41616 width=18)
Output: t2.c
Update on public.t11 t1_1
Update on public.t12 t1_2
-> Append (cost=0.00..125187.88 rows=41616 width=18)
-> Nested Loop (cost=0.00..62489.90 rows=20808 width=18)
Output: 10, t2_1.c, t1_1.tableoid, t1_1.ctid
Join Filter: (t1_1.a = t2_1.a)
-> Seq Scan on public.t11 t1_1 (cost=0.00..30.40 rows=2040 width=14)
Output: t1_1.a, t1_1.tableoid, t1_1.ctid
-> Materialize (cost=0.00..40.60 rows=2040 width=8)
Output: t2_1.c, t2_1.a
-> Seq Scan on public.t21 t2_1 (cost=0.00..30.40 rows=2040 width=8)
Output: t2_1.c, t2_1.a
-> Nested Loop (cost=0.00..62489.90 rows=20808 width=18)
Output: 10, t2_2.c, t1_2.tableoid, t1_2.ctid
Join Filter: (t1_2.a = t2_2.a)
-> Seq Scan on public.t12 t1_2 (cost=0.00..30.40 rows=2040 width=14)
Output: t1_2.a, t1_2.tableoid, t1_2.ctid
-> Materialize (cost=0.00..40.60 rows=2040 width=8)
Output: t2_2.c, t2_2.a
-> Seq Scan on public.t22 t2_2 (cost=0.00..30.40 rows=2040 width=8)
Output: t2_2.c, t2_2.a
(23 rows)
```
After further investigation into the code, I noticed following:
1. In the 'grouping_planner()' function, while generating paths for the final relation (https://github.com/postgres/postgres/blob/master/src/backend/optimizer/plan/planner.c#L1857), we only take care of adjusting the append_rel_attributes in returningList for resultRelation. Shouldn't we do that for other relations as well in query? Example for Query2 above, adjust_appendrel_attrs_multilevel is a no-op.
2. After plan creation (https://github.com/postgres/postgres/blob/master/src/backend/optimizer/plan/createplan.c#L351), shouldn't we perform tlist labeling for the `returningList` as well? I suspect this is resulting in incorrect output in Query1.
I suspect that similar issues might also be present for `withCheckOptions`, `mergeActionList`, and `mergeJoinCondition`.
I would appreciate it if the community could provide insights or clarifications regarding this observation.
Thank you for your time and consideration.
Regards
Saikiran Avula,
SDE, Amazon Web Services.
Re: Incorrect explain output for updates/delete operations with returning-list on partitioned tables
From
Tom Lane
Date:
SAIKIRAN AVULA <avulasaikiranreddy@gmail.com> writes: > I have been working on partitioned tables recently, and I have noticed > something that doesn't seem correct with the EXPLAIN output of an > update/delete query with a returning list. What do you think is not right exactly? The output has to use some one of the correlation names for the partitioned table. I think it generally chooses the one corresponding to the first Append arm, but really any would be good enough for EXPLAIN's purposes. > 1. In the 'grouping_planner()' function, while generating paths for the > final relation ( > https://github.com/postgres/postgres/blob/master/src/backend/optimizer/plan/planner.c#L1857), > we only take care of adjusting the append_rel_attributes in returningList > for resultRelation. Shouldn't we do that for other relations as well in > query? If the only difference is which way variables get labeled in EXPLAIN, I'd be kind of disinclined to spend extra cycles. But in any case, I rather suspect you'll find that this actively breaks things. Whether we change the varno on a Var isn't really optional, and there are cross-checks in setrefs.c to make sure things match up. regards, tom lane
Re: Incorrect explain output for updates/delete operations with returning-list on partitioned tables
From
David Rowley
Date:
On Tue, 7 May 2024 at 09:18, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > SAIKIRAN AVULA <avulasaikiranreddy@gmail.com> writes: > > I have been working on partitioned tables recently, and I have noticed > > something that doesn't seem correct with the EXPLAIN output of an > > update/delete query with a returning list. > > What do you think is not right exactly? The output has to use some > one of the correlation names for the partitioned table. I think > it generally chooses the one corresponding to the first Append arm, > but really any would be good enough for EXPLAIN's purposes. Also looks harmless to me. But just a slight correction, you're talking about the deparse Append condition that's in set_deparse_plan(). Whereas the code that controls this for the returningList is the following in nodeModifyTable.c: /* * Initialize result tuple slot and assign its rowtype using the first * RETURNING list. We assume the rest will look the same. */ mtstate->ps.plan->targetlist = (List *) linitial(node->returningLists); David