Use ctid in where clause in update from statement - Mailing list pgsql-general
From | Dirk Mika |
---|---|
Subject | Use ctid in where clause in update from statement |
Date | |
Msg-id | C8D480EC-AFF2-40CC-8BD3-754A1204011A@mikatiming.de Whole thread Raw |
Responses |
Re: Use ctid in where clause in update from statement
Re: Use ctid in where clause in update from statement Re: Use ctid in where clause in update from statement |
List | pgsql-general |
Hello,
I come from the Oracle world and am trying to convert some queries to PostgreSQL syntax. One of these queries is a MERGE statement, which I converted into an UPDATE SET FROM WHERE construct. In the original query I use the pseudo column ROWID to match a source row with a target row.
This is a simplified version of such a query:
MERGE INTO test_large d
USING (SELECT ROWID, test_large.*
FROM test_large
WHERE grp = 1) s
ON (d.ROWID = s.ROWID)
WHEN MATCHED
THEN
UPDATE SET d.grp = s.grp;
It has the following execution plan:
---------------------------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost (%CPU)| Time |
---------------------------------------------------------------------------------------------
| 0 | MERGE STATEMENT | | 1000 | 16000 | 2006 (0)| 00:00:25 |
| 1 | MERGE | TEST_LARGE | | | | |
| 2 | VIEW | | | | | |
| 3 | NESTED LOOPS | | 1000 | 220K| 2006 (0)| 00:00:25 |
| 4 | TABLE ACCESS BY INDEX ROWID| TEST_LARGE | 1000 | 110K| 1006 (0)| 00:00:13 |
|* 5 | INDEX RANGE SCAN | IX_TL_GRP | 1000 | | 6 (0)| 00:00:01 |
| 6 | TABLE ACCESS BY USER ROWID | TEST_LARGE | 1 | 113 | 1 (0)| 00:00:01 |
---------------------------------------------------------------------------------------------
In the PostgreSQL version I use the column ctid for this. The above query becomes:
UPDATE test_large d
SET grp = s.grp
FROM (SELECT ctid, test_large.*
FROM test_large
WHERE grp = 1) s
WHERE d.ctid = s.ctid;
I noticed that the execution plan for such a statement seems to be suboptimal:
Update on test_large d (cost=155346.98..160367.08 rows=996 width=53)
-> Merge Join (cost=155346.98..160367.08 rows=996 width=53)
Merge Cond: (d.ctid = test_large.ctid)
-> Sort (cost=154330.57..156833.16 rows=1001033 width=43)
Sort Key: d.ctid
-> Seq Scan on test_large d (cost=0.00..19366.33 rows=1001033 width=43)
-> Sort (cost=1016.40..1018.89 rows=996 width=10)
Sort Key: test_large.ctid
-> Index Scan using ix_tl_grp on test_large (cost=0.42..966.80 rows=996 width=10)
Index Cond: (grp = 1)
I expected a tid scan to be used instead of a Seq scan. I did a VACUUM test_large to make sure statistics are valid.
I would suspect that a tid scan is even a bit faster than accessing via the primary key, since reading the index is not required.
I know I could change the WHERE clause to use the primary key, which results in the following plan:
Update on test_large d (cost=0.85..3344.12 rows=995 width=57)
-> Nested Loop (cost=0.85..3344.12 rows=995 width=57)
-> Index Scan using ix_tl_grp on test_large (cost=0.42..994.84 rows=995 width=14)
Index Cond: (grp = 1)
-> Index Scan using pk_test_large on test_large d (cost=0.42..2.36 rows=1 width=47)
Index Cond: (id = test_large.id)
So my question is more "Is a tid scan not possible / useful for an update" than "how do I accelerate this query".
BR
Dirk
-- Dirk Mika Software Developer |
|
fon +49 2202 2401-1197 |
AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884 Geschäftsführer: Harald Mika, Jörg Mika |
Attachment
pgsql-general by date: