Home > mailing lists

Use ctid in where clause in update from statement - Mailing list pgsql-general

From	Dirk Mika
Subject	Use ctid in where clause in update from statement
Date	July 1, 2019 12:13:19
Msg-id	C8D480EC-AFF2-40CC-8BD3-754A1204011A@mikatiming.de Whole thread Raw
Responses	Re: Use ctid in where clause in update from statement Re: Use ctid in where clause in update from statement Re: Use ctid in where clause in update from statement
List	pgsql-general

Tree view

Hello,

I come from the Oracle world and am trying to convert some queries to PostgreSQL syntax. One of these queries is a MERGE statement, which I converted into an UPDATE SET FROM WHERE construct. In the original query I use the pseudo column ROWID to match a source row with a target row.

This is a simplified version of such a query:

MERGE INTO test_large d

USING (SELECT ROWID, test_large.*

FROM test_large

WHERE grp = 1) s

ON (d.ROWID = s.ROWID)

WHEN MATCHED

THEN

UPDATE SET d.grp = s.grp;

It has the following execution plan:

---------------------------------------------------------------------------------------------

---------------------------------------------------------------------------------------------

| 0 | MERGE STATEMENT | | 1000 | 16000 | 2006 (0)| 00:00:25 |

| 2 | VIEW | | | | | |

| 3 | NESTED LOOPS | | 1000 | 220K| 2006 (0)| 00:00:25 |

| 4 | TABLE ACCESS BY INDEX ROWID| TEST_LARGE | 1000 | 110K| 1006 (0)| 00:00:13 |

|* 5 | INDEX RANGE SCAN | IX_TL_GRP | 1000 | | 6 (0)| 00:00:01 |

| 6 | TABLE ACCESS BY USER ROWID | TEST_LARGE | 1 | 113 | 1 (0)| 00:00:01 |

---------------------------------------------------------------------------------------------

In the PostgreSQL version I use the column ctid for this. The above query becomes:

UPDATE test_large d

SET grp = s.grp

FROM (SELECT ctid, test_large.*

FROM test_large

WHERE grp = 1) s

WHERE d.ctid = s.ctid;

I noticed that the execution plan for such a statement seems to be suboptimal:

Update on test_large d (cost=155346.98..160367.08 rows=996 width=53)

-> Merge Join (cost=155346.98..160367.08 rows=996 width=53)

Merge Cond: (d.ctid = test_large.ctid)

-> Sort (cost=154330.57..156833.16 rows=1001033 width=43)

Sort Key: d.ctid

-> Seq Scan on test_large d (cost=0.00..19366.33 rows=1001033 width=43)

-> Sort (cost=1016.40..1018.89 rows=996 width=10)

Sort Key: test_large.ctid

-> Index Scan using ix_tl_grp on test_large (cost=0.42..966.80 rows=996 width=10)

Index Cond: (grp = 1)

I expected a tid scan to be used instead of a Seq scan. I did a VACUUM test_large to make sure statistics are valid.

I would suspect that a tid scan is even a bit faster than accessing via the primary key, since reading the index is not required.

I know I could change the WHERE clause to use the primary key, which results in the following plan:

Update on test_large d (cost=0.85..3344.12 rows=995 width=57)

-> Nested Loop (cost=0.85..3344.12 rows=995 width=57)

-> Index Scan using ix_tl_grp on test_large (cost=0.42..994.84 rows=995 width=14)

Index Cond: (grp = 1)

-> Index Scan using pk_test_large on test_large d (cost=0.42..2.36 rows=1 width=47)

Index Cond: (id = test_large.id)

So my question is more "Is a tid scan not possible / useful for an update" than "how do I accelerate this query".

Dirk

--
Dirk Mika
Software Developer

mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika

Attachment

pgsql-general by date:

From: Achilleas Mantzios
Date: 01 July 2019, 10:12:51
Subject: Re: iterate over partitions

From: Achilleas Mantzios
Date: 01 July 2019, 12:27:16
Subject: Re: Use ctid in where clause in update from statement

Use ctid in where clause in update from statement - Mailing list pgsql-general

Attachment

Previous

Next