Thread: Use ctid in where clause in update from statement

Use ctid in where clause in update from statement

From
Dirk Mika
Date:

Hello,

 

I come from the Oracle world and am trying to convert some queries to PostgreSQL syntax. One of these queries is a MERGE statement, which I converted into an UPDATE SET FROM WHERE construct. In the original query I use the pseudo column ROWID to match a source row with a target row.

This is a simplified version of such a query:

 

MERGE INTO test_large d

     USING (SELECT ROWID, test_large.*

              FROM test_large

             WHERE grp = 1) s

        ON (d.ROWID = s.ROWID)

WHEN MATCHED

THEN

   UPDATE SET d.grp = s.grp;

 

It has the following execution plan:

 

---------------------------------------------------------------------------------------------

| Id  | Operation                      | Name       | Rows  | Bytes | Cost (%CPU)| Time     |

---------------------------------------------------------------------------------------------

|   0 | MERGE STATEMENT                |            |  1000 | 16000 |  2006   (0)| 00:00:25 |

|   1 |  MERGE                         | TEST_LARGE |       |       |            |          |

|   2 |   VIEW                         |            |       |       |            |          |

|   3 |    NESTED LOOPS                |            |  1000 |   220K|  2006   (0)| 00:00:25 |

|   4 |     TABLE ACCESS BY INDEX ROWID| TEST_LARGE |  1000 |   110K|  1006   (0)| 00:00:13 |

|*  5 |      INDEX RANGE SCAN          | IX_TL_GRP  |  1000 |       |     6   (0)| 00:00:01 |

|   6 |     TABLE ACCESS BY USER ROWID | TEST_LARGE |     1 |   113 |     1   (0)| 00:00:01 |

---------------------------------------------------------------------------------------------

 

In the PostgreSQL version I use the column ctid for this. The above query becomes:

 

UPDATE test_large d

   SET grp = s.grp

  FROM (SELECT ctid, test_large.*

          FROM test_large

         WHERE grp = 1) s

 WHERE d.ctid = s.ctid;

 

I noticed that the execution plan for such a statement seems to be suboptimal:

 

Update on test_large d  (cost=155346.98..160367.08 rows=996 width=53)

  ->  Merge Join  (cost=155346.98..160367.08 rows=996 width=53)

        Merge Cond: (d.ctid = test_large.ctid)

        ->  Sort  (cost=154330.57..156833.16 rows=1001033 width=43)

              Sort Key: d.ctid

              ->  Seq Scan on test_large d  (cost=0.00..19366.33 rows=1001033 width=43)

        ->  Sort  (cost=1016.40..1018.89 rows=996 width=10)

              Sort Key: test_large.ctid

              ->  Index Scan using ix_tl_grp on test_large  (cost=0.42..966.80 rows=996 width=10)

                    Index Cond: (grp = 1)

 

I expected a tid scan to be used instead of a Seq scan. I did a VACUUM test_large to make sure statistics are valid.

I would suspect that a tid scan is even a bit faster than accessing via the primary key, since reading the index is not required.

 

I know I could change the WHERE clause to use the primary key, which results in the following plan:

 

Update on test_large d  (cost=0.85..3344.12 rows=995 width=57)

  ->  Nested Loop  (cost=0.85..3344.12 rows=995 width=57)

        ->  Index Scan using ix_tl_grp on test_large  (cost=0.42..994.84 rows=995 width=14)

              Index Cond: (grp = 1)

        ->  Index Scan using pk_test_large on test_large d  (cost=0.42..2.36 rows=1 width=47)

              Index Cond: (id = test_large.id)

 

So my question is more "Is a tid scan not possible / useful for an update" than "how do I accelerate this query".

 

BR

Dirk

 

--
Dirk Mika
Software Developer



mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika


 
Attachment

Re: Use ctid in where clause in update from statement

From
Achilleas Mantzios
Date:
On 1/7/19 12:13 μ.μ., Dirk Mika wrote:

Hello,

 

I come from the Oracle world and am trying to convert some queries to PostgreSQL syntax. One of these queries is a MERGE statement, which I converted into an UPDATE SET FROM WHERE construct. In the original query I use the pseudo column ROWID to match a source row with a target row.

This is a simplified version of such a query:

 

Postgresql supports upserts : https://www.postgresql.org/docs/11/sql-insert.html "ON CONFLICT"

 

BR

Dirk

 

--
Dirk Mika
Software Developer



mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika



 


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt
Attachment

Re: Use ctid in where clause in update from statement

From
Thomas Kellerer
Date:
> I come from the Oracle world and am trying to convert some queries to
> PostgreSQL syntax. One of these queries is a MERGE statement, which I
> converted into an UPDATE SET FROM WHERE construct. In the original
> query I use the pseudo column ROWID to match a source row with a
> target row.
> 
> In the PostgreSQL version I use the column ctid for this. The above query becomes:
> 
> *UPDATE*test_large d
>    *SET* grp = s.grp
>   *FROM* (*SELECT* ctid, test_large.*
>           *FROM* test_large
>          *WHERE* grp = 1) s
>  *WHERE* d.ctid = s.ctid;

Why don't you join on the primary key column? 
The ctid comparison is typically quite slow.

Thomas



Re: Use ctid in where clause in update from statement

From
Achilleas Mantzios
Date:
On 1/7/19 12:13 μ.μ., Dirk Mika wrote:

 

UPDATE test_large d

   SET grp = s.grp

  FROM (SELECT ctid, test_large.*

          FROM test_large

         WHERE grp = 1) s

 WHERE d.ctid = s.ctid;

 


Besides, what the above does is select for rows that have grp=1 and then set grp to the same value.

 

BR

Dirk

 

--
Dirk Mika
Software Developer



mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika



 


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt
Attachment

Re: Use ctid in where clause in update from statement

From
Dirk Mika
Date:
I know I can join using the pk, but in oracle using the rowid is faster so I wanted to know, if this is possible in
PostgreSQLas well.
 

Dirk


--
Dirk Mika
Software Developer

mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika



## How2Use
## the ChampionChip by mika:timing
## https://youtu.be/qfOFXrpSKLQ
Am 01.07.19, 11:36 schrieb "Thomas Kellerer" <spam_eater@gmx.net>:

    > I come from the Oracle world and am trying to convert some queries to
    > PostgreSQL syntax. One of these queries is a MERGE statement, which I
    > converted into an UPDATE SET FROM WHERE construct. In the original
    > query I use the pseudo column ROWID to match a source row with a
    > target row.
    >
    > In the PostgreSQL version I use the column ctid for this. The above query becomes:
    >
    > *UPDATE*test_large d
    >    *SET* grp = s.grp
    >   *FROM* (*SELECT* ctid, test_large.*
    >           *FROM* test_large
    >          *WHERE* grp = 1) s
    >  *WHERE* d.ctid = s.ctid;

    Why don't you join on the primary key column?
    The ctid comparison is typically quite slow.

    Thomas





Re: Use ctid in where clause in update from statement

From
Dirk Mika
Date:

This is just a very simplified sample and doesn’t makes sense, but it shows the execution plan. The original query is more complex.  It selects more values and updates more columns.

 

Dirk

 

--
Dirk Mika
Software Developer



mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika


 

Von: Achilleas Mantzios <achill@matrix.gatewaynet.com>
Datum: Montag, 1. Juli 2019 um 11:50
An: "pgsql-general@lists.postgresql.org" <pgsql-general@lists.postgresql.org>
Betreff: Re: Use ctid in where clause in update from statement

 

On 1/7/19 12:13 μ.μ., Dirk Mika wrote:

 

UPDATE test_large d

   SET grp = s.grp

  FROM (SELECT ctid, test_large.*

          FROM test_large

         WHERE grp = 1) s

 WHERE d.ctid = s.ctid;

 


Besides, what the above does is select for rows that have grp=1 and then set grp to the same value.


 

BR

Dirk

 

--
Dirk Mika
Software Developer

cid:part1.F9658A75.C84DBA7E@matrix.gatewaynet.com

mika:timing GmbH

Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika

 

cid:part2.3A5504AA.38B3DF7A@matrix.gatewaynet.com


 




-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt
Attachment

Re: Use ctid in where clause in update from statement

From
Dirk Mika
Date:

Hi,

 

The problem with the INSERT ON CONFLICT is that an insert is tried here first, which may fire triggers.

In my case there is a fairly expensive INSERT Trigger ON EACH ROW, which I would like to avoid.

 

Since the SELECT statement only returns records whose PK already exists, I thought an UPDATE FROM would be the better choice.

As I said, with the PK as JOIN condition it works fine too. I'm only interested in why no tid scan is used.

 

Dirk

 

--
Dirk Mika
Software Developer



mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika


 

Von: Achilleas Mantzios <achill@matrix.gatewaynet.com>
Datum: Montag, 1. Juli 2019 um 11:27
An: "pgsql-general@lists.postgresql.org" <pgsql-general@lists.postgresql.org>
Betreff: Re: Use ctid in where clause in update from statement

 

On 1/7/19 12:13 μ.μ., Dirk Mika wrote:

Hello,

 

I come from the Oracle world and am trying to convert some queries to PostgreSQL syntax. One of these queries is a MERGE statement, which I converted into an UPDATE SET FROM WHERE construct. In the original query I use the pseudo column ROWID to match a source row with a target row.

This is a simplified version of such a query:

 

Postgresql supports upserts : https://www.postgresql.org/docs/11/sql-insert.html "ON CONFLICT"


 

BR

Dirk

 

--
Dirk Mika
Software Developer

cid:part1.679C32D5.E7787ACA@matrix.gatewaynet.com

mika:timing GmbH

Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika

 

cid:part2.E58C80B6.31D80ADB@matrix.gatewaynet.com


 




-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt
Attachment

Re: Use ctid in where clause in update from statement

From
Thomas Kellerer
Date:
Dirk Mika schrieb am 01.07.2019 um 12:02:
> I know I can join using the pk, but in oracle using the rowid is
> faster so I wanted to know, if this is possible in PostgreSQL as
> well.
Well, in Postgres  ctid is not necessarily faster. 



Re: Use ctid in where clause in update from statement

From
Thomas Kellerer
Date:
Dirk Mika schrieb am 01.07.2019 um 13:18:
> The problem with the INSERT ON CONFLICT is that an insert is tried here first, which may fire triggers.
> 
> In my case there is a fairly expensive INSERT Trigger ON EACH ROW, which I would like to avoid.

The insert trigger will only be fired if an INSERT actually takes place. 

If INSERT ON CONFLICT results in an UPDATE (or a DO NOTHING), no INSERT trigger will be fired.




Re: Use ctid in where clause in update from statement

From
Dirk Mika
Date:
I've tried it with the following trigger:

CREATE TRIGGER tr_tl_test1
   BEFORE INSERT
   ON public.test_large
   FOR EACH ROW
EXECUTE PROCEDURE tf_tr_tl_test1 ()

The trigger function does nothing special:

CREATE OR REPLACE FUNCTION public.tf_tr_tl_test1 ()
   RETURNS trigger
   LANGUAGE 'plpgsql'
   VOLATILE
   NOT LEAKPROOF
   SECURITY INVOKER
   PARALLEL UNSAFE
AS
$$
BEGIN
   RAISE NOTICE 'Trigger called with: %', new;
   RETURN new;
END;
$$

If I do a

EXPLAIN ANALYZE
    INSERT INTO test_large (id)
         VALUES (2)
    ON CONFLICT
       ON CONSTRAINT pk_test_large
       DO NOTHING;

I get the following:

NOTICE:  Trigger called with: (2,,)
                                            QUERY PLAN
---------------------------------------------------------------------------------------------------
 Insert on test_large  (cost=0.00..0.01 rows=1 width=40) (actual time=0.153..0.153 rows=0 loops=1)
   Conflict Resolution: NOTHING
   Conflict Arbiter Indexes: pk_test_large
   Tuples Inserted: 0
   Conflicting Tuples: 1
   ->  Result  (cost=0.00..0.01 rows=1 width=40) (actual time=0.003..0.004 rows=1 loops=1)
 Planning Time: 0.142 ms
 Trigger tr_tl_test1: time=0.116 calls=1
 Execution Time: 0.180 ms

As you can see the trigger function is called for the row I try to insert, but you can also see that there's no tuple
insertedbut one conflicting.
 

Dirk



--
Dirk Mika
Software Developer

mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika



## How2Use
## the ChampionChip by mika:timing
## https://youtu.be/qfOFXrpSKLQ
Am 01.07.19, 13:33 schrieb "Thomas Kellerer" <spam_eater@gmx.net>:

    Dirk Mika schrieb am 01.07.2019 um 13:18:
    > The problem with the INSERT ON CONFLICT is that an insert is tried here first, which may fire triggers.
    >
    > In my case there is a fairly expensive INSERT Trigger ON EACH ROW, which I would like to avoid.

    The insert trigger will only be fired if an INSERT actually takes place.

    If INSERT ON CONFLICT results in an UPDATE (or a DO NOTHING), no INSERT trigger will be fired.






Re: Use ctid in where clause in update from statement

From
Achilleas Mantzios
Date:
Hello Dirk,

pls don't top post, that's the rule here

On 1/7/19 2:40 μ.μ., Dirk Mika wrote:
> I've tried it with the following trigger:
>
> CREATE TRIGGER tr_tl_test1
>     BEFORE INSERT
>     ON public.test_large
>     FOR EACH ROW
> EXECUTE PROCEDURE tf_tr_tl_test1 ()
>
> The trigger function does nothing special:
>
> CREATE OR REPLACE FUNCTION public.tf_tr_tl_test1 ()
>     RETURNS trigger
>     LANGUAGE 'plpgsql'
>     VOLATILE
>     NOT LEAKPROOF
>     SECURITY INVOKER
>     PARALLEL UNSAFE
> AS
> $$
> BEGIN
>     RAISE NOTICE 'Trigger called with: %', new;
>     RETURN new;
> END;
> $$
>
> If I do a
>
> EXPLAIN ANALYZE
>      INSERT INTO test_large (id)
>           VALUES (2)
>      ON CONFLICT
>         ON CONSTRAINT pk_test_large
>         DO NOTHING;
>
> I get the following:
>
> NOTICE:  Trigger called with: (2,,)
>                                              QUERY PLAN
> ---------------------------------------------------------------------------------------------------
>   Insert on test_large  (cost=0.00..0.01 rows=1 width=40) (actual time=0.153..0.153 rows=0 loops=1)
>     Conflict Resolution: NOTHING
>     Conflict Arbiter Indexes: pk_test_large
>     Tuples Inserted: 0
>     Conflicting Tuples: 1
>     ->  Result  (cost=0.00..0.01 rows=1 width=40) (actual time=0.003..0.004 rows=1 loops=1)
>   Planning Time: 0.142 ms
>   Trigger tr_tl_test1: time=0.116 calls=1
>   Execution Time: 0.180 ms
>
> As you can see the trigger function is called for the row I try to insert, but you can also see that there's no tuple
insertedbut one conflicting.
 

A before trigger cannot know the actual. So a BEFORE INSERT trigger will behave as such even if no INSERT takes place.
If you want to skip your particular "normal" trigger , just run :
set session_replication_role to 'replica';
and run your upsert.

>
> Dirk
>
>
>
> --
> Dirk Mika
> Software Developer
>
> mika:timing GmbH
> Strundepark - Kürtener Str. 11b
> 51465 Bergisch Gladbach
> Germany
>
> fon +49 2202 2401-1197
> dirk.mika@mikatiming.de
> www.mikatiming.de
>
> AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
> Geschäftsführer: Harald Mika, Jörg Mika
>
>
>
> ## How2Use
> ## the ChampionChip by mika:timing
> ## https://youtu.be/qfOFXrpSKLQ
> Am 01.07.19, 13:33 schrieb "Thomas Kellerer" <spam_eater@gmx.net>:
>
>      Dirk Mika schrieb am 01.07.2019 um 13:18:
>      > The problem with the INSERT ON CONFLICT is that an insert is tried here first, which may fire triggers.
>      >
>      > In my case there is a fairly expensive INSERT Trigger ON EACH ROW, which I would like to avoid.
>
>      The insert trigger will only be fired if an INSERT actually takes place.
>
>      If INSERT ON CONFLICT results in an UPDATE (or a DO NOTHING), no INSERT trigger will be fired.
>
>
>
>
>


-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt




Re: Use ctid in where clause in update from statement

From
Thomas Kellerer
Date:
> As you can see the trigger function is called for the row I try to
> insert, but you can also see that there's no tuple inserted but one
> conflicting.

Ah, right. 

Thinking about it, it _has_ to call any BEFORE trigger function 
as that might change values of the row to be inserted that 
could possibly change the outcome of the test. 







Re: Use ctid in where clause in update from statement

From
Dirk Mika
Date:
Hi

    Thinking about it, it _has_ to call any BEFORE trigger function
    as that might change values of the row to be inserted that
    could possibly change the outcome of the test.

Yeah, that was my thought, too. Unfortunately, the affected trigger changes two columns, so I can't change it to an
AFTERROW trigger.
 

Dirk







--
Dirk Mika
Software Developer

mika:timing GmbH
Strundepark - Kürtener Str. 11b
51465 Bergisch Gladbach
Germany

fon +49 2202 2401-1197
dirk.mika@mikatiming.de
www.mikatiming.de

AG Köln HRB 47509 * WEEE-Reg.-Nr. DE 90029884
Geschäftsführer: Harald Mika, Jörg Mika



## How2Use
## the ChampionChip by mika:timing
## https://youtu.be/qfOFXrpSKLQ

Re: Use ctid in where clause in update from statement

From
Tom Lane
Date:
Dirk Mika <Dirk.Mika@mikatiming.de> writes:
> I know I can join using the pk, but in oracle using the rowid is faster so I wanted to know, if this is possible in
PostgreSQLas well. 

Existing Postgres releases are not very bright about joins on CTID ---
basically merge join is the only plan type you can get for that.
v12 will improve that somewhat.

            regards, tom lane