Re: lifetime of the old CTID - Mailing list pgsql-general

From Laurenz Albe
Subject Re: lifetime of the old CTID
Date
Msg-id 193fd8fb631c7e8078de9d25bffcb1a04f6e1943.camel@cybertec.at
Whole thread Raw
In response to Re: lifetime of the old CTID  (Matthias Apitz <guru@unixarea.de>)
Responses Re: lifetime of the old CTID  (Karsten Hilbert <Karsten.Hilbert@gmx.net>)
List pgsql-general
On Wed, 2022-07-06 at 08:18 +0200, Matthias Apitz wrote:
> El día martes, julio 05, 2022 a las 10:44:23p. m. -0700, Christophe Pettus escribió:
> 
> > 
> > 
> > > On Jul 5, 2022, at 22:35, Matthias Apitz <guru@unixarea.de> wrote:
> > > Internally, in the DB layer, the read_where() builds the row list matching
> > > the WHERE clause as a SCROLLED CURSOR of
> > > 
> > >    SELECT ctid, * FROM d01buch WHERE ...
> > > 
> > > and each fetch() delivers the next row from this cursor. The functions
> > > start_transaction() and end_transaction() do what their names suggest and
> > > rewrite_actual_row() does a new SELECT based on the ctid of the actual row
> > > 
> > >    SELECT * FROM d01buch WHERE ctid = ... FOR UPDATE
> > >    ...
> > >    UPDATE ...
> > 
> > On first glance, it appears that you are using the ctid as a primary key for a row, and that's highly
not-recommended. The ctid is never intended to be stable in the database, as you have
 
> > discovered.  There are really no particular guarantees about ctid values being retained.
> > 
> > I'd suggest having a proper primary key column on the table, and using that instead.
> 
> Ofc, each table has its own primary key(s), used for example for the 
> SELECT ctid, * FROM d01buch WHERE ...
> 
> As I said, we came to PostgreSQL from Sybase (and Oracle) and Sybase has
> for each table a so called SYB_IDENTITY_COLUMN which is static for the
> table and its value does not change. When we would add now to some 400 tables an
> additional INTEGER column (and triggers to fill this on INSERT) this
> would be a big change in our DB layer and migration of databases in the
> field. Your suggesting (thanks for it in any case) is not that easy to
> implement, and no option at the moment.
> 
> At the moment the DB layer informs the application layer correctly when
> the row can not be found by the ctid for an UPDATE and the application
> must handle this situation correctly (logging, 2nd pass through with a
> new CURSOR etc.).
> 
> If I understand Laurenz correctly, there seems to be a way to keep the
> tuple old-ctid : new-ctid for some time, at least until the ney
> autovacuum.

No, there isn't.  I said that you might be able to mutilate the PostgreSQL
code so that it does that, but I think that would be a bad idea.

Using the primary key is the proper solution.  To be safe from concurrent
modifications, use a logic like in this pseudo-code:

FOR b IN SELECT pk, other_columns FROM books WHERE condition
   UPDATE books SET ... WHERE pk = ... AND condition

Checking the condition again on the inner UPDATE will detect concurrent
data modifications.  If the UPDATE changes nothing, then a book has been
removed or updated by a concurrent transaction, and you ignore it.

Yours,
Laurenz Albe
-- 
Cybertec | https://www.cybertec-postgresql.com



pgsql-general by date:

Previous
From: Matthias Apitz
Date:
Subject: Re: lifetime of the old CTID
Next
From: Laurenz Albe
Date:
Subject: Re: How to upgrade postgres version 8 to 13