Re: lifetime of the old CTID - Mailing list pgsql-general

From Matthias Apitz
Subject Re: lifetime of the old CTID
Date
Msg-id 20220706122600.GA24@sh4-5.1blu.de
Whole thread Raw
In response to Re: lifetime of the old CTID  (Karsten Hilbert <Karsten.Hilbert@gmx.net>)
Responses Re: lifetime of the old CTID  ("Peter J. Holzer" <hjp-pgsql@hjp.at>)
Re: lifetime of the old CTID  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Re: lifetime of the old CTID  (Ilya Anfimov <ilan@tzirechnoy.com>)
List pgsql-general
El día Mittwoch, Juli 06, 2022 a las 11:45:14 +0200, Karsten Hilbert escribió:

> Am Wed, Jul 06, 2022 at 08:18:42AM +0200 schrieb Matthias Apitz:
> 
> > > On first glance, it appears that you are using the ctid as a primary key for a row, and that's highly
not-recommended. The ctid is never intended to be stable in the database, as you have discovered.  There are really no
particularguarantees about ctid values being retained.
 
> > >
> > > I'd suggest having a proper primary key column on the table, and using that instead.
> >
> > Ofc, each table has its own primary key(s), used for example for the
> > SELECT ctid, * FROM d01buch WHERE ...
> >
> > As I said, we came to PostgreSQL from Sybase (and Oracle) and Sybase has
> > for each table a so called SYB_IDENTITY_COLUMN which is static for the
> > table and its value does not change. When we would add now to some 400 tables an
> > additional INTEGER column (and triggers to fill this on INSERT) this
> > would be a big change in our DB layer and migration of databases in the
> > field. Your suggesting (thanks for it in any case) is not that easy to
> > implement, and no option at the moment.
> 
> Christopher suggested to *use* the primary key, not to *add*
> one.
> 
> You said that there *is* a primary key.
> 
> So, more thought/explanation would need to go into why that
> cannot be used.

This is because of the generic, generated structure of the DB layer for
all the 400 tables. The provided pseudo-code example should only explain
from where our problem comes from. The big CURSOR is done in the DB layer by
the application calling something like

    read_where(d01buch, "WHERE d01status=4");

i.e. the application provides the table name (d01buch) and a
WHERE-clause of its interest ("WHERE d01status=4"), here all books which
are lent to patrons. This gives in the DB layer a CURSOR of say 100.000
rows of the 3.000.000 in the table. Now the application fetches row
by row and see if something should be done with the row. If so, the
DB layer must LOCK the row for update. It does so using the CTID. Of
course there is a key in the row (d01gsi, the signature of the book),
but this is not uniqu and can't be used to lock exactly this row for update.

The same problem could occure in any or the other 400 tables.

Interestingly, I tested today morning how long the new CTID can be seen
with currtid2(). I did 10 or more updates of a row and the then new CTID
could always be seen with the old CTID from the moment before the 10 updates. I even
found no way to get this tuple broken. Only deletion of the row helped
to make currtid2() loosing the relation. This is with a 14.1 server. Why
the 13.1 behaves different?

    matthias

-- 
Matthias Apitz, ✉ guru@unixarea.de, http://www.unixarea.de/ +49-176-38902045
Public GnuPG key: http://www.unixarea.de/key.pub



pgsql-general by date:

Previous
From: Ludwig Isaac Lim
Date:
Subject: Error when pasting function blocks into psql
Next
From: Alvaro Herrera
Date:
Subject: Re: lifetime of the old CTID