Re: Questions about update, delete, ctid... - Mailing list pgsql-general

From Martijn van Oosterhout
Subject Re: Questions about update, delete, ctid...
Date
Msg-id 20060731093757.GB27863@svana.org
Whole thread Raw
In response to Re: Questions about update, delete, ctid...  (DANTE Alexandra <Alexandra.Dante@bull.net>)
Responses Re: Questions about update, delete, ctid...  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Questions about update, delete, ctid...  (DANTE Alexandra <Alexandra.Dante@bull.net>)
List pgsql-general
On Mon, Jul 31, 2006 at 11:04:58AM +0200, DANTE Alexandra wrote:
> I've just seen that I've done a mistake in my example. My question was :
> is it correct to think that the ctid of the old version of the tuple is
> a link to newer version ? In my example, is it correct to think that the
> tuple :
> 140049    |    0 | (0,12) |          11 | IRAQ
> has become :
> new value |    0 | (0,26) |          11 | *IRAQ*
> Could you give me more details about the link between the old and the
> new version, please ?
> For me, the link is the c_tid, but maybe I'm wrong...

Well, in your case where there are no other transactions running, yes.
In the general case there may have been other updates so all you know
is that the new tuple is a descendant of the old one. The chain of
t_ctid links can be arbitrarily long.

Note: with multiple psql sessions you can see some of this happening.

======= Session 1 =======
test=# begin;
BEGIN
test=# set TRANSACTION ISOLATION LEVEL SERIALIZABLE ;
SET
test=# select xmin, xmax, cmin, cmax, * from a;
  xmin  | xmax | cmin | cmax | country
--------+------+------+------+---------
 277264 |    0 |    0 |    0 | IRAQ
(1 row)

======= Session 2 =======
test=# select xmin, xmax, cmin, cmax, * from a;
  xmin  | xmax | cmin | cmax | country
--------+------+------+------+---------
 277264 |    0 |    0 |    0 | IRAQ
(1 row)

test=# update a set country = 'ITALY';
UPDATE 1
test=# select xmin, xmax, cmin, cmax, * from a;
  xmin  | xmax | cmin | cmax | country
--------+------+------+------+---------
 277269 |    0 |    0 |    0 | ITALY
(1 row)

======= Session 1 again =======
test=# select xmin, xmax, cmin, cmax, * from a;
  xmin  |  xmax  |  cmin  | cmax | country
--------+--------+--------+------+---------
 277264 | 277269 | 277269 |    0 | IRAQ
(1 row)

As you can see now, both sessions are seeing different views of the
same table. The old tuple has now updated xmax and cmin values. If
session two updated the tuple again it would get a cmax value also. You
can't see the t_ctid link here, but there is one from the old row to
the new one.

> My question about the "infomask" strucutre was linked to the code of
> VACUUM. I've seen in the "lazy_scan_heap method that the
> "HeapTupleSatisfiesVacuum" method is called. In this method, according
> to the value of "infomask", a tuple is defined as "dead" or not.
> That's why I wonder if the "infomask" structure is changed after an
> commited update or delete, and what are the values set ?

It was Tom who pointed this out to me first: the infomask is not the
important part. The infomask is just a cache of the results of tests.

The problem is that checking if a transaction has been committed or not
can be reasonably expensive since it might have to check on disk. If
you had to do that every time you looked up a tuple the performence
would be terrible. So what happens is that the first time someone looks
up the status of a transaction and finds it's committed or aborted, it
sets that bit so no-one else has to do the test.

The basic result is that you can read the code as if the infomask was
blank and the result should be the same. The only difference is that
various bits allow the code to skip certain tests because somebody has
already done them before. The end result should be the same.

> So, consequently, it is not necessary to rebuild the B-tree index after
> an update or a delete.
> Is it correct ?

You never have to rebuild the index. The whole system is designed so
many people can be reading and writing the index simultaneously without
getting in eachothers way.

Hope this helps,
--
Martijn van Oosterhout   <kleptog@svana.org>   http://svana.org/kleptog/
> From each according to his ability. To each according to his ability to litigate.

Attachment

pgsql-general by date:

Previous
From: DANTE Alexandra
Date:
Subject: Re: Questions about update, delete, ctid...
Next
From: Teodor Sigaev
Date:
Subject: Re: number of distinct values in tsearch2 gist index