Hi Community,
Let me use a small example to demonstrate my observation.
Step 1: create a simple table, insert a tuple and update it.
create table ta (id int, name varchar(32), age int);
insert into ta values(1, 'aa', 4);
update ta set age=99 where id=1;
Step 2: with pageinspect, we can the 2 version of the tuple:
SELECT * FROM heap_page_items(get_raw_page('ta', 0));
lp | lp_off | lp_flags | lp_len | t_xmin | t_xmax | t_field3 | t_ctid | t_infomask2 | t_infomask | t_hoff | t_bits | t_oid | t_data
----+--------+----------+--------+--------+--------+----------+--------+-------------+------------+--------+--------+-------+----------------------------
1 | 8152 | 1 | 36 | 765 | 765 | 0 | (0,2) | 16387 | 34 | 24 | | | \x010000000761610004000000
2 | 8112 | 1 | 36 | 765 | 0 | 2 | (0,2) | 32771 | 10242 | 24 | | | \x010000000761610063000000
(2 rows)
The old version's ctid now points to (0,2) which is expected.
Step 3: create a index on the table
create index idx_ta_age on ta(age);
Step 4: view the index page
evantest=# SELECT * FROM bt_page_items('idx_ta_age', 1);
itemoffset | ctid | itemlen | nulls | vars | data | dead | htid | tids
------------+-------+---------+-------+------+-------------------------+------+-------+------
1 | (0,1) | 16 | f | f | 63 00 00 00 00 00 00 00 | f | (0,1) |
(1 row)
Here comes my question, why the index entry's ctid points to the old version tuple?
I understand that, for updated tuples, old version's ctid points to new version, that builds a chain of all versions. But my confusion is that, when an index is created, older transactions and in-progress transactions won't see the newly created index. So, it should be ok for the index to point to the newest version of tuple version that is visible to the index.
Can anyone please explain me about that?
Thanks and regards,
Chao Li (Evan)
------------------------------