Thread: Trying to understand page structures in PG

Trying to understand page structures in PG

From
Rakesh Kumar
Date:
Hello

I understand that when an update of say 100,000 rows are made, PG
writes the updated rows as a new row. These new rows are not visible
to any sessions except the one creating it. At commit time PG flips
something internally to make these rows visible to all.

My Q: what happens to those rows which use to contain the values
before the update. Shouldn't something change in those rows to
indicate that those rows are no longer valid. Who does it chain those
rows to the new rows.

thanks.


Re: Trying to understand page structures in PG

From
Albe Laurenz
Date:
Rakesh Kumar wrote:
> I understand that when an update of say 100,000 rows are made, PG
> writes the updated rows as a new row. These new rows are not visible
> to any sessions except the one creating it. At commit time PG flips
> something internally to make these rows visible to all.
> 
> My Q: what happens to those rows which use to contain the values
> before the update. Shouldn't something change in those rows to
> indicate that those rows are no longer valid. Who does it chain those
> rows to the new rows.

Every row has two system columns associated with it: xmin and xmax

xmin is the transaction ID that created the row, while xmax is
the transaction ID that removed the row.

So when an update takes place, xmax of the original row and xmin
of the new row are set to the transaction ID of the current transaction.

Furthermore, the commit log (CLOG) logs for each transaction whether
it was committed or rolled back.

Now when a backend examines a row, it first checks if the row is
visible, i.e. xmin must be less or equal to the current transaction ID
and xmax must be 0 or greater than the current transaction ID
or belong to a transaction that was rolled back.

To save CLOG lookups, the first reader who consults the CLOG will
save the result in so-called "hint bits" on the row itself.

Yours,
Laurenz Albe

Re: Trying to understand page structures in PG

From
"Joshua D. Drake"
Date:
On 04/06/2016 02:39 AM, Rakesh Kumar wrote:
> Hello
>
> I understand that when an update of say 100,000 rows are made, PG
> writes the updated rows as a new row. These new rows are not visible
> to any sessions except the one creating it. At commit time PG flips
> something internally to make these rows visible to all.
>
> My Q: what happens to those rows which use to contain the values
> before the update. Shouldn't something change in those rows to
> indicate that those rows are no longer valid. Who does it chain those
> rows to the new rows.

They are marked dead and at a threshold vacuum will come along
automatically and mark them reusable.

The vacuum and maintenance docs explain this pretty well.

Sincerely,

JD

>
> thanks.
>
>


--
Command Prompt, Inc.                  http://the.postgres.company/
                         +1-503-667-4564
PostgreSQL Centered full stack support, consulting and development.
Everyone appreciates your honesty, until you are honest with them.


Re: Trying to understand page structures in PG

From
rob stone
Date:
On Wed, 2016-04-06 at 10:33 +0000, Albe Laurenz wrote:

<snip>
> Every row has two system columns associated with it: xmin and xmax
>
> xmin is the transaction ID that created the row, while xmax is
> the transaction ID that removed the row.
>
> So when an update takes place, xmax of the original row and xmin
> of the new row are set to the transaction ID of the current
> transaction.
>
> 
> Yours,
> Laurenz Albe
>
</snip>

Hello Laurenz,

Out of curiosity:-
1) If you only updated a column defined as BOOLEAN, DATE or TIMESTAMP,
the tuples physical size should be the same. Is it written back to the
same page with altered xmin, xmax values?

2) If you defined a table with a FILLFACTOR of say 70%, and there is
still space left on its page for the updated tuple, does the same
apply?

I am curious because of "while xmax is the transaction ID that
*removed* the row".

Not urgent.

Cheers,
rob


Re: Trying to understand page structures in PG

From
Jeff Janes
Date:
On Wed, Apr 6, 2016 at 8:35 AM, rob stone <floriparob@gmail.com> wrote:
> On Wed, 2016-04-06 at 10:33 +0000, Albe Laurenz wrote:
>
> <snip>
>> Every row has two system columns associated with it: xmin and xmax
>>
>> xmin is the transaction ID that created the row, while xmax is
>> the transaction ID that removed the row.
>>
>> So when an update takes place, xmax of the original row and xmin
>> of the new row are set to the transaction ID of the current
>> transaction.

> Out of curiosity:-
> 1) If you only updated a column defined as BOOLEAN, DATE or TIMESTAMP,
> the tuples physical size should be the same. Is it written back to the
> same page with altered xmin, xmax values?

Being the same size doesn't matter, because it *has* to be copied.

If there is room for the copy to go onto the same page, then that is
done.  If there is not room, then it goes onto a different page.

>
> 2) If you defined a table with a FILLFACTOR of say 70%, and there is
> still space left on its page for the updated tuple, does the same
> apply?
>
> I am curious because of "while xmax is the transaction ID that
> *removed* the row".

"marked for removal" would be more accurate.  If the row were actually
physically removed, it would no longer have a xmax to set.

Cheers,

Jeff


Re: Trying to understand page structures in PG

From
Rakesh Kumar
Date:
On Wed, Apr 6, 2016 at 6:33 AM, Albe Laurenz <laurenz.albe@wien.gv.at> wrote:
> Rakesh Kumar wrote:

> Every row has two system columns associated with it: xmin and xmax
>
> xmin is the transaction ID that created the row, while xmax is
> the transaction ID that removed the row.
>
> So when an update takes place, xmax of the original row and xmin
> of the new row are set to the transaction ID of the current transaction.
>
> Furthermore, the commit log (CLOG) logs for each transaction whether
> it was committed or rolled back.
>
> Now when a backend examines a row, it first checks if the row is
> visible, i.e. xmin must be less or equal to the current transaction ID
> and xmax must be 0 or greater than the current transaction ID
> or belong to a transaction that was rolled back.
>
> To save CLOG lookups, the first reader who consults the CLOG will
> save the result in so-called "hint bits" on the row itself.

I am assuming the same must be true for the indexes also. Does PG look
up primary key
by examining the rows like you described above.


Re: Trying to understand page structures in PG

From
Albe Laurenz
Date:
Jeff Janes wrote:
>> I am curious because of "while xmax is the transaction ID that
>> *removed* the row".
> 
> "marked for removal" would be more accurate.  If the row were actually
> physically removed, it would no longer have a xmax to set.

Yes, thanks for the clarification.
I was thinking "logically removed", but I realize my wording was confusing.

Yours,
Laurenz Albe

Re: Trying to understand page structures in PG

From
Albe Laurenz
Date:
Rakesh Kumar wrote:
>> Every row has two system columns associated with it: xmin and xmax
>>
>> xmin is the transaction ID that created the row, while xmax is
>> the transaction ID that removed the row.
>>
>> So when an update takes place, xmax of the original row and xmin
>> of the new row are set to the transaction ID of the current transaction.
>>
>> Furthermore, the commit log (CLOG) logs for each transaction whether
>> it was committed or rolled back.
>>
>> Now when a backend examines a row, it first checks if the row is
>> visible, i.e. xmin must be less or equal to the current transaction ID
>> and xmax must be 0 or greater than the current transaction ID
>> or belong to a transaction that was rolled back.
>>
>> To save CLOG lookups, the first reader who consults the CLOG will
>> save the result in so-called "hint bits" on the row itself.
> 
> I am assuming the same must be true for the indexes also. Does PG look
> up primary key
> by examining the rows like you described above.

Rows remain in the index as long as they remain in the table.

What actually removes old rows from the table - VACUUM - also
removes them from the index.

The index does not contain visibility information about the indexed row,
so every row found in an index must still be checked for visibility,
either as described above, or in the case of an index-only scan
by checking the visibility map if all tuples on that heap page are
visible to all transactions.

Yours,
Laurenz Albe