Re: HOT documentation README - Mailing list pgsql-patches

From Heikki Linnakangas
Subject Re: HOT documentation README
Date
Msg-id 46DD478E.9000607@enterprisedb.com
Whole thread Raw
In response to HOT documentation README  (Bruce Momjian <bruce@momjian.us>)
List pgsql-patches
Bruce Momjian wrote:
> Heikki Linnakangas wrote:
>> Tom Lane wrote:
>>> "Pavan Deolasee" <pavan.deolasee@gmail.com> writes:
>>>> Please see the version 14 of HOT patch attached.
>>> I expected to find either a large new README, or some pretty substantial
>>> additions to existing README files, to document how this all works.
>> Here's an updated version of the README I posted earlier. It now
>> reflects the changes to how pruning works.
>
> I have taken this, and Pavan's documentation about CREATE INDEX, and
> worked up an updated README.  Comments?  Corrections?

Thanks, that's much better.

I made some corrections, patch attached. I clarified the terminology a
bit: "row", "row version" and "tuple" were mixed in some places. "Tuple"
and "row version" are synonyms in my mind, so they can be used
interchangeably, but "row" is not. Row is a higher level concept and
refers to what a user sees when he does a "SELECT * FROM foo". There can
be multiple row versions, or tuples, behind a single row.

I also changed "ctid" to "line pointer" in most places. ctid is a field
on a tuple, I don't think we use that term to refer to line pointers
anywhere.

> I plan to put this in src/backend/access/heap/README.HOT.

Sounds good. I didn't look at the create index stuff.

--
  Heikki Linnakangas
  EnterpriseDB   http://www.enterprisedb.com
*** README.HOT.Bruce    2007-09-04 12:36:19.000000000 +0100
--- ./-root-HOT    2007-09-04 12:38:20.000000000 +0100
***************
*** 14,24 ****
  --------------------

  PostgreSQL's MVCC system makes single-page vacuums (pruning) quite
! difficult because rows must remain after UPDATE or DELETE until all
  transaction snapshots that were active during the command have
  completed.  Traditionally, VACUUM must be run at a later time which
  sequentially scans the table and collects information about all obsolete
! rows.  VACUUM also removes index entries for obsolete rows.
  Unfortunately, VACUUM can be an expensive operation because of its full
  table scan and index cleanup requirement.

--- 14,24 ----
  --------------------

  PostgreSQL's MVCC system makes single-page vacuums (pruning) quite
! difficult because tuples must remain after UPDATE or DELETE until all
  transaction snapshots that were active during the command have
  completed.  Traditionally, VACUUM must be run at a later time which
  sequentially scans the table and collects information about all obsolete
! row versions.  VACUUM also removes index entries for obsolete rows.
  Unfortunately, VACUUM can be an expensive operation because of its full
  table scan and index cleanup requirement.

***************
*** 35,111 ****
  get a new index entry and is marked with the HEAP_ONLY_TUPLE flag.  (The
  old row is marked as HEAP_HOT_UPDATE.)  This allows the space taken by
  UPDATEd row versions to be reused during a single-page vacuum (pruning)
! when they is no longer visible to any running transactions.  This is
  possible because there is only one index entry for the entire UPDATE
  chain on the heap page.

  Internally, things are a bit more complicated:

      Index points to 1
!     ctid [1]  [2]

      [111111111]->[2222222222]

! In the above diagram, the index points to ctid 1, and is marked as
! HEAP_HOT_UPDATE.  Row versions 2 is a HOT UPDATE, meaning it has no
! index row pointing to it, and is marked as HEAP_HOT_UPDATE. Later, even
! if row 1 is no longer visible to any transaction, its ctid pointer
  cannot be removed by pruning because concurrent index/heap lookup
  activity might be happening on the page and removing it might interfere
! with other backends. However, the heap space for row 1 can be reused:

      Index points to 1
!     ctid [1]->[2]

      [2222222222]

! In this case the ctid pointer 1 points to ctid 2, which points to heap
! row version 2.

! If row 2 is updated to version 3, it looks like this:

      [Index points to 1]
      --------------------------------------------------------
!     ctid [1]->[2]  [3]

      [2222222222]->[3333333333]

  The arrow from 2 to 3 is part of the UPDATE chain already present on all
  update rows.

! At some later time when no transaction can see row 2 in its snapshot,
! the space taken by heap row 2 _and_ its ctid can be reused during a
  pruning, e.g.

      Index points to 1
!     ctid [1]------>[3]

      [3333333333]

! Notice that HEAP_HOT_UPDATE row 1 now points to row 3, and row 2 is now
! gone.  Again, this is possible because row 2 did not have an index
! entry.

  Pruning occurs when a row is UPDATEd and there is no free space on the
! page containing the old row.  Pruning scans the entire page looking for
! HEAP_HOT_UPDATE and HEAP_ONLY_TUPLE rows that can be removed.

  Row version 4 would look like this:

      Index points to 1
!     ctid [1]------>[3]  [4]

      [3333333333]->[4444444444]

  and when row 3 is no longer visible, this:

      Index points to 1
!     ctid [1]----------->[4]

      [4444444444]

! As you can see, ctid 1 has to remain, but the space taken by a ctid is
! small compare to a heap row.

  The requirements for doing a HOT update is that none of the indexed
  columns are changed. That is checked at execution time, comparing the
--- 35,113 ----
  get a new index entry and is marked with the HEAP_ONLY_TUPLE flag.  (The
  old row is marked as HEAP_HOT_UPDATE.)  This allows the space taken by
  UPDATEd row versions to be reused during a single-page vacuum (pruning)
! when they are no longer visible to any running transactions.  This is
  possible because there is only one index entry for the entire UPDATE
  chain on the heap page.

  Internally, things are a bit more complicated:

      Index points to 1
!     lp [1]  [2]

      [111111111]->[2222222222]

! In the above diagram, the index points to line pointer 1, and tuple 1 is marked as
! HEAP_HOT_UPDATE.  Row version 2 is a HOT tuple, meaning it has no
! index row pointing to it, and is marked as HEAP_ONLY_TUPLE. Later, even
! if tuple 1 is no longer visible to any transaction, its line pointer
  cannot be removed by pruning because concurrent index/heap lookup
  activity might be happening on the page and removing it might interfere
! with other backends. However, the space occupied by tuple 1 can be reused:

      Index points to 1
!     lp [1]->[2]

      [2222222222]

! In this case the line pointer 1 points to line pointer 2, which points to heap
! row version 2. Line pointer 1 is called a redirecting line pointer, because
! it points to another line pointer instead of a tuple.

! If the row 2 updated again, to version 3, it looks like this:

      [Index points to 1]
      --------------------------------------------------------
!     lp [1]->[2]  [3]

      [2222222222]->[3333333333]

  The arrow from 2 to 3 is part of the UPDATE chain already present on all
  update rows.

! At some later time when no transaction can see tuple 2 in its snapshot,
! the space taken by tuple 2 _and_ its line pointer can be reused during a
  pruning, e.g.

      Index points to 1
!     lp [1]------>[3]

      [3333333333]

! Notice that the line pointer 1 now points to line pointer 3, and tuple 2
! and its line pointer is now gone.  Again, this is possible because tuple
! 2 did not have an index entry.

  Pruning occurs when a row is UPDATEd and there is no free space on the
! page containing the old row. (XXX: not accurate. See "pruning" chapter
! for the rules on when we prune) Pruning scans the entire page looking for
! HEAP_ONLY_TUPLE tuples that can be removed.

  Row version 4 would look like this:

      Index points to 1
!     lp [1]------>[3]  [4]

      [3333333333]->[4444444444]

  and when row 3 is no longer visible, this:

      Index points to 1
!     lp [1]----------->[4]

      [4444444444]

! As you can see, line pointer 1 has to remain, but the space taken by a
! line pointer is small compare to a heap tuple.

  The requirements for doing a HOT update is that none of the indexed
  columns are changed. That is checked at execution time, comparing the
***************
*** 118,124 ****

  When doing an index scan, whenever we reach a non-visible tuple, we need
  to check if the tuple is HEAP_HOT_UPDATE.  If so, we need to follow the
! ctid pointer until we reach a visible one, or one that has not been
  HOT-updated.

  Sequential scans (and bitmap heap scans with a lossy bitmap) do not need
--- 120,126 ----

  When doing an index scan, whenever we reach a non-visible tuple, we need
  to check if the tuple is HEAP_HOT_UPDATE.  If so, we need to follow the
! line pointer until we reach a visible one, or one that has not been
  HOT-updated.

  Sequential scans (and bitmap heap scans with a lossy bitmap) do not need
***************
*** 134,143 ****

  Pruning removes more than just dead HOT tuples. Other dead tuples, such
  as those from DELETEs and aborted transactions, are truncated and leave
! behind only a dead line pointer.  In the illustration below, ctid 1 is
! dead and points to no heap row.

!     ctid [1D] [2]

      [2222222222]

--- 136,145 ----

  Pruning removes more than just dead HOT tuples. Other dead tuples, such
  as those from DELETEs and aborted transactions, are truncated and leave
! behind only a dead line pointer.  In the illustration below, line pointer
! 1 is dead and points to no heap row.

!     lp [1D] [2]

      [2222222222]


pgsql-patches by date:

Previous
From: Gregory Stark
Date:
Subject: Re: HOT documentation README
Next
From: Gregory Stark
Date:
Subject: Re: HOT documentation README