Re: visibility map - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: visibility map
Date
Msg-id 4C15BBEA.500@enterprisedb.com
Whole thread Raw
In response to visibility map  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: visibility map  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 14/06/10 06:08, Robert Haas wrote:
> visibilitymap.c begins with a long and useful comment - but this part
> seems to have a bit of split personality disorder.
>
>   * Currently, the visibility map is not 100% correct all the time.
>   * During updates, the bit in the visibility map is cleared after releasing
>   * the lock on the heap page. During the window between releasing the lock
>   * and clearing the bit in the visibility map, the bit in the visibility map
>   * is set, but the new insertion or deletion is not yet visible to other
>   * backends.
>   *
>   * That might actually be OK for the index scans, though. The newly inserted
>   * tuple wouldn't have an index pointer yet, so all tuples reachable from an
>   * index would still be visible to all other backends, and deletions wouldn't
>   * be visible to other backends yet.  (But HOT breaks that argument, no?)
>
> I believe that the answer to the parenthesized question here is "yes"
> (in which case we might want to just delete this paragraph).

A HOT update can only update non-indexed columns, so I think we're still 
OK with HOT. To an index-only scan, it doesn't matter which tuple in a 
HOT update chain you consider as live, because they both must all the 
same value in the indexed columns. Subtle..

>   * There's another hole in the way the PD_ALL_VISIBLE flag is set. When
>   * vacuum observes that all tuples are visible to all, it sets the flag on
>   * the heap page, and also sets the bit in the visibility map. If we then
>   * crash, and only the visibility map page was flushed to disk, we'll have
>   * a bit set in the visibility map, but the corresponding flag on the heap
>   * page is not set. If the heap page is then updated, the updater won't
>   * know to clear the bit in the visibility map.  (Isn't that prevented by
>   * the LSN interlock?)
>
> I *think* that the answer to this parenthesized question is "no".
> When we vacuum a page, we set the LSN on both the heap page and the
> visibility map page.  Therefore, neither of them can get written to
> disk until the WAL record is flushed, but they could get flushed in
> either order.  So the visibility map page could get flushed before the
> heap page, as the non-parenthesized portion of the comment indicates.

Right.

> However, at least in theory, it seems like we could fix this up during
> redo.

Setting a bit in the visibility map is currently not WAL-logged, but yes 
once we add WAL-logging, that's straightforward to fix.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Fujii Masao
Date:
Subject: Re: Command to prune archive at restartpoints
Next
From: Heikki Linnakangas
Date:
Subject: Re: GSoC - Materialized Views - is stale or fresh?