Bug in visibility map WAL-logging - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Bug in visibility map WAL-logging
Date
Msg-id 52AAF3CC.3090108@vmware.com
Whole thread Raw
Responses Re: Bug in visibility map WAL-logging
List pgsql-hackers
lazy_vacuum_page() does this:

1. START_CRIT_SECTION()
2. Remove dead tuples from the page, marking the itemid's unused.
3. MarkBufferDirty
4. if all remaining tuples on the page are visible to everyone, set the 
all-visible flag on the page, and call visibilitymap_set() to set the VM 
bit.
5 visibilitymap_set() writes a WAL record about setting the bit in the 
visibility map.
6. write the WAL record of removing the dead tuples.
7. END_CRIT_SECTION().

See the problem? Setting the VM bit is WAL-logged first, before the 
removal of the tuples. If you stop recovery between the two WAL records, 
the page's VM bit in the VM map will be set, but the dead tuples are 
still on the page.

This bug was introduced in Feb 2013, by commit 
fdf9e21196a6f58c6021c967dc5776a16190f295, so it's present in 9.3 and master.

The fix seems quite straightforward, we just have to change the order of 
those two records. I'll go commit that.

- Heikki



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: logical changeset generation v6.8
Next
From: Simon Riggs
Date:
Subject: Re: Time-Delayed Standbys