Re: crash-safe visibility map, take three - Mailing list pgsql-hackers

From Jesper Krogh
Subject Re: crash-safe visibility map, take three
Date
Msg-id 4D24D31E.2040906@krogh.cc
Whole thread Raw
In response to crash-safe visibility map, take three  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: crash-safe visibility map, take three  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2010-11-30 05:57, Robert Haas wrote:
> Last week, I posted a couple of possible designs for making the
> visibility map crash-safe, which did not elicit much comment.  Since
> this is an important prerequisite to index-only scans, I'm trying
> again.

The logic seems to be:

* If the visibillity map should be crash-safe if should be WAL-logged.
* PD_ALL_VISIBLE is currently not being WAL-logged when vacuum sets it.
* WAL logging the visibillity map bit is not "that" bad (small size).
* WAL-logging the PD_ALL_VISIBLE bit would can WAL-record for the entire  relation to be written out (potentially
huge).

Would the problem not be solved by not "trying to keep the two bits in 
sync" but
simply removing the PD_ALL_VISIBLE bit in the page-header in favor
for the bit in the visibillity map, that is now WAL-logged and thus safe 
to trust?

Then vacuum could emit WAL records for setting the visibillity map bits, 
combined
with changes on the page could clear it?

The question probably boils down to:

Given a crash-safe visibility map, what purpuse does the PD_ALL_VISIBLE 
bit serve?

I've probably just missed some logic?

Having index-only scans per-table ajustable would make quite some sense..

I have a couple of tables with a high turn-over rate that never get out 
of the OS-cache
anyway, the benefit of index-only scans are quite small, especially if 
they come with
additional overhead on INSERT/UPDATE/DELETE operations, whereas I also have
huge tables with a very small amount of changes. Just the saved IO of 
not having to
go to the heap in some cases would be good.

I could see some benefits in having the index-only scan work on 
tuple-level visibillity information
and not page-level, but that would require a bigger map
(allthough still less than 1% of the heap size if my calculations are 
correct), but
would enable visibillity testing of a tuple without going to the heap 
even other (unrelated)
changes happend on the same page.

Jesper

-- 
Jesper


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Re: [COMMITTERS] pgsql: Reduce spurious Hot Standby conflicts from never-visible records
Next
From: Josh Berkus
Date:
Subject: Re: making an unlogged table logged