Re: Logical to physical page mapping - Mailing list pgsql-hackers

From Gavin Flower
Subject Re: Logical to physical page mapping
Date
Msg-id 508C4E5B.6090606@archidevsys.co.nz
Whole thread Raw
In response to Re: Logical to physical page mapping  (Heikki Linnakangas <hlinnakangas@vmware.com>)
List pgsql-hackers
On 28/10/12 07:41, Heikki Linnakangas wrote:
> On 27.10.2012 16:43, Tom Lane wrote:
>> Jan Wieck<JanWieck@Yahoo.com> writes:
>>> The reason why we need full_page_writes is that we need to guard 
>>> against
>>> torn pages or partial writes. So what if smgr would manage a mapping
>>> between logical page numbers and their physical location in the 
>>> relation?
>>
>>> At the moment where we today require a full page write into WAL, we
>>> would mark the buffer as "needs relocation". The smgr would then write
>>> this page into another physical location whenever it is time to 
>>> write it
>>> (via the background writer, hopefully). After that page is flushed, it
>>> would update the page location pointer, or whatever we want to call it.
>>> A thus free'd physical page location can be reused, once the location
>>> pointer has been flushed to disk. This is a critical ordering of 
>>> writes.
>>> First the page at the new location, second the pointer to the current
>>> location. Doing so would make write(2) appear atomic to us, which is
>>> exactly what we need for crash recovery.
>
> Hmm, aka copy-on-write.
>
>> I think you're just moving the atomic-write problem from the data pages
>> to wherever you keep these pointers.
>
> If the pointers are stored as simple 4-byte integers, you probably 
> could assume that they're atomic, and won't be torn.
>
> There's a lot of practical problems in adding another level of 
> indirection to every page access, though. It'll surely add some 
> overhead to every access, even if the data never changes. And it's not 
> at all clear to me that it would perform better than full_page_writes. 
> You're writing and flushing out roughly the same amount of data AFAICS.
>
> What exactly is the problem with full_page_writes that we're trying to 
> solve?
>
> - Heikki
>
>
Would a 4 byte pointer be adequate for a 64 bit machine with well over 
4GB used by Postgres?


Cheers,
Gavin



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Logical to physical page mapping
Next
From: "Greg Sabino Mullane"
Date:
Subject: Re: My first patch! (to \df output)