Home > mailing lists

Re: Logical to physical page mapping - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Re: Logical to physical page mapping
Date	October 27, 2012 21:56:45
Msg-id	508C2AEF.1040004@vmware.com Whole thread Raw
In response to	Re: Logical to physical page mapping (Tom Lane <tgl@sss.pgh.pa.us>)
Responses	Re: Logical to physical page mapping Re: Logical to physical page mapping Re: Logical to physical page mapping Re: Logical to physical page mapping
List	pgsql-hackers

Tree view

On 27.10.2012 16:43, Tom Lane wrote:
> Jan Wieck<JanWieck@Yahoo.com>  writes:
>> The reason why we need full_page_writes is that we need to guard against
>> torn pages or partial writes. So what if smgr would manage a mapping
>> between logical page numbers and their physical location in the relation?
>
>> At the moment where we today require a full page write into WAL, we
>> would mark the buffer as "needs relocation". The smgr would then write
>> this page into another physical location whenever it is time to write it
>> (via the background writer, hopefully). After that page is flushed, it
>> would update the page location pointer, or whatever we want to call it.
>> A thus free'd physical page location can be reused, once the location
>> pointer has been flushed to disk. This is a critical ordering of writes.
>> First the page at the new location, second the pointer to the current
>> location. Doing so would make write(2) appear atomic to us, which is
>> exactly what we need for crash recovery.

Hmm, aka copy-on-write.

> I think you're just moving the atomic-write problem from the data pages
> to wherever you keep these pointers.

If the pointers are stored as simple 4-byte integers, you probably could 
assume that they're atomic, and won't be torn.

There's a lot of practical problems in adding another level of 
indirection to every page access, though. It'll surely add some overhead 
to every access, even if the data never changes. And it's not at all 
clear to me that it would perform better than full_page_writes. You're 
writing and flushing out roughly the same amount of data AFAICS.

What exactly is the problem with full_page_writes that we're trying to 
solve?

- Heikki

pgsql-hackers by date:

From: Noah Misch
Date: 27 October 2012, 21:04:09
Subject: Re: Performance Improvement by reducing WAL for Update Operation

From: Heikki Linnakangas
Date: 27 October 2012, 22:36:47
Subject: Re: Performance Improvement by reducing WAL for Update Operation

Re: Logical to physical page mapping - Mailing list pgsql-hackers

Previous

Next