Re: fallocate / posix_fallocate for new WAL file creation (etc...) - Mailing list pgsql-hackers

From Greg Smith
Subject Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Date
Msg-id 51A7741B.4010506@2ndQuadrant.com
Whole thread Raw
In response to Re: fallocate / posix_fallocate for new WAL file creation (etc...)  (Alvaro Herrera <alvherre@2ndquadrant.com>)
List pgsql-hackers
On 5/30/13 11:21 AM, Alvaro Herrera wrote:
> Greg Smith escribió:
>
>> The messy part of extending relations in larger chunks
>> is how to communicate that back into the buffer manager usefully.
>> The extension path causing trouble is RelationGetBufferForTuple
>> calling ReadBufferBI.  All of that is passing a single buffer
>> around.  There's no simple way I can see to rewrite it to handle
>> more than one at a time.
>
> No, but we can have it create several pages and insert them into the
> FSM.  So they aren't returned to the original caller but are available
> to future users.

There's actually a code comment wondering about this topic for the pages 
that are already created, in src/backend/access/heap/hio.c :

"Remember the new page as our target for future insertions.
XXX should we enter the new page into the free space map immediately, or 
just keep it for this backend's exclusive use in the short run (until 
VACUUM sees it)?  Seems to depend on whether you expect the current 
backend to make more insertions or not, which is probably a good bet 
most of the time.  So for now, don't add it to FSM yet."

We have to be careful about touching too much at that particular point, 
because it's holding a relation extension lock at the obvious spot to 
make a change.

There's an interesting overlap with these questions about how files are 
extended too, with this comment in that file too, just before the above:

"XXX This does an lseek - rather expensive - but at the moment it is the 
only way to accurately determine how many blocks are in a relation.  Is 
it worth keeping an accurate file length in shared memory someplace, 
rather than relying on the kernel to do it for us?"

That whole sequence of code took the easy way forward when it was 
written, but it's obvious the harder one (also touching the FSM) was 
considered even then.  The whole sequence needs to be revisited to pull 
off multiple page extension.  I wouldn't say it's hard, but it's enough 
work that I haven't been able to find a block of time to go through the 
whole thing.

-- 
Greg Smith   2ndQuadrant US    greg@2ndQuadrant.com   Baltimore, MD
PostgreSQL Training, Services, and 24x7 Support www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Next
From: Robert Haas
Date:
Subject: Re: Behavior of a pg_trgm index for 2 (or < 3) character LIKE queries