Re: Can postgres create a file with physically continuous blocks. - Mailing list pgsql-hackers

From Rob Wultsch
Subject Re: Can postgres create a file with physically continuous blocks.
Date
Msg-id AANLkTim-TEzDdKCEo-OstKM3B7efkiyQ_rPiCpFVaggu@mail.gmail.com
Whole thread Raw
In response to Re: Can postgres create a file with physically continuous blocks.  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
Responses Re: Can postgres create a file with physically continuous blocks.  (Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>)
List pgsql-hackers
On Wed, Dec 22, 2010 at 12:15 AM, Heikki Linnakangas
<heikki.linnakangas@enterprisedb.com> wrote:
> On 22.12.2010 03:45, Rob Wultsch wrote:
>>
>> On Tue, Dec 21, 2010 at 4:49 AM, Robert Haas<robertmhaas@gmail.com>
>>  wrote:
>>>
>>> On Sun, Dec 19, 2010 at 1:10 PM, Jim Nasby<jim@nasby.net>  wrote:
>>>>
>>>> On Dec 19, 2010, at 1:10 AM, flyusa2010 fly wrote:
>>>>>
>>>>> Does postgres make an effort to create a file with physically
>>>>> continuous blocks?
>>>>
>>>> AFAIK all files are expanded as needed. I don't think there's any flags
>>>> you can pass to the filesystem to tell it "this file will eventually be 1GB
>>>> in size". So, we're basically at the mercy of the FS to try and keep things
>>>> contiguous.
>>>
>>> There have been some reports that we would do better on some
>>> filesystems if we extended the file more than a block at a time, as we
>>> do today.  However, AFAIK, no one is pursuing this ATM.
>>
>> The has been found to be the case in the MySQL world, particularly
>> when ext3 is in use:
>> http://forge.mysql.com/worklog/task.php?id=4925
>> http://www.facebook.com/note.php?note_id=194501560932
>
> These seem to be about extending the transaction log, and we already
> pre-allocate the WAL. The WAL is repeatedly fsync'd, so I can understand
> that extending that in small chunks would hurt performance a lot, as the
> filesystem needs to flush the metadata changes to disk at every commit.
> However, that's not an issue with extending data files, they are only
> fsync'd at checkpoints.
>
> It might well be advantageous to extend data files in larger chunks too, but
> it's probably nowhere near as important as with the WAL.

Agree.

>> Also, InnoDB has an option for how much data should be allocated at
>> the end of a tablespace when it needs to grow:
>>
>> http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_data_file_path
>
> Hmm, innodb_autoextend_increment seems more like what we're discussing here
> (http://dev.mysql.com/doc/refman/5.0/en/innodb-parameters.html#sysvar_innodb_autoextend_increment).
> If I'm reading that correctly, InnoDB defaults to extending files in 8MB
> chunks.

This is not pure apples to apples as InnoDB does direct io, however
doesn't the checkpoint completion target code call fsync repeatedly in
order to achieve the check point completion target? And for that
matter, haven't there been recent discussion on hackers about calling
fsync more often?

Sorry for the loopy email. I have not been getting anywhere near
enough sleep recently :(
--
Rob Wultsch
wultsch@gmail.com


pgsql-hackers by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: How much do the hint bits help?
Next
From: Heikki Linnakangas
Date:
Subject: Re: Can postgres create a file with physically continuous blocks.