Thread: Is mdextend really safe?

Is mdextend really safe?

From

Gregory Stark

Date:

20 August 2008, 06:51:23

Earlier we saw some bug reports from someone who had a buffer flush fail do to
ENOSPC. We asserted then that that should never happen because when we extend
the relation we write out the new blocks so any ENOSPC errors out to happen at
that point, not when a buffer is flushed.

However looking at mdextend it only writes out the requested block. Any blocks
between the end of the table and the requested block are *not* written out. We
count on the OS to implicitly fill those blocks with zeros.

On Unix that creates a sparse file where the intervening blocks are not
allocated. When we later write out those blocks the filesystem then has to
allocate space for them. IIRC the bug reports were from Windows. I'm not sure
what NTFS's behaviour with sparse files is.

Now this only matters if we ever call mdextend on a block which isn't the
block immediately following the end of file. Is that true?

--  Gregory Stark EnterpriseDB          http://www.enterprisedb.com Get trained by Bruce Momjian - ask me about
EnterpriseDB'sPostgreSQL training!

Re: Is mdextend really safe?

From

Florian Weimer

Date:

20 August 2008, 07:49:45

* Gregory Stark:

> On Unix that creates a sparse file where the intervening blocks are
> not allocated. When we later write out those blocks the filesystem
> then has to allocate space for them.

This seems to happen relatively rarely.  Creating temporary holes like
this usually results in heavily fragmented files on the file systems I
use, and I don't see this with PostgreSQL.  (It's one of my gripes
with Berkeley DB.)

However, I looked at the code recently and couldn't figure out *why*
PostgreSQL's observed behavior is this way. 8-(

--
Florian Weimer                <fweimer@bfk.de>
BFK edv-consulting GmbH       http://www.bfk.de/
Kriegsstraße 100              tel: +49-721-96201-1
D-76133 Karlsruhe             fax: +49-721-96201-99

Re: Is mdextend really safe?

From

Zdenek Kotala

Date:

20 August 2008, 08:20:34

Gregory Stark napsal(a):


> On Unix that creates a sparse file where the intervening blocks are not
> allocated. When we later write out those blocks the filesystem then has to
> allocate space for them. IIRC the bug reports were from Windows. I'm not sure
> what NTFS's behaviour with sparse files is.

NTFS has sparse file feature, but how it works ...

> Now this only matters if we ever call mdextend on a block which isn't the
> block immediately following the end of file. Is that true?

I think, that it could happens only during wal log replay, but at the 
end everything should be OK. Look into ReadBuffer_common there is 
following code:

00226     /* Substitute proper block number if caller asked for P_NEW */
00227     if (isExtend)
00228         blockNum = smgrnblocks(smgr, forkNum);

    Zdenek

Re: Is mdextend really safe?

From

"Heikki Linnakangas"

Date:

20 August 2008, 08:24:48

Gregory Stark wrote:
> Now this only matters if we ever call mdextend on a block which isn't the
> block immediately following the end of file. Is that true?

I don't think so.

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com

Re: Is mdextend really safe?

From

Tom Lane

Date:

20 August 2008, 09:43:46

Gregory Stark <stark@enterprisedb.com> writes:
> Now this only matters if we ever call mdextend on a block which isn't the
> block immediately following the end of file. Is that true?

Only in hash indexes.
        regards, tom lane