Re: smgrzeroextend clarification - Mailing list pgsql-hackers

From Andres Freund
Subject Re: smgrzeroextend clarification
Date
Msg-id 20230510181049.ad5vw6kijqd2y274@awork3.anarazel.de
Whole thread Raw
In response to smgrzeroextend clarification  (Peter Eisentraut <peter.eisentraut@enterprisedb.com>)
Responses Re: smgrzeroextend clarification
Re: smgrzeroextend clarification
List pgsql-hackers
Hi,

On 2023-05-10 11:50:14 +0200, Peter Eisentraut wrote:
> I was looking at the new smgrzeroextend() function in the smgr API.  The
> documentation isn't very extensive:
> 
> /*
>  *  smgrzeroextend() -- Add new zeroed out blocks to a file.
>  *
>  *      Similar to smgrextend(), except the relation can be extended by
>  *      multiple blocks at once and the added blocks will be filled with
>  *      zeroes.
>  */
> 
> The documentation of smgrextend() is:
> 
> /*
>  *  smgrextend() -- Add a new block to a file.
>  *
>  *      The semantics are nearly the same as smgrwrite(): write at the
>  *      specified position.  However, this is to be used for the case of
>  *      extending a relation (i.e., blocknum is at or beyond the current
>  *      EOF).  Note that we assume writing a block beyond current EOF
>  *      causes intervening file space to become filled with zeroes.
>  */
> 
> So if you want to understand what smgrzeroextend() does, you need to
> mentally combine the documentation of three different functions.  Could we
> write documentation for each function that stands on its own?  And document
> the function arguments, like what does blocknum and nblocks mean?

I guess it couldn't hurt. But if we go down that route, we basically need to
rewrite all the function headers in smgr.c, I think.


> Moreover, the text "except the relation can be extended by multiple blocks
> at once and the added blocks will be filled with zeroes" doesn't make much
> sense as a differentiation, because smgrextend() does that as well.

Hm? smgrextend() writes a single block, and it's filled with the caller
provided buffer.


> AFAICT, the differences between smgrextend() and smgrzeroextend() are:
> 
> 1. smgrextend() writes a payload block in addition to extending the file,
> smgrzeroextend() just extends the file without writing a payload.
> 
> 2. smgrzeroextend() uses various techniques (posix_fallocate() etc.) to make
> sure the extended space is actually reserved on disk, smgrextend() does not.
> 
> #1 seems fine, but the naming of the APIs does not reflect that at all.
> 
> If we think that #2 is important, maybe smgrextend() should do that as well?
> Or at least explain why it's not needed?

smgrextend() does #2 - it just does it by writing data.

The FileFallocate() path in smgrzeroextend() tries to avoid writing data if
extending by sufficient blocks - not having dirty data in the kernel page
cache can substantially reduce the IO usage.

Whereas the FileZero() path just optimizes the number of syscalls (and cache
misses etc).

Greetings,

Andres Freund



pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: BUG: Postgres 14 + vacuum_defer_cleanup_age + FOR UPDATE + UPDATE
Next
From: Bruce Momjian
Date:
Subject: Subscription suborigin?