Re: pgstattuple: fix free space calculation - Mailing list pgsql-hackers

From Frédéric Yhuel
Subject Re: pgstattuple: fix free space calculation
Date
Msg-id ea72b283-d11b-49f4-ac96-b6f146249645@dalibo.com
Whole thread Raw
In response to Re: pgstattuple: fix free space calculation  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Hi Tom, thanks for your review.

On 9/7/24 22:10, Tom Lane wrote:
> I looked at this patch.  I agree with making the change.  However,
> I don't agree with the CF entry's marking of "target version: stable"
> (i.e., requesting back-patch).  I think this falls somewhere in the
> gray area between a bug fix and a definitional change.  Also, people
> are unlikely to be happy if they suddenly get new, not-comparable
> numbers after a minor version update.  So I think we should just fix
> it in HEAD.
>

OK, I did the change.

> As far as the patch itself goes, the one thing that is bothering me
> is this comment change
> 
>           /*
> -         * It's not safe to call PageGetHeapFreeSpace() on new pages, so we
> +         * It's not safe to call PageGetExactFreeSpace() on new pages, so we
>            * treat them as being free space for our purposes.
>            */
> 
> which looks like it wasn't made with a great deal of thought.
> Now it seems to me that the comment was already bogus when written:
> there isn't anything uncertain about what will happen if you call
> either of these functions on a "new" page.  PageIsNew checks for
> 
>          return ((PageHeader) page)->pd_upper == 0;
> 
> If pd_upper is 0, PageGet[Exact]FreeSpace is absolutely guaranteed
> to return zero, even if pd_lower contains garbage.  And then

Indeed. I failed to notice that LocationIndex was an unsigned int, so I 
thought that pg_upper - pd_upper could be positive with garbage in pg_upper.

> PageGetHeapFreeSpace will likewise return zero.  Perhaps there
> could be trouble if we got into the line-pointer-checking part
> of PageGetHeapFreeSpace, but we can't.  So this comment is wrong,
> and is even more obviously wrong after the above change.  I thought
> for a moment about removing the PageIsNew test altogether, but
> then I decided that it probably*is*  what we want and is just
> mis-explained.  I think the comment should read more like
> 
>          /*
>           * PageGetExactFreeSpace() will return zero for a "new" page,
>           * but it's actually usable free space, so count it that way.
>           */
> 
> Now alternatively you could argue that a "new" page isn't usable free
> space yet and so we should count it as zero, just as we don't count
> dead tuples as usable free space.  You need VACUUM to turn either of
> those things into real free space.  But that'd be a bigger definitional
> change, and I'm not sure we want it.  Thoughts?
> 
> Also, do we need any documentation change for this?  I looked through
> https://www.postgresql.org/docs/devel/pgstattuple.html
> and didn't see anything that was being very specific about what
> "free space" means, so maybe it's fine as-is.

It's not easy. Maybe something like this?

"For any initialized page, free space refers to anything that isn't page 
metadata (header and special), a line pointer or a tuple pointed to by a 
valid line pointer. In particular, a dead tuple is not free space 
because there's still a valid line pointer pointer pointing to it, until 
VACUUM or some other maintenance mechanism (e.g. page pruning) cleans up 
the page. A dead line pointer is not free space either, but the tuple it 
points to has become free space. An unused line pointer could be 
considered free space, but pgstattuple doesn't take it into account."



pgsql-hackers by date:

Previous
From: Jim Jones
Date:
Subject: Re: Psql meta-command conninfo+
Next
From: Fujii Masao
Date:
Subject: Re: Add has_large_object_privilege function