Re: Can't we give better table bloat stats easily? - Mailing list pgsql-hackers

From Masahiko Sawada
Subject Re: Can't we give better table bloat stats easily?
Date
Msg-id CAD21AoBjVq56Tw71cNfiObG+YpgEZ0=B1EF=UA+hxf+7147k-Q@mail.gmail.com
Whole thread Raw
In response to Re: Can't we give better table bloat stats easily?  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
On Sat, Aug 17, 2019 at 9:59 AM Andres Freund <andres@anarazel.de> wrote:
>
> Hi,
>
> On 2019-08-16 20:39:21 -0400, Greg Stark wrote:
> > But isn't this all just silliiness these days? We could actually sum up the
> > space recorded in the fsm and get a much more trustworthy number in
> > milliseconds.
>
> You mean like pgstattuple_approx()?
>
> https://www.postgresql.org/docs/current/pgstattuple.html
>
> Or something different?
>
> > I rigged up a quick proof of concept and the code seems super simple and
> > quick. There's one or two tables where the number is a bit suspect and
> > there's no fsm if vacuum hasn't run but that seems pretty small potatoes
> > for such a huge help in reducing user pain.
>
> Hard to comment on what you propose, without more details. But note that
> you can't just look at the FSM, because in a lot of workloads it is
> often hugely out of date. And fairly obviously it doesn't provide you
> with information about how much space is currently occupied by dead
> tuples.  What pgstattuple_approx does is to use the FSM for blocks that
> are all-visible, and look at the page otherwise.
>

It's just an idea but we could have pgstattuple_approx use sample scan
to estimate the table bloat more faster.

Regards,

--
Masahiko Sawada
NIPPON TELEGRAPH AND TELEPHONE CORPORATION
NTT Open Source Software Center



pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: refactoring - share str2*int64 functions
Next
From: Andrey Borodin
Date:
Subject: Yet another fast GiST build