Hi,
On 2020-03-17 21:58:53 -0400, James Coleman wrote:
> On Tue, Mar 17, 2020 at 9:03 PM Andres Freund <andres@anarazel.de> wrote:
> >
> > Hi,
> >
> > On 2020-03-17 20:42:07 +0100, Laurenz Albe wrote:
> > > > I think Andres was thinking this would maybe be an optimization independent of
> > > > is_insert_only (?)
> > >
> > > I wasn't sure.
> >
> > I'm not sure myself - but I'm doubtful that using a 0 min age by default
> > will be ok.
> >
> > I was trying to say (in a later email) that I think it might be a good
> > compromise to opportunistically freeze if we're dirtying the page
> > anyway, but not optimize WAL emission etc. That's a pretty simple
> > change, and it'd address a lot of the potential performance regressions,
> > while still freezing for the "first" vacuum in insert only workloads.
>
> If we have truly insert-only tables, then doesn't vacuuming with
> freezing every tuple actually decrease total vacuum cost (perhaps
> significantly) since otherwise every vacuum keeps having to scan the
> heap for dead tuples on pages where we know there are none? Those
> pages could conceptually be frozen and ignored, but are not frozen
> because of the default behavior, correct?
Yes.
> If that's all true, it seems to me that removing that part of the
> patch significantly lowers its value.
Well, perfect sometimes is the enemy of the good. We gotta get something
in, and having some automated vacuuming for insert mostly/only tables is
a huge step forward. And avoiding regressions is an important part of
doing so.
I outlined the steps we could take to allow for more aggressive
vacuuming upthread.
> If we opportunistically freeze only if we're already dirtying a page,
> would that help a truly insert-only workload?
Yes.
> E.g., are there hint bits on the page that would need to change the
> first time we vacuum a full page with no dead tuples?
Yes. HEAP_XMIN_COMMITTED.
> I would have assumed the answer was "no" (since if so I think it would
> follow that _all_ pages need updated the first time they're
> vacuumed?).
That is the case. Although they might already be set when the tuples are
accessed for other reasons.
Greetings,
Andres Freund