Re: removing PD_ALL_VISIBLE - Mailing list pgsql-hackers

From Andres Freund
Subject Re: removing PD_ALL_VISIBLE
Date
Msg-id 20130530121208.GA7466@awork2.anarazel.de
Whole thread Raw
In response to Re: removing PD_ALL_VISIBLE  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: removing PD_ALL_VISIBLE  (Heikki Linnakangas <hlinnakangas@vmware.com>)
Re: removing PD_ALL_VISIBLE  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2013-05-30 07:54:38 -0400, Robert Haas wrote:
> On Thu, May 30, 2013 at 12:06 AM, Jeff Davis <pgsql@j-davis.com> wrote:
> >> AFAICS, the main benefit of eliminating PD_ALL_VISIBLE is that we
> >> eliminate one write cycle; that is, we won't dirty the page once to
> >> hint it and then again to mark it all-visible.  But as of 9.3, that
> >> should really only be a problem in the insert-only case.  And in that
> >> case, my proposal to consider all-visible pages as frozen would be a
> >> huge win, because you'd only need to emit XLOG_HEAP_VISIBLE for every
> >> page in the heap, rather than XLOG_HEAP_FREEZE.
> >
> > Agreed.
> 
> Just to quantify that a bit more, I ran this command a couple of times:
> 
> dropdb rhaas ; sleep 5 ; createdb ; sleep 5 ; pgbench -i -s 1000 -n;
> sleep 5 ; time psql -c checkpoint ; time psql -c 'vacuum'
> 
> And also this one:
> 
> dropdb rhaas ; sleep 5 ; createdb ; sleep 5 ; pgbench -i -s 1000 -n;
> sleep 5 ; time psql -c checkpoint ; time psql -c 'vacuum freeze'
> 
> In the first one, the vacuum at the end takes about 25 seconds.  In
> the second one, it takes about 15 minutes, during which time there's
> one CPU core running at about 10%; the remainder of the time is spent
> waiting for disk I/O.  A little follow-up testing shows that the
> vacuum emits 88MB of WAL, while the vacuum freeze emits 13GB of WAL.
> 
> This is on the 16-core, 64-thread IBM POWER box with the following
> non-default configuration settings:
> 
> shared_buffers = 8GB
> maintenance_work_mem = 1GB
> synchronous_commit = off
> checkpoint_segments = 300
> checkpoint_timeout = 15min
> checkpoint_completion_target = 0.9
> log_line_prefix = '%t [%p] '
> 
> Andres' proposal for freezing at the same time we mark pages
> all-visible relies on emitting FPIs when we mark pages all-visible,
> but I hope that the test above is convincing evidence that it would be
> *really* expensive for some users.  My proposal to consider
> all-visible pages as frozen avoids that cost

I think I basically suggested treating all visible as frozen, didn't I?
If not, I had lost sync between my fingers and my thoughts which happens
too often ;).
You had noticed that my proposed was lacking a bit around when we omit
FPIs for the page while setting all-visible, but we both thought that we
may find a workaround that - which looking at the page level flag first
basically is.

As far as I understand the trick basically is that we can rely on an FPI
being logged when an action unsetting ALL_VISIBLE is performed. That
all-visible would then make sure the hint-bits marking indvidual tuples
as frozen would hit disk. For that we need to add some more work though,
consider:

1) write tuples on a page
2) "freeze" page by setting ALL_VISIBLE and setting hint
bits. Setting ALL_VISIBLE is wall logged
3) crash
4) replay ALL_VISIBLE, set it on the page level. The individual tuples  are *not* guaranteed to be marked frozen.
5) update tuple on the page unsetting all visible. Emits an FPI which  does *not* have the tuples marked as frozen.

Easy enough and fairly cheap to fix by having a function that checks
that updates the hint bits on a page when unsetting all visible since we
can just set it for all pre-existing tuples.

> but as far as I can see, it also requires PD_ALL_VISIBLE to stick
> around.

Now, I am far from being convinced its a good idea to get rid of
PD_ALL_VISIBLE, but I don't think it does. Except that it currently is
legal for the page level ALL_VISIBLE being set while the corresponding
visibilitymap one isn't there's not much prohibiting us fundamentally
from looking in the vm when we need to know whether the page is all
visible, is there?
To the contrary, this actually seems to be a pretty good case for Jeff's
proposed behaviour since it would allow freezing while only writing the
vm?

Greetings,

Andres Freund

-- Andres Freund                       http://www.2ndQuadrant.com/PostgreSQL Development, 24x7 Support, Training &
Services



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: fallocate / posix_fallocate for new WAL file creation (etc...)
Next
From: Andres Freund
Date:
Subject: Re: fallocate / posix_fallocate for new WAL file creation (etc...)