Re: getting rid of freezing - Mailing list pgsql-hackers

From Robert Haas
Subject Re: getting rid of freezing
Date
Msg-id CA+TgmoZcq9+C6FAD_R-GdTHbyoNUwMk7LaJqt7ZK8iZLAu6gLw@mail.gmail.com
Whole thread Raw
In response to Re: getting rid of freezing  (Josh Berkus <josh@agliodbs.com>)
Responses Re: getting rid of freezing  (Jeff Davis <pgsql@j-davis.com>)
List pgsql-hackers
On Tue, May 28, 2013 at 12:29 PM, Josh Berkus <josh@agliodbs.com> wrote:
> On 05/28/2013 07:17 AM, Andres Freund wrote:
>> On 2013-05-26 16:58:58 -0700, Josh Berkus wrote:
>>> I was talking this over with Jeff on the plane, and we wanted to be
>>> clear on your goals here:  are you looking to eliminate the *write* cost
>>> of freezing, or just the *read* cost of re-reading already frozen pages?
>>
>> Both. The latter is what I have seen causing more hurt, but the former
>> alone is painful enough.
>
> I guess I don't see how your proposal is reducing the write cost for
> most users then?
>
> - for users with frequently, randomly updated data, pdallvisible would
> not be ever set, so they still need to be rewritten to freeze

Do these users never run vacuum?  As of 9.3, vacuum phase 2 will
typically set PD_ALL_VISIBLE on each relevant page.  The only time
that this WON'T happen is if an insert, update, or delete hits the
page after phases 1 of vacuum and before phase 2 of vacuum.  I don't
think that's going to be the common case.

> - for users with append-only tables, allvisible would never be set since
> those pages don't get vacuumed

There's no good solution for append-only tables.  Eventually, they
will get vacuumed, and when that happens, PD_ALL_VISIBLE will be set,
and freezing will also happen.  I don't think anything that is being
proposed here is going to make that a whole lot better, but it
shouldn't make it any worse than it is now, either.  Since it's
probably not solvable without a rewrite of the heap AM, I'm not going
to feel too bad about that.

> - it would prevent us from getting rid of allvisible, which has a
> documented and known write overhead

Again, I think this is going to be much less of an issue with 9.3, for
the reason explained above.  In 9.2 and prior, we'd scan a page with
dead tuples, prune them to line pointers, vacuum the indexes, and then
mark the dead pointers as unused.  Then, the NEXT vacuum would revisit
the same page and dirty it again ONLY to mark it all-visible.  But in
9.3, the first vacuum will mark the page all-visible at the same time
it marks the dead line pointers unused.  So the write overhead of
PD_ALL_VISIBLE should basically be gone.  If it's not, it would be
good to know why.

> If we just wanted to reduce read cost, why not just take a simpler
> approach and give the visibility map a "isfrozen" bit?  Then we'd know
> which pages didn't need rescanning without nearly as much complexity.

That would break pg_upgrade, which would have to remove visibility map
forks when upgrading.  More importantly, it would require another
round of complex changes to the write-ahead logging in this area.
It's not obvious to me that we'd end up ahead of where we are today,
although perhaps I am a pessimist.

> That would also make it more effective to do precautionary vacuum freezing.

But wouldn't it be a whole lot nicer if we just didn't have to do
vacuum freezing AT ALL?  The point here is to absorb freezing into
some other operation that we already have to do.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: "Joshua D. Drake"
Date:
Subject: Re: Planning incompatibilities for Postgres 10.0
Next
From: Andres Freund
Date:
Subject: Re: preserving forensic information when we freeze