Re: Freeze avoidance of very large table. - Mailing list pgsql-hackers

From Sawada Masahiko
Subject Re: Freeze avoidance of very large table.
Date
Msg-id CAD21AoA5CHjSVecYhQzGpoXJCnWEPqkNnHkJtagxdtF=466bYA@mail.gmail.com
Whole thread Raw
In response to Re: Freeze avoidance of very large table.  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On Thu, Apr 23, 2015 at 3:24 AM, Robert Haas <robertmhaas@gmail.com> wrote:
> On Wed, Apr 22, 2015 at 12:39 PM, Heikki Linnakangas <hlinnaka@iki.fi> wrote:
>> The thing that made me nervous about that approach is that it made the LSN
>> of each page critical information. If you somehow zeroed out the LSN, you
>> could no longer tell which pages are frozen and which are not. I'm sure it
>> could be made to work - and I got it working to some degree anyway - but
>> it's a bit scary. It's similar to the multixid changes in 9.3: multixids
>> also used to be data that you can just zap at restart, and when we changed
>> the rules so that you lose data if you lose multixids, we got trouble. Now,
>> LSNs are much simpler, and there wouldn't be anything like the
>> multioffset/member SLRUs that you'd have to keep around forever or vacuum,
>> but still..
>
> LSNs are already pretty critical.  If they're in the future, you can't
> flush those pages.  Ever.  And if they're wrong in either direction,
> crash recovery is broken.  But it's still worth thinking about ways
> that we could make this more robust.
>
> I keep coming back to the idea of treating any page that is marked as
> all-visible as frozen, and deferring freezing until the page is again
> modified.  The big downside of this is that if the page is set as
> all-visible and then immediately thereafter modified, it sucks to have
> to freeze when the XIDs in the page are still present in CLOG.  But if
> we could determine from the LSN that the XIDs in the page are new
> enough to still be considered valid, then we could skip freezing in
> those cases and only do it when the page is "old".  That way, if
> somebody zeroed out the LSN (why, oh why?) the worst that would happen
> is that we'd do some extra freezing when the page was next modified.

In your idea, if we have WORM (write-once read-many) table then these
tuples in page would not be frozen at all unless we do VACUUM FREEZE.
Also in this situation, from the second time VACUUM FREEZE would need
to scan only pages of increment from last freezing, we could reduce
I/O, but we would still need to do explicitly freezing for
anti-wrapping as in the past. WORM table has huge data in general, and
that data would be increase rapidly, so it would also be expensive.

>
>> I would feel safer if we added a completely new "epoch" counter to the page
>> header, instead of reusing LSNs. But as we all know, changing the page
>> format is a problem for in-place upgrade, and takes some space too.
>
> Yeah.  We have a serious need to reduce the size of our on-disk
> format.  On a TPC-C-like workload Jan Wieck recently tested, our data
> set was 34% larger than another database at the beginning of the test,
> and 80% larger by the end of the test.  And we did twice the disk
> writes.  See "The Elephants in the Room.pdf" at
> https://sites.google.com/site/robertmhaas/presentations
>

Regards,

-------
Sawada Masahiko



pgsql-hackers by date:

Previous
From: Jeff Davis
Date:
Subject: Re: [BUGS] Failure to coerce unknown type to specific type
Next
From: Heikki Linnakangas
Date:
Subject: Re: Freeze avoidance of very large table.