On Wed, Jul 22, 2015 at 12:11 PM, Jeff Janes <jeff.janes@gmail.com> wrote:
>> Urgh. So if we do this, that forever precludes having HOT pruning set
>> the all-visible bit.
>
> I wouldn't say forever, as it would be easy to revert the change if
> something more important came along that conflicted with it.
Well, the people who want this change would likely object to reverting it later.
> I don't think
> this change would grow tentacles across the code that make it hard to
> revert, you would just have to take the performance hit (and by that time,
> maybe HDD will truly be dead anyway and so we don't care anymore). But yes,
> that is definitely a downside. HOT pruning is one example, but also one
> could envision having someone (bgwriter?) set vm bits on unindexed tables.
> Or if we invent some efficient way to know that no expiring tids for a
> certain block range are stored in indexes, other jobs could also set the vm
> bit on indexed tables. Or parallel vacuums in the same table, not that I
> really see a reason to have those.
Yes, good points. (I think we will need parallel vacuum to cope with
TB-sized tables, but that's another story.)
>> At the least we'd better document that carefully
>> so that nobody breaks it later. But I wonder if there isn't some
>> better approach, because I would certainly rather that we didn't
>> foreclose the possibility of doing something like that in the future.
>
> But where do we document it (other than in-place)? README.HOT doesn't seem
> sufficient, and there is no README.vm.
visibilitymap_set?
> I guess add an "Assert(InRecovery || running_a_vacuum);" to the
> visibilitymap_set with a comment there, except that I don't know how to
> implement running_a_vacuum so that it covers manual vacs as well as autovac.
> Perhaps assert that we hold a SHARE UPDATE EXCLUSIVE on rel?
I care more about the comment than I do about the Assert().
> The advantage of the other approach, just force kernel read-ahead to work
> for us, is that it doesn't impose any of these restrictions on future
> development. The disadvantage is that I don't know how to auto-tune it, or
> auto-disable it for SSD, and it will never be as quite as efficient.
I have to say that I like the prefetch approach. I don't know what to
do about the fact that it loses to the VM-bit based approach, but I
think it's a bad bet that we will never care about setting visibility
map bits anyplace other than VACUUM.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company