Re: [HACKERS] GUC for cleanup indexes threshold. - Mailing list pgsql-hackers

From Robert Haas
Subject Re: [HACKERS] GUC for cleanup indexes threshold.
Date
Msg-id CA+Tgmoa6xNWe6+ME+ocg-BgmdwOZXoT0O6Tid7RLxm_8ztNJcA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] GUC for cleanup indexes threshold.  (Peter Geoghegan <pg@bowt.ie>)
Responses Re: [HACKERS] GUC for cleanup indexes threshold.  (Peter Geoghegan <pg@bowt.ie>)
List pgsql-hackers
On Tue, Mar 14, 2017 at 6:10 PM, Peter Geoghegan <pg@bowt.ie> wrote:
> On Tue, Mar 14, 2017 at 2:48 PM, Peter Geoghegan <pg@bowt.ie> wrote:
>> I think that that's safe, but it is a little disappointing that it
>> does not allow us to skip work in the case that you really had in mind
>> when writing the patch. Better than nothing, though, and perhaps still
>> a good start. I would like to hear other people's opinions.
>
> Hmm. So, the SQL-callable function txid_current() exports a 64-bit
> XID, which includes an "epoch". If PostgreSQL used these 64-bit XIDs
> natively, we'd never need to freeze. Obviously we don't do that
> because the per-tuple overhead of visibility information is already
> high, and that would make it much worse. But, we can easily afford the
> extra overhead in a completely dead page. Maybe we can overcome the
> _bt_page_recyclable() limitation by being cleverer about how it
> determines if recycling is safe -- it can examine epoch, too. This
> would also be required in the similar function
> vacuumRedirectAndPlaceholder(), which is a part of SP-GiST.
>
> We already have BTPageOpaqueData.btpo, a union whose contained type
> varies based on the page being dead. We could just do the same with
> some other field in that struct, and then store epoch there. Clearly
> nobody really cares about most data that remains on the page. Index
> scans just need to be able to land on it to determine that it's dead,
> and VACUUM needs to be able to determine whether or not there could
> possibly be such an index scan at the time it considers recycling..
>
> Does anyone see a problem with that?

Wouldn't it break on-disk compatibility with existing btree indexes?

I think we're still trying to solve a problem that Simon postulated in
advance of evidence that shows how much of a problem it actually is.
Not only might that be unnecessary, but if we don't have a test
demonstrating the problem, we also don't have a test demonstrating
that a given approach fixes it.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Parallel tuplesort (for parallel B-Tree index creation)
Next
From: Robert Haas
Date:
Subject: Re: [HACKERS] increasing the default WAL segment size