On Thu, Sep 5, 2019 at 3:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Well, I still dislike making the toast chunk size configurable in a
> > halfhearted manner.
>
> It's hard to make it fully configurable without breaking our on-disk
> storage, because of the lack of any explicit representation of the chunk
> size in TOAST data. You have to "just know" how big the chunks are
> supposed to be.
There was a concrete proposal about this from Andres here, down at the
bottom of the email:
http://postgr.es/m/20190802224251.lsxw4o5ebn2ng5ey@alap3.anarazel.de
Basically, detoasting would tolerate whatever chunk size it finds, and
the slice-fetching logic would get complicated.
> However, it's reasonable to ask why we should treat it as an AM property,
> especially a fixed AM property as this has it. If somebody does
> reimplement toast logic in some other AM, they might well decide it's
> worth the storage cost to be more flexible about the chunk size ... but
> too bad, this design won't let them do it.
Fair complaint. The main reason I want to treat it as an AM property
is that TOAST_TUPLE_THRESHOLD is defined in terms of heap-specific
constants, and having other AMs include heap-specific header files
seems like a thing we should try hard to avoid. Once you're indirectly
including htup_details.h in every AM in existence, it's going to be
hard to be sure that you've got no other dependencies on the current
heap AM. But I agree that making it not a fixed value could be useful.
One benefit of it would be that you could just change the value, even
for the current heap, without breaking access to already-toasted data.
> It seems like this design throws away most of the benefit of a fixed
> chunk size (mostly, being able to do relevant modulo arithmetic with
> shifts and masks rather than full-fledged integer division) without
> getting much of anything in return.
I don't think you're really getting that particular benefit, because
TOAST_TUPLE_THRESHOLD and TOAST_TUPLE_TARGET are not going to end up
as powers of two. But you do get the benefit of working with
constants instead of a value determined at runtime.
--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company