Re: tableam vs. TOAST - Mailing list pgsql-hackers

From Robert Haas
Subject Re: tableam vs. TOAST
Date
Msg-id CA+TgmoZvoDqBKrA99f7cRGa2oMjr_efP=QJdvzf=SVqLX6X5GA@mail.gmail.com
Whole thread Raw
In response to Re: tableam vs. TOAST  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Thu, Sep 5, 2019 at 3:36 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Andres Freund <andres@anarazel.de> writes:
> > Well, I still dislike making the toast chunk size configurable in a
> > halfhearted manner.
>
> It's hard to make it fully configurable without breaking our on-disk
> storage, because of the lack of any explicit representation of the chunk
> size in TOAST data.  You have to "just know" how big the chunks are
> supposed to be.

There was a concrete proposal about this from Andres here, down at the
bottom of the email:

http://postgr.es/m/20190802224251.lsxw4o5ebn2ng5ey@alap3.anarazel.de

Basically, detoasting would tolerate whatever chunk size it finds, and
the slice-fetching logic would get complicated.

> However, it's reasonable to ask why we should treat it as an AM property,
> especially a fixed AM property as this has it.  If somebody does
> reimplement toast logic in some other AM, they might well decide it's
> worth the storage cost to be more flexible about the chunk size ... but
> too bad, this design won't let them do it.

Fair complaint.  The main reason I want to treat it as an AM property
is that TOAST_TUPLE_THRESHOLD is defined in terms of heap-specific
constants, and having other AMs include heap-specific header files
seems like a thing we should try hard to avoid. Once you're indirectly
including htup_details.h in every AM in existence, it's going to be
hard to be sure that you've got no other dependencies on the current
heap AM. But I agree that making it not a fixed value could be useful.
One benefit of it would be that you could just change the value, even
for the current heap, without breaking access to already-toasted data.

> It seems like this design throws away most of the benefit of a fixed
> chunk size (mostly, being able to do relevant modulo arithmetic with
> shifts and masks rather than full-fledged integer division) without
> getting much of anything in return.

I don't think you're really getting that particular benefit, because
TOAST_TUPLE_THRESHOLD and TOAST_TUPLE_TARGET are not going to end up
as powers of two.  But you do get the benefit of working with
constants instead of a value determined at runtime.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Alvaro Herrera from 2ndQuadrant
Date:
Subject: Re: Index Skip Scan
Next
From: Andres Freund
Date:
Subject: Re: tableam vs. TOAST