Re: Shared detoast Datum proposal - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Shared detoast Datum proposal
Date
Msg-id fea81484-09b5-499d-80d0-acb8df43d047@enterprisedb.com
Whole thread Raw
In response to Re: Shared detoast Datum proposal  (Andy Fan <zhihuifan1213@163.com>)
Responses Re: Shared detoast Datum proposal
Re: Shared detoast Datum proposal
List pgsql-hackers
On 2/26/24 14:22, Andy Fan wrote:
> 
>...
>
>> Also, toasted values
>> are not always being used immediately and as a whole, i.e. jsonb values are fully
>> detoasted (we're working on this right now) to extract the smallest value from
>> big json, and these values are not worth keeping in memory. For text values too,
>> we often do not need the whole value to be detoasted.
> 
> I'm not sure how often this is true, espeically you metioned text data
> type. I can't image why people acquire a piece of text and how can we
> design a toasting system to fulfill such case without detoast the whole
> as the first step. But for the jsonb or array data type, I think it is
> more often. However if we design toasting like that, I'm not sure if it
> would slow down other user case. for example detoasting the whole piece
> use case. I am justing feeling both way has its user case, kind of heap
> store and columna store.  
> 

Any substr/starts_with call benefits from this optimization, and such
calls don't seem that uncommon. I certainly can imagine people doing
this fairly often. In any case, it's a long-standing behavior /
optimization, and I don't think we can just dismiss it this quickly.

Is there a way to identify cases that are likely to benefit from this
slicing, and disable the detoasting for them? We already disable it for
other cases, so maybe we can do this here too?

Or maybe there's a way to do the detoasting lazily, and only keep the
detoasted value when it's requesting the whole value. Or perhaps even
better, remember what part we detoasted, and maybe it'll be enough for
future requests?

I'm not sure how difficult would this be with the proposed approach,
which eagerly detoasts the value into tts_values. I think it'd be easier
to implement with the TOAST cache I briefly described ...


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Bertrand Drouvot
Date:
Subject: Re: Fix race condition in InvalidatePossiblyObsoleteSlot()
Next
From: Masahiko Sawada
Date:
Subject: Re: Improve eviction algorithm in ReorderBuffer