Re: Shared detoast Datum proposal - Mailing list pgsql-hackers

From Andy Fan
Subject Re: Shared detoast Datum proposal
Date
Msg-id 87h6i2djf9.fsf@163.com
Whole thread Raw
In response to Re: Shared detoast Datum proposal  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Shared detoast Datum proposal
List pgsql-hackers

I see your reply when I started to write a more high level
document. Thanks for the step by step help!

Tomas Vondra <tomas.vondra@enterprisedb.com> writes:

> On 2/20/24 19:38, Andy Fan wrote:
>> 
>> ...
>>
>>> I think it should be done the other
>>> way around, i.e. the patch should introduce the main feature first
>>> (using the traditional Bitmapset), and then add Bitset on top of that.
>>> That way we could easily measure the impact and see if it's useful.
>> 
>> Acutally v4 used the Bitmapset, and then both perf and pgbench's tps
>> indicate it is too expensive. and after talk with David at [2], I
>> introduced bitset and use it here. the test case I used comes from [1].
>> IRCC, there were 5% performance difference because of this.
>> 
>> create table w(a int, b numeric);
>> insert into w select i, i from generate_series(1, 1000000)i;
>> select b from w where b > 0;
>> 
>> To reproduce the difference, we can replace the bitset_clear() with
>> 
>> bitset_free(slot->pre_detoasted_attrs);
>> slot->pre_detoasted_attrs = bitset_init(slot->tts_tupleDescriptor->natts);
>> 
>> in ExecFreePreDetoastDatum. then it works same as Bitmapset.
>> 
>
> I understand the bitset was not introduced until v5, after noticing the
> bitmapset is not quite efficient. But I still think the patches should
> be the other way around, i.e. the main feature first, then the bitset as
> an optimization.
>
> That allows everyone to observe the improvement on their own (without
> having to tweak the code), and it also doesn't require commit of the
> bitset part before it gets actually used by anything.

I start to think this is a better way rather than the opposite. The next
version will be:

0001: shared detoast datum feature with high level introduction.
0002: introduce bitset and use it shared-detoast-datum feature, with the
test case to show the improvement. 

>> I will do more test on the memory leak stuff, since there are so many
>> operation aginst slot like ExecCopySlot etc, I don't know how to test it
>> fully. the method in my mind now is use TPCH with 10GB data size, and
>> monitor the query runtime memory usage.
>> 
>
> I think this is exactly the "high level design" description that should
> be in a comment, somewhere.

Got it.

>>> Of course, expanded datums are
>>> not meant to be long-lived, while "shared detoasted values" are meant to
>>> exist (potentially) for the query duration.
>> 
>> hmm, acutally the "shared detoast value" just live in the
>> TupleTableSlot->tts_values[*], rather than the whole query duration. The
>> simple case is:
>> 
>> SELECT * FROM t WHERE a_text LIKE 'abc%';
>> 
>> when we scan to the next tuple, the detoast value for the previous tuple
>> will be relased.
>> 
>
> But if the (detoasted) value is passed to the next executor node, it'll
> be kept, right?

Yes and only one copy for all the executor nodes.

> Unrelated question - could this actually end up being too aggressive?
> That is, could it detoast attributes that we end up not needing? For
> example, what if the tuple gets eliminated for some reason (e.g. because
> of a restriction on the table, or not having a match in a join)? Won't
> we detoast the tuple only to throw it away?

The detoast datum will have the exactly same lifespan with other
tts_values[*]. If the tuple get eliminated for any reason, those detoast
datum still exist until the slot is cleared for storing the next tuple.

>> typedef struct Join
>> {
>> ..
>>     /*
>>      * Records of var's varattno - 1 where the Var is accessed indirectly by
>>      * any expression, like a > 3.  However a IS [NOT] NULL is not included
>>      * since it doesn't access the tts_values[*] at all.
>>      *
>>      * This is a essential information to figure out which attrs should use
>>      * the pre-detoast-attrs logic.
>>      */
>>     Bitmapset  *outer_reference_attrs;
>>     Bitmapset  *inner_reference_attrs;
>> } Join;
>> 
>
> Is it actually necessary to add new fields to these nodes? Also, the
> names are not particularly descriptive of the purpose - it'd be better
> to have "detoast" in the name, instead of generic "reference".

Because of the way of the data transformation, I think we must add the
fields to keep such inforamtion. Then these information will be used
initilize the necessary information in PlanState. maybe I am having a
fixed mindset, I can't think out a way to avoid that right now.

I used 'reference' rather than detoast is because some implementaion
issues. In the createplan.c and setref.c, I can't check the atttyplen
effectively, so even a Var with int type is still hold here which may
have nothing with detoast.

>> 
>> When I worked with the UniqueKey feature, I maintained a
>> UniqueKey.README to summaried all the dicussed topics in threads, the
>> README is designed to save the effort for more reviewer, I think I
>> should apply the same logic for this feature.
>> 
>
> Good idea. Either that (a separate README), or a comment in a header of
> some suitable .c/.h file (I prefer that, because that's kinda obvious
> when reading the code, I often not notice a README exists next to it).

Great, I'd try this from tomorrow. 

-- 
Best Regards
Andy Fan




pgsql-hackers by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Shared detoast Datum proposal
Next
From: Bharath Rupireddy
Date:
Subject: Re: LogwrtResult contended spinlock