Re: detoast datum into the given buffer as a optimization. - Mailing list pgsql-hackers

From Andy Fan
Subject Re: detoast datum into the given buffer as a optimization.
Date
Msg-id 87cyl0xyzl.fsf@163.com
Whole thread Raw
In response to detoast datum into the given buffer as a optimization.  (Andy Fan <zhihuifan1213@163.com>)
List pgsql-hackers
Jubilee Young <workingjubilee@gmail.com> writes:


> On Wed, Sep 18, 2024 at 2:23 PM Nathan Bossart <nathandbossart@gmail.com> wrote:
>>
>> On Wed, Sep 18, 2024 at 05:35:56PM +0800, Andy Fan wrote:
>> > Currently detoast_attr always detoast the data into a palloc-ed memory
>> > and then if user wants the detoast data in a different memory, user has to
>> > copy them, I'm thinking if we could provide a buf as optional argument for
>> > detoast_attr to save such wastage.
>> >
>> > [...]
>> >
>> > What do you think?
>>
>> My first thought is that this seems reasonable if there are existing places
>> where we are copying the data out of the palloc'd memory, but otherwise it
>> might be more of a prerequisite patch for the other things you mentioned.
>>
>
> This would also simplify data copying patterns many extensions have to do.
> For instance, often they have to move the data from Postgres memory into
> another language's allocation types. Or just for custom data structures,
> of course.

I thought we have, but did't check it so far. If we figured out an API,
we can use them for optimizing some existing code.

> I would suggest that this be added as new API, however, instead of a change
> to `detoast_attr`.

I agree that new API would usually be clearer than adding a more
argument, just that I don't want copy-paste too much existing
code. Actually the thing in my mind is:

struct varlena *
detoast_attr(struct varlena *attr)
{
   int rawsize = toast_raw_datum_size(attr);
   char *buffer = palloc(rawsize)
   return detoast_attr_buffer(attr, buffer);
}

struct varlena *
detoast_attr_buffer(struct varlena *attr, char *buffer)
{
  ...
}

In this case:
- there is no existing code need to be changed.
- detoast_attr_buffer is tested sufficiently automatically.

> This would make different return types more sensical,
> as there is no need to implicitly allocate. It could return an error
> type?

I can't understand the error here.

>
>> *   Note if caller provides a non-NULL buffer, it is the duty of caller
>> * to make sure it has enough room for the detoasted format (Usually
>> * they can use toast_raw_datum_size to get the size)
>
> I'm not entirely sure why the caller is being given the burden of checking,
> but I suppose they probably did check? I can imagine scenarios where they
> are not interested, however, and the callee always has to obtain the data
> for len written anyways. So I would probably make writable length a
> third arg.

I didn't follow up here as well:(,  do you mind to explain a bit?

--
Best Regards
Andy Fan




pgsql-hackers by date:

Previous
From: Andy Fan
Date:
Subject: Re: detoast datum into the given buffer as a optimization.
Next
From: Nathan Bossart
Date:
Subject: Re: First draft of PG 17 release notes