Thread: detoast datum into the given buffer as a optimization.
Hi, Currently detoast_attr always detoast the data into a palloc-ed memory and then if user wants the detoast data in a different memory, user has to copy them, I'm thinking if we could provide a buf as optional argument for detoast_attr to save such wastage. current format: /* ---------- * detoast_attr - * * Public entry point to get back a toasted value from compression * or external storage. The result is always non-extended varlena form. * * Note some callers assume that if the input is an EXTERNAL or COMPRESSED * datum, the result will be a pfree'able chunk. * ---------- */ struct varlena * detoast_attr(struct varlena *attr) new format: /* ---------- * detoast_attr - * ... * * Note if caller provides a non-NULL buffer, it is the duty of caller * to make sure it has enough room for the detoasted format (Usually * they can use toast_raw_datum_size to get the size) Or else a * palloced memory under CurrentMemoryContext is used. */ struct varlena * detoast_attr(struct varlena *attr, char *buffer) There are 2 user cases at least: 1. The shared detoast datum patch at [1], where I want to avoid the duplicated detoast effort for the same datum, for example: SELECT f(big_toast_col) FROM t WHERE g(big_toast_col); Current master detoast it twice now. In that patch, I want to detoast the datum into a MemoryContext where the lifespan is same as slot->tts_values[*] rather than CurrentMemoryContext so that the result can be reused in the different expression. Within the proposal here, we can detoast the datum into the desired MemoryContext directly (just allocating the buffer in the desired MemoryContext is OK). 2. make printtup function a bit faster [2]. That patch already removed some palloc, memcpy effort, but it still have some chances to optimize further. for example in text_out function, it is still detoast the datum into a palloc memory and then copy them into a StringInfo. One of the key point is we can always get the varlena rawsize cheaply without any real detoast activity in advance, thanks to the existing varlena design. If this can be accepted, it would reduce the size of patch [2] at some extend, and which part was disliked by Thomas (and me..) [3]. What do you think? [1] https://commitfest.postgresql.org/49/4759/ [2] https://www.postgresql.org/message-id/87wmjzfz0h.fsf%40163.com [3] https://www.postgresql.org/message-id/6718759c-2dac-48e4-bf18-282de4d82204%40enterprisedb.com -- Best Regards Andy Fan
On Wed, Sep 18, 2024 at 05:35:56PM +0800, Andy Fan wrote: > Currently detoast_attr always detoast the data into a palloc-ed memory > and then if user wants the detoast data in a different memory, user has to > copy them, I'm thinking if we could provide a buf as optional argument for > detoast_attr to save such wastage. > > [...] > > What do you think? My first thought is that this seems reasonable if there are existing places where we are copying the data out of the palloc'd memory, but otherwise it might be more of a prerequisite patch for the other things you mentioned. -- nathan
On Wed, Sep 18, 2024 at 2:23 PM Nathan Bossart <nathandbossart@gmail.com> wrote: > > On Wed, Sep 18, 2024 at 05:35:56PM +0800, Andy Fan wrote: > > Currently detoast_attr always detoast the data into a palloc-ed memory > > and then if user wants the detoast data in a different memory, user has to > > copy them, I'm thinking if we could provide a buf as optional argument for > > detoast_attr to save such wastage. > > > > [...] > > > > What do you think? > > My first thought is that this seems reasonable if there are existing places > where we are copying the data out of the palloc'd memory, but otherwise it > might be more of a prerequisite patch for the other things you mentioned. > This would also simplify data copying patterns many extensions have to do. For instance, often they have to move the data from Postgres memory into another language's allocation types. Or just for custom data structures, of course. I would suggest that this be added as new API, however, instead of a change to `detoast_attr`. This would make different return types more sensical, as there is no need to implicitly allocate. It could return an error type? > * Note if caller provides a non-NULL buffer, it is the duty of caller > * to make sure it has enough room for the detoasted format (Usually > * they can use toast_raw_datum_size to get the size) I'm not entirely sure why the caller is being given the burden of checking, but I suppose they probably did check? I can imagine scenarios where they are not interested, however, and the callee always has to obtain the data for len written anyways. So I would probably make writable length a third arg.
Andy Fan <zhihuifan1213@163.com> writes: > * Note if caller provides a non-NULL buffer, it is the duty of caller > * to make sure it has enough room for the detoasted format (Usually > * they can use toast_raw_datum_size to get the size) This is a pretty awful, unsafe API design. It puts it on the caller to know how to get the detoasted length, and it implies double decoding of the toast datum. > One of the key point is we can always get the varlena rawsize cheaply > without any real detoast activity in advance, thanks to the existing > varlena design. This is not an assumption I care to wire into the API design. How about a variant like struct varlena * detoast_attr_cxt(struct varlena *attr, MemoryContext cxt) which promises to allocate the result in the specified context? That would cover most of the practical use-cases, I think. regards, tom lane
Thank you all for the double check. > Andy Fan <zhihuifan1213@163.com> writes: >> * Note if caller provides a non-NULL buffer, it is the duty of caller >> * to make sure it has enough room for the detoasted format (Usually >> * they can use toast_raw_datum_size to get the size) > > ..., It puts it on the caller to know how to get the detoasted length Yes. > and it implies double decoding of the toast datum. Yes, We need to decoding the toast datum to know the rawsize as what we did in toast_raw_datum_size, this is an extra effrot. But I want to highlight that this "decoding" is different from "detoast", the later one need to scan toast_relation or decompression the data so it is a heavy work, but the former one just decoding some existing memory at hand which should be very cheap. >> One of the key point is we can always get the varlena rawsize cheaply >> without any real detoast activity in advance, thanks to the existing >> varlena design. > > This is not an assumption I care to wire into the API design. OK. (I just was excited to find out we can get the rawsize so cheaply, so we can find out an API to satify the both user cases.) > How about a variant like > > struct varlena * > detoast_attr_cxt(struct varlena *attr, MemoryContext cxt) > > which promises to allocate the result in the specified context? > That would cover most of the practical use-cases, I think. I think this works for my user case 1 but doesn't work for my user case 2 which requires the detoasted data is writen into a given memory buffer (not only a certain MemoryContext). IIUC the user cases Jubilee provided is more like user case 2. """ (user case 2) 2. make printtup function a bit faster [2]. The patch there already removed some palloc, memcpy effort, but it still have some chances to optimize further. for example text_out function, it is still detoast the datum into a palloc memory and then copy them into a StringInfo. """ I really want to make some progress in this direction, so thank you for the feedback. -- Best Regards Andy Fan
Jubilee Young <workingjubilee@gmail.com> writes: > On Wed, Sep 18, 2024 at 2:23 PM Nathan Bossart <nathandbossart@gmail.com> wrote: >> >> On Wed, Sep 18, 2024 at 05:35:56PM +0800, Andy Fan wrote: >> > Currently detoast_attr always detoast the data into a palloc-ed memory >> > and then if user wants the detoast data in a different memory, user has to >> > copy them, I'm thinking if we could provide a buf as optional argument for >> > detoast_attr to save such wastage. >> > >> > [...] >> > >> > What do you think? >> >> My first thought is that this seems reasonable if there are existing places >> where we are copying the data out of the palloc'd memory, but otherwise it >> might be more of a prerequisite patch for the other things you mentioned. >> > > This would also simplify data copying patterns many extensions have to do. > For instance, often they have to move the data from Postgres memory into > another language's allocation types. Or just for custom data structures, > of course. I thought we have, but did't check it so far. If we figured out an API, we can use them for optimizing some existing code. > I would suggest that this be added as new API, however, instead of a change > to `detoast_attr`. I agree that new API would usually be clearer than adding a more argument, just that I don't want copy-paste too much existing code. Actually the thing in my mind is: struct varlena * detoast_attr(struct varlena *attr) { int rawsize = toast_raw_datum_size(attr); char *buffer = palloc(rawsize) return detoast_attr_buffer(attr, buffer); } struct varlena * detoast_attr_buffer(struct varlena *attr, char *buffer) { ... } In this case: - there is no existing code need to be changed. - detoast_attr_buffer is tested sufficiently automatically. > This would make different return types more sensical, > as there is no need to implicitly allocate. It could return an error > type? I can't understand the error here. > >> * Note if caller provides a non-NULL buffer, it is the duty of caller >> * to make sure it has enough room for the detoasted format (Usually >> * they can use toast_raw_datum_size to get the size) > > I'm not entirely sure why the caller is being given the burden of checking, > but I suppose they probably did check? I can imagine scenarios where they > are not interested, however, and the callee always has to obtain the data > for len written anyways. So I would probably make writable length a > third arg. I didn't follow up here as well:(, do you mind to explain a bit? -- Best Regards Andy Fan
Hi Tom, > Andy Fan <zhihuifan1213@163.com> writes: >> * Note if caller provides a non-NULL buffer, it is the duty of caller >> * to make sure it has enough room for the detoasted format (Usually >> * they can use toast_raw_datum_size to get the size) > > This is a pretty awful, unsafe API design. Sorry that I expressed my thoughts incorrectly. I'm not going to refactor "detoast_attr", I'm going to add a new API named "detoast_attr_buffer", which very similar with text_to_cstring and text_to_string_buffer. Most of the user still can using detoast_attr, only the user who care about the MemoryContext or memcpy issue, they can the deotast_attr_buffer variant. > It puts it on the caller > to know how to get the detoasted length, and it implies double > decoding of the toast datum. I really nearly give up this idea today because of this, later I realized something which was known when I was writing the first message is forgotten by me now. Since it is really confused, I want to highlight it here for double check from more people. I thought 'toast_raw_datum_size' is the existing function to get the "detoasted length" for the caller of detoast_attr_buffer, so it looks reasonable for me to assume the caller knows how to get the detoast length. What really confused me suddenly is it is really correct to use "toast_raw_datum_size". In "toast_raw_datum_size": if (VARATT_IS_EXTERNAL_ONDISK(attr)) { /* va_rawsize is the size of the original datum -- including header */ struct varatt_external toast_pointer; VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr); result = toast_pointer.va_rawsize; } We just return the va_rawsize directly. Does it work for a datum which is compressed first and then stored on external ondisk? After some more research, it is correct since the rawsize is the size of uncompressed data. and we also use the fact in 1b393f4e5db4fd6bbc86a4e88785b6945a1541d0. This is why I said: > One of the key point is we can always get the varlena rawsize cheaply > without any real detoast activity in advance, thanks to the existing > varlena design. This is the thing I forget today. Since user can know the size easily now, is it cheap to use it? By looking the code in toast_raw_datum_size, I really hard to say it is expensive. > > How about a variant like > > struct varlena * > detoast_attr_cxt(struct varlena *attr, MemoryContext cxt) > > which promises to allocate the result in the specified context? > That would cover most of the practical use-cases, I think. Yes, it work for some use case, and it is similar with what I did in [1] (search detoast_attr_ext). However it can't support the case where user want to detoast the data into the given buffer (to avoid the later memcpy), so detoast_attr_buffer is the favorite API right now. If it is not doable, detoast_attr_ctx is also works for me. Would you still think detoast_attr_buffer is not a acceptable API now at the high level design. I'm working on an implementation, but I want have some high-level designment agreement first. [1] https://www.postgresql.org/message-id/attachment/160491/v10-0001-shared-detoast-datum.patch -- Best Regards Andy Fan
Hi Andy,
For EXPANDED attributes va_rawsize is the size of the compressed attribute, not original size.
You can check toast_save_datum for that.
This thread looks like the second take of the shared detoast datum patch. Have you checked
my proposals in that thread?
Nikita Malakhov <hukutoc@gmail.com> writes: Hello Nikita, > > For EXPANDED attributes va_rawsize is the size of the compressed attribute, not original size. > You can check toast_save_datum for that. Here is some quota from toast_save_datum, which is different from what you are saying here. * Get the data pointer and length, and compute va_rawsize and va_extinfo. * * va_rawsize is the size of the equivalent fully uncompressed datum, so * we have to adjust for short headers. * * va_extinfo stored the actual size of the data payload in the toast * records and the compression method in first 2 bits if data is * compressed. I might be wrong anyway, but if you want to feedback on the va_rawsize, could you quota the part I said for that? otherwise I am feeling all of my effort is ignored, and I'm not sure if you really read my post carefully. I think this would cause some inefficient communication. """ I thought 'toast_raw_datum_size' is the existing function to get the "detoasted length" for the caller of detoast_attr_buffer, so it looks reasonable for me to assume the caller knows how to get the detoast length. What really confused me suddenly is it is really correct to use "toast_raw_datum_size". In "toast_raw_datum_size": if (VARATT_IS_EXTERNAL_ONDISK(attr)) { /* va_rawsize is the size of the original datum -- including header */ struct varatt_external toast_pointer; VARATT_EXTERNAL_GET_POINTER(toast_pointer, attr); result = toast_pointer.va_rawsize; } We just return the va_rawsize directly. Does it work for a datum which is compressed first and then stored on external ondisk? After some more research, it is correct since the rawsize is the size of uncompressed data. and we also use the fact in 1b393f4e5db4fd6bbc86a4e88785b6945a1541d0. This is why I said... """ > This thread looks like the second take of the shared detoast datum > patch. Have you checked my proposals in that thread? At the start of this thread, I wrote the user case as below: Qutoa here again: """ There are 2 user cases at least: 1. The shared detoast datum patch at [1], where I want to avoid the duplicated detoast effort for the same datum. 2.make printtup function a bit faster [2]. """ If you check it carefully, you would find they have different needs and detoast_attr_buffer can meets the two needs at the same time. As to your prososal, I remember you want to redesign the storage for detoast, but then there is no detailed design or code later, which makes me confused about the state of this direction. But if you want to talk about this, could we discuss on its own thread? Thanks -- Best Regards Andy Fan
Hi!
-- Sorry for misguiding you, I've overlooked va_rawsize with va_extinfo.
You're right, va_rawsize holds uncompressed size, and extinfo actual
storage size. This was not intentional.
I'd better not count on caller's do know detoasted data length,
and much more the buffer is correctly initialized, because we cannot
check that inside and must rely on the caller, which would for sure
lead to unexpected segfaults, I agree with Tom Lane's proposal above.
Other options seem to be more crude and error-prone here. This is
an internal data fetching function and it should not generate new kinds
of errors, I think.
In case you're playing with this part of the code - I was always
confused with detoast_attr and detoast_external_attr functions
both marked as entry points of retrieving toasted data and both
look a lot the same. Have you ever thought of making a single
entry point by slightly redesigning this part?
Nikita Malakhov <hukutoc@gmail.com> writes: > Hi! > > Sorry for misguiding you, I've overlooked va_rawsize with va_extinfo. > You're right, va_rawsize holds uncompressed size, and extinfo actual > storage size. This was not intentional. That's OK, so we are in the same page here. > I'd better not count on caller's do know detoasted data length, > and much more the buffer is correctly initialized, because we cannot > check that inside and must rely on the caller, which would for sure > lead to unexpected segfaults, I agree with Tom Lane's proposal above. > Other options seem to be more crude and error-prone here. This is > an internal data fetching function and it should not generate new kinds > of errors, I think. Tom'sreply depends on the fact I was going to changing the "detoast_attr" to "detoast_attr_buffer", as I have expalined in my previous post. I don't think a [new] user provided buffer function is so harmful. What do you think about function "text_to_string_buffer"? This is also a part in my previous reply, but is ignored by your reply? > In case you're playing with this part of the code - I was always > confused with detoast_attr and detoast_external_attr functions > both marked as entry points of retrieving toasted data and both > look a lot the same. Have you ever thought of making a single > entry point by slightly redesigning this part? You can check [1] for a indepent improvements for this topic. [1] https://www.postgresql.org/message-id/874j4vcspl.fsf%40163.com -- Best Regards Andy Fan
Hi!
Andy, I'm truly sorry, I removed quoted messages just to not
make replies to walls of text.
On detoast_attr_buffer topic - and again agree with reply above
that supposes adding required data length as parameter, at least
by doing so we should explicitly make the user of this API pay
attention to the length of the buffer and data size, as it is done
in text_to_cstring_buffer you mentioned.
About printtup: I agree direct detoasting to the buffer could give
us some performance improvement, it seems not too difficult
to check, but as you can see there is a function call via fmgr,
it is better to try to find out why such a decision was made.
On Wed, Oct 30, 2024 at 2:47 AM Andy Fan <zhihuifan1213@163.com> wrote:
Nikita Malakhov <hukutoc@gmail.com> writes:
> Hi!
>
> Sorry for misguiding you, I've overlooked va_rawsize with va_extinfo.
> You're right, va_rawsize holds uncompressed size, and extinfo actual
> storage size. This was not intentional.
That's OK, so we are in the same page here.
> I'd better not count on caller's do know detoasted data length,
> and much more the buffer is correctly initialized, because we cannot
> check that inside and must rely on the caller, which would for sure
> lead to unexpected segfaults, I agree with Tom Lane's proposal above.
> Other options seem to be more crude and error-prone here. This is
> an internal data fetching function and it should not generate new kinds
> of errors, I think.
Tom'sreply depends on the fact I was going to changing the "detoast_attr"
to "detoast_attr_buffer", as I have expalined in my previous post. I
don't think a [new] user provided buffer function is so harmful. What
do you think about function "text_to_string_buffer"? This is also a
part in my previous reply, but is ignored by your reply?
> In case you're playing with this part of the code - I was always
> confused with detoast_attr and detoast_external_attr functions
> both marked as entry points of retrieving toasted data and both
> look a lot the same. Have you ever thought of making a single
> entry point by slightly redesigning this part?
You can check [1] for a indepent improvements for this topic.
[1] https://www.postgresql.org/message-id/874j4vcspl.fsf%40163.com
--
Best Regards
Andy Fan