Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem) - Mailing list pgsql-hackers
From | Michael Paquier |
---|---|
Subject | Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem) |
Date | |
Msg-id | aG2nv0Iw9Te7c6en@paquier.xyz Whole thread Raw |
In response to | Re: Support for 8-byte TOAST values (aka the TOAST infinite loop problem) ("Burd, Greg" <greg@burd.me>) |
List | pgsql-hackers |
On Tue, Jul 08, 2025 at 12:58:26PM -0400, Burd, Greg wrote: > All that aside, I think you're right to tackle this one step at a > time and try not to boil too much of the ocean at once (the patch > set is already large enough). With that in mind I've read once or > twice over your changes and have a few basic comments/questions. > > > v2-0001 Refactor some TOAST value ID code to use uint64 instead of Oid > > This set of changes make sense and as you say are mechanical in > nature, no real comments other than I think that using uint64 rather > than Oid is the right call and addresses #2 on Tom's list. > > > v2-0002 Minimize footprint of TOAST_MAX_CHUNK_SIZE in heap TOAST code > > I like this as well, clarifies the code and reduces repetition. Thanks. These are independent pieces if you want to link the code less to TOAST, assuming that an area of 8 bytes would be good enough for any TOAST "value" concept. TIDs were mentioned as well on a different part of the thread, ItemPointerData is 6 bytes. > v2-0003 Refactor external TOAST pointer code for better pluggability > > + * For now there are only two types, all defined in this file. For now this > + * is the maximum value of vartag_external, which is a historical choice. > > This provides a bridge for compatibility, but doesn't open the door > to a truly pluggable API. I'm guessing the goal is incremental > change rather than wholesale rewrite. Nope, it does not introduce a pluggable thing, but it does untangle the fact that one needs to change 15-ish code paths when they want to add a new type of external TOAST pointer, without showing an actual impact AFAIK when we insert a TOAST value or fetch it, as long as we know that we're dealing with an on-disk thing that requires an external lookup. > + * The different kinds of on-disk external TOAST pointers. divided by > + * vartag_external. > > Extra '.' in "TOAST pointers. divided" I'm guessing. Indeed, thanks. > v2-0007 Introduce global 64-bit TOAST ID counter in control file > > Do you have any concern that this might become a bottleneck when > there are many relations and many backends all contending for a new > id? I'd imagine that this would show up in a flame graph, but I > think your test focused on the read side detoast_attr_slice() rather > than insert/update and contention on the shared counter. Would this > be even worse on NUMA systems? That may be possible, see below. > Thanks for the flame graphs examining a heavy detoast_attr_slice() > workload. I agree that there is little or no difference between > them which is nice. Cool. Yes. I was wondering why detoast_attr_slice() does not show up in the profile on HEAD, perhaps it just got optimized away (I was under -O2 for these profiles). > I think the only call out Tom made [4] that isn't addressed was the > ask for localized ID selection. That may make sense at some point, > especially if there is contention on GetNewToastId(). I think that > case is worth a separate performance test, something with a large > number of relations and backends all performing a lot of updates > generating a lot of new IDs. What do you think? Yeah, I need to do more benchmark for the int8 part, I was holding on such evaluations because this part of the patch does not fly if we don't do the refactoring pieces first. Anyway, I cannot get excited about the extra workload that this would require in the catalogs, because we would need one TOAST sequence tracked in there, linked to the TOAST relation so it would not be free. Or we invent a new facility just for this purpose, meaning that we get far away even more from being able to resolve the original problem with the values and compression IDs. We're talking about two instructions. Well, I guess that we could optimize it more some atomics or even cache a range of values to save in ToastIdGenLock acquisitions in a single backend. I suspect that the bottleneck is going to be the insertion of the TOAST entries in toast_save_datum() anyway with the check for conflicting values, even if your relation is unlogged or running-on-scissors in memory. > [2] "Yes, the idea is to put the tid pointer directly in the varlena > external header and have a tid array in the toast table as an extra > column. If all of the TOAST fits in the single record, this will be > empty, else it will have an array of tids for all the pages for this > toasted field." - Hannu Krosing in an email to me after > PGConf.dev/2025 Sure, you could do that as well, but I suspect that we'll need the steps of at least up to 0003 to be able to handle more easily multiple external TOAST pointer types, or the code will be messier than it currently is. :D -- Michael
Attachment
pgsql-hackers by date: