On Tue, Jul 08, 2025 at 08:38:41AM +0900, Michael Paquier wrote:
> Please note that I still need to look at perf profiles and some flame
> graphs with the refactoring done in 0003 with the worst case I've
> mentioned upthread with detoasting and values stored uncompressed in
> the TOAST relation.
So, the worst case I could think of for the slice detoast path is
something like that:
create table toasttest_bytea (f1 bytea);
alter table toasttest_bytea alter column f1 set storage external;
insert into toasttest_bytea values(decode(repeat('1234567890',10000),'escape'));
And then use something like the following query that retrieves a small
substring many times, to force a maximum of detoast_attr_slice() to
happen, checking the effect of toast_external_info_get_data():
select length(string_agg(substr(f1, 2, 3), '')) from
toasttest_bytea, lateral generate_series(1,1000000) as a (id);
I have taken this query, kept running that with a \watch, and took
samples of 10s perf records, finishing with the attached graphs
(runtime does not show any difference):
- detoast_master.svg, for the graph on HEAD.
- detoast_patch.svg with the patch set up to 0003 and the external
TOAST pointer refactoring, where detoast_attr_slice() shows up.
- master_patch_diff.svg as the difference between both, with
difffolded.pl from [1].
I don't see a difference in the times spent in these stacks, as we are
spending most of the run retrieving the slices from the TOAST relation
in fetch_datum_slice(). Opinions and/or comments are welcome.
[1]: https://github.com/brendangregg/FlameGraph
--
Michael