Re: Detoasting optionally to make Explain-Analyze less misleading - Mailing list pgsql-hackers

From stepan rutz
Subject Re: Detoasting optionally to make Explain-Analyze less misleading
Date
Msg-id 9126b7da-204b-4701-8e2b-37fb824a0cf4@gmx.de
Whole thread Raw
In response to Re: Detoasting optionally to make Explain-Analyze less misleading  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Detoasting optionally to make Explain-Analyze less misleading
List pgsql-hackers
Hi Thomas,

you are right of course. Thanks!

I have attached a new version of the patch that supports the syntax like
suggested. The previous patch was insonsistent in style indeed.

explain (analyze, serialize)

and

explain (analyze, serialize binary)

That doesn't make too much of a difference for most scenarios I am
certain. However the the seralize option itself does. Mostly because it
performs the detoasting and that was a trap for me in the past with just
plain analyze.


Eg this scenario really is not too far fetched in a world where people
have large JSONB values.


db1=# create table test(id bigint, val text);

db1=# insert into test(val) select string_agg(s::text, ',') from (select
generate_series(1, 10_000_000) as s) as a1;

now we have a cell that has roughly 80Mb in it. A large detoasting that
will happen in reallife but in explain(analyze).

and then...

db1=# explain (analyze) select * from test;
                                             QUERY PLAN
---------------------------------------------------------------------------------------------------
  Seq Scan on test  (cost=0.00..22.00 rows=1200 width=40) (actual
time=0.018..0.020 rows=1 loops=1)
  Planning Time: 0.085 ms
  Execution Time: 0.044 ms
(3 rows)

db1=# explain (analyze, serialize) select * from test;
                                             QUERY PLAN
---------------------------------------------------------------------------------------------------
  Seq Scan on test  (cost=0.00..22.00 rows=1200 width=40) (actual
time=0.023..0.027 rows=1 loops=1)
  Planning Time: 0.077 ms
  Execution Time: 303.281 ms
  Serialized Bytes: 78888953 Bytes. Mode Text. Bandwidth 248.068 MB/sec
(4 rows)

db1=#

So the explain(analyze) does not process the ~80 MB in 0.044ms in any
way of course.

Actually I could print the serialized bytes using 1. grouping-separators
(eg 78_888_953) or 2. in the way pg_size_pretty does it.

If doing it the pg_size_pretty way I am uncertain if it would be ok to
query the actual pg_size_pretty function via its (certainly frozen) oid
of 3166 and do OidFunctionCall1(3166...) to invoke it. Otherwise I'd say
it would be nice if the code from that function would be made available
as a utility function for all c-code.  Any suggestions on this topic?

Regards,

/Stepan


On 02.11.23 18:49, Tomas Vondra wrote:
> Hi,
>
> On 9/15/23 22:09, stepan rutz wrote:
>> Hi,
>>
>> please see a revised version yesterday's mail. The patch attached now
>> provides the following:
>>
>> EXPLAIN(ANALYZE,SERIALIZE)
>>
>> and
>>
>> EXPLAIN(ANALYZE,SERIALIZEBINARY)
>>
> I haven't looked at the patch in detail yet, but this option name looks
> a bit strange/inconsistent. Either it should be SERIALIZE_BINARY (to
> match other multi-word options), or maybe there should be just SERIALIZE
> with a parameter to determine text/binary (like FORMAT, for example).
>
> So we'd do either
>
>      EXPLAIN (SERIALIZE)
>      EXPLAIN (SERIALIZE TEXT)
>
> to get serialization to text (which I guess 99% of people will do), or
>
>      EXPLAIN (SERIALIZE BINARY)
>
> to get binary.
>
>
> regards
>

Attachment

pgsql-hackers by date:

Previous
From: John Morris
Date:
Subject: Re: Including Doxyfile and Meson script for docs into main source tree
Next
From: Bharath Rupireddy
Date:
Subject: Re: Add new option 'all' to pg_stat_reset_shared()