Custom explain options - Mailing list pgsql-hackers

From Konstantin Knizhnik
Subject Custom explain options
Date
Msg-id 6cd5caa7-06e1-4460-bf35-00a59da3f677@garret.ru
Whole thread Raw
Responses Re: Custom explain options
Re: Custom explain options
Re: Custom explain options
Re: Custom explain options
List pgsql-hackers

Hi hackers,

EXPLAIN statement has a list of options (i.e. ANALYZE, BUFFERS, COST,...) which help to provide useful details of query execution.
In Neon we have added PREFETCH option which shows information about page prefetching during query execution (prefetching is more critical for Neon
architecture because of separation of compute and storage, so it is implemented not only for bitmap heap scan as in Vanilla Postgres, but also for seqscan, indexscan and indexonly scan). Another possible candidate  for explain options is local file cache (extra caching layer above shared buffers which is used to somehow replace file system cache in standalone Postgres).

I think that it will be nice to have a generic mechanism which allows extensions to add its own options to EXPLAIN.
I have attached the patch with implementation of such mechanism (also available as PR:  https://github.com/knizhnik/postgres/pull/1 )

I have demonstrated this mechanism using Bloom extension - just to report number of Bloom matches.

Not sure that it is really useful information but it is used mostly as example:

explain (analyze,bloom) select * from t where pk=2000;                                                       QUERY PLAN                                                        
------------------------------------------------------------------------------------------------------------------------- Bitmap Heap Scan on t  (cost=15348.00..15352.01 rows=1 width=4) (actual time=25.244..25.939 rows=1 loops=1)   Recheck Cond: (pk = 2000)   Rows Removed by Index Recheck: 292   Heap Blocks: exact=283   Bloom: matches=293   ->  Bitmap Index Scan on t_pk_idx  (cost=0.00..15348.00 rows=1 width=0) (actual time=25.147..25.147 rows=293 loops=1)         Index Cond: (pk = 2000)         Bloom: matches=293 Planning:   Bloom: matches=0 Planning Time: 0.387 ms Execution Time: 26.053 ms
(12 rows)

There are two known issues with this proposal:

1. I have to limit total size of all custom metrics - right now it is limited by 128 bytes. It is done to keep Instrumentation and some other data structures fixes size. Otherwise maintaining varying parts of this structure is ugly, especially in shared memory

2. Custom extension is added by means of RegisterCustomInsrumentation function which is called from _PG_init
But _PG_init is called when extension is loaded and it is loaded on demand when some of extension functions is called (except when extension is included
in shared_preload_libraries list), Bloom extension doesn't require it. So if your first statement executed in your session is:
 explain (analyze,bloom) select * from t where pk=2000;
...you will get error:
ERROR:  unrecognized EXPLAIN option "bloom"
LINE 1: explain (analyze,bloom) select * from t where pk=2000;
It happens because at the moment when explain statement parses options, Bloom index is not yet selected and so bloom extension is not loaded and RegisterCustomInsrumentation is not yet called. If we repeat the query, then proper result will be displayed (see above).

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Remove last traces of HPPA support
Next
From: Gurjeet Singh
Date:
Subject: Re: Remove extraneous break condition in logical slot advance function