Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE - Mailing list pgsql-hackers

From David Geier
Subject Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE
Date
Msg-id 290b9d46-5ba3-613f-a51f-4a7886878242@gmail.com
Whole thread Raw
In response to Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE  (Dmitry Dolgov <9erthalion6@gmail.com>)
Responses Re: Parallel Bitmap Heap Scan reports per-worker stats in EXPLAIN ANALYZE  (Dmitry Dolgov <9erthalion6@gmail.com>)
List pgsql-hackers
Hi Dmitry,

Thanks for looking at the patch and sorry for the long line.

On 3/17/23 21:14, Dmitry Dolgov wrote:
> The idea sounds reasonable to me, but I have to admit snapshot_and_stats
> implementation looks awkward. Maybe it would be better to have a
> separate structure field for both stats and snapshot, which will be set
> to point to a corresponding place in the shared FAM e.g. when the worker
> is getting initialized? shm_toc_allocate mentions BUFFERALIGN to handle
> possibility of some atomic operations needing it, so I guess that would
> have to be an alignment in this case as well.
>
> Probably another option would be to allocate two separate pieces of
> shared memory, which resolves questions like proper alignment, but
> annoyingly will require an extra lookup and a new key.

I considered the other options and it seems to me none of them is 
particularly superior. All of them have pros and cons with the cons 
mostly outweighing the pros. Let me quickly elaborate:

1. Use multiple shm_toc entries: Shared state is split into multiple 
pieces. Extra pointers in BitmapHeapScanState needed to point at the 
split out data. BitmapHeapScanState has already a shared_info member, 
which is not a pointer to the shared memory but a pointer to the leader 
local data allocated used to store the instrumentation data from the 
workers. This is confusing but at least consistent with how its done in 
other places (e.g. execSort.c, nodeHash.c, nodeIncrementalSort.c). 
Having another pointer there which points to the shared memory makes it 
even more confusing. If we go this way we would have e.g. 
shared_info_copy and shared_info members in BitmapHeapScanState.

2. Store two extra pointers to the shared FAM entries in 
BitmapHeapScanState: IMHO, that is the better alternative of (1) as it 
doesn't need an extra TOC entry but comes with the same confusion of 
multiple pointers to SharedBitmapHeapScanInfo in BitmapHeapScanState. 
But maybe that's not too bad?

3. Solution in initial patch (use two functions to obtain pointers 
where/when needed): Avoids the need for another pointer in 
BitmapHeapScanState at the cost / ugliness of having to call the helper 
functions.

Another, not yet discussed, option I can see work is:

4. Allocate a fixed amount of memory for the instrumentation stats based 
on MAX_PARALLEL_WORKER_LIMIT: MAX_PARALLEL_WORKER_LIMIT is 1024 and used 
as the limit of the max_parallel_workers GUC. This way 
MAX_PARALLEL_WORKER_LIMIT * sizeof(BitmapHeapScanInstrumentation) = 1024 
* 8 = 8192 bytes would be allocated. To cut this down in half we could 
additionally change the type of lossy_pages and exact_pages from long to 
uint32. Only possibly needed memory would have to get initialized, the 
remaining unused memory would remain untouched to not waste cycles.

My first preference is the new option (4). My second preference is 
option (1). What's your take?

-- 
David Geier
(ServiceNow)




pgsql-hackers by date:

Previous
From: Ashutosh Bapat
Date:
Subject: Re: Infinite Interval
Next
From: Ashutosh Bapat
Date:
Subject: Re: Infinite Interval