Re: pg_stat_io_histogram - Mailing list pgsql-hackers
| From | Jakub Wartak |
|---|---|
| Subject | Re: pg_stat_io_histogram |
| Date | |
| Msg-id | CAKZiRmx3Z+fxunEc4CDGrpZVFwD4MAEDfGVM_v-CW-=xTLeF_Q@mail.gmail.com Whole thread |
| In response to | Re: pg_stat_io_histogram (Jakub Wartak <jakub.wartak@enterprisedb.com>) |
| List | pgsql-hackers |
On Thu, Mar 19, 2026 at 11:16 AM Jakub Wartak <jakub.wartak@enterprisedb.com> wrote: > > On Wed, Mar 18, 2026 at 2:29 PM Jakub Wartak > <jakub.wartak@enterprisedb.com> wrote: > > > > On Tue, Mar 17, 2026 at 3:17 PM Andres Freund <andres@anarazel.de> wrote: > > > On 2026-03-17 13:13:59 +0100, Jakub Wartak wrote: > > > > 1. Concerns about memory use. With v7 I had couple of ideas, and with those > > > > the memory use is really minimized as long as the code is still simple > > > > (so nothing fancy, just some ideas to trim stuff and dynamically allocate > > > > memory). I hope those reduce memory footprint to acceptable levels, see my > > > > earlier description for v7. > > > > > > Personally I unfortunately continue to think that storing lots of values that > > > are never anything but zero isn't a good idea once you have more than a > > > handful of kB. Storing pointless data is something different than increasing > > > memory usage with actual information. > > > > > > I still think you should just count the number of histograms needed, have an > > > array [object][context][op] with the associated histogram "offset" and then > > > increment the associated offset. It'll add an indirection at count time, but > > > no additional branches. > > > > Great idea, thanks, I haven't thought about that! Attached v9 attempts to do > > that for pending backend I/O struct, which minimizes the (backend) memory > > footprint for client backends to just about ~5kB. > > > > I have been pulling my hair trying to achieve the same for shared-memory, but I > > have failed to do that w/o sinking into complexity [..] > > OK, I've made it done too with indirect offset on shared memory, it wasn't easy > at least for me, but now we have two approaches/patchsets: > [..] > v9b: with more code and build complexity but that should address concern of not > used memory > > 'Shared Memory Stats' allocated size: > master - uses ~308kB for shm > v9a-000[12]: 578kB shm > v9a-000[123]: 507kB shm > v9a-000[1234]: 471kB shm (+~163kB more) > > v9b-000[123]: 361kB shm > > v9a-000[12] are identical to v9b-00[12], but included just for > patchset completeness. > > In v9b meson/autoconf (for adding pgstat_io_genstats) build most of > the time what > they need, but probably that needs some cleanups and better dependency > tracking. I'm > not sure about correctnes of those changes as especially > autoconf/Makefile is a lot > like brainf**k to me and that area would need some help... > > I think now we could even increase max resolution of buckets to cover > max those maximum > of 32s+ (at the cost of one extra 64-byte cacheline for pending IO > stats, so go with > PGSTAT_IO_HIST_BUCKETS from 16 to 24) Good morning all, Ok here comes v10, which is bit like earlier v9b (so has reduced shared memory footprint using Yours idea about indirect offsets idea), but now with shm memory sized and allocated on startup by postmaster. There are 3 patches: - 0001, one to introduce view and bucketting, no changes since quite some time - 0002, saves some private (backend) memory - 0003, main meat, saving shared memory (main problem raised earlier), now switched to simply dynamically size shared memory based on those pgstat_track_io*() logic The problem with the 0003 earlier was that I wanted to absolutley avoid further complexiy/alterations in struct PgStat_IO related to dynamic shared memory allocation for hist_time_buckets_slots[PGSTAT_IO_HIST_BUCKET_SLOTS] [PGSTAT_IO_HIST_BUCKETS] (I was afraid to touch that shm code, it looks complex), so I had to come out with something that would tell us how many slots (PGSTAT_IO_HIST_BUCKET_SLOTS) we need, I wish we had C++'s `constexpr` that would do all of that. I've tried three aproaches (like in v9b but that hit some serious cross-compiling obstacles, also had perl doing that, but that had lots of code duplication), so in the end I had to alter the pgstat_io shm allocation which is now in 0003. Summary of changes in 0003 since v9b / earlier post: - Fixed potential race condition (touch via memset/memcpy() only histogram slots under LWLock) - Fixed/removed the PGSTAT_IO_HIST_BUCKET_SLOTS macro - Removed pgstat_io_genslots.c (first idea, above) and abandonded attempt to fixup some cross compilation woes on MSVC/mingw - Bumped PGSTAT_FILE_FORMAT_ID - Move/optimize pending_off in pgstat_io_flush_cb out of hot loop - Document that hist_time_buckets_offsets should be the last member of PgStat_BktypeIO - Be defensive - added some asserts() - Adjust _bucket_offsets from uint64 to just int to save memory (offsets are low numbers) - and finally moved to dynamic shm allocation of PgStat_IO stuff during startup At the end of the day, I'll squeze 000[123] into just one, but wanted to ease the review first a bit. Of course this is material for PG20. -J.
Attachment
pgsql-hackers by date: