Re: Add tracking of backend memory allocated to pg_stat_activity - Mailing list pgsql-hackers

From Reid Thompson
Subject Re: Add tracking of backend memory allocated to pg_stat_activity
Date
Msg-id 403cf0b8e25609393789509dcbd4d1acf970555c.camel@crunchydata.com
Whole thread Raw
In response to Re: Add tracking of backend memory allocated to pg_stat_activity  (Andres Freund <andres@anarazel.de>)
Responses Re: Add tracking of backend memory allocated to pg_stat_activity
Re: Add tracking of backend memory allocated to pg_stat_activity
List pgsql-hackers
Hi Andres,
Thanks for looking at this and for the feedback. Responses inline below.

On Fri, 2022-11-04 at 19:41 -0700, Andres Freund wrote:
> Hi,
>
> On 2022-11-04 08:56:13 -0400, Reid Thompson wrote:
>
> I'm not convinced that counting DSM values this way is quite right.
> There are a few uses of DSMs that track shared resources, with the biggest
> likely being the stats for relations etc.  I suspect that tracking that via
> backend memory usage will be quite confusing, because fairly random backends that
> had to grow the shared state end up being blamed with the memory usage in
> perpituity - and then suddenly that memory usage will vanish when that backend exits,
> despite the memory continuing to exist.

Ok, I'll make an attempt to identify these allocations and manage them
elsewhere.

>
>
> > @@ -734,6 +747,7 @@ AllocSetAlloc(MemoryContext context, Size size)
> >                         return NULL;
> >  
> >                 context->mem_allocated += blksize;
> > +               pgstat_report_backend_allocated_bytes_increase(blksize);
> >  
> >                 block->aset = set;
> >                 block->freeptr = block->endptr = ((char *) block) + blksize;
> > @@ -944,6 +958,7 @@ AllocSetAlloc(MemoryContext context, Size size)
> >                         return NULL;
> >  
> >                 context->mem_allocated += blksize;
> > +               pgstat_report_backend_allocated_bytes_increase(blksize);
> >  
> >                 block->aset = set;
> >                 block->freeptr = ((char *) block) + ALLOC_BLOCKHDRSZ;
> > @@ -1043,6 +1058,7 @@ AllocSetFree(void *pointer)
> >                         block->next->prev = block->prev;
> >  
> >                 set->header.mem_allocated -= block->endptr - ((char *) block);
> > +               pgstat_report_backend_allocated_bytes_decrease(block->endptr - ((char *) block));
> >  
> >  #ifdef CLOBBER_FREED_MEMORY
> >                 wipe_mem(block, block->freeptr - ((char *) block));
>
> I suspect this will be noticable cost-wise. Even though these paths aren't the
> hottest memory allocation paths, by nature of going down into malloc, adding
> an external function call that then does a bunch of branching etc. seems
> likely to add up to some.  See below for how I think we can deal with
> that...
>
>
> > +
> > +/* --------
> > + * pgstat_report_backend_allocated_bytes_increase() -
> > + *
> > + * Called to report increase in memory allocated for this backend
> > + * --------
> > + */
> > +void
> > +pgstat_report_backend_allocated_bytes_increase(uint64 allocation)
> > +{
> > +       volatile PgBackendStatus *beentry = MyBEEntry;
> > +
> > +       if (!beentry || !pgstat_track_activities)
> > +       {
> > +               /*
> > +                * Account for memory before pgstats is initialized. This will be
> > +                * migrated to pgstats on initialization.
> > +                */
> > +               backend_allocated_bytes += allocation;
> > +
> > +               return;
> > +       }
> > +
> > +       /*
> > +        * Update my status entry, following the protocol of bumping
> > +        * st_changecount before and after.  We use a volatile pointer here to
> > +        * ensure the compiler doesn't try to get cute.
> > +        */
> > +       PGSTAT_BEGIN_WRITE_ACTIVITY(beentry);
> > +       beentry->backend_allocated_bytes += allocation;
> > +       PGSTAT_END_WRITE_ACTIVITY(beentry);
> > +}
>
> This is quite a few branches, including write/read barriers.
>
> It doesn't really make sense to use the PGSTAT_BEGIN_WRITE_ACTIVITY() pattern
> here - you're just updating a single value, there's nothing to be gained by
> it. The point of PGSTAT_BEGIN_*ACTIVITY() stuff is to allow readers to get a
> consistent view of multiple values - but there aren't multiple values
> here!

I'll remove the barriers - initially I copied how prior functions were
coded as my template ala
pgstat_report_query_id, pgstat_report_xact_timestamp.

>
> To avoid the overhead of checking (!beentry || !pgstat_track_activities) and
> the external function call, I think you'd be best off copying the trickery I
> introduced for pgstat_report_wait_start(), in 225a22b19ed.
>
> I.e. make pgstat_report_backend_allocated_bytes_increase() a static inline
> function that unconditionally updates something like
> *my_backend_allocated_memory.  To deal with the case of (!beentry ||
> !pgstat_track_activities), that variable initially points to some backend
> local state and is set to the shared state in pgstat_bestart().
>
> This additionally has the nice benefit that you can track memory usage from
> before pgstat_bestart(), it'll be in the local variable.

OK, I think I can mimic the code you reference.

>
> > +void
> > +pgstat_report_backend_allocated_bytes_decrease(uint64
> > deallocation)
> > +{
> > +       volatile PgBackendStatus *beentry = MyBEEntry;
> > +
> > +       /*
> > +        * Cases may occur where shared memory from a previous postmaster
> > +        * invocation still exist. These are cleaned up at startup by
> > +        * dsm_cleanup_using_control_segment. Limit decreasing memory allocated to
> > +        * zero in case no corresponding prior increase exists or decrease has
> > +        * already been accounted for.
> > +        */
>
> I don't really follow - postmaster won't ever have a backend status
> array, so how would they be tracked here?

On startup, a check is made for leftover dsm control segments in the
DataDir. It appears possible that in certain situations on startup we
may find and destroy stale segments and thus decrement the allocation
variable.

I based this off of:
/ipc/dsm.c

dsm_postmaster_startup:
 150 dsm_postmaster_startup(PGShmemHeader *shim)
 {
...snip...
 158     /*
                 
 159     ¦* If we're using the mmap implementations, clean up any leftovers.
                 
 160     ¦* Cleanup isn't needed on Windows, and happens earlier in startup for
                 
 161     ¦* POSIX and System V shared memory, via a direct call to
                 
 162     ¦* dsm_cleanup_using_control_segment.
                 
 163     ¦*/
...snip... }


dsm_cleanup_using_control_segment:
 206 /*
                 
 207  * Determine whether the control segment from the previous postmaster
                 
 208  * invocation still exists.  If so, remove the dynamic shared memory
                 
 209  * segments to which it refers, and then the control segment itself.
                 
 210  */
                 
 211 void
                 
 212 dsm_cleanup_using_control_segment(dsm_handle old_control_handle)
                 
 213 {
 ...snip...
 270         /* Destroy the referenced segment. */
                 
 271         dsm_impl_op(DSM_OP_DESTROY, handle, 0, &junk_impl_private,
                 
 272                     &junk_mapped_address, &junk_mapped_size, LOG);
                 
 273     }
                 
 274
                 
 275     /* Destroy the old control segment, too. */
                 
 276     elog(DEBUG2,
                 
 277         ¦"cleaning up dynamic shared memory control segment with ID %u",
                 
 278         ¦old_control_handle);
                 
 279     dsm_impl_op(DSM_OP_DESTROY, old_control_handle, 0, &impl_private,
                 
 280                 &mapped_address, &mapped_size, LOG);
 281 }

>
> Greetings,
>
> Andres Freund

Thanks again,
Reid

--
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thompson@crunchydata.com
www.crunchydata.com





pgsql-hackers by date:

Previous
From: Bharath Rupireddy
Date:
Subject: Re: [PATCH] Teach pg_waldump to extract FPIs from the WAL
Next
From: Aleksander Alekseev
Date:
Subject: Re: HOT chain validation in verify_heapam()