Re: Get memory contexts of an arbitrary backend process - Mailing list pgsql-hackers

From torikoshia
Subject Re: Get memory contexts of an arbitrary backend process
Date
Msg-id 0b0657d5febd0e46565a6bc9c62ba3f6@oss.nttdata.com
Whole thread Raw
In response to Re: Get memory contexts of an arbitrary backend process  (Kyotaro Horiguchi <horikyota.ntt@gmail.com>)
Responses Re: Get memory contexts of an arbitrary backend process  (Fujii Masao <masao.fujii@oss.nttdata.com>)
List pgsql-hackers
On 2021-03-23 17:24, Kyotaro Horiguchi wrote:

Thanks for reviewing and suggestions!

> At Mon, 22 Mar 2021 15:09:58 +0900, torikoshia
> <torikoshia@oss.nttdata.com> wrote in
>> >> If MemoryContextStatsPrint(), i.e. showing 100 children at most is
>> >> enough, this hard limit may be acceptable.
>> > Can't this number be passed via shared memory?
>> 
>> The attached patch uses static shared memory to pass the number.
> 
> "pg_print_backend_memory_contexts"
> 
> That name looks like as if it returns the result as text when used on
> command-line. We could have pg_get_backend_memory_context(bool
> dump_to_log (or where to dump), int limit).  Or couldn't we name it
> differently even in the ase we add a separate function?

Redefined pg_get_backend_memory_contexts() as
pg_get_backend_memory_contexts(pid, int max_children).

When pid equals 0, pg_get_backend_memory_contexts() prints local memory
contexts as original pg_get_backend_memory_contexts() does.
In this case, 'max_children' is ignored.

When 'pid' does not equal 0 and it is the PID of the client backend,
  memory contexts are logged through elog().

> 
> +/*
> + * MaxChildrenPerContext
> + *           Max number of children to print per one parent context.
> + */
> +int  *MaxChildrenPerContext = NULL;
> 
> Perhaps it'd be better to have a struct even if it consists only of
> one member.  (Aligned) C-int values are atomic so we can omit the
> McxtPrintLock. (I don't think it's a problem even if it is modifed
> while reading^^:)

Fixed them.

> +     if(max_children <= 0)
> +     {
> +             ereport(WARNING,
> +                             (errmsg("%d is invalid value", 
> max_children),
> +                              errhint("second parameter is the number 
> of context and it must
> be set to a value greater than or equal to 1")));
> 
> It's annoying to choose a number large enough when I want to dump
> children unlimitedly.  Couldn't we use 0 to specify "unlimited"?

Modified as you suggested.

> +                             (errmsg("%d is invalid value", 
> max_children),
> +                              errhint("second parameter is the number 
> of context and it must
> be set to a value greater than or equal to 1")));
> 
> For the main message, (I think) we usually spell the "%d is invalid
> value" as "maximum number of children must be positive" or such.  For
> the hint, we don't need a copy of the primary section of the
> documentation here.

Modified it to "The maximum number of children must be greater than 0".

> 
> I think we should ERROR out for invalid parameters, at least for
> max_children.  I'm not sure about pid since we might call it based on
> pg_stat_activity..

Changed to ERROR out when the 'max_children' is less than 0.

Regarding pid, I left it untouched considering the consistency with 
other
signal sending functions such as pg_cancel_backend().

> +     if(!SendProcSignal(pid, PROCSIG_PRINT_MEMORY_CONTEXT, 
> InvalidBackendId))
> 
> We know the backendid of the process here.

Added it.

> 
> +                     if (is_dst_stderr)
> +                     {
> +                             for (i = 0; i <= level; i++)
> +                                     fprintf(stderr, "  ");
> 
> The fprintf path is used nowhere in the patch at all. It can be used
> while attaching debugger but I'm not sure we need that code.  The
> footprint of this patch is largely shrinked by removing it.

According to the past discussion[1], people wanted MemoryContextStats
as it was, so I think it's better that MemoryContextStats can be used
as before.

> 
> +             strcat(truncated_ident, delimiter);
> 
> strcpy is sufficient here.  And we don't need the delimiter to be a
> variable.  (we can copy a string literal into truncate_ident, then
> count the length of truncate_ident, instead of the delimiter
> variable.)

True.

> +             $current_logfiles = slurp_file($node->data_dir . 
> '/current_logfiles');
> ...
> +my $lfname = $current_logfiles;
> +$lfname =~ s/^stderr //;
> +chomp $lfname;
> 
> $node->logfile is the current log file name.
> 
> +     'target PID is not PostgreSQL server process');
> 
> Maybe "check if PID check is working" or such?  And, we can do
> something like the following to exercise in a more practical way.
> 
>  select pg_print_backend...(pid,) from pg_stat_activity where
> backend_type = 'checkpointer';

It seems better.

>> As documented, the current implementation allows that when multiple
>> pg_print_backend_memory_contexts() called in succession or
>> simultaneously, max_children can be the one of another
>> pg_print_backend_memory_contexts().
>> I had tried to avoid this by adding some state information and using
>> before_shmem_exit() in case of process termination for cleaning up the
>> state information as in the patch I presented earlier, but since
>> kill()
>> returns success before the dumper called signal handler, it seemed
>> there were times when we couldn't clean up the state.
>> Since this happens only when multiple
>> pg_print_backend_memory_contexts()
>> are called and their specified number of children are different, and
>> the
>> effect is just the not intended number of children to print, it might
>> be
>> acceptable.
> 
> I see it as a non-issue. Even though the behavior looks somewhat
> strange, that usage is stranger than the behavior.

Thanks for your comments!

>> Or it might be better to wait for some seconds if num_chilren on
>> shared
>> memory is not the initialized value(meaning some other process is
>> requesting to print memory contexts).
>> 
>> >> Only superusers can call pg_print_backend_memory_contexts().
>> > +  /* Only allow superusers to signal superuser-owned backends. */
>> > +  if (superuser_arg(proc->roleId) && !superuser())
>> > The patch seems to allow even non-superuser to request to print the
>> > memory
>> > contexts if the target backend is owned by non-superuser. Is this
>> > intentional?
>> > I think that only superuser should be allowed to execute
>> > pg_print_backend_memory_contexts() whoever owns the target backend.
>> > Because that function can cause lots of log messages.
>> 
>> Thanks, it's not intentional, modified it.
> 
> By the way we can send a procsig to other than a client backend. And
> most of the postgres processes are using the standard signal handler
> and intializes the procsig facility. So some of such kind of processes
> can dump the memory context information as-is. Otherwise we can add
> CHECK_FOR_INTERRUPT to appropriate place to allow that.  I'm not sure
> how it is useful for other kind of processes, but it might be useful
> for autovacuum workers.

Yeah, I also think it's possible to get memory contexts other than the
client backend.
But I'm not sure whether people want to do it...


[1] https://www.postgresql.org/message-id/906.1513707472%40sss.pgh.pa.us

regards.
Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [HACKERS] Custom compression methods
Next
From: Julien Rouhaud
Date:
Subject: Re: Feature improvement: can we add queryId for pg_catalog.pg_stat_activity view?