Re: Enhancing Memory Context Statistics Reporting - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Enhancing Memory Context Statistics Reporting
Date
Msg-id 7d4db5cf-8d72-4ee1-8f2e-f88ef9615f27@vondra.me
Whole thread Raw
In response to Re: Enhancing Memory Context Statistics Reporting  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
List pgsql-hackers
Hi,

I took a quick look at the patch today. Overall, I think this would be
very useful, I've repeatedly needed to inspect why a backend uses so
much memory, and I ended up triggering MemoryContextStats() from gdb.
This would be more convenient / safer. So +1 to the patch intent.


A couple review comments:

1) I read through the thread, and in general I agree with the reasoning
for removing the file part - it seems perfectly fine to just dump as
much as we can fit into a buffer, and then summarize the rest. But do we
need to invent a "new" limit here? The other places logging memory
contexts do something like this:

   MemoryContextStatsDetail(TopMemoryContext, 100, 100, false);

Which means we only print the 100 memory contexts at the top, and that's
it. Wouldn't that give us a reasonable memory limit too?


2) I see the function got renamed to pg_get_process_memory_contexts(),
bu the comment still says pg_get_remote_backend_memory_contexts().


3) I don't see any SGML docs for this new function. I was a bit unsure
what the "summary" argument is meant to do. The comment does not explain
that either.


4) I wonder if the function needs to return PID. I mean, the caller
knows which PID it is for, so it seems rather unnecessary.


5) In the "summary" mode, it might be useful to include info about how
many child contexts were aggregated. It's useful to know whether there
was 1 child or 10000 children. In the regular (non-summary) mode it'd
always be "1", probably, but maybe it'd interact with the limit in (1).
Not sure.


6) I feel a bit uneasy about the whole locking / communication scheme.
In particular, I'm worried about lockups, deadlocks, etc. So I decided
to do a trivial stress-test - just run the new function through pgbench
with many clients.

The memstats.sql script does just this:

  SELECT * FROM pg_get_process_memory_contexts(
    (SELECT pid FROM pg_stat_activity
      WHERE pid != pg_backend_pid()
      ORDER BY random() LIMIT 1)
    , false);

where the inner query just picks a PID for some other backend, and asks
for memory context stats for that.

And just run it like this on a scale 1 pgbench database:

  pgbench -n -f memstats.sql -c 10 test

And it gets stuck *immediately*. I've seen it to wait for other client
backends and auxiliary processes like autovacuum launcher.

This is absolutely idle system, there's no reason why a process would
not respond almost immediately. I wonder if e.g. autovacuum launcher may
not be handling these requests, or what if client backends can wait in a
cycle. IIRC condition variables are not covered by a deadlock detector,
so that would be an issue. But maybe I remember wrong?


7) I've also seen this error:

  pgbench: error: client 6 script 0 aborted in command 0 query 0: \
  ERROR:  can't attach the same segment more than once

I haven't investigated it, but it seems like a problem handling errors,
where we fail to detach from a segment after a timeout. I may be wrong,
but it might be related to this:

  > I opted for DSAs over DSMs to enable memory reuse by freeing
  > segments for subsequent statistics copies of the same backend,
  > without needing to recreate DSMs for each request.

I feel like this might be a premature optimization - I don't have a
clear idea how expensive it is to create DSM per request, but my
intuition is that it's cheaper than processing the contexts and
generating the info.

I'd just remove that, unless someone demonstrates it really matters. I
don't really worry about how expensive it is to process a request
(within reason, of course) - it will happen only very rarely. It's more
important to make sure there's no overhead when no one asks the backend
for memory context info, and simplicity.

Also, how expensive it is to just keep the DSA "just in case"? Imagine
someone asks for the memory context info once - isn't it a was to still
keep the DSA? I don't recall how much resources could that be.

I don't have a clear opinion on that, I'm more asking for opinions.


8) Two minutes seems pretty arbitrary, and also quite high. If a timeout
is necessary, I think it should not be hard-coded.


regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Make postmaster environment dump easier to skim over
Next
From: Nathan Bossart
Date:
Subject: Re: Allow non-superuser to cancel superuser tasks.