Re: Make MemoryContextMemAllocated() more precise - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Make MemoryContextMemAllocated() more precise
Date
Msg-id 883362f227978fa09de9b89ed02667d3d3e61daf.camel@j-davis.com
Whole thread Raw
In response to Re: Make MemoryContextMemAllocated() more precise  (David Rowley <dgrowleyml@gmail.com>)
Responses Re: Make MemoryContextMemAllocated() more precise  (David Rowley <dgrowleyml@gmail.com>)
List pgsql-hackers
On Mon, 2020-04-06 at 23:39 +1200, David Rowley wrote:
> 1. The comment mentions "passthru", but you've removed that
> parameter.

Fixed, thank you.

> 2. I don't think MemoryContextCount is the best name for this
> function. When I saw:

I've gone back and forth on naming a bit. The right name, in my
opinion, is MemoryContextStats(), but that's taken by something that
should be called MemoryContextReport(). But I didn't want to change
that as it would probably annoy a lot of people who are used to calling
MemoryContextStats() from gdb.

I changed the new function to be called MemoryContextGetCounters(),
which is more directly what it's doing. "Telemetry" makes me think more
of a stream of information rather than a particular point in time.

> 3. I feel like it would be nicer if you didn't change the "count"
> methods to return a MemoryContextCounters. It means you may need to
> zero a struct for each level, assign the values, then add them to the
> total.  If you were just to zero the struct in MemoryContextCount()
> then pass it into the count function, then you could just have it do
> all the += work. It would reduce the code in MemoryContextCount()
> too.

I changed it to use a pointer out parameter, but I don't think an
in/out parameter is quite right there. MemoryContextStats() ends up
using both the per-context counters as well as the totals, so it's not
helpful to return just the totals.

> 4. Do you think it would be better to have two separate functions for
> MemoryContextCount(), a recursive version and a non-recursive
> version.

I could, but right now the only caller passes recurse=true, so I'm not
really eliminating any code in that path by specializing the functions.
Are you thinking about performance or you just think it would be better
to have two entry points?

> 5. For performance testing, I tried using the following table with
> 1MB
> work_mem then again with 1GB work_mem.  I wondered if making the
> accounting more complex would cause a slowdown in nodeAgg.c, as I see
> we're calling this function each time we get a tuple that belongs in
> a
> new group. The 1MB test is likely not such a great way to measure
> this
> since we do spill to disk in that case and the changing in accounting
> means we do that at a slightly different time, but the 1GB test does
> not spill and it's a bit slower.

I think this was because I was previously returning a struct. By just
passing a pointer as an out param, it seems to have mitigated it, but
not completely eliminated the cost (< 2% in my tests). I was using an
OFFSET 100000000 instead of EXPLAIN ANALYZE in my test measurements
because it was less noisy, and I focused on the 1GB test for the reason
you mention.

I also addressed Andres's comments:

* changed the name of the flags from MCXT_STAT to MCXT_COUNT
* changed ((1<<6)-1) to ~0
* removed unnecessary branches from the GetCounters method
* expanded some comments

Regards,
    Jeff Davis


Attachment

pgsql-hackers by date:

Previous
From: Anna Akenteva
Date:
Subject: Re: [HACKERS] make async slave to wait for lsn to be replayed
Next
From: Amit Kapila
Date:
Subject: Re: User Interface for WAL usage data