Re: Trim the heap free memory - Mailing list pgsql-hackers
From | shawn wang |
---|---|
Subject | Re: Trim the heap free memory |
Date | |
Msg-id | CA+T=_GVcz2=vcFdms1S5qm69+znreUgMhHnyvj6XXFNOEYdBbQ@mail.gmail.com Whole thread Raw |
In response to | Re: Trim the heap free memory (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
Thank you very much for your response and suggestions.
As you mentioned, the patch here is actually designed for glibc's ptmalloc2 andis not applicable to other platforms. I will consider supporting it only on the Linux platform in the future. In the memory management strategy of ptmalloc2, there is a certain amount of non-garbage-collected memory, which is closely related to the order and method of memory allocation and release. To reduce the performance overhead caused by frequent allocation and release of small blocks of memory, ptmalloc2 intentionally retains this part of the memory. The malloc_trim function locks, traverses memory blocks, and uses madvise to release this part of the memory, but this process may also have a negative impact on performance. In the process of exploring solutions, I also considered a variety of strategies, including scheduling malloc_trim to be executed at regular intervals or triggering malloc_trim after a specific number of free operations. However, we found that these methods are not optimal solutions.
We can see that out of about 43K test queries, 32K saved nothing
whatever, and in only four was more than a couple of meg saved.
That's pretty discouraging IMO. It might be useful to look closer
at the behavior of those top four though. I see them as
In addition, I have considered the following optimization strategies:
Adjust the configuration of ptmalloc2 through the mallopt function to use mmap rather than sbrk for memory allocation. This can immediately return the memory to the operating system when it is released, but it may affect performance due to the higher overhead of mmap.
Use other memory allocators such as jemalloc or tcmalloc, and adjust relevant parameters to reduce the generation of non-garbage-collected memory. However, these allocators are designed for multi-threaded and may lead to increased memory usage per process.
Build a set of memory context (memory context) allocation functions based on mmap, delegating the responsibility of memory management entirely to the database level. Although this solution can effectively control memory allocation, it requires a large-scale engineering implementation.
I look forward to further discussing these solutions with you and exploring the best memory management practices together.
Best regards, Shawn
I wrote:
> The single test case you showed suggested that maybe we could
> usefully prod glibc to free memory at query completion, but we
> don't need all this interrupt infrastructure to do that. I think
> we could likely get 95% of the benefit with about a five-line
> patch.
To try to quantify that a little, I wrote a very quick-n-dirty
patch to apply malloc_trim during finish_xact_command and log
the effects. (I am not asserting this is the best place to
call malloc_trim; it's just one plausible possibility.) Patch
attached, as well as statistics collected from a run of the
core regression tests followed by
grep malloc_trim postmaster.log | sed 's/.*LOG:/LOG:/' | sort -k4n | uniq -c >trim_savings.txt
We can see that out of about 43K test queries, 32K saved nothing
whatever, and in only four was more than a couple of meg saved.
That's pretty discouraging IMO. It might be useful to look closer
at the behavior of those top four though. I see them as
2024-09-15 14:58:06.146 EDT [960138] LOG: malloc_trim saved 7228 kB
2024-09-15 14:58:06.146 EDT [960138] STATEMENT: ALTER TABLE delete_test_table ADD PRIMARY KEY (a,b,c,d);
2024-09-15 14:58:09.861 EDT [960949] LOG: malloc_trim saved 12488 kB
2024-09-15 14:58:09.861 EDT [960949] STATEMENT: with recursive search_graph(f, t, label, is_cycle, path) as (
select *, false, array[row(g.f, g.t)] from graph g
union distinct
select g.*, row(g.f, g.t) = any(path), path || row(g.f, g.t)
from graph g, search_graph sg
where g.f = sg.t and not is_cycle
)
select * from search_graph;
2024-09-15 14:58:09.866 EDT [960949] LOG: malloc_trim saved 12488 kB
2024-09-15 14:58:09.866 EDT [960949] STATEMENT: with recursive search_graph(f, t, label) as (
select * from graph g
union distinct
select g.*
from graph g, search_graph sg
where g.f = sg.t
) cycle f, t set is_cycle to 'Y' default 'N' using path
select * from search_graph;
2024-09-15 14:58:09.853 EDT [960949] LOG: malloc_trim saved 12616 kB
2024-09-15 14:58:09.853 EDT [960949] STATEMENT: with recursive search_graph(f, t, label) as (
select * from graph0 g
union distinct
select g.*
from graph0 g, search_graph sg
where g.f = sg.t
) search breadth first by f, t set seq
select * from search_graph order by seq;
I don't understand why WITH RECURSIVE queries might be more prone
to leave non-garbage-collected memory behind than other queries,
but maybe that is worth looking into.
regards, tom lane
pgsql-hackers by date: