Thread: Bug fix. autovacuum.c do_worker_start() associates memory allocations with TopMemoryContext rather than 'Autovacuum start worker (tmp)'

Hi Hackers,

This patch ensures get_database_list() switches back to the memory
context in use upon entry rather than returning with TopMemoryContext
as the context.

This will address memory allocations in autovacuum.c being associated
with TopMemoryContext when they should not be.

autovacuum.c do_start_worker() with current context
'Autovacuum start worker (tmp)' invokes get_database_list(). Upon
return, the current context has been changed to TopMemoryContext by
AtCommit_Memory() as part of an internal transaction. Further down
in the do_start_worker(), pgstat_fetch_stat_dbentry() is invoked.
Previously this didn't pose a issue, however recent changes altered
how pgstat_fetch_stat_dbentry() is implemented. The new
implementation has a branch utilizing palloc. The patch ensures these
allocations are associated with the 'Autovacuum start worker (tmp)'
context rather than the TopMemoryContext. Prior to the change,
leaving an idle laptop PG instance running just shy of 3 days saw the
autovacuum launcher process grow to 42MB with most of that growth in
TopMemoryContext due to the palloc allocations issued with autovacuum
worker startup.



--
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thompson@crunchydata.com
www.crunchydata.com



--
Reid Thompson
Senior Software Engineer
Crunchy Data, Inc.

reid.thompson@crunchydata.com
www.crunchydata.com

Attachment
Reid Thompson <reid.thompson@crunchydata.com> writes:
> This patch ensures get_database_list() switches back to the memory
> context in use upon entry rather than returning with TopMemoryContext
> as the context.

> This will address memory allocations in autovacuum.c being associated
> with TopMemoryContext when they should not be.

> autovacuum.c do_start_worker() with current context
> 'Autovacuum start worker (tmp)' invokes get_database_list(). Upon
> return, the current context has been changed to TopMemoryContext by
> AtCommit_Memory() as part of an internal transaction. Further down
> in the do_start_worker(), pgstat_fetch_stat_dbentry() is invoked.
> Previously this didn't pose a issue, however recent changes altered
> how pgstat_fetch_stat_dbentry() is implemented. The new
> implementation has a branch utilizing palloc. The patch ensures these
> allocations are associated with the 'Autovacuum start worker (tmp)'
> context rather than the TopMemoryContext. Prior to the change,
> leaving an idle laptop PG instance running just shy of 3 days saw the
> autovacuum launcher process grow to 42MB with most of that growth in
> TopMemoryContext due to the palloc allocations issued with autovacuum
> worker startup.

Yeah, I can reproduce noticeable growth within a couple minutes
after setting autovacuum_naptime to 1s, and I confirm that the
launcher's space consumption stays static after applying this.

Even if there's only a visible leak in v15/HEAD, I'm inclined
to back-patch this all the way, because it's certainly buggy
on its own terms.  It's just the merest accident that neither
caller is leaking other stuff into TopMemoryContext, because
they both think they are using a short-lived context.

Thanks for the report!

            regards, tom lane



Hi,

On 2022-08-31 16:05:03 -0400, Tom Lane wrote:
> Even if there's only a visible leak in v15/HEAD, I'm inclined
> to back-patch this all the way, because it's certainly buggy
> on its own terms.  It's just the merest accident that neither
> caller is leaking other stuff into TopMemoryContext, because
> they both think they are using a short-lived context.

+1

> Thanks for the report!

+1

Greetings,

Andres Freund



On Wed, Aug 31, 2022 at 01:09:22PM -0700, Andres Freund wrote:
> On 2022-08-31 16:05:03 -0400, Tom Lane wrote:
>> Even if there's only a visible leak in v15/HEAD, I'm inclined
>> to back-patch this all the way, because it's certainly buggy
>> on its own terms.  It's just the merest accident that neither
>> caller is leaking other stuff into TopMemoryContext, because
>> they both think they are using a short-lived context.
>
> +1

Ouch.  Thanks for the quick fix and the backpatch.
--
Michael

Attachment