Thread: Avoid memory leaks during base backups
Hi, Postgres currently can leak memory if a failure occurs during base backup in do_pg_backup_start() or do_pg_backup_stop() or perform_base_backup(). The palloc'd memory such as backup_state or tablespace_map in xlogfuncs.c or basebackup.c or tablespaceinfo or the memory that gets allocated by bbsink_begin_backup() in perform_base_backup() or any other, is left-out which may cause memory bloat on the server eventually. To experience this issue, run pg_basebackup with --label name longer than 1024 characters and observe memory with watch command, the memory usage goes up. It looks like the memory leak issue has been there for quite some time, discussed in [1]. I'm proposing a patch that leverages the error callback mechanism and memory context. The design of the patch is as follows: 1) pg_backup_start() and pg_backup_stop() - the error callback frees up the backup_state and tablespace_map variables allocated in TopMemoryContext. We don't need a separate memory context here because do_pg_backup_start() and do_pg_backup_stop() don't return any dynamically created memory for now. We can choose to create a separate memory context for the future changes that may come, but now it is not required. 2) perform_base_backup() - a new memory context has been created that gets deleted by the callback upon error. The error callbacks are typically called for all the elevels, but we need to free up the memory only when elevel is >= ERROR or == COMMERROR. The COMMERROR is a common scenario because the server can close the connection to the client or vice versa in which case the base backup fails. For all other elevels like WARNING, NOTICE, DEBUGX, INFO etc. we don't free up the memory. I'm attaching v1 patch herewith. Thoughts? [1] https://www.postgresql.org/message-id/Yyq15ekNzjZecwMW%40paquier.xyz -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes: > Postgres currently can leak memory if a failure occurs during base > backup in do_pg_backup_start() or do_pg_backup_stop() or > perform_base_backup(). The palloc'd memory such as backup_state or > tablespace_map in xlogfuncs.c or basebackup.c or tablespaceinfo or the > memory that gets allocated by bbsink_begin_backup() in > perform_base_backup() or any other, is left-out which may cause memory > bloat on the server eventually. To experience this issue, run > pg_basebackup with --label name longer than 1024 characters and > observe memory with watch command, the memory usage goes up. > It looks like the memory leak issue has been there for quite some > time, discussed in [1]. > I'm proposing a patch that leverages the error callback mechanism and > memory context. This ... seems like inventing your own shape of wheel. The normal mechanism for preventing this type of leak is to put the allocations in a memory context that can be reset or deallocated in mainline code at the end of the operation. I do not think that having an errcontext callback with side-effects like deallocating memory is even remotely safe, and it's certainly a first-order abuse of that mechanism. regards, tom lane
On Mon, Sep 26, 2022 at 7:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > I'm proposing a patch that leverages the error callback mechanism and > > memory context. > > This ... seems like inventing your own shape of wheel. The > normal mechanism for preventing this type of leak is to put the > allocations in a memory context that can be reset or deallocated > in mainline code at the end of the operation. Yes, that's the typical way and the patch attached does it for perform_base_backup(). What happens if we allocate some memory in the new memory context and error-out before reaching the end of operation? How do we deallocate such memory? Backup related code has simple-to-generate-error paths in between and memory can easily be leaked. Are you suggesting to use sigsetjmp or some other way to prevent memory leaks? > I do not think that > having an errcontext callback with side-effects like deallocating > memory is even remotely safe, and it's certainly a first-order > abuse of that mechanism. Are you saying that the error callback might deallocate the memory that may be needed later in the error processing? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
At Tue, 27 Sep 2022 11:33:56 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > On Mon, Sep 26, 2022 at 7:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > This ... seems like inventing your own shape of wheel. The > > normal mechanism for preventing this type of leak is to put the > > allocations in a memory context that can be reset or deallocated > > in mainline code at the end of the operation. > > Yes, that's the typical way and the patch attached does it for > perform_base_backup(). What happens if we allocate some memory in the > new memory context and error-out before reaching the end of operation? > How do we deallocate such memory? Whoever directly or indirectly catches the exception can do that. For example, SendBaseBackup() seems to catch execptions from perform_base_backup(). bbsinc_cleanup() is already resides there. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Tue, Sep 27, 2022 at 05:32:26PM +0900, Kyotaro Horiguchi wrote: > At Tue, 27 Sep 2022 11:33:56 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > > On Mon, Sep 26, 2022 at 7:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > This ... seems like inventing your own shape of wheel. The > > > normal mechanism for preventing this type of leak is to put the > > > allocations in a memory context that can be reset or deallocated > > > in mainline code at the end of the operation. > > > > Yes, that's the typical way and the patch attached does it for > > perform_base_backup(). What happens if we allocate some memory in the > > new memory context and error-out before reaching the end of operation? > > How do we deallocate such memory? > > Whoever directly or indirectly catches the exception can do that. For > example, SendBaseBackup() seems to catch execptions from > perform_base_backup(). bbsinc_cleanup() is already resides there. Even with that, what's the benefit in using an extra memory context in basebackup.c? backup_label and tablespace_map are mentioned upthread, but we have a tight control of these, and they should be allocated in the memory context created for replication commands (grep for "Replication command context") anyway. Using a dedicated memory context for the SQL backup functions under TopMemoryContext could be interesting, on the other hand.. -- Michael
Attachment
On Wed, Sep 28, 2022 at 9:46 AM Michael Paquier <michael@paquier.xyz> wrote: > > On Tue, Sep 27, 2022 at 05:32:26PM +0900, Kyotaro Horiguchi wrote: > > At Tue, 27 Sep 2022 11:33:56 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > > > On Mon, Sep 26, 2022 at 7:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > This ... seems like inventing your own shape of wheel. The > > > > normal mechanism for preventing this type of leak is to put the > > > > allocations in a memory context that can be reset or deallocated > > > > in mainline code at the end of the operation. > > > > > > Yes, that's the typical way and the patch attached does it for > > > perform_base_backup(). What happens if we allocate some memory in the > > > new memory context and error-out before reaching the end of operation? > > > How do we deallocate such memory? > > > > Whoever directly or indirectly catches the exception can do that. For > > example, SendBaseBackup() seems to catch execptions from > > perform_base_backup(). bbsinc_cleanup() is already resides there. > > Even with that, what's the benefit in using an extra memory context in > basebackup.c? backup_label and tablespace_map are mentioned upthread, > but we have a tight control of these, and they should be allocated in > the memory context created for replication commands (grep for > "Replication command context") anyway. Using a dedicated memory > context for the SQL backup functions under TopMemoryContext could be > interesting, on the other hand.. I had the same opinion. Here's what I think - for backup functions, we can have the new memory context child of TopMemoryContext and for perform_base_backup(), we can have the memory context child of CurrentMemoryContext. With PG_TRY()-PG_FINALLY()-PG_END_TRY(), we can delete those memory contexts upon ERRORs. This approach works for us since backup-related code doesn't have any FATALs. Thoughts? -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes: > I had the same opinion. Here's what I think - for backup functions, we > can have the new memory context child of TopMemoryContext and for > perform_base_backup(), we can have the memory context child of > CurrentMemoryContext. With PG_TRY()-PG_FINALLY()-PG_END_TRY(), we can > delete those memory contexts upon ERRORs. This approach works for us > since backup-related code doesn't have any FATALs. Not following your last point here? A process exiting on FATAL does not especially need to clean up its memory allocations first. Which is good, because "backup-related code doesn't have any FATALs" seems like an assertion with a very short half-life. regards, tom lane
On Wed, Sep 28, 2022 at 10:19 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> writes: > > I had the same opinion. Here's what I think - for backup functions, we > > can have the new memory context child of TopMemoryContext and for > > perform_base_backup(), we can have the memory context child of > > CurrentMemoryContext. With PG_TRY()-PG_FINALLY()-PG_END_TRY(), we can > > delete those memory contexts upon ERRORs. This approach works for us > > since backup-related code doesn't have any FATALs. > > Not following your last point here? A process exiting on FATAL > does not especially need to clean up its memory allocations first. > Which is good, because "backup-related code doesn't have any FATALs" > seems like an assertion with a very short half-life. You're right. My bad. For FATALs, we don't need to clean the memory as the process itself exits. * Note: an ereport(FATAL) will not be caught by this construct; control will * exit straight through proc_exit(). /* * Perform error recovery action as specified by elevel. */ if (elevel == FATAL) { /* * Do normal process-exit cleanup, then return exit code 1 to indicate * FATAL termination. The postmaster may or may not consider this * worthy of panic, depending on which subprocess returns it. */ proc_exit(1); } -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
At Wed, 28 Sep 2022 13:16:36 +0900, Michael Paquier <michael@paquier.xyz> wrote in > On Tue, Sep 27, 2022 at 05:32:26PM +0900, Kyotaro Horiguchi wrote: > > At Tue, 27 Sep 2022 11:33:56 +0530, Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote in > > > On Mon, Sep 26, 2022 at 7:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > This ... seems like inventing your own shape of wheel. The > > > > normal mechanism for preventing this type of leak is to put the > > > > allocations in a memory context that can be reset or deallocated > > > > in mainline code at the end of the operation. > > > > > > Yes, that's the typical way and the patch attached does it for > > > perform_base_backup(). What happens if we allocate some memory in the > > > new memory context and error-out before reaching the end of operation? > > > How do we deallocate such memory? > > > > Whoever directly or indirectly catches the exception can do that. For > > example, SendBaseBackup() seems to catch execptions from > > perform_base_backup(). bbsinc_cleanup() is already resides there. > > Even with that, what's the benefit in using an extra memory context in > basebackup.c? backup_label and tablespace_map are mentioned upthread, > but we have a tight control of these, and they should be allocated in > the memory context created for replication commands (grep for > "Replication command context") anyway. Using a dedicated memory > context for the SQL backup functions under TopMemoryContext could be > interesting, on the other hand.. If I understand you correctly, my point was the usage of error callbacks. I meant that we can release that tangling memory blocks in SendBaseBackup() even by directly pfree()ing then NULLing the pointer, if the pointer were file-scoped static. regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Wed, Sep 28, 2022 at 10:09 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Here's what I think - for backup functions, we > can have the new memory context child of TopMemoryContext and for > perform_base_backup(), we can have the memory context child of > CurrentMemoryContext. With PG_TRY()-PG_FINALLY()-PG_END_TRY(), we can > delete those memory contexts upon ERRORs. This approach works for us > since backup-related code doesn't have any FATALs. > > Thoughts? I'm attaching the v2 patch designed as described above. Please review it. I've added an entry in CF - https://commitfest.postgresql.org/40/3915/ -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Wed, Sep 28, 2022 at 5:30 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > I'm attaching the v2 patch designed as described above. Please review it. > > I've added an entry in CF - https://commitfest.postgresql.org/40/3915/ This looks odd to me. In the case of a regular backend, the sigsetjmp() handler in src/backend/tcop/postgres.c is responsible for cleaning up memory. It calls AbortCurrentTransaction() which will call CleanupTransaction() which will call AtCleanup_Memory() which will block away TopTransactionContext. I think we ought to do something analogous here, and we almost do already. Some walsender commands are going to be SQL commands and some aren't. For those that aren't, the same block calls WalSndErrorCleanup() which does similar kinds of cleanup, including in some situations calling WalSndResourceCleanup() which cleans up the resource owner in cases where we have a resource owner without a transaction. I feel like we ought to be trying to tie the cleanup into WalSndErrorCleanup() or WalSndResourceCleanup() based on having the memory context that we ought to be blowing away stored in a global variable, rather than using a try/catch block. Like, maybe there's a function EstablishWalSenderMemoryContext() that commands can call before allocating memory that shouldn't survive an error. And it's deleted after each command if it exists, or if an error occurs then WalSndErrorCleanup() deletes it. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Sep 28, 2022 at 8:46 PM Robert Haas <robertmhaas@gmail.com> wrote: > > I feel like we ought to be trying to tie > the cleanup into WalSndErrorCleanup() or WalSndResourceCleanup() based > on having the memory context that we ought to be blowing away stored > in a global variable, rather than using a try/catch block. Okay, I got rid of the try-catch block. I added two clean up callbacks (one for SQL backup functions or on-line backup, another for base backup) that basically delete the respective memory contexts and reset the file-level variables, they get called from PostgresMain()'s error handling code. > Like, maybe there's a function EstablishWalSenderMemoryContext() that > commands can call before allocating memory that shouldn't survive an > error. And it's deleted after each command if it exists, or if an > error occurs then WalSndErrorCleanup() deletes it. I don't think we need any of the above. I've used file-level variables to hold memory contexts, allocating them whenever needed and cleaning them up either at the end of backup operation or upon error. Please review the attached v3 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Thu, Sep 29, 2022 at 4:29 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > Please review the attached v3 patch. template1=# select * from pg_backup_start('sdgkljsdgkjdsg', true); pg_backup_start ----------------- 0/2000028 (1 row) template1=# select 1/0; ERROR: division by zero template1=# select * from pg_backup_stop(); server closed the connection unexpectedly This probably means the server terminated abnormally before or while processing the request. The connection to the server was lost. Attempting reset: Failed. The connection to the server was lost. Attempting reset: Failed. !?> -- Robert Haas EDB: http://www.enterprisedb.com
On Thu, Sep 29, 2022 at 7:05 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Sep 29, 2022 at 4:29 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > Please review the attached v3 patch. > > template1=# select * from pg_backup_start('sdgkljsdgkjdsg', true); > pg_backup_start > ----------------- > 0/2000028 > (1 row) > > template1=# select 1/0; > ERROR: division by zero > template1=# select * from pg_backup_stop(); > server closed the connection unexpectedly > This probably means the server terminated abnormally > before or while processing the request. > The connection to the server was lost. Attempting reset: Failed. > The connection to the server was lost. Attempting reset: Failed. > !?> Thanks! I used a variable to define the scope to clean up the backup memory context for SQL functions/on-line backup. We don't have this problem in case of base backup because we don't give control in between start and stop backup in perform_base_backup(). Please review the v4 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Thu, Sep 29, 2022 at 10:38 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Please review the v4 patch. I used valgrind for testing. Without patch, there's an obvious memory leak [1], with patch no memory leak. I used ALLOCSET_START_SMALL_SIZES instead of ALLOCSET_DEFAULT_SIZES for backup memory context so that it can start small and grow if required. I'm attaching v5 patch, please review it further. [1] ==00:00:01:36.306 145709== VALGRINDERROR-BEGIN ==00:00:01:36.306 145709== 24 bytes in 1 blocks are still reachable in loss record 122 of 511 ==00:00:01:36.306 145709== at 0x98E501: palloc (mcxt.c:1170) ==00:00:01:36.306 145709== by 0x9C1795: makeStringInfo (stringinfo.c:45) ==00:00:01:36.306 145709== by 0x2DE22A: pg_backup_start (xlogfuncs.c:96) ==00:00:01:36.306 145709== by 0x4D2DB6: ExecMakeTableFunctionResult (execSRF.c:234) ==00:00:01:36.306 145709== by 0x4F08DA: FunctionNext (nodeFunctionscan.c:95) ==00:00:01:36.306 145709== by 0x4D48EA: ExecScanFetch (execScan.c:133) ==00:00:01:36.306 145709== by 0x4D4963: ExecScan (execScan.c:182) ==00:00:01:36.306 145709== by 0x4F0C84: ExecFunctionScan (nodeFunctionscan.c:270) ==00:00:01:36.306 145709== by 0x4D0255: ExecProcNodeFirst (execProcnode.c:464) ==00:00:01:36.306 145709== by 0x4C32D4: ExecProcNode (executor.h:259) ==00:00:01:36.306 145709== by 0x4C619C: ExecutePlan (execMain.c:1636) ==00:00:01:36.306 145709== by 0x4C3A0F: standard_ExecutorRun (execMain.c:363) ==00:00:01:36.306 145709== ==00:00:01:36.306 145709== VALGRINDERROR-END ==00:00:01:36.334 145709== VALGRINDERROR-BEGIN ==00:00:01:36.334 145709== 1,024 bytes in 1 blocks are still reachable in loss record 426 of 511 ==00:00:01:36.334 145709== at 0x98E501: palloc (mcxt.c:1170) ==00:00:01:36.334 145709== by 0x9C17CF: initStringInfo (stringinfo.c:63) ==00:00:01:36.334 145709== by 0x9C17A5: makeStringInfo (stringinfo.c:47) ==00:00:01:36.334 145709== by 0x2DE22A: pg_backup_start (xlogfuncs.c:96) ==00:00:01:36.334 145709== by 0x4D2DB6: ExecMakeTableFunctionResult (execSRF.c:234) ==00:00:01:36.334 145709== by 0x4F08DA: FunctionNext (nodeFunctionscan.c:95) ==00:00:01:36.334 145709== by 0x4D48EA: ExecScanFetch (execScan.c:133) ==00:00:01:36.334 145709== by 0x4D4963: ExecScan (execScan.c:182) ==00:00:01:36.334 145709== by 0x4F0C84: ExecFunctionScan (nodeFunctionscan.c:270) ==00:00:01:36.334 145709== by 0x4D0255: ExecProcNodeFirst (execProcnode.c:464) ==00:00:01:36.334 145709== by 0x4C32D4: ExecProcNode (executor.h:259) ==00:00:01:36.334 145709== by 0x4C619C: ExecutePlan (execMain.c:1636) ==00:00:01:36.334 145709== ==00:00:01:36.334 145709== VALGRINDERROR-END ==00:00:01:36.335 145709== VALGRINDERROR-BEGIN ==00:00:01:36.335 145709== 1,096 bytes in 1 blocks are still reachable in loss record 431 of 511 ==00:00:01:36.335 145709== at 0x98E766: palloc0 (mcxt.c:1201) ==00:00:01:36.335 145709== by 0x2DE152: pg_backup_start (xlogfuncs.c:81) ==00:00:01:36.335 145709== by 0x4D2DB6: ExecMakeTableFunctionResult (execSRF.c:234) ==00:00:01:36.335 145709== by 0x4F08DA: FunctionNext (nodeFunctionscan.c:95) ==00:00:01:36.335 145709== by 0x4D48EA: ExecScanFetch (execScan.c:133) ==00:00:01:36.335 145709== by 0x4D4963: ExecScan (execScan.c:182) ==00:00:01:36.335 145709== by 0x4F0C84: ExecFunctionScan (nodeFunctionscan.c:270) ==00:00:01:36.335 145709== by 0x4D0255: ExecProcNodeFirst (execProcnode.c:464) ==00:00:01:36.335 145709== by 0x4C32D4: ExecProcNode (executor.h:259) ==00:00:01:36.335 145709== by 0x4C619C: ExecutePlan (execMain.c:1636) ==00:00:01:36.335 145709== by 0x4C3A0F: standard_ExecutorRun (execMain.c:363) ==00:00:01:36.335 145709== by 0x4C37FA: ExecutorRun (execMain.c:307) ==00:00:01:36.335 145709== ==00:00:01:36.335 145709== VALGRINDERROR-END -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
The following review has been posted through the commitfest application: make installcheck-world: tested, passed Implements feature: tested, passed Spec compliant: not tested Documentation: not tested Hello I applied your v5 patch on the current master and run valgrind on it while doing a basebackup with simulated error. No memoryleak related to backup is observed. Regression is also passing thank you Cary Huang HighGo Software Canada
On Fri, Oct 14, 2022 at 09:56:31PM +0000, Cary Huang wrote: > I applied your v5 patch on the current master and run valgrind on it > while doing a basebackup with simulated error. No memory leak > related to backup is observed. Regression is also passing. Echoing with what I mentioned upthread in [1], I don't quite understand why this patch needs to touch basebackup.c, walsender.c and postgres.c. In the case of a replication command processed by a WAL sender, memory allocations happen in the memory context created for replication commands, which is itself, as far as I understand, the message memory context when we get a 'Q' message for a simple query. Why do we need more code for a cleanup that should be already happening? Am I missing something obvious? xlogfuncs.c, by storing stuff in the TopMemoryContext of the process running the SQL commands pg_backup_start/stop() is different, of course. Perhaps the point of centralizing the base backup context in xlogbackup.c makes sense, but my guess that it makes more sense to keep that with the SQL functions as these are the only ones in need of a cleanup, coming down to the fact that the start and stop functions happen in different queries, aka these are not bind to a message context. [1]: https://www.postgresql.org/message-id/YzPKpKEk/JMjhWEz@paquier.xyz -- Michael
Attachment
On Mon, Oct 17, 2022 at 1:09 PM Michael Paquier <michael@paquier.xyz> wrote: > > On Fri, Oct 14, 2022 at 09:56:31PM +0000, Cary Huang wrote: > > I applied your v5 patch on the current master and run valgrind on it > > while doing a basebackup with simulated error. No memory leak > > related to backup is observed. Regression is also passing. > > Echoing with what I mentioned upthread in [1], I don't quite > understand why this patch needs to touch basebackup.c, walsender.c > and postgres.c. In the case of a replication command processed by a > WAL sender, memory allocations happen in the memory context created > for replication commands, which is itself, as far as I understand, the > message memory context when we get a 'Q' message for a simple query. > Why do we need more code for a cleanup that should be already > happening? Am I missing something obvious? > > [1]: https://www.postgresql.org/message-id/YzPKpKEk/JMjhWEz@paquier.xyz My bad, I missed that. You are right. We have "Replication command context" as a child of "MessageContext" memory context for base backup that gets cleaned upon error in PostgresMain() [1]. > xlogfuncs.c, by storing stuff in the TopMemoryContext of the process > running the SQL commands pg_backup_start/stop() is different, of > course. Perhaps the point of centralizing the base backup context in > xlogbackup.c makes sense, but my guess that it makes more sense to > keep that with the SQL functions as these are the only ones in need of > a cleanup, coming down to the fact that the start and stop functions > happen in different queries, aka these are not bind to a message > context. Yes, they're not related to "MessageContext" memory context. Please see the attached v6 patch that deals with memory leaks for backup SQL-callable functions. [1] (gdb) bt #0 MemoryContextDelete (context=0x558b7cd0de50) at mcxt.c:378 #1 0x0000558b7c655733 in MemoryContextDeleteChildren (context=0x558b7ccda8c0) at mcxt.c:430 #2 0x0000558b7c65546d in MemoryContextReset (context=0x558b7ccda8c0) at mcxt.c:309 #3 0x0000558b7c43b5cd in PostgresMain (dbname=0x558b7cd11fb8 "", username=0x558b7ccd6298 "ubuntu") at postgres.c:4358 #4 0x0000558b7c364a88 in BackendRun (port=0x558b7cd09620) at postmaster.c:4482 #5 0x0000558b7c36431b in BackendStartup (port=0x558b7cd09620) at postmaster.c:4210 #6 0x0000558b7c3603be in ServerLoop () at postmaster.c:1804 #7 0x0000558b7c35fb1b in PostmasterMain (argc=3, argv=0x558b7ccd4200) at postmaster.c:1476 #8 0x0000558b7c229a0e in main (argc=3, argv=0x558b7ccd4200) at main.c:197 -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Wed, Oct 19, 2022 at 3:04 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > Echoing with what I mentioned upthread in [1], I don't quite > > understand why this patch needs to touch basebackup.c, walsender.c > > and postgres.c. In the case of a replication command processed by a > > WAL sender, memory allocations happen in the memory context created > > for replication commands, which is itself, as far as I understand, the > > message memory context when we get a 'Q' message for a simple query. > > Why do we need more code for a cleanup that should be already > > happening? Am I missing something obvious? > > My bad, I missed that. You are right. We have "Replication command > context" as a child of "MessageContext" memory context for base backup > that gets cleaned upon error in PostgresMain() [1]. Well this still touches postgres.c. And I still think it's an awful lot of machinery for a pretty trivial problem. As a practical matter, nobody is going to open a connection and sit there and try to start a backup over and over again on the same connection. And even if someone wrote a client that did that -- why? -- they'd have to be awfully persistent to leak any amount of memory that would actually matter. So it is not insane to just think of ignoring this problem entirely. But if we want to fix it, I think we should do it in some more localized way. One option is to just have do_pg_start_backup() blow away any old memory context before it allocates any new memory, and forget about releasing anything in PostgresMain(). That means memory could remain allocated after a failure until you next retry the operation, but I don't think that really matters. It's not a lot of memory; we just don't want it to accumulate across many repetitions. Another option, perhaps, is to delete some memory context from within the TRY/CATCH block if non-NULL, although that wouldn't work nicely if it might blow away the data we need to generate the error message. A third option is to do something useful inside WalSndErrorCleanup() or WalSndResourceCleanup() as I suggested previously. I'm not exactly sure what the right solution is here, but I think you need to put more thought into how to make the code look simple, elegant, and non-invasive. -- Robert Haas EDB: http://www.enterprisedb.com
On Wed, Oct 19, 2022 at 8:10 PM Robert Haas <robertmhaas@gmail.com> wrote: > > Well this still touches postgres.c. And I still think it's an awful > lot of machinery for a pretty trivial problem. As a practical matter, > nobody is going to open a connection and sit there and try to start a > backup over and over again on the same connection. And even if someone > wrote a client that did that -- why? -- they'd have to be awfully > persistent to leak any amount of memory that would actually matter. So > it is not insane to just think of ignoring this problem entirely. I understand that the amount of memory allocated by pg_backup_start() is small compared to the overall RAM, however, I don't think we can ignore the problem and let postgres cause memory leaks. > But if we want to fix it, I think we should do it in some more > localized way. Agreed. > One option is to just have do_pg_start_backup() blow > away any old memory context before it allocates any new memory, and > forget about releasing anything in PostgresMain(). That means memory > could remain allocated after a failure until you next retry the > operation, but I don't think that really matters. It's not a lot of > memory; we just don't want it to accumulate across many repetitions. This seems reasonable to me. > Another option, perhaps, is to delete some memory context from within > the TRY/CATCH block if non-NULL, although that wouldn't work nicely if > it might blow away the data we need to generate the error message. Right. > A third option is to do something useful inside WalSndErrorCleanup() or > WalSndResourceCleanup() as I suggested previously. These functions will not be called for SQL-callable backup functions pg_backup_start() and pg_backup_stop(). And the memory leak problem we're trying to solve is for SQL-callable functions, but not for basebaskups as they already have a memory context named "Replication command context" that gets deleted in PostgresMain(). -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
On Wed, Oct 19, 2022 at 9:23 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > > On Wed, Oct 19, 2022 at 8:10 PM Robert Haas <robertmhaas@gmail.com> wrote: > > > One option is to just have do_pg_start_backup() blow > > away any old memory context before it allocates any new memory, and > > forget about releasing anything in PostgresMain(). That means memory > > could remain allocated after a failure until you next retry the > > operation, but I don't think that really matters. It's not a lot of > > memory; we just don't want it to accumulate across many repetitions. > > This seems reasonable to me. I tried implementing this, please see the attached v7 patch. Currently, memory allocated in the new memory context is around 4KB [1]. In the extreme and rarest of the rare cases where somebody executes select pg_backup_start(repeat('foo', 1024)); or a failure occurs before reaching pg_backup_stop() on all of the sessions (max_connections) at once, the maximum/peak memory bloat/leak is around max_connections*4KB, which will still be way less than the total amount of RAM. Hence, I think this approach seems very reasonable and non-invasive yet can solve the memory leak problem. Thoughts? [1] (gdb) p *backupcontext $4 = {type = T_AllocSetContext, isReset = false, allowInCritSection = false, mem_allocated = 4232, methods = 0x55c925b81f90 <mcxt_methods+240>, parent = 0x55c92766d2a0, firstchild = 0x0, prevchild = 0x0, nextchild = 0x55c92773f1f0, name = 0x55c9258be05c "on-line backup context", ident = 0x0, reset_cbs = 0x0} -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Thu, Oct 20, 2022 at 6:47 AM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > I tried implementing this, please see the attached v7 patch. I haven't checked this in detail but it looks much more reasonable in terms of code footprint. However, we should, I think, set backup_state = NULL and tablespace_map = NULL before deleting the memory context. As you have it, I believe that if backup_state = (BackupState *) palloc0(sizeof(BackupState)) fails -- say due to running out of memory -- then those variables could end up pointing to garbage because the context had already been reset before initializing them. I don't know whether it's possible for that to cause any concrete harm, but nulling out the pointers seems like cheap insurance. -- Robert Haas EDB: http://www.enterprisedb.com
On Thu, Oct 20, 2022 at 9:48 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Oct 20, 2022 at 6:47 AM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > I tried implementing this, please see the attached v7 patch. > > I haven't checked this in detail but it looks much more reasonable in > terms of code footprint. However, we should, I think, set backup_state > = NULL and tablespace_map = NULL before deleting the memory context. > As you have it, I believe that if backup_state = (BackupState *) > palloc0(sizeof(BackupState)) fails -- say due to running out of memory > -- then those variables could end up pointing to garbage because the > context had already been reset before initializing them. I don't know > whether it's possible for that to cause any concrete harm, but nulling > out the pointers seems like cheap insurance. I think elsewhere in the code we reset dangling pointers either ways - before or after deleting/resetting memory context. But placing them before would give us extra safety in case memory context deletion/reset fails. Not sure what's the best way. However, I'm nullifying the dangling pointers after deleting/resetting memory context. MemoryContextDelete(Conf->buildCxt); MemoryContextDelete(PostmasterContext); MemoryContextDelete(rulescxt); Please see the attached v8 patch. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On 2022-Oct-20, Bharath Rupireddy wrote: > I think elsewhere in the code we reset dangling pointers either ways - > before or after deleting/resetting memory context. But placing them > before would give us extra safety in case memory context > deletion/reset fails. Not sure what's the best way. However, I'm > nullifying the dangling pointers after deleting/resetting memory > context. I agree that's a good idea, and the patch looks good to me, but I don't think asserting that they are null afterwards is useful. -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
On Thu, Oct 20, 2022 at 1:35 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: > I think elsewhere in the code we reset dangling pointers either ways - > before or after deleting/resetting memory context. But placing them > before would give us extra safety in case memory context > deletion/reset fails. Not sure what's the best way. I think it's OK to assume that deallocating memory will always succeed, so it doesn't matter whether you do it just before or just after that. But it's not OK to assume that *allocating* memory will always succeed. -- Robert Haas EDB: http://www.enterprisedb.com
On Thu, Oct 20, 2022 at 11:17 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > On 2022-Oct-20, Bharath Rupireddy wrote: > > > I think elsewhere in the code we reset dangling pointers either ways - > > before or after deleting/resetting memory context. But placing them > > before would give us extra safety in case memory context > > deletion/reset fails. Not sure what's the best way. However, I'm > > nullifying the dangling pointers after deleting/resetting memory > > context. > > I agree that's a good idea, and the patch looks good to me, but I don't > think asserting that they are null afterwards is useful. +1. Removed those assertions. Please see the attached v9 patch. On Fri, Oct 21, 2022 at 12:21 AM Robert Haas <robertmhaas@gmail.com> wrote: > > On Thu, Oct 20, 2022 at 1:35 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: > > I think elsewhere in the code we reset dangling pointers either ways - > > before or after deleting/resetting memory context. But placing them > > before would give us extra safety in case memory context > > deletion/reset fails. Not sure what's the best way. > > I think it's OK to assume that deallocating memory will always > succeed, so it doesn't matter whether you do it just before or just > after that. But it's not OK to assume that *allocating* memory will > always succeed. Right. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Thu, Oct 20, 2022 at 02:51:21PM -0400, Robert Haas wrote: > On Thu, Oct 20, 2022 at 1:35 PM Bharath Rupireddy > <bharath.rupireddyforpostgres@gmail.com> wrote: >> I think elsewhere in the code we reset dangling pointers either ways - >> before or after deleting/resetting memory context. But placing them >> before would give us extra safety in case memory context >> deletion/reset fails. Not sure what's the best way. > > I think it's OK to assume that deallocating memory will always > succeed, so it doesn't matter whether you do it just before or just > after that. But it's not OK to assume that *allocating* memory will > always succeed. AFAIK, one of the callbacks associated to a memory context could fail, see comments before MemoryContextCallResetCallbacks() in MemoryContextDelete(). I agree that it should not matter here, but I think that it is better to reset the pointers before attempting the deletion of the memory context in this case. -- Michael
Attachment
On Fri, Oct 21, 2022 at 11:34:27AM +0530, Bharath Rupireddy wrote: > On Fri, Oct 21, 2022 at 12:21 AM Robert Haas <robertmhaas@gmail.com> wrote: >> On Thu, Oct 20, 2022 at 1:35 PM Bharath Rupireddy <bharath.rupireddyforpostgres@gmail.com> wrote: >>> I think elsewhere in the code we reset dangling pointers either ways - >>> before or after deleting/resetting memory context. But placing them >>> before would give us extra safety in case memory context >>> deletion/reset fails. Not sure what's the best way. >> >> I think it's OK to assume that deallocating memory will always >> succeed, so it doesn't matter whether you do it just before or just >> after that. But it's not OK to assume that *allocating* memory will >> always succeed. > > Right. To be exact, it seems to me that tablespace_map and backup_state should be reset before deleting backupcontext, but the reset of backupcontext should happen after the fact. + backup_state = NULL; tablespace_map = NULL; These two in pg_backup_start() don't matter, do they? They are reallocated a couple of lines down. + * across. We keep the memory allocated in this memory context less, What does "We keep the memory allocated in this memory context less" mean here? -- Michael
Attachment
At Thu, 20 Oct 2022 19:47:07 +0200, Alvaro Herrera <alvherre@alvh.no-ip.org> wrote in > I agree that's a good idea, and the patch looks good to me, but I don't > think asserting that they are null afterwards is useful. +1 for this direction. And the patch is fine to me. > oldcontext = MemoryContextSwitchTo(backupcontext); > Assert(backup_state == NULL); > Assert(tablespace_map == NULL); > backup_state = (BackupState *) palloc0(sizeof(BackupState)); > tablespace_map = makeStringInfo(); > MemoryContextSwitchTo(oldcontext); We can use MemoryContextAllocZero() for this purpose, but of couse not mandatory. + * across. We keep the memory allocated in this memory context less, + * because any error before reaching pg_backup_stop() can leak the memory + * until pg_backup_start() is called again. While this is not smart, it + * helps to keep things simple. I think the "less" is somewhat obscure. I feel we should be more explicitly. And we don't need to put emphasis on "leak". I recklessly propose this as the draft. "The context is intended to be used by this function to store only session-lifetime values. It is, if left alone, reset at the next call to blow away orphan memory blocks from the previous failed call. While this is not smart, it helps to keep things simple." regards. -- Kyotaro Horiguchi NTT Open Source Software Center
On Fri, Oct 21, 2022 at 2:18 AM Michael Paquier <michael@paquier.xyz> wrote: > AFAIK, one of the callbacks associated to a memory context could > fail, see comments before MemoryContextCallResetCallbacks() in > MemoryContextDelete(). I agree that it should not matter here, but I > think that it is better to reset the pointers before attempting the > deletion of the memory context in this case. I think this is nitpicking. There's no real danger here, and if there were, the error handling would have to take it into account somehow, which it doesn't. I'd probably do it before resetting the context as a matter of style, to make it clear that there's no window in which the pointers are set but referencing invalid memory. But I do not think it makes any practical difference at all. -- Robert Haas EDB: http://www.enterprisedb.com
On Fri, Oct 21, 2022 at 11:58 AM Michael Paquier <michael@paquier.xyz> wrote: > > To be exact, it seems to me that tablespace_map and backup_state > should be reset before deleting backupcontext, but the reset of > backupcontext should happen after the fact. > > + backup_state = NULL; > tablespace_map = NULL; > These two in pg_backup_start() don't matter, do they? They are > reallocated a couple of lines down. After all, that is what is being discussed here; what if palloc down below fails and they're not reset to NULL after MemoryContextReset()? > + * across. We keep the memory allocated in this memory context less, > What does "We keep the memory allocated in this memory context less" > mean here? We try to keep it less because we don't want to allocate more memory and leak it unless pg_start_backup() is called again. Please read the description. I'll leave it to the committer's discretion whether to have that part or remove it. On Fri, Oct 21, 2022 at 12:11 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: > > > + * across. We keep the memory allocated in this memory context less, > + * because any error before reaching pg_backup_stop() can leak the memory > + * until pg_backup_start() is called again. While this is not smart, it > + * helps to keep things simple. > > I think the "less" is somewhat obscure. I feel we should be more > explicitly. And we don't need to put emphasis on "leak". I recklessly > propose this as the draft. I tried to put it simple, please see the attached v10. I'll leave it to the committer's discretion for better wording. On Fri, Oct 21, 2022 at 7:47 PM Robert Haas <robertmhaas@gmail.com> wrote: > > On Fri, Oct 21, 2022 at 2:18 AM Michael Paquier <michael@paquier.xyz> wrote: > > AFAIK, one of the callbacks associated to a memory context could > > fail, see comments before MemoryContextCallResetCallbacks() in > > MemoryContextDelete(). I agree that it should not matter here, but I > > think that it is better to reset the pointers before attempting the > > deletion of the memory context in this case. > > I think this is nitpicking. There's no real danger here, and if there > were, the error handling would have to take it into account somehow, > which it doesn't. > > I'd probably do it before resetting the context as a matter of style, > to make it clear that there's no window in which the pointers are set > but referencing invalid memory. But I do not think it makes any > practical difference at all. Please see the attached v10. -- Bharath Rupireddy PostgreSQL Contributors Team RDS Open Source Databases Amazon Web Services: https://aws.amazon.com
Attachment
On Fri, Oct 21, 2022 at 09:02:04PM +0530, Bharath Rupireddy wrote: > After all, that is what is being discussed here; what if palloc down > below fails and they're not reset to NULL after MemoryContextReset()? It does not seem to matter much to me for that, so left these as proposed. > On Fri, Oct 21, 2022 at 12:11 PM Kyotaro Horiguchi <horikyota.ntt@gmail.com> wrote: >> I think the "less" is somewhat obscure. I feel we should be more >> explicitly. And we don't need to put emphasis on "leak". I recklessly >> propose this as the draft. > > I tried to put it simple, please see the attached v10. I'll leave it > to the committer's discretion for better wording. I am still not sure what "less" means when referring to a "memory context". Anyway, I have gone through the comments and finished with something much more simplified, and applied the whole. -- Michael