Remove_temp_files_after_crash and significant recovery/startup time - Mailing list pgsql-hackers

From McCoy, Shawn
Subject Remove_temp_files_after_crash and significant recovery/startup time
Date
Msg-id E7573D54-A8C9-40A8-89D7-0596A36ED124@amazon.com
Whole thread Raw
Responses Re: Remove_temp_files_after_crash and significant recovery/startup time  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Remove_temp_files_after_crash and significant recovery/startup time  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Re: Remove_temp_files_after_crash and significant recovery/startup time  ("Euler Taveira" <euler@eulerto.com>)
List pgsql-hackers

I noticed that the new parameter remove_temp_files_after_crash is currently set to a default value of "true" in the version 14 release. It seems this was discussed in this thread [1], and it doesn't look to me like there's been a lot of stress testing of this feature.

 

In our fleet there have been cases where we have seen hundreds of thousands of temp files generated.  I found a case where we helped a customer that had a little over 2.2 million temp files.  Single threaded cleanup of these takes a significant amount of time and delays recovery. In RDS, we mitigated this by moving the pgsql_tmp directory aside, start the engine and then separately remove the old temp files.

 

After noticing the current plans to default this GUC to "on" in v14, just thought I'd raise the question of whether this should get a little more discussion or testing with higher numbers of temp files?

 

Regards,

Shawn McCoy

Database Engineer

Amazon Web Services

 

[1] https://www.postgresql.org/message-id/CAH503wDKdYzyq7U-QJqGn%3DGm6XmoK%2B6_6xTJ-Yn5WSvoHLY1Ww%40mail.gmail.com

 

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: Estimating HugePages Requirements?
Next
From: Tomas Vondra
Date:
Subject: Re: slab allocator performance issues