Thread: Re: Improve cleaning files on Postgres crashes

Re: Improve cleaning files on Postgres crashes

From
Tom Lane
Date:
Ranier Vilela <ranier.vf@gmail.com> writes:
> There are some reports that Postgres does not handle correctly cleaning the
> files used when it crashes. [1]
> I think that function *fcloseall* can help a little bit.
> Mainly on Windows.

I doubt that this is a good thing to try to do during a panic exit.
In the first place, we don't know to what extent the process's
internal data structures may be corrupted, possibly causing
fcloseall itself to malfunction.  In the second place, we might
wish to have a look at those temp files for debugging purposes.

I won't bother to address such points as whether fcloseall exists
everywhere or whether you put the calls in sane places.

            regards, tom lane



Re: Improve cleaning files on Postgres crashes

From
Ranier Vilela
Date:

Em ter., 18 de fev. de 2025 13:21, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Ranier Vilela <ranier.vf@gmail.com> writes:
> There are some reports that Postgres does not handle correctly cleaning the
> files used when it crashes. [1]
> I think that function *fcloseall* can help a little bit.
> Mainly on Windows.
Thanks for answear Tom.

I doubt that this is a good thing to try to do during a panic exit.
In the first place, we don't know to what extent the process's
internal data structures may be corrupted, possibly causing
fcloseall itself to malfunction. 
Well, I put after fflush(NULL), so if data structures are corrupted, fflush will fail in any way.

In the second place, we might
wish to have a look at those temp files for debugging purposes.
On dev/debug are great, but on production, where disk spaces cost money, I think that a good idea.

Maybe guarded by ifdefs?

Best regards,
Ranier Vilela 

Re: Improve cleaning files on Postgres crashes

From
Ranier Vilela
Date:
Em ter., 18 de fev. de 2025 às 13:29, Ranier Vilela <ranier.vf@gmail.com> escreveu:

Em ter., 18 de fev. de 2025 13:21, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Ranier Vilela <ranier.vf@gmail.com> writes:
> There are some reports that Postgres does not handle correctly cleaning the
> files used when it crashes. [1]
> I think that function *fcloseall* can help a little bit.
> Mainly on Windows.
Thanks for answear Tom.

I doubt that this is a good thing to try to do during a panic exit.
In the first place, we don't know to what extent the process's
internal data structures may be corrupted, possibly causing
fcloseall itself to malfunction. 
Well, I put after fflush(NULL), so if data structures are corrupted, fflush will fail in any way.

In the second place, we might
wish to have a look at those temp files for debugging purposes.
On dev/debug are great, but on production, where disk spaces cost money, I think that a good idea.

Maybe guarded by ifdefs?

I did a dirty test on Windows 64 bits.

I added a PANIC, in bufpage.c:

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 91da73dda8..c8056df662 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -1475.6 +1475.8 @@ PageIndexTupleOverwrite(Page page, OffsetNumber offnum,
 tupid->lp_len = newsize;

 /* Copy new tuple data onto page */
+ if (newsize > sizeof(newtup))
+ elog(PANIC, "out-of-bound access: %lu", newsize - sizeof(newtup));
 memcpy(PageGetItem(page, tupid), newtup, newsize);

 return true;

Here the results, after ninja test command:
dir *.* /s

head:
30938 arquivo(s)  2.735.487.655 bytes

patched with v1:
16233 arquivo(s)  1.540.519.373 bytes

best regards,
Ranier Vilela

Re: Improve cleaning files on Postgres crashes

From
Ranier Vilela
Date:
Em qua., 19 de fev. de 2025 às 14:48, Ranier Vilela <ranier.vf@gmail.com> escreveu:
Em ter., 18 de fev. de 2025 às 13:29, Ranier Vilela <ranier.vf@gmail.com> escreveu:

Em ter., 18 de fev. de 2025 13:21, Tom Lane <tgl@sss.pgh.pa.us> escreveu:
Ranier Vilela <ranier.vf@gmail.com> writes:
> There are some reports that Postgres does not handle correctly cleaning the
> files used when it crashes. [1]
> I think that function *fcloseall* can help a little bit.
> Mainly on Windows.
Thanks for answear Tom.

I doubt that this is a good thing to try to do during a panic exit.
In the first place, we don't know to what extent the process's
internal data structures may be corrupted, possibly causing
fcloseall itself to malfunction. 
Well, I put after fflush(NULL), so if data structures are corrupted, fflush will fail in any way.

In the second place, we might
wish to have a look at those temp files for debugging purposes.
On dev/debug are great, but on production, where disk spaces cost money, I think that a good idea.

Maybe guarded by ifdefs?

I did a dirty test on Windows 64 bits.

I added a PANIC, in bufpage.c:

diff --git a/src/backend/storage/page/bufpage.c b/src/backend/storage/page/bufpage.c
index 91da73dda8..c8056df662 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -1475.6 +1475.8 @@ PageIndexTupleOverwrite(Page page, OffsetNumber offnum,
 tupid->lp_len = newsize;

 /* Copy new tuple data onto page */
+ if (newsize > sizeof(newtup))
+ elog(PANIC, "out-of-bound access: %lu", newsize - sizeof(newtup));
 memcpy(PageGetItem(page, tupid), newtup, newsize);

 return true;

Here the results, after ninja test command:
dir *.* /s

head:
30938 arquivo(s)  2.735.487.655 bytes

patched with v1:
16233 arquivo(s)  1.540.519.373 bytes

ops, now with v1 attached.

best regards,
Ranier Vilela
Attachment