Improve cleaning files on Postgres crashes

Started by Ranier Vilela11 months ago5 messages
#1Ranier Vilela
ranier.vf@gmail.com
1 attachment(s)

Hi.

There are some reports that Postgres does not handle correctly cleaning the
files used when it crashes. [1]/messages/by-id/CAE9k0Pno=Mns7J5HA4+bbXzb=yCZnCtSF_wf1ZipCQxardKDjA@mail.gmail.com

I think that function *fcloseall* can help a little bit.
Mainly on Windows.

/*
* Closes all of the calling process's open streams.
* On Windows, close and remove all temporary files created by tmpfile
function.
*/

patch attached.

best regards,
Ranier Vilela

[1]: /messages/by-id/CAE9k0Pno=Mns7J5HA4+bbXzb=yCZnCtSF_wf1ZipCQxardKDjA@mail.gmail.com
/messages/by-id/CAE9k0Pno=Mns7J5HA4+bbXzb=yCZnCtSF_wf1ZipCQxardKDjA@mail.gmail.com

Attachments:

improve-cleaning-on-fatal-and-panic-elog.patchapplication/octet-stream; name=improve-cleaning-on-fatal-and-panic-elog.patchDownload
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 860bbd40d4..a67938ac44 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -584,6 +584,12 @@ errfinish(const char *filename, int lineno, const char *funcname)
 		 * worthy of panic, depending on which subprocess returns it.
 		 */
 		proc_exit(1);
+
+		/*
+		 * Closes all of the calling process's open streams.
+		 * On Windows, close and remove all temporary files created by tmpfile function.
+		 */
+		fcloseall();
 	}
 
 	if (elevel >= PANIC)
@@ -596,6 +602,13 @@ errfinish(const char *filename, int lineno, const char *funcname)
 		 * children...
 		 */
 		fflush(NULL);
+
+		/*
+		 * Closes all of the calling process's open streams.
+		 * On Windows, close and remove all temporary files created by tmpfile function.
+		 */
+		fcloseall();
+
 		abort();
 	}
 
#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ranier Vilela (#1)
Re: Improve cleaning files on Postgres crashes

Ranier Vilela <ranier.vf@gmail.com> writes:

There are some reports that Postgres does not handle correctly cleaning the
files used when it crashes. [1]
I think that function *fcloseall* can help a little bit.
Mainly on Windows.

I doubt that this is a good thing to try to do during a panic exit.
In the first place, we don't know to what extent the process's
internal data structures may be corrupted, possibly causing
fcloseall itself to malfunction. In the second place, we might
wish to have a look at those temp files for debugging purposes.

I won't bother to address such points as whether fcloseall exists
everywhere or whether you put the calls in sane places.

regards, tom lane

#3Ranier Vilela
ranier.vf@gmail.com
In reply to: Tom Lane (#2)
Re: Improve cleaning files on Postgres crashes

Em ter., 18 de fev. de 2025 13:21, Tom Lane <tgl@sss.pgh.pa.us> escreveu:

Ranier Vilela <ranier.vf@gmail.com> writes:

There are some reports that Postgres does not handle correctly cleaning

the

files used when it crashes. [1]
I think that function *fcloseall* can help a little bit.
Mainly on Windows.

Thanks for answear Tom.

I doubt that this is a good thing to try to do during a panic exit.

In the first place, we don't know to what extent the process's
internal data structures may be corrupted, possibly causing
fcloseall itself to malfunction.

Well, I put after fflush(NULL), so if data structures are corrupted, fflush
will fail in any way.

In the second place, we might

wish to have a look at those temp files for debugging purposes.

On dev/debug are great, but on production, where disk spaces cost money, I
think that a good idea.

Maybe guarded by ifdefs?

Best regards,
Ranier Vilela

#4Ranier Vilela
ranier.vf@gmail.com
In reply to: Ranier Vilela (#3)
Re: Improve cleaning files on Postgres crashes

Em ter., 18 de fev. de 2025 às 13:29, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Em ter., 18 de fev. de 2025 13:21, Tom Lane <tgl@sss.pgh.pa.us> escreveu:

Ranier Vilela <ranier.vf@gmail.com> writes:

There are some reports that Postgres does not handle correctly cleaning

the

files used when it crashes. [1]
I think that function *fcloseall* can help a little bit.
Mainly on Windows.

Thanks for answear Tom.

I doubt that this is a good thing to try to do during a panic exit.

In the first place, we don't know to what extent the process's
internal data structures may be corrupted, possibly causing
fcloseall itself to malfunction.

Well, I put after fflush(NULL), so if data structures are corrupted,
fflush will fail in any way.

In the second place, we might

wish to have a look at those temp files for debugging purposes.

On dev/debug are great, but on production, where disk spaces cost money, I
think that a good idea.

Maybe guarded by ifdefs?

I did a dirty test on Windows 64 bits.

I added a PANIC, in bufpage.c:

diff --git a/src/backend/storage/page/bufpage.c
b/src/backend/storage/page/bufpage.c
index 91da73dda8..c8056df662 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -1475.6 +1475.8 @@ PageIndexTupleOverwrite(Page page, OffsetNumber
offnum,
 tupid->lp_len = newsize;
 /* Copy new tuple data onto page */
+ if (newsize > sizeof(newtup))
+ elog(PANIC, "out-of-bound access: %lu", newsize - sizeof(newtup));
 memcpy(PageGetItem(page, tupid), newtup, newsize);

return true;

Here the results, after ninja test command:
dir *.* /s

head:
30938 arquivo(s) 2.735.487.655 bytes

patched with v1:
16233 arquivo(s) 1.540.519.373 bytes

best regards,
Ranier Vilela

#5Ranier Vilela
ranier.vf@gmail.com
In reply to: Ranier Vilela (#4)
1 attachment(s)
Re: Improve cleaning files on Postgres crashes

Em qua., 19 de fev. de 2025 às 14:48, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Em ter., 18 de fev. de 2025 às 13:29, Ranier Vilela <ranier.vf@gmail.com>
escreveu:

Em ter., 18 de fev. de 2025 13:21, Tom Lane <tgl@sss.pgh.pa.us> escreveu:

Ranier Vilela <ranier.vf@gmail.com> writes:

There are some reports that Postgres does not handle correctly

cleaning the

files used when it crashes. [1]
I think that function *fcloseall* can help a little bit.
Mainly on Windows.

Thanks for answear Tom.

I doubt that this is a good thing to try to do during a panic exit.

In the first place, we don't know to what extent the process's
internal data structures may be corrupted, possibly causing
fcloseall itself to malfunction.

Well, I put after fflush(NULL), so if data structures are corrupted,
fflush will fail in any way.

In the second place, we might

wish to have a look at those temp files for debugging purposes.

On dev/debug are great, but on production, where disk spaces cost money,
I think that a good idea.

Maybe guarded by ifdefs?

I did a dirty test on Windows 64 bits.

I added a PANIC, in bufpage.c:

diff --git a/src/backend/storage/page/bufpage.c
b/src/backend/storage/page/bufpage.c
index 91da73dda8..c8056df662 100644
--- a/src/backend/storage/page/bufpage.c
+++ b/src/backend/storage/page/bufpage.c
@@ -1475.6 +1475.8 @@ PageIndexTupleOverwrite(Page page, OffsetNumber
offnum,
tupid->lp_len = newsize;
/* Copy new tuple data onto page */
+ if (newsize > sizeof(newtup))
+ elog(PANIC, "out-of-bound access: %lu", newsize - sizeof(newtup));
memcpy(PageGetItem(page, tupid), newtup, newsize);

return true;

Here the results, after ninja test command:
dir *.* /s

head:
30938 arquivo(s) 2.735.487.655 bytes

patched with v1:
16233 arquivo(s) 1.540.519.373 bytes

ops, now with v1 attached.

best regards,
Ranier Vilela

Attachments:

v1-improve-cleaning-on-fatal-and-panic-elog.patchapplication/octet-stream; name=v1-improve-cleaning-on-fatal-and-panic-elog.patchDownload
diff --git a/src/backend/utils/error/elog.c b/src/backend/utils/error/elog.c
index 860bbd40d4..7f6e64e78f 100644
--- a/src/backend/utils/error/elog.c
+++ b/src/backend/utils/error/elog.c
@@ -584,6 +584,14 @@ errfinish(const char *filename, int lineno, const char *funcname)
 		 * worthy of panic, depending on which subprocess returns it.
 		 */
 		proc_exit(1);
+
+		/*
+		 * Closes all of the calling process's open streams.
+		 * On Windows, close and remove all temporary files created by tmpfile function.
+		 */
+		#ifdef NDEBUG
+		fcloseall();
+		#endif
 	}
 
 	if (elevel >= PANIC)
@@ -596,6 +604,15 @@ errfinish(const char *filename, int lineno, const char *funcname)
 		 * children...
 		 */
 		fflush(NULL);
+
+		/*
+		 * Closes all of the calling process's open streams.
+		 * On Windows, close and remove all temporary files created by tmpfile function.
+		 */
+		#ifdef NDEBUG
+		fcloseall();
+		#endif
+
 		abort();
 	}