Out of memory error handling in frontend code

Started by Frédéric Yhuelover 2 years ago3 messageshackers
Jump to latest
#1Frédéric Yhuel
frederic.yhuel@dalibo.com

Hello,

One of our customers recently complained that his pg_dump stopped
abruptly with the message "out of memory".

After some time, we understood that the 20 million of large objects were
responsible for the huge memory usage (more than 10 GB) by pg_dump.

I think a more useful error message would help for such cases. Indeed,
it's not always possible to ask the client to run pg_dump with
"valgrind --tool=massif" on its server.

Now, I understand that we don't want to add too much to the frontend
code, it would be a lot of pain for not much gain.

But I wonder if we could add some checks in a few strategic places, as
in the attached patch.

The idea would be to print a more useful error message in most of the
cases, and keep the basic "out of memory" for the remaining ones.

I haven't try to get the patch ready for review, I know that the format
of the messages isn't right, I'd like to know what do you think of the
idea, first.

Best regards,
Frédéric

Attachments:

0001-pg_dump-fix-OOM-handling.patchtext/x-patch; charset=UTF-8; name=0001-pg_dump-fix-OOM-handling.patchDownload+22-8
#2Daniel Gustafsson
daniel@yesql.se
In reply to: Frédéric Yhuel (#1)
Re: Out of memory error handling in frontend code

On 28 Sep 2023, at 10:14, Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:

After some time, we understood that the 20 million of large objects were responsible for the huge memory usage (more than 10 GB) by pg_dump.

This sounds like a known issue [0]/messages/by-id/7da8823d83a2b66bdd917aa6cb2c5c2619d86011.camel@credativ.de which has been reported several times, and
one we should get around to fixing sometime.

I think a more useful error message would help for such cases.

Knowing that this is case that pops up, I agree that we could do better around
the messaging here.

I haven't try to get the patch ready for review, I know that the format of the messages isn't right, I'd like to know what do you think of the idea, first.

I don't think adding more details is a bad idea, but it shouldn't require any
knowledge about internals so I think messages like the one below needs to be
reworded to be more helpful.

+	if (loinfo == NULL)
+	{
+		pg_fatal("getLOs: out of memory");
+	}

--
Daniel Gustafsson

[0]: /messages/by-id/7da8823d83a2b66bdd917aa6cb2c5c2619d86011.camel@credativ.de

#3Frédéric Yhuel
frederic.yhuel@dalibo.com
In reply to: Daniel Gustafsson (#2)
Re: Out of memory error handling in frontend code

Hi Daniel,

Thank you for your answer.

On 9/28/23 14:02, Daniel Gustafsson wrote:

On 28 Sep 2023, at 10:14, Frédéric Yhuel <frederic.yhuel@dalibo.com> wrote:

After some time, we understood that the 20 million of large objects were responsible for the huge memory usage (more than 10 GB) by pg_dump.

This sounds like a known issue [0] which has been reported several times, and
one we should get around to fixing sometime.

Indeed, I saw some of these reports afterwards :)

I think a more useful error message would help for such cases.

Knowing that this is case that pops up, I agree that we could do better around
the messaging here.

I haven't try to get the patch ready for review, I know that the format of the messages isn't right, I'd like to know what do you think of the idea, first.

I don't think adding more details is a bad idea, but it shouldn't require any
knowledge about internals so I think messages like the one below needs to be
reworded to be more helpful.

+	if (loinfo == NULL)
+	{
+		pg_fatal("getLOs: out of memory");
+	}

OK, here is a second version of the patch.

I didn't try to fix the path getLOs -> AssignDumpId -> catalogid_insert
-> [...] -> catalogid_allocate, but that's annoying because it amounts
to 11% of the memory allocations from valgrind's output.

Best regards,
Frédéric

Attachments:

0001-pg_dump-fix-OOM-handling_v2.patchtext/x-patch; charset=UTF-8; name=0001-pg_dump-fix-OOM-handling_v2.patchDownload+80-18