Improving pg_dump performance when handling large numbers of LOBs
Hi,
We've inherited a series of legacy PG 12 clusters that each contain a
database that we need to migrate to a PG 15 cluster. Each database contains
about 150 million large objects totaling about 250GB. When using pg_dump
we've found that it takes a couple of weeks to dump out this much data.
We've tried using the jobs option with the directory format but that seems
to save each LOB separately which makes moving the resulting dump to
another location unwieldy. Has anyone else had to deal with dumping a
database with these many LOBs? Are there any suggestions for how to
improve performance?
Thanks,
Wyatt
On Mon, Feb 5, 2024 at 2:01 PM Wyatt Tellis <wyatt.tellis@gmail.com> wrote:
Hi,
We've inherited a series of legacy PG 12 clusters that each contain a
database that we need to migrate to a PG 15 cluster. Each database contains
about 150 million large objects totaling about 250GB.
250*10^9 / (150*10^6) = 1667 bytes. That's *tiny*.
Am I misunderstanding you?
Show quoted text
On 2/5/24 11:35 AM, Ron Johnson wrote:
On Mon, Feb 5, 2024 at 2:01 PM Wyatt Tellis <wyatt.tellis@gmail.com>
wrote:Hi,
We've inherited a series of legacy PG 12 clusters that each
contain a database that we need to migrate to a PG 15 cluster.
Each database contains about 150 million large objects totaling
about 250GB.250*10^9 / (150*10^6) = 1667 bytes. That's *tiny*.
Am I misunderstanding you?
I think it less about the size of the individual objects then the number
(150 million) of them.
AFAIK that can't be handled by COPY, therefore they have to be
transferred individually.
--
Adrian Klaver
adrian.klaver@aklaver.com
Yes, the LOBs themselves are tiny, but there are a lot of them (~150
million) which seem to be slowing down pg_dump. Note, we did not
design/build this system and agree that use of LOBs for this purpose was
not necessary.
Wyatt
On Mon, Feb 5, 2024 at 11:36 AM Ron Johnson <ronljohnsonjr@gmail.com> wrote:
Show quoted text
On Mon, Feb 5, 2024 at 2:01 PM Wyatt Tellis <wyatt.tellis@gmail.com>
wrote:Hi,
We've inherited a series of legacy PG 12 clusters that each contain a
database that we need to migrate to a PG 15 cluster. Each database contains
about 150 million large objects totaling about 250GB.250*10^9 / (150*10^6) = 1667 bytes. That's *tiny*.
Am I misunderstanding you?
På mandag 05. februar 2024 kl. 20:55:43, skrev Wyatt Tellis <
wyatt.tellis@gmail.com <mailto:wyatt.tellis@gmail.com>>:
Yes, the LOBs themselves are tiny, but there are a lot of them (~150 million)
which seem to be slowing down pg_dump. Note, we did not design/build this
system and agree that use of LOBs for this purpose was not necessary.
Well, the data is there nonetheless, is it an option to convert it to bytea
before migration?
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com>
<https://www.visena.com>
No, we don't have the ability to make schema changes and the schema in the
PG15 copy needs to match what's in the PG 12 versions
Wyatt
On Mon, Feb 5, 2024 at 12:05 PM Andreas Joseph Krogh <andreas@visena.com>
wrote:
Show quoted text
På mandag 05. februar 2024 kl. 20:55:43, skrev Wyatt Tellis <
wyatt.tellis@gmail.com>:Yes, the LOBs themselves are tiny, but there are a lot of them (~150
million) which seem to be slowing down pg_dump. Note, we did not
design/build this system and agree that use of LOBs for this purpose was
not necessary.Well, the data is there nonetheless, is it an option to convert it to
bytea before migration?--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com>
Attachments:
Just checking... and I sympathize with your situation.
On Mon, Feb 5, 2024 at 2:56 PM Wyatt Tellis <wyatt.tellis@gmail.com> wrote:
Show quoted text
Yes, the LOBs themselves are tiny, but there are a lot of them (~150
million) which seem to be slowing down pg_dump. Note, we did not
design/build this system and agree that use of LOBs for this purpose was
not necessary.Wyatt
On Mon, Feb 5, 2024 at 11:36 AM Ron Johnson <ronljohnsonjr@gmail.com>
wrote:On Mon, Feb 5, 2024 at 2:01 PM Wyatt Tellis <wyatt.tellis@gmail.com>
wrote:Hi,
We've inherited a series of legacy PG 12 clusters that each contain a
database that we need to migrate to a PG 15 cluster. Each database contains
about 150 million large objects totaling about 250GB.250*10^9 / (150*10^6) = 1667 bytes. That's *tiny*.
Am I misunderstanding you?
På mandag 05. februar 2024 kl. 21:10:53, skrev Wyatt Tellis <
wyatt.tellis@gmail.com <mailto:wyatt.tellis@gmail.com>>:
No, we don't have the ability to make schema changes and the schema in the
PG15 copy needs to match what's in the PG 12 versions
Well then, I guess it boils down to how badly the ones in charge wants this
migration…
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com>
<https://www.visena.com>
On Mon, Feb 05, 2024 at 09:17:53PM +0100, Andreas Joseph Krogh wrote:
På mandag 05. februar 2024 kl. 21:10:53, skrev Wyatt Tellis <
wyatt.tellis@gmail.com <mailto:wyatt.tellis@gmail.com>>:No, we don't have the ability to make schema changes and the schema in the
PG15 copy needs to match what's in the PG 12 versionsWell then, I guess it boils down to how badly the ones in charge wants this
migration…--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com>
<https://www.visena.com>
Hi,
Can you use logical replication? Then you aren't under the gun to make
a fast copy.
Regards,
Ken
Wyatt Tellis <wyatt.tellis@gmail.com> writes:
Yes, the LOBs themselves are tiny, but there are a lot of them (~150
million) which seem to be slowing down pg_dump. Note, we did not
design/build this system and agree that use of LOBs for this purpose was
not necessary.
I don't know of anything much you can do at the user level. But there
is a patchset under development to improve pg_dump's behavior with
tons of blobs:
/messages/by-id/842242.1706287466@sss.pgh.pa.us
That's intended for v17, and I'm not sure how tough it might be to
adapt to v15 pg_dump, but maybe you could make that happen. In any
case, more eyeballs reviewing that patchset would be welcome.
regards, tom lane
On 2/5/24 12:32, Ken Marshall wrote:
On Mon, Feb 05, 2024 at 09:17:53PM +0100, Andreas Joseph Krogh wrote:
På mandag 05. februar 2024 kl. 21:10:53, skrev Wyatt Tellis <
wyatt.tellis@gmail.com <mailto:wyatt.tellis@gmail.com>>:No, we don't have the ability to make schema changes and the schema in the
PG15 copy needs to match what's in the PG 12 versionsWell then, I guess it boils down to how badly the ones in charge wants this
migration…--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com>
<https://www.visena.com>Hi,
Can you use logical replication? Then you aren't under the gun to make
a fast copy.
https://www.postgresql.org/docs/current/logical-replication-restrictions.html
"Large objects (see Chapter 35) are not replicated. There is no
workaround for that, other than storing data in normal tables."
Regards,
Ken
--
Adrian Klaver
adrian.klaver@aklaver.com
Might it be worth a modest amount of time using some basic profiling to see
where the time is going? A week is a looonnngg time, even for 150e6
operations. For example, if there an unexpectedly high IO load, some
temporary M.2 storage might help?
On Tue, 6 Feb 2024, 01:36 Ron Johnson, <ronljohnsonjr@gmail.com> wrote:
Show quoted text
On Mon, Feb 5, 2024 at 2:01 PM Wyatt Tellis <wyatt.tellis@gmail.com>
wrote:Hi,
We've inherited a series of legacy PG 12 clusters that each contain a
database that we need to migrate to a PG 15 cluster. Each database contains
about 150 million large objects totaling about 250GB.250*10^9 / (150*10^6) = 1667 bytes. That's *tiny*.
Am I misunderstanding you?