Updating large postgresql database with blobs

Started by CAJ CAJalmost 19 years ago4 messages
#1CAJ CAJ
pguser@gmail.com

Hello,

I didn't get any response on the GENERAL list so i'm escalating this ....

We have several independent database servers with ~50GB+ databases running
postgres 8.0.x. We are planning to upgrade these databases to postgres
8.2.xover the weekend

We plan to use the following steps to upgrade each server,

1. Dump the 8.0.x database cluster using 8.2.x pg_dumpall
% ./pg_dumpall > pgdumpall_backup.sql

2.Dump the 8.0.x database including large objects in compressed custom
format using 8.2.x pg_dump
% ./pg_dump -Fc -b -Z9 dbname > pgdump_lobs_backup

Restoring database
1. Initialize 8.2.x darabase
% initdb -D /data/pgdata

2. Restore template1 database from cluster dump
% ./psql -d template1 < pgdumpall_backup.sql

3. Delete database dbname else restoring will give error about existing
dbname
% dropdb dbname

4. Create fresh dbname
% createdb -O dbowner dbname

5. Restore database with lobs
% ./pg_restore -v -Fc -d dbname -e -U dbowner < pgdumpall_lobs_backup

Some of the problems we have are,
1. We are not sure if all of the data will be available after dump/restore
with above process
2. The dump and restore process is very very slow to be complete over the
weekend (takes approx 1GB/hr to dump on a dual G5 PPC 2Ghz with 1GB RAM and
RAID 1 disks)

What is the fastest way to upgrade postgres for large databases that has
binary objects?

Thanks for all your help.

#2Andrew Dunstan
andrew@dunslane.net
In reply to: CAJ CAJ (#1)
Re: Updating large postgresql database with blobs

CAJ CAJ wrote:

Hello,

I didn't get any response on the GENERAL list so i'm escalating this ....

We have several independent database servers with ~50GB+ databases
running postgres 8.0.x. We are planning to upgrade these databases to
postgres 8.2.x over the weekend

We plan to use the following steps to upgrade each server,

1. Dump the 8.0.x database cluster using 8.2.x pg_dumpall
% ./pg_dumpall > pgdumpall_backup.sql

2.Dump the 8.0.x database including large objects in compressed
custom format using 8.2.x pg_dump
% ./pg_dump -Fc -b -Z9 dbname > pgdump_lobs_backup

Restoring database
1. Initialize 8.2.x darabase
% initdb -D /data/pgdata

2. Restore template1 database from cluster dump
% ./psql -d template1 < pgdumpall_backup.sql

3. Delete database dbname else restoring will give error about
existing dbname
% dropdb dbname

4. Create fresh dbname
% createdb -O dbowner dbname

5. Restore database with lobs
% ./pg_restore -v -Fc -d dbname -e -U dbowner < pgdumpall_lobs_backup

Some of the problems we have are,
1. We are not sure if all of the data will be available after
dump/restore with above process
2. The dump and restore process is very very slow to be complete over
the weekend (takes approx 1GB/hr to dump on a dual G5 PPC 2Ghz with
1GB RAM and RAID 1 disks)

What is the fastest way to upgrade postgres for large databases that
has binary objects?

Thanks for all your help.

Your procedure dumps and restore the databases twice. This seems less
than sound. My prediction is that you could get a 50% speed improvement
by fixing that ...

The only thing you really need pg_dumpall for is the global tables. I
would just use pg_dumpall -g to get those, and then use pg_dump -F c +
pg_restore for each actual database.

Another thing is to make sure that pg_dump/pg_restore are not competing
with postgres for access to the same disk(s). One way to do that is to
run them from a different machine - they don't have to be run on the
server machine - of course then the network can become a bottleneck, so
YMMV.

cheers

andrew

#3CAJ CAJ
pguser@gmail.com
In reply to: Andrew Dunstan (#2)
Re: Updating large postgresql database with blobs

<snip>

What is the fastest way to upgrade postgres for large databases that
has binary objects?

Your procedure dumps and restore the databases twice. This seems less
than sound. My prediction is that you could get a 50% speed improvement
by fixing that ...

Thanks for the response. This'd be wonderful if I can get my process right.
My assumptions (probably incorrect) are that pgdump has to be excuted twice
on a database with blobs. Once to get the data and once to get the blob
(using the -b flag).

The only thing you really need pg_dumpall for is the global tables. I

would just use pg_dumpall -g to get those, and then use pg_dump -F c +
pg_restore for each actual database.

This makes sense :) I assume that running pg_dump with -b will get all of
the data including the blobs?

Another thing is to make sure that pg_dump/pg_restore are not competing

with postgres for access to the same disk(s). One way to do that is to
run them from a different machine - they don't have to be run on the
server machine - of course then the network can become a bottleneck, so
YMMV.

We are using separate servers for dump and restore.

Thanks again for your suggestions. This helps immensely.

#4Andrew Dunstan
andrew@dunslane.net
In reply to: CAJ CAJ (#3)
Re: Updating large postgresql database with blobs

CAJ CAJ wrote:

Thanks for the response. This'd be wonderful if I can get my process
right. My assumptions (probably incorrect) are that pgdump has to be
excuted twice on a database with blobs. Once to get the data and once
to get the blob (using the -b flag).

Why do you assume that? The pg_dump manual says:

-b
--blobs

Include large objects in the dump. This is the default behavior
except when --schema, --table, or --schema-only is specified, so the
-b switch is only useful to add large objects to selective dumps.

IOW, in most cases you will just get the large objects for free, and
don't even need to use this flag to get them. No need to run twice.

cheers

andrew