dealing with file size when archiving databases

Started by Andrew L. Gouldalmost 21 years ago7 messagesgeneral
Jump to latest
#1Andrew L. Gould
algould@datawok.com

I've been backing up my databases by piping pg_dump into gzip and
burning the resulting files to a DVD-R. Unfortunately, FreeBSD has
problems dealing with very large files (>1GB?) on DVD media. One of my
compressed database backups is greater than 1GB; and the results of a
gzipped pg_dumpall is approximately 3.5GB. The processes for creating
the iso image and burning the image to DVD-R finish without any
problems; but the resulting file is unreadable/unusable.

My proposed solution is to modify my python script to:

1. use pg_dump to dump each database's tables individually, including
both the database and table name in the file name;
3. use 'pg_dumpall -g' to dump the global information; and
4. burn the backup directories, files and a recovery script to DVD-R.

The script will pipe pg_dump into gzip to compress the files.

My questions are:

1. Will 'pg_dumpall -g' dump everything not dumped by pg_dump? Will I
be missing anything?
2. Does anyone foresee any problems with the solution above?

Thanks,

Andrew Gould

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Andrew L. Gould (#1)
Re: dealing with file size when archiving databases

"Andrew L. Gould" <algould@datawok.com> writes:

I've been backing up my databases by piping pg_dump into gzip and
burning the resulting files to a DVD-R. Unfortunately, FreeBSD has
problems dealing with very large files (>1GB?) on DVD media. One of my
compressed database backups is greater than 1GB; and the results of a
gzipped pg_dumpall is approximately 3.5GB. The processes for creating
the iso image and burning the image to DVD-R finish without any
problems; but the resulting file is unreadable/unusable.

Yech. However, I think you are reinventing the wheel in your proposed
solution. Why not just use split(1) to divide the output of pg_dump or
pg_dumpall into slices that the DVD software won't choke on? See
notes at
http://developer.postgresql.org/docs/postgres/backup.html#BACKUP-DUMP-LARGE

regards, tom lane

#3Alvaro Herrera
alvherre@surnet.cl
In reply to: Andrew L. Gould (#1)
Re: dealing with file size when archiving databases

On Mon, Jun 20, 2005 at 09:28:51PM -0500, Andrew L. Gould wrote:

I've been backing up my databases by piping pg_dump into gzip and
burning the resulting files to a DVD-R. Unfortunately, FreeBSD has
problems dealing with very large files (>1GB?) on DVD media. One of my
compressed database backups is greater than 1GB; and the results of a
gzipped pg_dumpall is approximately 3.5GB. The processes for creating
the iso image and burning the image to DVD-R finish without any
problems; but the resulting file is unreadable/unusable.

Tom's response is certainly something to consider; also, note that if
you "pg_dump -t" each table separately, the dumps are not necessarily
consistent with one another, meaning that you could end up with an
unrecoverable backup if a transaction modifying two (foreign key-)
dependant tables happens to run after backing up one but before backing
up the other.

--
Alvaro Herrera (<alvherre[a]surnet.cl>)
"Las cosas son buenas o malas segun las hace nuestra opini�n" (Lisias)

#4Andrew L. Gould
algould@datawok.com
In reply to: Andrew L. Gould (#1)
Re: dealing with file size when archiving databases

On Monday 20 June 2005 09:53 pm, Tom Lane wrote:

"Andrew L. Gould" <algould@datawok.com> writes:

I've been backing up my databases by piping pg_dump into gzip and
burning the resulting files to a DVD-R. Unfortunately, FreeBSD has
problems dealing with very large files (>1GB?) on DVD media. One
of my compressed database backups is greater than 1GB; and the
results of a gzipped pg_dumpall is approximately 3.5GB. The
processes for creating the iso image and burning the image to DVD-R
finish without any problems; but the resulting file is
unreadable/unusable.

Yech. However, I think you are reinventing the wheel in your
proposed solution. Why not just use split(1) to divide the output of
pg_dump or pg_dumpall into slices that the DVD software won't choke
on? See notes at
http://developer.postgresql.org/docs/postgres/backup.html#BACKUP-DUMP
-LARGE

regards, tom lane

Thanks, Tom! The split option also fixes the problem; whereas my
"solution", only delays the problem until a table gets too large. Of
course, at that point, I should probably use something other than
DVD's.

Andrew Gould

#5Andrew L. Gould
algould@datawok.com
In reply to: Alvaro Herrera (#3)
Re: dealing with file size when archiving databases

On Monday 20 June 2005 10:14 pm, Alvaro Herrera wrote:

On Mon, Jun 20, 2005 at 09:28:51PM -0500, Andrew L. Gould wrote:

I've been backing up my databases by piping pg_dump into gzip and
burning the resulting files to a DVD-R. Unfortunately, FreeBSD has
problems dealing with very large files (>1GB?) on DVD media. One
of my compressed database backups is greater than 1GB; and the
results of a gzipped pg_dumpall is approximately 3.5GB. The
processes for creating the iso image and burning the image to DVD-R
finish without any problems; but the resulting file is
unreadable/unusable.

Tom's response is certainly something to consider; also, note that if
you "pg_dump -t" each table separately, the dumps are not necessarily
consistent with one another, meaning that you could end up with an
unrecoverable backup if a transaction modifying two (foreign key-)
dependant tables happens to run after backing up one but before
backing up the other.

Ouch! I hadn't thought of that. (Isn't it wonderful that I've only had
to restore dumps for upgrades?!)

Thanks.

Andrew Gould

#6Tino Wildenhain
tino@wildenhain.de
In reply to: Andrew L. Gould (#1)
Re: dealing with file size when archiving databases

Am Montag, den 20.06.2005, 21:28 -0500 schrieb Andrew L. Gould:

I've been backing up my databases by piping pg_dump into gzip and
burning the resulting files to a DVD-R. Unfortunately, FreeBSD has
problems dealing with very large files (>1GB?) on DVD media. One of my
compressed database backups is greater than 1GB; and the results of a
gzipped pg_dumpall is approximately 3.5GB. The processes for creating
the iso image and burning the image to DVD-R finish without any
problems; but the resulting file is unreadable/unusable.

My proposed solution is to modify my python script to:

1. use pg_dump to dump each database's tables individually, including
both the database and table name in the file name;
3. use 'pg_dumpall -g' to dump the global information; and
4. burn the backup directories, files and a recovery script to DVD-R.

The script will pipe pg_dump into gzip to compress the files.

I'd use pg_dump -Fc instead. It is compressed and you get some more
options for restore for free (selective restore for example)

My questions are:

1. Will 'pg_dumpall -g' dump everything not dumped by pg_dump? Will I
be missing anything?
2. Does anyone foresee any problems with the solution above?

Yes, the files might be too big for one DVD at a time.

#7Vick Khera
vivek@khera.org
In reply to: Andrew L. Gould (#1)
Re: dealing with file size when archiving databases

On Jun 20, 2005, at 10:28 PM, Andrew L. Gould wrote:

compressed database backups is greater than 1GB; and the results of a
gzipped pg_dumpall is approximately 3.5GB. The processes for creating
the iso image and burning the image to DVD-R finish without any
problems; but the resulting file is unreadable/unusable.

I ran into this as well. Apparently FreeBSD will not read a large
file on an ISO file system even though on a standard UFS or UFS2 fs
it will read files larger than you can make :-).

What I used to do was "split -b 1024m my.dump my.dump-split-" to
create multiple files and burn those to the DVD. To restore, you
"cat my.dump.split.?? | pg_restore" with appropriate options to
pg_restore.

My ultimate fix was to start burning and reading the DVD's on my
MacOS desktop instead, which can read/write these large files just
fine :-)

Vivek Khera, Ph.D.
+1-301-869-4449 x806