pg_dumpall

Started by Steve Clarkabout 18 years ago12 messagesgeneral
Jump to latest
#1Steve Clark
sclark@netwolves.com

Hello List,

the man page for pg_dump say:
pg_dump is a utility for backing up a PostgreSQL database. It makes
consistent backups even if the database is being used
concurrently.

does pg_dumpall make consistent backups if the database is being used
concurrently?
Even though the man page doesn't say it does.

Thanks,
Steve

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Steve Clark (#1)
Re: pg_dumpall

Steve Clark <sclark@netwolves.com> writes:

does pg_dumpall make consistent backups if the database is being used
concurrently?
Even though the man page doesn't say it does.

That's intentional, because it doesn't. What you get is a pg_dump
snapshot of each database in sequence; those snapshots don't all
correspond to the same time instant. There isn't any good way to
guarantee time coherence of dumps across two databases.

regards, tom lane

#3Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Tom Lane (#2)
Re: pg_dumpall

Tom Lane wrote:

Steve Clark <sclark@netwolves.com> writes:

does pg_dumpall make consistent backups if the database is being used
concurrently?
Even though the man page doesn't say it does.

That's intentional, because it doesn't. What you get is a pg_dump
snapshot of each database in sequence; those snapshots don't all
correspond to the same time instant. There isn't any good way to
guarantee time coherence of dumps across two databases.

The fine point possibly being missed is that each database's dump
produced by pg_dumpall is, of course, self-consistent.

--
Alvaro Herrera http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.

#4Glyn Astill
glynastill@yahoo.co.uk
In reply to: Tom Lane (#2)
Re: pg_dumpall

Out of interest, how does pg_dump manage to do a snapshot of a
database at an instant in time?

My mental picture of pg_dump was just a series of queries dumping out
the tables...

--- Tom Lane <tgl@sss.pgh.pa.us> wrote:

Steve Clark <sclark@netwolves.com> writes:

does pg_dumpall make consistent backups if the database is being

used

concurrently?
Even though the man page doesn't say it does.

That's intentional, because it doesn't. What you get is a pg_dump
snapshot of each database in sequence; those snapshots don't all
correspond to the same time instant. There isn't any good way to
guarantee time coherence of dumps across two databases.

regards, tom lane

---------------------------(end of
broadcast)---------------------------
TIP 2: Don't 'kill -9' the postmaster

___________________________________________________________
Support the World Aids Awareness campaign this month with Yahoo! For Good http://uk.promotions.yahoo.com/forgood/

#5Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Glyn Astill (#4)
Re: pg_dumpall

Glyn Astill wrote:

Out of interest, how does pg_dump manage to do a snapshot of a
database at an instant in time?

My mental picture of pg_dump was just a series of queries dumping out
the tables...

begin;
set transaction isolation level serializable;

--- begin dumping stuff;

--
Alvaro Herrera http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support

#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#3)
Re: pg_dumpall

Alvaro Herrera <alvherre@commandprompt.com> writes:

Tom Lane wrote:

That's intentional, because it doesn't. What you get is a pg_dump
snapshot of each database in sequence; those snapshots don't all
correspond to the same time instant. There isn't any good way to
guarantee time coherence of dumps across two databases.

The fine point possibly being missed is that each database's dump
produced by pg_dumpall is, of course, self-consistent.

Right, but Steve already knew that.

Hmm ... it suddenly strikes me that Simon's "transaction snapshot
cloning" idea could provide a way to get exactly coherent dumps from
multiple databases in the same cluster. Maybe he already realized that,
but I didn't.

regards, tom lane

#7Greg Smith
gsmith@gregsmith.com
In reply to: Tom Lane (#2)
Re: pg_dumpall

On Thu, 17 Jan 2008, Tom Lane wrote:

There isn't any good way to guarantee time coherence of dumps across two
databases.

Whether there's a good way depends on what you're already doing. If
you're going to the trouble of making a backup using PITR anyway, it's not
hard to stop applying new logs to that replica and dump from it to get a
point in time backup across all the databases. That's kind of painful now
because you have to start the server to run pg_dumpall, so resuming
recovery is difficult, but you can play filesystem tricks to make that
easier.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#8Glyn Astill
glynastill@yahoo.co.uk
In reply to: Alvaro Herrera (#5)
Re: pg_dumpall

Alvaro Herrera <alvherre@commandprompt.com> wrote:
Glyn Astill wrote:

Out of interest, how does pg_dump manage to do a snapshot of a
database at an instant in time?

My mental picture of pg_dump was just a series of queries dumping

out

the tables...

begin;
set transaction isolation level serializable;

--- begin dumping stuff;

Wouldn't that just lock everything so nothing could be updated? Or
just the table it is outputting?

I'm guessing I need to go off and school myself on different
isolation levels etc to understand, but say I have 2 tables "sales"
and "sold", and users are selling items with inserts into the sales
table and a count updating manually in sold. Wouldn't these end up
inconsistant in the dump?

___________________________________________________________
Support the World Aids Awareness campaign this month with Yahoo! For Good http://uk.promotions.yahoo.com/forgood/

#9Martijn van Oosterhout
kleptog@svana.org
In reply to: Glyn Astill (#8)
Re: pg_dumpall

On Thu, Jan 17, 2008 at 11:14:22AM -0800, Glyn Astill wrote:

begin;
set transaction isolation level serializable;

--- begin dumping stuff;

Wouldn't that just lock everything so nothing could be updated? Or
just the table it is outputting?

PostgreSQL uses MVCC, which means the whole thing is lock free. It just
requires more diskspace. To keep the older versions around.

Have a nice day,
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/

Show quoted text

Those who make peaceful revolution impossible will make violent revolution inevitable.
-- John F Kennedy

#10Erik Jones
erik@myemma.com
In reply to: Greg Smith (#7)
Re: pg_dumpall

On Jan 17, 2008, at 1:08 PM, Greg Smith wrote:

On Thu, 17 Jan 2008, Tom Lane wrote:

There isn't any good way to guarantee time coherence of dumps
across two databases.

Whether there's a good way depends on what you're already doing.
If you're going to the trouble of making a backup using PITR
anyway, it's not hard to stop applying new logs to that replica and
dump from it to get a point in time backup across all the
databases. That's kind of painful now because you have to start
the server to run pg_dumpall, so resuming recovery is difficult,
but you can play filesystem tricks to make that easier.

Actually, this exact scenario brings up a question I was thinking of
last night. If you stop a PITR standby server and bring it up to
dump from, will all of the database file have something written to
them at some point during the dump? Transactional information is
what I'd assume would be written, if so, but I'm not really sure of
the low level details there.

Erik Jones

DBA | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com

#11Steve Clark
sclark@netwolves.com
In reply to: Erik Jones (#10)
Re: pg_dumpall

Erik Jones wrote:

On Jan 17, 2008, at 1:08 PM, Greg Smith wrote:

On Thu, 17 Jan 2008, Tom Lane wrote:

There isn't any good way to guarantee time coherence of dumps
across two databases.

Whether there's a good way depends on what you're already doing.
If you're going to the trouble of making a backup using PITR
anyway, it's not hard to stop applying new logs to that replica and
dump from it to get a point in time backup across all the
databases. That's kind of painful now because you have to start
the server to run pg_dumpall, so resuming recovery is difficult,
but you can play filesystem tricks to make that easier.

Actually, this exact scenario brings up a question I was thinking of
last night. If you stop a PITR standby server and bring it up to
dump from, will all of the database file have something written to
them at some point during the dump? Transactional information is
what I'd assume would be written, if so, but I'm not really sure of
the low level details there.

Erik Jones

DBA | Emma®
erik@myemma.com
800.595.4401 or 615.292.5888
615.292.0777 (fax)

Emma helps organizations everywhere communicate & market in style.
Visit us online at http://www.myemma.com

Thanks for everyone that replied to my query about pg_dumpall.

Now another question/issue - anytime I usr createdb the resulting db
ends up
with UTF-8 encoding unless I use the -E switch. Is there a way to make
the
default be sql_ascii? postgres version is 8.2.5

Thanks again
Steve

#12Aarni Ruuhimäki
aarni@kymi.com
In reply to: Steve Clark (#11)
Re: pg_dumpall

On Friday 18 January 2008 14:38, Steve Clark wrote:

Thanks for everyone that replied to my query about pg_dumpall.

Now another question/issue - anytime I usr createdb the resulting db
ends up
with UTF-8 encoding unless I use the -E switch. Is there a way to make
the
default be sql_ascii? postgres version is 8.2.5

Thanks again
Steve

Hi Steve,

http://www.postgresql.org/docs/8.2/static/app-initdb.html

Best regards,

--
Aarni