PG_RESTORE/DUMP Question
Hi,
I have a test system that is setup the same as a production system and
would like to frequently copy the database over.
pg_dump takes a few hours and even sometimes hangs.
Are there any reasons not to simply just copy the entire data directory
over to the test system? I could not find any postings on the net
suggesting otherwise. Is there anything to pay attention too ?
Thanks for any advise
Alex
Alex <alex@meerkatsoft.com> writes:
Hi,
I have a test system that is setup the same as a production system and
would like to frequently copy the database over.
pg_dump takes a few hours and even sometimes hangs.Are there any reasons not to simply just copy the entire data
directory over to the test system? I could not find any postings on
the net suggesting otherwise. Is there anything to pay attention too ?
If the two systems are the same architecture and OS, this can work,
but in order to get a consistent copy, you need to either:
a) Stop (completely shut down) the source database while the copy
runs, or
b) Use volume management and take a snapshot of the source database,
them copy the snapshot over. This will lose open transactions but
will be otherwise consistent.
-Doug
Import Notes
Reply to msg id not found: Alex'smessageofWed29Oct2003225202+0900
Alex wrote:
Hi,
I have a test system that is setup the same as a production system and
would like to frequently copy the database over.
pg_dump takes a few hours and even sometimes hangs.Are there any reasons not to simply just copy the entire data directory
over to the test system? I could not find any postings on the net
suggesting otherwise. Is there anything to pay attention too ?
Yes. just shutdown production postmaster. Copy the entire data directory over to
test system.
Two system should be absolutely identical. Same architecture, preferrably same
OS, same postgresql client and server version etc.
Or investigate some of the asynchronous replication systems. That would save you
some time but will affect production performance a bit.
HTH
Shridhar
Alex wrote:
Hi,
I have a test system that is setup the same as a production system and
would like to frequently copy the database over.
pg_dump takes a few hours and even sometimes hangs.Are there any reasons not to simply just copy the entire data directory
over to the test system? I could not find any postings on the net
suggesting otherwise. Is there anything to pay attention too ?Thanks for any advise
Alex
Probally a point for debate, but sure, why not.
I would create the database in it's own directory as to not
mix things up on both machines ie. export PGDATA2=/usr/local/database
Then just make sure you stop postgres when copying from or to on each
machine.
If someone doesn't think this will work, I'd like to know.
One of my backup routines depends on this kind of proceedure.
Of coarse I've got pg_dumps as well too. :)
In the following situation:
You do a large transaction where lots of rows are update
All of your tables/indexes cached in memory
When are the updated rows written out to disk? When they are updated inside
the transaction, or when the transaction is completed?
On Wed, 29 Oct 2003, Rick Gigger wrote:
In the following situation:
You do a large transaction where lots of rows are update
All of your tables/indexes cached in memoryWhen are the updated rows written out to disk? When they are updated inside
the transaction, or when the transaction is completed?
The data is written out but not made real, so to speak, during each
update. I.e. the updates individually add all these rows. At the end of
the transaction, if we rollback, all the tuples that were written out are
simply not committed, and therefore the last version of that record
remains the last one in the chain.
If the transaction is committed then each tuple becomes the last in its
chain (could it be second to last because of other transactions? I'm not
sure.)
In the following situation:
You do a large transaction where lots of rows are update
All of your tables/indexes cached in memoryWhen are the updated rows written out to disk? When they are updated
inside
the transaction, or when the transaction is completed?
The data is written out but not made real, so to speak, during each
update. I.e. the updates individually add all these rows. At the end of
the transaction, if we rollback, all the tuples that were written out are
simply not committed, and therefore the last version of that record
remains the last one in the chain.If the transaction is committed then each tuple becomes the last in its
chain (could it be second to last because of other transactions? I'm not
sure.)
I realize how commiting the transaction works from the users perspective I
am thinking here about the internal implementation. For instance if I do an
update inside a transaction, postgres could, in order to make sure data was
not lost, make sure that the data was flushed out to disk and fsynced. That
way it could tell me if there was a problem writing that data out to disk.
But if it is in the middle of a transaction I would think that you could
update the tuples cached in memory and return, then start sending the tuples
out to disk in the background. When you issue the commit of course
everything would need to be flushed out to disk and fsynced and any errors
with it could be reported before the transaction was finished and it could
still be rolled back.
It seems like if I had to update say 39,000 rows all with separate update
statements that it would be a lot faster if each update statement could just
update memory and then return and flush out to disk in the background while
I continue processing the other updates. Maybe it does do this already or
maybe it is a bad idea for some reason. I don't understand the inner
workings of postgres to say. That is why I'm asking.
Also is there any way to issue a whole bunch of updates together like this
faster than just issuing 39,000 individual update statements.
Is it enough to just copy the global and the base directory ?
Is there any reason the db would not come up if the data is copied form
solaris to linux or vice versa as long as the db version is the same?
Shridhar Daithankar wrote:
Show quoted text
Alex wrote:
Hi,
I have a test system that is setup the same as a production system
and would like to frequently copy the database over.
pg_dump takes a few hours and even sometimes hangs.Are there any reasons not to simply just copy the entire data
directory over to the test system? I could not find any postings on
the net suggesting otherwise. Is there anything to pay attention too ?Yes. just shutdown production postmaster. Copy the entire data
directory over to test system.Two system should be absolutely identical. Same architecture,
preferrably same OS, same postgresql client and server version etc.Or investigate some of the asynchronous replication systems. That
would save you some time but will affect production performance a bit.HTH
Shridhar
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
The data in the data directory is binary data and is not intended to work
even across different installations on the same machine. To copy the binary
data you'd need at least the global, base, pg_xlog, pg_clog and a few other.
The only thing you may skip may be different database is the base directory.
And even then you're risking a lot. Between linux and solaris I'd expect
various byte boundaries to move so forget transportability.
pg_dump is the only supported way of transporting data around.
On Sat, Nov 01, 2003 at 10:20:42PM +0900, Alex wrote:
Is it enough to just copy the global and the base directory ?
Is there any reason the db would not come up if the data is copied form
solaris to linux or vice versa as long as the db version is the same?Shridhar Daithankar wrote:
Alex wrote:
Hi,
I have a test system that is setup the same as a production system
and would like to frequently copy the database over.
pg_dump takes a few hours and even sometimes hangs.Are there any reasons not to simply just copy the entire data
directory over to the test system? I could not find any postings on
the net suggesting otherwise. Is there anything to pay attention too ?Yes. just shutdown production postmaster. Copy the entire data
directory over to test system.Two system should be absolutely identical. Same architecture,
preferrably same OS, same postgresql client and server version etc.Or investigate some of the asynchronous replication systems. That
would save you some time but will affect production performance a bit.HTH
Shridhar
---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org
--
Martijn van Oosterhout <kleptog@svana.org> http://svana.org/kleptog/
Show quoted text
"All that is needed for the forces of evil to triumph is for enough good
men to do nothing." - Edmond Burke
"The penalty good people pay for not being interested in politics is to be
governed by people worse than themselves." - Plato