Ways to speed up dump&reload

Started by Markus Wollnyabout 21 years ago10 messagesgeneral
Jump to latest
#1Markus Wollny
Markus.Wollny@computec.de

Hello!

Finally the time has come for us to upgrade our PostgreSQL 7.4.5-servers to 8.0.1 - and though I'm very much looking forward to some of the new features, the dump&reload process is worrying me a bit this time. I've got one cluster in particular which is roughly 9GB in size and features some dreadfully large indices of the GiST-type for Tsearch2.

I have already scheduled a nightshift for this upgrade, but I'm not all too certain that I'll be up and running again in the morning, so I wondered if there might be some preparations that would allow for some speed-up during the reload process? What kind of tuning could be done in postgresql.conf in respect to just this particular workload (COPY and finally the CREATE INDEX stuff) with no other concurrent access going on?

The machine in question features a RAID10 disk array which hosts the DB-cluster, some 2GB RAM and four processors and I've tuned postgresql.conf for our ordinary everyday workload with lots and lots of concurrent reads and writes, but I've got the feeling that this configuration might not be the optimum for dump&reload, so it might make some sense to use some sort of "maintenance"-configuration. What kind of tuning would you recommend?

Thank you very much for your advice and lots of cheers to the developers for 8.0!

Kind regards

Markus

#2Csaba Nagy
nagy@ecircle-ag.com
In reply to: Markus Wollny (#1)
Re: Ways to speed up dump&reload

Hi Markus,

Have you considered installing slony and replicate from the old version
to the new one ? It can do that, and when the replica is up to date, the
switchover will take minutes I guess.
Note that I have never done that so please ask others about that too, or
research for yourself ;-)

Cheers,
Csaba.

Show quoted text

On Mon, 2005-02-21 at 07:11, Markus Wollny wrote:

Hello!

Finally the time has come for us to upgrade our PostgreSQL 7.4.5-servers to 8.0.1 - and though I'm very much looking forward to some of the new features, the dump&reload process is worrying me a bit this time. I've got one cluster in particular which is roughly 9GB in size and features some dreadfully large indices of the GiST-type for Tsearch2.

I have already scheduled a nightshift for this upgrade, but I'm not all too certain that I'll be up and running again in the morning, so I wondered if there might be some preparations that would allow for some speed-up during the reload process? What kind of tuning could be done in postgresql.conf in respect to just this particular workload (COPY and finally the CREATE INDEX stuff) with no other concurrent access going on?

The machine in question features a RAID10 disk array which hosts the DB-cluster, some 2GB RAM and four processors and I've tuned postgresql.conf for our ordinary everyday workload with lots and lots of concurrent reads and writes, but I've got the feeling that this configuration might not be the optimum for dump&reload, so it might make some sense to use some sort of "maintenance"-configuration. What kind of tuning would you recommend?

Thank you very much for your advice and lots of cheers to the developers for 8.0!

Kind regards

Markus

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

#3Oleg Bartunov
oleg@sai.msu.su
In reply to: Markus Wollny (#1)
Re: Ways to speed up dump&reload

Markus,

have you read updated http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html
about regprocedure_7.4.patch.gz and regprocedure_update.sql ?

Oleg
On Mon, 21 Feb 2005, Markus Wollny wrote:

Hello!

Finally the time has come for us to upgrade our PostgreSQL 7.4.5-servers to 8.0.1 - and though I'm very much looking forward to some of the new features, the dump&reload process is worrying me a bit this time. I've got one cluster in particular which is roughly 9GB in size and features some dreadfully large indices of the GiST-type for Tsearch2.

I have already scheduled a nightshift for this upgrade, but I'm not all too certain that I'll be up and running again in the morning, so I wondered if there might be some preparations that would allow for some speed-up during the reload process? What kind of tuning could be done in postgresql.conf in respect to just this particular workload (COPY and finally the CREATE INDEX stuff) with no other concurrent access going on?

The machine in question features a RAID10 disk array which hosts the DB-cluster, some 2GB RAM and four processors and I've tuned postgresql.conf for our ordinary everyday workload with lots and lots of concurrent reads and writes, but I've got the feeling that this configuration might not be the optimum for dump&reload, so it might make some sense to use some sort of "maintenance"-configuration. What kind of tuning would you recommend?

Thank you very much for your advice and lots of cheers to the developers for 8.0!

Kind regards

Markus

---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faq

Regards,
Oleg
_____________________________________________________________
Oleg Bartunov, sci.researcher, hostmaster of AstroNet,
Sternberg Astronomical Institute, Moscow University (Russia)
Internet: oleg@sai.msu.su, http://www.sai.msu.su/~megera/
phone: +007(095)939-16-83, +007(095)939-23-83

#4Magnus Hagander
magnus@hagander.net
In reply to: Oleg Bartunov (#3)
Re: Ways to speed up dump&reload

Hello!

Finally the time has come for us to upgrade our PostgreSQL
7.4.5-servers to 8.0.1 - and though I'm very much looking
forward to some of the new features, the dump&reload process
is worrying me a bit this time. I've got one cluster in
particular which is roughly 9GB in size and features some
dreadfully large indices of the GiST-type for Tsearch2.

I have already scheduled a nightshift for this upgrade, but
I'm not all too certain that I'll be up and running again in
the morning, so I wondered if there might be some
preparations that would allow for some speed-up during the
reload process? What kind of tuning could be done in
postgresql.conf in respect to just this particular workload
(COPY and finally the CREATE INDEX stuff) with no other
concurrent access going on?

If you'll be around to babysit the system during the reload, turn off
fsync during the load.
If you have the RAM, consider temporarily moving the WAL logs to a
ramdrive.
Increase sort_mem and maintenance_sort_mem, since you will only have a
single connection.
If you're on hyperthreading CPUs, disable hyperthreading. Since you load
in a single connectino, only one CPU will be used.

Be very careful to remember to turn this back on after the load!

Not sure if there is something specific you can do for tsearch, but this
should help with the general stuff

//Magnus

#5Chris Browne
cbbrowne@acm.org
In reply to: Markus Wollny (#1)
Re: Ways to speed up dump&reload

Quoth Markus.Wollny@computec.de ("Markus Wollny"):

Finally the time has come for us to upgrade our PostgreSQL
7.4.5-servers to 8.0.1 - and though I'm very much looking forward to
some of the new features, the dump&reload process is worrying me a
bit this time. I've got one cluster in particular which is roughly
9GB in size and features some dreadfully large indices of the
GiST-type for Tsearch2.

I have already scheduled a nightshift for this upgrade, but I'm not
all too certain that I'll be up and running again in the morning, so
I wondered if there might be some preparations that would allow for
some speed-up during the reload process? What kind of tuning could
be done in postgresql.conf in respect to just this particular
workload (COPY and finally the CREATE INDEX stuff) with no other
concurrent access going on?

You might consider using Slony-I to minimize the downtime.

I haven't done an upgrade using, but know that others have, and I've
done the same using eRServer...

The idea is that you set up an 8.0.1 backend, and set up replication
well ahead of time. Replicate from the 7.4 system to the 8.0 one.

It might take several days to get replication up to date if the
databases are large enough, but once they are relatively in sync, they
should stay there pretty easily.

Switching versions is then as easy as using MOVE SET to switch the
origin from the 7.4 system to the 8.0 one. That ought to take mere
seconds, so you wouldn't need a long time to do the upgrade. See the
URL below for more information...
--
output = reverse("moc.liamg" "@" "enworbbc")
http://linuxdatabases.info/info/versionupgrade.html
"Did you ever walk in a room and forget why you walked in? I think
that's how dogs spend their lives." -- Sue Murphy

#6Markus Wollny
Markus.Wollny@computec.de
In reply to: Chris Browne (#5)
Re: Ways to speed up dump&reload

Hello Oleg,

Yes, thanks - I have installed this patch as soon as it became available. Concerning my largest database, I haven't yet migrated that to tsearch2 - it's still tsearch1-based, so I intend to drop ts1 before the last dump, reload the database with PostgreSQL 8.0.1 and deploy tsearch2 after that. I was thinking more in terms of what would be the optimum settings for things like workmem during reload and creation of the indices - as there's not much else going on at this point of time, it might be worthwhile to use some settings which allocate all available resources to just this task.

Kind regards

Markus

-----Original Message-----
From: Oleg Bartunov [mailto:oleg@sai.msu.su]
Sent: Mon 2/21/2005 11:15
To: Markus Wollny
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Ways to speed up dump&reload
Markus,

have you read updated http://www.sai.msu.su/~megera/postgres/gist/tsearch/V2/docs/tsearch-V2-intro.html
about regprocedure_7.4.patch.gz and regprocedure_update.sql ?

Oleg

#7Markus Wollny
Markus.Wollny@computec.de
In reply to: Markus Wollny (#6)
Re: Ways to speed up dump&reload

Hi!

Thanks very much, this was exactly the kind of advice I was hoping for! I'll give the WAL to ramdisk thing a try; fsync is off by default anyways - the data in this cluster is not so valuable as not to risk one day of rollback for the performance gain of having fsync turned off and I'm doing nightly dumps anyway AND the machine as both a dedicated UPS and redundant PSUs, so the likelyhood of things turning snafu is considerably small. But having WAL in RAM might actually speed up the whole lot quite considerably.

Thank you!

Kind regards

Markus

-----Original Message-----
From: Magnus Hagander [mailto:mha@sollentuna.net]
Sent: Mon 2/21/2005 13:04
To: Markus Wollny; pgsql-general@postgresql.org
Cc:
Subject: RE: [GENERAL] Ways to speed up dump&reload

Hello!

Finally the time has come for us to upgrade our PostgreSQL
7.4.5-servers to 8.0.1 - and though I'm very much looking
forward to some of the new features, the dump&reload process
is worrying me a bit this time. I've got one cluster in
particular which is roughly 9GB in size and features some
dreadfully large indices of the GiST-type for Tsearch2.

I have already scheduled a nightshift for this upgrade, but
I'm not all too certain that I'll be up and running again in
the morning, so I wondered if there might be some
preparations that would allow for some speed-up during the
reload process? What kind of tuning could be done in
postgresql.conf in respect to just this particular workload
(COPY and finally the CREATE INDEX stuff) with no other
concurrent access going on?

If you'll be around to babysit the system during the reload, turn off
fsync during the load.
If you have the RAM, consider temporarily moving the WAL logs to a
ramdrive.
Increase sort_mem and maintenance_sort_mem, since you will only have a
single connection.
If you're on hyperthreading CPUs, disable hyperthreading. Since you load
in a single connectino, only one CPU will be used.

Be very careful to remember to turn this back on after the load!

Not sure if there is something specific you can do for tsearch, but this
should help with the general stuff

//Magnus

#8Chris Browne
cbbrowne@acm.org
In reply to: Markus Wollny (#1)
Re: Ways to speed up dump&reload

You might consider using Slony-I to minimize the downtime.

I haven't done an upgrade using, but know that others have, and I've
done the same using eRServer...

The idea is that you set up an 8.0.1 backend, and set up replication
well ahead of time. Replicate from the 7.4 system to the 8.0 one.

It might take several days to get replication up to date if the
databases are large enough, but once they are relatively in sync, they
should stay there pretty easily.

Switching versions is then as easy as using MOVE SET to switch the
origin from the 7.4 system to the 8.0 one. That ou
--
output = reverse("moc.liamg" "@" "enworbbc")
http://linuxdatabases.info/info/versionupgrade.html
The quickest way to a man's heart is through his chest, with an axe.

#9Markus Wollny
Markus.Wollny@computec.de
In reply to: Chris Browne (#8)
Re: Ways to speed up dump&reload

Hi!

Thanks, I'll try Slony-I next time - I currently lack the time to test it beforehand on my non-production system; I'd be a bit worried about the additional load a second PG-instance as replication slave would impose on the machine, so at the moment I'm more confident going along the same tracks I've gone before during the previous switches since 7.1.

Kind regards

Markus

#10Vivek Khera
khera@yertle.int.kciLink.com
In reply to: Markus Wollny (#1)
Re: Ways to speed up dump&reload

max out your checkpoint_segments. 128 or 256 is good if you have
plenty of spare space for the pg_xlog directory. you will also want
to increase checkpoint_timeout to something large. I like 900 seconds
personally. if you can put pg_xlog on a separate physical RAID you're
best off.

increase the amount of ram your index operations can use (sort_mem or
work_mem depending on PG version) to a very large number and then
reset that to a normal value when you're done.

then just let pg_dump + pg_restore do its work.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Vivek Khera, Ph.D. Khera Communications, Inc.
Internet: khera@kciLink.com Rockville, MD +1-301-869-4449 x806
AIM: vivekkhera Y!: vivek_khera http://www.khera.org/~vivek/