8.3 / 8.2.6 restore comparison

jd@commandprompt.com

over 18 years ago

Hello,

I have been testing a migration for a week now trying to get it into a
reasonable state. This is what we have:

Restore file 220G

8.2.6 and 8.3.0 are configured identically:

shared_buffers = 8000MB
work_mem = 32MB
maintenance_work_mem = 512MB
fsync = off
full_page_writes = off
checkpoint_segments = 300
synchronous_commit = off (8.3)
wal_writer_delay = off (8.3)
autovacuum = off

8.2.6 after 2 hours has restored 41GB.
8.3.0 after 2.5 hours had restored 38GB.

Originally I was thinking that 8.2.6 was stomping 8.3. However I am
thinking that the reduction in the tuple header sizes for 8.3 means
that yes I restored 38GB, it is actually *more* data than 8.2.6. Does
that seem accurate to everyone else? If so what can we do to speed this
up? We are certainly *not* saturating the disk (16 spindles SCSI).

I am thinking the way we are going to need to do this is to have an
extended outage and write a custom script to do a concurrent dump and
load. (no in this case slony is not an option).

Sincerely,

Joshua D. Drake

--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director | PostgreSQL political pundit

Andrew Dunstan

andrew@dunslane.net

over 18 years ago

In reply to: Joshua D. Drake (#1)

Re: 8.3 / 8.2.6 restore comparison

Joshua D. Drake wrote:

Hello,

I have been testing a migration for a week now trying to get it into a
reasonable state. This is what we have:

Restore file 220G

8.2.6 and 8.3.0 are configured identically:

shared_buffers = 8000MB
work_mem = 32MB
maintenance_work_mem = 512MB
fsync = off
full_page_writes = off
checkpoint_segments = 300
synchronous_commit = off (8.3)
wal_writer_delay = off (8.3)
autovacuum = off

8.2.6 after 2 hours has restored 41GB.
8.3.0 after 2.5 hours had restored 38GB.

Originally I was thinking that 8.2.6 was stomping 8.3. However I am
thinking that the reduction in the tuple header sizes for 8.3 means
that yes I restored 38GB, it is actually *more* data than 8.2.6. Does
that seem accurate to everyone else? If so what can we do to speed this
up? We are certainly *not* saturating the disk (16 spindles SCSI).

I am thinking the way we are going to need to do this is to have an
extended outage and write a custom script to do a concurrent dump and
load. (no in this case slony is not an option).

I just tested a ~110GB load. On our modest backup server, 8.2 yesterday
did the data load (i.e. the COPY steps) in 1h57m. Today, 8.3 on
identical data and settings took 1h42m. Relation size is down by about
10% too, which is very nice, and probably accounts for the load time
improvement.

cheers

andrew

jd@commandprompt.com

over 18 years ago

In reply to: Andrew Dunstan (#2)

Re: 8.3 / 8.2.6 restore comparison

On Thu, 07 Feb 2008 09:47:08 -0500
Andrew Dunstan <andrew@dunslane.net> wrote:

Restore file 220G

8.2.6 and 8.3.0 are configured identically:

shared_buffers = 8000MB
work_mem = 32MB
maintenance_work_mem = 512MB
fsync = off
full_page_writes = off
checkpoint_segments = 300
synchronous_commit = off (8.3)
wal_writer_delay = off (8.3)
autovacuum = off

8.2.6 after 2 hours has restored 41GB.
8.3.0 after 2.5 hours had restored 38GB.

I just tested a ~110GB load. On our modest backup server, 8.2
yesterday did the data load (i.e. the COPY steps) in 1h57m. Today,
8.3 on identical data and settings took 1h42m. Relation size is down
by about 10% too, which is very nice, and probably accounts for the
load time improvement.

Ergghh o.k. I am definitely missing something in the environment. By
your numbers I should be well over 100GB restored at 2.5 hours. I am
not. I am only 38GB in.

What type of IO do you have on that machine? What type of CPU and RAM?

Sincerely,

Joshua D. Drake

Stephen Frost

sfrost@snowman.net

over 18 years ago

In reply to: Joshua D. Drake (#3)

Re: 8.3 / 8.2.6 restore comparison

* Joshua D. Drake (jd@commandprompt.com) wrote:

Ergghh o.k. I am definitely missing something in the environment. By
your numbers I should be well over 100GB restored at 2.5 hours. I am
not. I am only 38GB in.

I'm guessing you've checked this, so don't shoot me if you have, but....
How was the "restore file" built? Does it create the indexes, primary
keys, FKs, whatever, before loading the data? That'd slow things down
tremendously.. Or if it's creating them while loading the data, there
would be large pauses while it's building the indexes...

Enjoy,

Stephen

Andrew Dunstan

andrew@dunslane.net

over 18 years ago

In reply to: Joshua D. Drake (#3)

Re: 8.3 / 8.2.6 restore comparison

Joshua D. Drake wrote:

On Thu, 07 Feb 2008 09:47:08 -0500
Andrew Dunstan <andrew@dunslane.net> wrote:

Restore file 220G

8.2.6 and 8.3.0 are configured identically:

shared_buffers = 8000MB
work_mem = 32MB
maintenance_work_mem = 512MB
fsync = off
full_page_writes = off
checkpoint_segments = 300
synchronous_commit = off (8.3)
wal_writer_delay = off (8.3)
autovacuum = off

8.2.6 after 2 hours has restored 41GB.
8.3.0 after 2.5 hours had restored 38GB.

I just tested a ~110GB load. On our modest backup server, 8.2
yesterday did the data load (i.e. the COPY steps) in 1h57m. Today,
8.3 on identical data and settings took 1h42m. Relation size is down
by about 10% too, which is very nice, and probably accounts for the
load time improvement.

Ergghh o.k. I am definitely missing something in the environment. By
your numbers I should be well over 100GB restored at 2.5 hours. I am
not. I am only 38GB in.

What type of IO do you have on that machine? What type of CPU and RAM?

2Ghz Xeon dual core
16 Gb RAM
HW RAID0 data store - not sure how many spindles

I didn't touch any of the 8.3-only config stuff. I have work_mem set a
lot higher than you do, though - not sure if that makes any difference
to a bulk load.

This is not a very heavy duty box.

Note: a full restore takes much longer than this - it is almost all
taken up building indexes. I will be testing that over the weekend,
probably.

cheers

andrew

jd@commandprompt.com

over 18 years ago

In reply to: Andrew Dunstan (#5)

Re: 8.3 / 8.2.6 restore comparison

On Thu, 07 Feb 2008 11:20:49 -0500
Andrew Dunstan <andrew@dunslane.net> wrote:

Joshua D. Drake wrote:

On Thu, 07 Feb 2008 09:47:08 -0500
Andrew Dunstan <andrew@dunslane.net> wrote:

Restore file 220G

8.2.6 and 8.3.0 are configured identically:

shared_buffers = 8000MB
work_mem = 32MB
maintenance_work_mem = 512MB
fsync = off
full_page_writes = off
checkpoint_segments = 300
synchronous_commit = off (8.3)
wal_writer_delay = off (8.3)
autovacuum = off

8.2.6 after 2 hours has restored 41GB.
8.3.0 after 2.5 hours had restored 38GB.

I just tested a ~110GB load. On our modest backup server, 8.2
yesterday did the data load (i.e. the COPY steps) in 1h57m. Today,
8.3 on identical data and settings took 1h42m. Relation size is
down by about 10% too, which is very nice, and probably accounts
for the load time improvement.

Ergghh o.k. I am definitely missing something in the environment. By
your numbers I should be well over 100GB restored at 2.5 hours. I am
not. I am only 38GB in.

What type of IO do you have on that machine? What type of CPU and
RAM?

2Ghz Xeon dual core
16 Gb RAM
HW RAID0 data store - not sure how many spindles

I didn't touch any of the 8.3-only config stuff. I have work_mem set
a lot higher than you do, though - not sure if that makes any
difference to a bulk load.

This is not a very heavy duty box.

Depends on how you look at it. Restores are traditionally CPU bound if
you have any kind of IO. If you have RAID 0 you have some IO available
to you. It bothers me that the environment I am testing which has 16
spindles (8 per volume) is garnering such miserable performance in
comparison.

In my environment we aren't able to push the I/O at all (wait times <
3%) and yet these are Opterons that we have which have great memory
bandwidth. 8.2 did use more I/O than 8.3 but I think that is just
architectural with the change to 8.3 in general.

Joshua D. Drake
--
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director | PostgreSQL political pundit

jd@commandprompt.com

over 18 years ago

In reply to: Stephen Frost (#4)

Re: 8.3 / 8.2.6 restore comparison

On Thu, 7 Feb 2008 11:11:32 -0500
Stephen Frost <sfrost@snowman.net> wrote:

* Joshua D. Drake (jd@commandprompt.com) wrote:

Ergghh o.k. I am definitely missing something in the environment. By
your numbers I should be well over 100GB restored at 2.5 hours. I am
not. I am only 38GB in.

I'm guessing you've checked this, so don't shoot me if you have,
but....

Uhh yeah :)

How was the "restore file" built? Does it create the
indexes, primary keys, FKs, whatever, before loading the data?
That'd slow things down tremendously.. Or if it's creating them
while loading the data, there would be large pauses while it's
building the indexes...

Absolutely correct. it would. This dump was created using pg_dumpall.

Sincerely,

Joshua D. Drake

Greg Smith

gsmith@gregsmith.com

over 18 years ago

In reply to: Joshua D. Drake (#1)

Re: 8.3 / 8.2.6 restore comparison

On Wed, 6 Feb 2008, Joshua D. Drake wrote:

8.2.6 after 2 hours has restored 41GB.

I've been doing a long bulk import job recently (COPY) on a box with more
spindles than yours (but with a dumb controller) and I too am stuck at
that speed; I calculate a consistant 19.6GB/hour. The actual disk I/O is
very low as that works out to only 5.7MB/s of progress. Mine was
bottlenecked by capacity of a single CPU (4X Opteron system). I think
this is one of those barriers it's hard to crack without a caching
controller, for reasons I haven't figured out completely yet.

I am thinking the way we are going to need to do this is to have an
extended outage and write a custom script to do a concurrent dump and
load.

If you look at the -performance list this week I've been yelping about
this issue and trying to figure out how to setup a useful multi-CPU loader
for cases like these.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

jd@commandprompt.com

over 18 years ago

In reply to: Greg Smith (#8)

Re: 8.3 / 8.2.6 restore comparison

On Thu, 7 Feb 2008 11:54:07 -0500 (EST)
Greg Smith <gsmith@gregsmith.com> wrote:

On Wed, 6 Feb 2008, Joshua D. Drake wrote:

8.2.6 after 2 hours has restored 41GB.

I've been doing a long bulk import job recently (COPY) on a box with
more spindles than yours (but with a dumb controller) and I too am
stuck at that speed; I calculate a consistant 19.6GB/hour.

Bingo! O.k. that is ~ I am pushing. So I am not alone in this world.

The
actual disk I/O is very low as that works out to only 5.7MB/s of
progress. Mine was bottlenecked by capacity of a single CPU (4X
Opteron system). I think this is one of those barriers it's hard to
crack without a caching controller, for reasons I haven't figured out
completely yet.

O.k. we have 8X but it is also Opteron.

I am thinking the way we are going to need to do this is to have an
extended outage and write a custom script to do a concurrent dump
and load.

If you look at the -performance list this week I've been yelping
about this issue and trying to figure out how to setup a useful
multi-CPU loader for cases like these.

I would think that for the custom format this would be reasonably
simple. It is certainly something we need to figure out. The idea that
PostgreSQL can only restore 20G an hour on a box that can actually
write 120+ Megs a second (random) is pathetic.

I know Luke has mentioned some issues in the past as well around CPU
boundness with an upper limit of 300M/s (IIRC) but even that doesn't
equate to what is going on here as we are not getting anywhere near
that.

Joshua D. Drake

#10

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Joshua D. Drake (#9)

Re: 8.3 / 8.2.6 restore comparison

"Joshua D. Drake" <jd@commandprompt.com> writes:

I know Luke has mentioned some issues in the past as well around CPU
boundness with an upper limit of 300M/s (IIRC) but even that doesn't
equate to what is going on here as we are not getting anywhere near
that.

Some vmstat and oprofile investigation seems called for. Right now
we're just guessing about what the bottleneck is.

regards, tom lane

#11

jd@commandprompt.com

over 18 years ago

In reply to: Tom Lane (#10)

Re: 8.3 / 8.2.6 restore comparison

On Thu, 07 Feb 2008 13:47:22 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Joshua D. Drake" <jd@commandprompt.com> writes:

I know Luke has mentioned some issues in the past as well around CPU
boundness with an upper limit of 300M/s (IIRC) but even that doesn't
equate to what is going on here as we are not getting anywhere near
that.

Some vmstat and oprofile investigation seems called for. Right now
we're just guessing about what the bottleneck is.

vmstat -adSK 5
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free inact active si so bi bo in cs us sy id wa
0 2 800 15853104 7150484 8581708 0 0 1571 387 0 0 10 7 71 12
2 0 800 15668016 7335564 8581456 0 0 12982 24382 2699 1421 11 2 82 5
1 0 800 15488688 7513572 8583016 0 0 13006 24351 2601 1322 10 2 82 6
1 0 800 15296024 7704488 8584160 0 0 13494 23324 2430 1135 11 2 84 3
1 0 800 15017240 7984204 8582308 0 0 17219 29910 2632 1427 10 3 86 1
1 2 800 14754224 8239160 8583344 0 0 13710 41932 2666 1371 8 2 80 9
1 3 800 14590216 8390680 8594228 0 0 8112 33920 2361 1351 5 2 72 20
1 0 800 14435000 8562800 8585540 0 0 9138 23773 3114 1461 6 2 75 17
2 0 800 14147160 8848424 8587072 0 0 14967 35206 3156 2675 10 4 76 10
1 0 800 13850560 9141480 8588632 0 0 15454 42181 3047 2633 10 4 72 14
1 1 800 13765824 9212296 8603744 0 0 4805 33795 2903 6312 4 9 68 20
1 0 800 13552088 9436156 8594284 0 0 11916 20255 2505 1695 7 3 79 11
1 0 800 13375152 9612124 8595336 0 0 11829 26774 2644 1359 11 2 83 4
1 0 800 13301232 9684104 8596356 0 0 4747 21765 2173 829 12 1 82 5
1 0 800 13294680 9693060 8597360 0 0 386 1786 1732 587 12 0 87 0
1 0 800 13284448 9702212 8598452 0 0 410 1871 1939 878 13 0 86 1
1 0 800 13273056 9711180 8599452 0 0 386 1870 1996 907 12 0 87 0
1 0 800 13235488 9748400 8600492 0 0 1824 1862 2128 1107 12 1 86 1
1 0 800 13019720 9962004 8602264 0 0 17773 19770 2632 1596 11 2 81 6
1 0 800 12832288 10143020 8603912 0 0 18359 19553 2627 1586 11 2 78 8
1 0 800 12665952 10313812 8605088 0 0 17102 17550 2587 1582 11 2 83 5
1 0 800 12491488 10486568 8606292 0 0 16795 17964 2514 1488 11 2 85 2
1 0 800 12317592 10659160 8607484 0 0 17103 18222 2543 1531 11 2 82 5
2 0 800 12137840 10836760 8608636 0 0 17562 17909 2526 1496 11 2 82 6

This is 8.2.6, 64bit.

Sincerely,

Joshua D. Drake

#12

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Joshua D. Drake (#11)

Re: 8.3 / 8.2.6 restore comparison

"Joshua D. Drake" <jd@commandprompt.com> writes:

Tom Lane <tgl@sss.pgh.pa.us> wrote:

Some vmstat and oprofile investigation seems called for. Right now
we're just guessing about what the bottleneck is.

vmstat -adSK 5
...

Looks to me like you're disk-bound (and your kernel is pretty bad at
distinguishing "idle" from "disk wait" states). Plain COPY into an
index-less table ought to be writing fairly linearly, so I'm surprised
you aren't able to get more out of the disk. Have you got WAL on
a separate spindle from the database?

regards, tom lane

#13

jd@commandprompt.com

over 18 years ago

In reply to: Tom Lane (#12)

Re: 8.3 / 8.2.6 restore comparison

On Thu, 07 Feb 2008 19:20:26 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

Looks to me like you're disk-bound (and your kernel is pretty bad at
distinguishing "idle" from "disk wait" states). Plain COPY into an
index-less table ought to be writing fairly linearly, so I'm surprised
you aren't able to get more out of the disk. Have you got WAL on
a separate spindle from the database?

I didn't, but do now and am testing. The way this is currently designed
is:

/data1 (8 disks RAID 10)
/data2 (8 disks RAID 10)
/pg_xlogs (2 disks RAID 1)

/data1 is what is primarily written against for the first couple of
hours and then it will switch to data2 because of table spaces. However
either way, we should expect (I would think) to see at least 100 megs a
second on an 8 Disk RAID 10. It is SCSI.

vmstat -adSK 5
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free inact active si so bi bo in cs us sy id wa
1 1 160 6534048 23711184 1434532 0 0 1571 387 0 0 10 7 71 12
1 1 160 6206560 23898644 1573680 0 0 18168 23851 2684 1647 11 3 84 2
1 2 160 5887072 24080840 1711060 0 0 17655 58395 3322 1762 11 3 74 11
2 2 160 5546160 24275268 1856108 0 0 18681 61834 3302 1838 11 3 74 11
1 2 160 5363432 24383836 1933028 0 0 13043 52353 3247 1578 12 2 74 11
1 1 160 5295336 24423764 1963808 0 0 4997 39778 2865 1310 12 1 75 12
1 0 160 5284184 24426444 1971608 0 0 385 6967 2087 955 13 1 84 3
1 0 160 5274608 24429180 1978236 0 0 384 1009 2030 958 13 0 87 0
1 0 160 5265408 24431976 1984688 0 0 410 946 1780 656 12 0 87 0
1 1 160 5224912 24453296 2003700 0 0 1998 23168 2419 1163 12 1 79 8
1 2 160 5005048 24605300 2069236 0 0 18270 41342 2942 1721 12 2 76 10
1 2 160 4816016 24748120 2117208 0 0 19014 37165 3014 1702 12 2 72 15
1 1 160 4618064 24894476 2167412 0 0 19091 15226 2517 1626 12 2 83 3
1 1 160 4423376 25039072 2218324 0 0 18603 10398 2585 1644 12 2 84 3
1 1 160 4223568 25187592 2268816 0 0 19475 10326 2511 1666 12 2 84 2
1 1 160 4023056 25337080 2319844 0 0 19450 10601 2462 1543 12 2 84 2
1 1 160 3821328 25487664 2370328 0 0 19834 10657 2454 1540 12 2 84 2
1 1 160 3617616 25640832 2420820 0 0 20398 11202 2536 1678 12 2 84 2
1 2 160 3410040 25795488 2473800 0 0 20090 29063 2756 1680 12 2 76 10
1 1 160 3189064 25957688 2531996 0 0 20321 13745 2523 1595 12 2 83 4

Sincerely,

Joshua D. Drake

#14

jd@commandprompt.com

over 18 years ago

In reply to: Joshua D. Drake (#13)

Re: 8.3 / 8.2.6 restore comparison

On Thu, 7 Feb 2008 16:37:39 -0800
"Joshua D. Drake" <jd@commandprompt.com> wrote:

I didn't, but do now and am testing. The way this is currently
designed is:

/data1 (8 disks RAID 10)
/data2 (8 disks RAID 10)
/pg_xlogs (2 disks RAID 1)

/data1 is what is primarily written against for the first couple of
hours and then it will switch to data2 because of table spaces.
However either way, we should expect (I would think) to see at least
100 megs a second on an 8 Disk RAID 10. It is SCSI.

Following up with this... 22G in one hour, with xlogs on a different
partition. Just looking at we are averaging 3-5% IOWait, further we are
only writing ~ 2Megs a second.

This is frustrating.

Sincerely,

Joshua D. Drake

#15

Mark Wong

markw@osdl.org

over 18 years ago

In reply to: Tom Lane (#10)

Re: 8.3 / 8.2.6 restore comparison

On Thu, 07 Feb 2008 13:47:22 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Joshua D. Drake" <jd@commandprompt.com> writes:

I know Luke has mentioned some issues in the past as well around CPU
boundness with an upper limit of 300M/s (IIRC) but even that doesn't
equate to what is going on here as we are not getting anywhere near
that.

Some vmstat and oprofile investigation seems called for. Right now
we're just guessing about what the bottleneck is.

I also recommend looking into partition alignment on the disks. At
least to quantify how much the storage subsystem suffers from
mis-aligned partitions. It could be that it doesn't.

Regards,
Mark

#16

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Joshua D. Drake (#14)

Re: 8.3 / 8.2.6 restore comparison

"Joshua D. Drake" <jd@commandprompt.com> writes:

Following up with this... 22G in one hour, with xlogs on a different
partition. Just looking at we are averaging 3-5% IOWait, further we are
only writing ~ 2Megs a second.

It might be interesting to capture some strace output and get a sense of
what's actually being read and written. It seems to me that the bad
I/O rate has to indicate that we're seeking all over the place. We
should try to determine exactly what's being touched in what order.

regards, tom lane

#17

jd@commandprompt.com

over 18 years ago

In reply to: Tom Lane (#16)

Re: 8.3 / 8.2.6 restore comparison

On Thu, 07 Feb 2008 21:04:44 -0500
Tom Lane <tgl@sss.pgh.pa.us> wrote:

"Joshua D. Drake" <jd@commandprompt.com> writes:

Following up with this... 22G in one hour, with xlogs on a different
partition. Just looking at we are averaging 3-5% IOWait, further we
are only writing ~ 2Megs a second.

It might be interesting to capture some strace output and get a sense
of what's actually being read and written. It seems to me that the
bad I/O rate has to indicate that we're seeking all over the place.
We should try to determine exactly what's being touched in what order.

Do you want counts or actual output?

Joshua D. Drake

regards, tom lane

---------------------------(end of
broadcast)--------------------------- TIP 4: Have you searched our
list archives?

http://archives.postgresql.org

#18

jd@commandprompt.com

over 18 years ago

In reply to: Joshua D. Drake (#17)

Re: 8.3 / 8.2.6 restore comparison

On Thu, 7 Feb 2008 18:32:41 -0800
"Joshua D. Drake" <jd@commandprompt.com> wrote:

Do you want counts or actual output?

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
58.22 1.597638 33 47795 write
31.86 0.874104 23 38024 recvfrom
4.43 0.121559 92 1319 munmap
4.17 0.114445 3 37906 lseek
0.70 0.019341 13 1440 brk
0.56 0.015402 12 1316 mmap
0.04 0.000991 26 38 open
0.01 0.000298 8 38 close
0.01 0.000156 4 38 time
0.00 0.000008 8 1 semop
------ ----------- ----------- --------- --------- ----------------
100.00 2.743942 127915 total

#19

jd@commandprompt.com

over 18 years ago

In reply to: Joshua D. Drake (#18)

Re: 8.3 / 8.2.6 restore comparison

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Thu, 7 Feb 2008 18:40:37 -0800
"Joshua D. Drake" <jd@commandprompt.com> wrote:

On Thu, 7 Feb 2008 18:32:41 -0800
"Joshua D. Drake" <jd@commandprompt.com> wrote:

Do you want counts or actual output?

% time seconds usecs/call calls errors syscall
------ ----------- ----------- --------- --------- ----------------
58.22 1.597638 33 47795 write
31.86 0.874104 23 38024 recvfrom
4.43 0.121559 92 1319 munmap
4.17 0.114445 3 37906 lseek
0.70 0.019341 13 1440 brk
0.56 0.015402 12 1316 mmap
0.04 0.000991 26 38 open
0.01 0.000298 8 38 close
0.01 0.000156 4 38 time
0.00 0.000008 8 1 semop
------ ----------- ----------- --------- --------- ----------------
100.00 2.743942 127915 total

Were their any thoughts on this? I am also finding that backing up this
database is rudely slow with the same type of odd metrics (almost zero
(or zero) iowait). We can saturate a CPU but the CPU is certainly not
pushing the data to disk as fast as it could.

Joshua D. Drake

- --
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director | PostgreSQL political pundit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHu0hqATb/zqfZUUQRAm77AJ41yWT4f6UNiMwyHtcq9GASK51uDgCgpKzf
lu14eCbSDl2v2Rsq40zww8E=
=DshT
-----END PGP SIGNATURE-----

#20

jd@commandprompt.com

over 18 years ago

In reply to: Joshua D. Drake (#19)

Re: 8.3 / 8.2.6 restore comparison

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 19 Feb 2008 13:21:46 -0800
"Joshua D. Drake" <jd@commandprompt.com> wrote:

Were their any thoughts on this? I am also finding that backing up
this database is rudely slow with the same type of odd metrics
(almost zero (or zero) iowait). We can saturate a CPU but the CPU is
certainly not pushing the data to disk as fast as it could.

Further on this. We have tested on RHEL4 and RHEL5. Their are two
machines, each with 32GB of ram. I have four of these in the RHEL 5
machine:

processor : 0
vendor_id : AuthenticAMD
cpu family : 15
model : 65
model name : Dual-Core AMD Opteron(tm) Processor 8216
stepping : 2
cpu MHz : 2411.132
cache size : 1024 KB
physical id : 0
siblings : 2
core id : 0
cpu cores : 2
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge
mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext
fxsr_opt rdtscp lm 3dnowext 3dnow pni cx16 lahf_lm cmp_legacy svm
extapic cr8_legacy bogomips : 4823.59 TLB size : 1024 4K
pages clflush size : 64 cache_alignment : 64
address sizes : 40 bits physical, 48 bits virtual
power management: ts fid vid ttp tm stc

Here is the mini specs:

# 5U Q524 - Quad Opteron 24 SCSI
# Tyan S4885G3NR 800 Series CPU Opteron 800
# AMD Opteron 880 - 2.4GHz 2Core x 4
# 32GB - DDR ECC REG 400MHz (16x2GB) x 1 (16 x 2GB 3200 ECC REG Smart
Modular (32GB) # HD 1: 73GB SCSI 15K RPM x 24
# PCI 1: MegaRaid SCSI 320-2E - 2ch, U320, PCI Express, 128MB
# MegaRaid LSIiTBBU01 Battery - Order #: LSI00009
# PCI 2: MegaRaid SCSI 320-2E - 2ch, U320, PCI Express, 128MB
# MegaRaid LSIiTBBU01 Battery - Order #: LSI00009
# DVD-ROM/Sony 3.5 Inch Floppy Drive

The configuration is:

/ RAID 1
/ xlogs RAID 1
/data1 10 drives RAID 10
/data2 10 drives RAID 10

The thing that is frustrating here, is it appears that PostgreSQL just
can't utilize the hardware. I *know* it can because we have larger
machines in production that use PostgreSQL happily. However when I have
220G backups taking 8 hours and restores taking 11 hours, I begin to
wonder where the bottleneck is.

Assuming 25 megs a second per drive (random write) on data1 and data2
we should be pushing 250M a second. Just to be insanely conservative
let's cut that in half to 125M per second. That means every 10 seconds
we should do ~ 1G. That means every minute we should to ~ 6G, which
means 360G an hour.

Granted we have index creation and everything else going on but 11
hours and no IO chewing?

As a note these are reproducible on both machines regardless of RHEL5
or RHEL4.

I know there are much stats here but I have provided them in previous
posts on this thread. Perhaps someone sees a red flag in the hardware?

Sincerely,

Joshua D. Drake
- --
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate
PostgreSQL SPI Liaison | SPI Director | PostgreSQL political pundit

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFHu0vyATb/zqfZUUQRAiuYAJ9ut6i/cPv2MYc8RO2+wNw09M5/WwCfUaGY
sAkFt+S14i0kFMn6mz9juBw=
=TNys
-----END PGP SIGNATURE-----

#21

jd@commandprompt.com

over 18 years ago

In reply to: Joshua D. Drake (#20)

#22

Luke Lonergan

llonergan@greenplum.com

over 18 years ago

In reply to: Joshua D. Drake (#20)

#23

jd@commandprompt.com

over 18 years ago

In reply to: Luke Lonergan (#22)

#24

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Joshua D. Drake (#23)

#25

jd@commandprompt.com

over 18 years ago

In reply to: Tom Lane (#24)

#26

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Joshua D. Drake (#25)

#27

jd@commandprompt.com

over 18 years ago

In reply to: Tom Lane (#26)

#28

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Joshua D. Drake (#27)

#29

Greg Smith

gsmith@gregsmith.com

over 18 years ago

In reply to: Tom Lane (#26)

#30

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Greg Smith (#29)

#31

Stefan Kaltenbrunner

stefan@kaltenbrunner.cc

over 18 years ago

In reply to: Tom Lane (#24)

#32

jd@commandprompt.com

over 18 years ago

In reply to: Stefan Kaltenbrunner (#31)

#33

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Stefan Kaltenbrunner (#31)

#34

jd@commandprompt.com

over 18 years ago

In reply to: Tom Lane (#33)

#35

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Joshua D. Drake (#34)

#36

jd@commandprompt.com

over 18 years ago

In reply to: Tom Lane (#35)

#37

jd@commandprompt.com

over 18 years ago

In reply to: Joshua D. Drake (#36)

#38

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Joshua D. Drake (#37)

#39

jd@commandprompt.com

over 18 years ago

In reply to: Tom Lane (#38)

#40

jd@commandprompt.com

over 18 years ago

In reply to: Tom Lane (#38)

#41

heikki.linnakangas@enterprisedb.com

over 18 years ago

In reply to: Joshua D. Drake (#37)

#42

jd@commandprompt.com

over 18 years ago

In reply to: Heikki Linnakangas (#41)

#43

heikki.linnakangas@enterprisedb.com

over 18 years ago

In reply to: Joshua D. Drake (#42)

#44

heikki.linnakangas@enterprisedb.com

over 18 years ago

In reply to: Heikki Linnakangas (#43)

#45

Andy Colson

andy@squeakycode.net

over 18 years ago

In reply to: Heikki Linnakangas (#43)

#46

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Heikki Linnakangas (#43)

#47

Hannes Dorbath

light@theendofthetunnel.de

over 18 years ago

In reply to: Tom Lane (#30)

#48

jd@commandprompt.com

over 18 years ago

In reply to: Heikki Linnakangas (#43)

#49

jd@commandprompt.com

over 18 years ago

In reply to: Tom Lane (#46)

#50

Andrew Dunstan

andrew@dunslane.net

over 18 years ago

In reply to: Joshua D. Drake (#48)

#51

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Andrew Dunstan (#50)

#52

jd@commandprompt.com

over 18 years ago

In reply to: Andrew Dunstan (#50)

#53

jd@commandprompt.com

over 18 years ago

In reply to: Tom Lane (#51)

#54

pgsql@j-davis.com

over 18 years ago

In reply to: Joshua D. Drake (#49)

#55

jd@commandprompt.com

over 18 years ago

In reply to: Jeff Davis (#54)

#56

pgsql@j-davis.com

over 18 years ago

In reply to: Joshua D. Drake (#55)

#57

jd@commandprompt.com

over 18 years ago

In reply to: Jeff Davis (#56)

#58

pgsql@j-davis.com

over 18 years ago

In reply to: Joshua D. Drake (#57)

#59

jd@commandprompt.com

over 18 years ago

In reply to: Jeff Davis (#58)

#60

jd@commandprompt.com

over 18 years ago

In reply to: Joshua D. Drake (#59)

#61

jd@commandprompt.com

over 18 years ago

In reply to: Joshua D. Drake (#20)

#62

pgsql@j-davis.com

over 18 years ago

In reply to: Joshua D. Drake (#60)

#63

jd@commandprompt.com

over 18 years ago

In reply to: Joshua D. Drake (#61)

#64

jd@commandprompt.com

over 18 years ago

In reply to: Joshua D. Drake (#63)

#65

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Joshua D. Drake (#64)

#66

jd@commandprompt.com

over 18 years ago

In reply to: Tom Lane (#65)

#67

tgl@sss.pgh.pa.us

over 18 years ago

In reply to: Joshua D. Drake (#66)

#68

jd@commandprompt.com

over 18 years ago

In reply to: Tom Lane (#67)

#69

Bruce Momjian

bruce@momjian.us

over 18 years ago

In reply to: Heikki Linnakangas (#43)

#70

heikki.linnakangas@enterprisedb.com

over 18 years ago

In reply to: Bruce Momjian (#69)

#71