Dell Hardware Recommendations

Started by Joe Uhlover 18 years ago19 messagesgeneral

joeuhl@gmail.com

over 18 years ago

We have a 30 GB database (according to pg_database_size) running nicely
on a single Dell PowerEdge 2850 right now. This represents data
specific to 1 US state. We are in the process of planning a deployment
that will service all 50 US states.

If 30 GB is an accurate number per state that means the database size is
about to explode to 1.5 TB. About 1 TB of this amount would be OLAP
data that is heavy-read but only updated or inserted in batch. It is
also largely isolated to a single table partitioned on state. This
portion of the data will grow very slowly after the initial loading.

The remaining 500 GB has frequent individual writes performed against
it. 500 GB is a high estimate and it will probably start out closer to
100 GB and grow steadily up to and past 500 GB.

I am trying to figure out an appropriate hardware configuration for such
a database. Currently I am considering the following:

PowerEdge 1950 paired with a PowerVault MD1000
2 x Quad Core Xeon E5310
16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
PERC 5/E Raid Adapter
2 x 146 GB SAS in Raid 1 for OS + logs.
A bunch of disks in the MD1000 configured in Raid 10 for Postgres data.

The MD1000 holds 15 disks, so 14 disks + a hot spare is the max. With
12 250GB SATA drives to cover the 1.5TB we would be able add another
250GB of usable space for future growth before needing to get a bigger
set of disks. 500GB drives would leave alot more room and could allow
us to run the MD1000 in split mode and use its remaining disks for other
purposes in the mean time. I would greatly appreciate any feedback with
respect to drive count vs. drive size and SATA vs. SCSI/SAS. The price
difference makes SATA awfully appealing.

We plan to involve outside help in getting this database tuned and
configured, but want to get some hardware ballparks in order to get
quotes and potentially request a trial unit.

Any thoughts or recommendations? We are running openSUSE 10.2 with
kernel 2.6.18.2-34.

Regards,

Joe Uhl
joeuhl@gmail.com

Jim Nasby

Jim.Nasby@BlueTreble.com

over 18 years ago

In reply to: Joe Uhl (#1)

Re: Dell Hardware Recommendations

On Thu, Aug 09, 2007 at 03:47:09PM -0400, Joe Uhl wrote:

We have a 30 GB database (according to pg_database_size) running nicely
on a single Dell PowerEdge 2850 right now. This represents data
specific to 1 US state. We are in the process of planning a deployment
that will service all 50 US states.

If 30 GB is an accurate number per state that means the database size is
about to explode to 1.5 TB. About 1 TB of this amount would be OLAP
data that is heavy-read but only updated or inserted in batch. It is
also largely isolated to a single table partitioned on state. This
portion of the data will grow very slowly after the initial loading.

The remaining 500 GB has frequent individual writes performed against
it. 500 GB is a high estimate and it will probably start out closer to
100 GB and grow steadily up to and past 500 GB.

What kind of transaction rate are you looking at?

I am trying to figure out an appropriate hardware configuration for such
a database. Currently I am considering the following:

PowerEdge 1950 paired with a PowerVault MD1000
2 x Quad Core Xeon E5310
16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)

16GB for 500GB of active data is probably a bit light.

PERC 5/E Raid Adapter
2 x 146 GB SAS in Raid 1 for OS + logs.
A bunch of disks in the MD1000 configured in Raid 10 for Postgres data.

The MD1000 holds 15 disks, so 14 disks + a hot spare is the max. With
12 250GB SATA drives to cover the 1.5TB we would be able add another
250GB of usable space for future growth before needing to get a bigger
set of disks. 500GB drives would leave alot more room and could allow
us to run the MD1000 in split mode and use its remaining disks for other
purposes in the mean time. I would greatly appreciate any feedback with
respect to drive count vs. drive size and SATA vs. SCSI/SAS. The price
difference makes SATA awfully appealing.

Well, how does this compare with what you have right now? And do you
expect your query rate to be 50x what it is now, or higher?

We plan to involve outside help in getting this database tuned and
configured, but want to get some hardware ballparks in order to get
quotes and potentially request a trial unit.

You're doing a very wise thing by asking for information before
purchasing (unfortunately, many people put that cart before the horse).
This list is a great resource for information, but there's no real
substitute for working directly with someone and being able to discuss
your actual system in detail, so I'd suggest getting outside help
involved before actually purchasing or even evaluating hardware. There's
a lot to think about beyond just drives and memory with the kind of
expansion you're looking at. For example, what ability do you have to
scale past one machine? Do you have a way to control your growth rate?
How well will the existing design scale out? (Often times what is a good
design for a smaller set of data is sub-optimal for a large set of
data.)

Something else that might be worth looking at is having your existing
workload modeled; that allows building a pretty accurate estimate of
what kind of hardware would be required to hit a different workload.
--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

Merlin Moncure

mmoncure@gmail.com

over 18 years ago

In reply to: Joe Uhl (#1)

Re: Dell Hardware Recommendations

On 8/9/07, Joe Uhl <joeuhl@gmail.com> wrote:

We have a 30 GB database (according to pg_database_size) running nicely
on a single Dell PowerEdge 2850 right now. This represents data
specific to 1 US state. We are in the process of planning a deployment
that will service all 50 US states.

If 30 GB is an accurate number per state that means the database size is
about to explode to 1.5 TB. About 1 TB of this amount would be OLAP
data that is heavy-read but only updated or inserted in batch. It is
also largely isolated to a single table partitioned on state. This
portion of the data will grow very slowly after the initial loading.

The remaining 500 GB has frequent individual writes performed against
it. 500 GB is a high estimate and it will probably start out closer to
100 GB and grow steadily up to and past 500 GB.

I am trying to figure out an appropriate hardware configuration for such
a database. Currently I am considering the following:

PowerEdge 1950 paired with a PowerVault MD1000
2 x Quad Core Xeon E5310
16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
PERC 5/E Raid Adapter
2 x 146 GB SAS in Raid 1 for OS + logs.
A bunch of disks in the MD1000 configured in Raid 10 for Postgres data.

The MD1000 holds 15 disks, so 14 disks + a hot spare is the max. With
12 250GB SATA drives to cover the 1.5TB we would be able add another
250GB of usable space for future growth before needing to get a bigger
set of disks. 500GB drives would leave alot more room and could allow
us to run the MD1000 in split mode and use its remaining disks for other
purposes in the mean time. I would greatly appreciate any feedback with
respect to drive count vs. drive size and SATA vs. SCSI/SAS. The price
difference makes SATA awfully appealing.

I'm getting a MD1000 tomorrow to play with for just this type of
analysis as it happens. First of all, move the o/s drives to the
backplane and get the cheapest available.

I might consider pick up an extra perc 5/e, since the MD1000 is
active/active, and do either raid 10 or 05 with one of the raid levels
in software. For example, two raid 5 volumes (hardware raid 5)
striped in software as raid 0. A 15k SAS drive is worth at least two
SATA drives (unless they are raptors) for OLTP performance loads.

Where the extra controller especially pays off is if you have to
expand to a second tray. It's easy to add trays but installing
controllers on a production server is scary.

Raid 10 is usually better for databases but in my experience it's a
roll of the dice. If you factor cost into the matrix a SAS raid 05
might outperform a SATA raid 10 because you are getting better storage
utilization out of the drives (n - 2 vs. n / 2). Then again, you
might not.

merlin

Jim Nasby

Jim.Nasby@BlueTreble.com

over 18 years ago

In reply to: Merlin Moncure (#3)

Re: Dell Hardware Recommendations

On Thu, Aug 09, 2007 at 05:50:10PM -0400, Merlin Moncure wrote:

Raid 10 is usually better for databases but in my experience it's a
roll of the dice. If you factor cost into the matrix a SAS raid 05
might outperform a SATA raid 10 because you are getting better storage
utilization out of the drives (n - 2 vs. n / 2). Then again, you
might not.

It's going to depend heavily on the controller and the workload.
Theoretically, if most of your writes are to stripes that the controller
already has cached then you could actually out-perform RAID10. But
that's a really, really big IF, because if the strip isn't in cache you
have to read the entire thing in before you can do the write... and that
costs *a lot*.

Also, a good RAID controller can spread reads out across both drives in
each mirror on a RAID10. Though, there is an argument for not doing
that... it makes it much less likely that both drives in a mirror will
fail close enough to each other that you'd lose that chunk of data.

Speaking of failures, keep in mind that a normal RAID5 puts you only 2
drive failures away from data loss, while with RAID10 you can
potentially lose half the array without losing any data. If you do RAID5
with multiple parity copies that does change things; I'm not sure which
is better at that point (I suspect it matters how many drives are
involved).

The comment about the extra controller isn't a bad idea, although I
would hope that you'll have some kind of backup server available, which
makes an extra controller much less useful.
--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

Arjen van der Meijden

acmmailing@tweakers.net

over 18 years ago

In reply to: Merlin Moncure (#3)

Re: Dell Hardware Recommendations

On 9-8-2007 23:50 Merlin Moncure wrote:

Where the extra controller especially pays off is if you have to
expand to a second tray. It's easy to add trays but installing
controllers on a production server is scary.

For connectivity-sake that's not a necessity. You can either connect
(two?) extra MD1000's to your first MD1000 or you can use the second
external SAS-port on your controller. Obviously it depends on the
controller whether its good enough to just add the disks to it, rather
than adding another controller for the second tray. Whether the perc5/e
is good enough for that, I don't know, we've only equipped ours with a
single MD1000 holding 15x 15k rpm drives, but in our benchmarks it
scaled pretty well going from a few to all 14 disks (+1 hotspare).

Best regards,

Arjen

Joe Uhl

joeuhl@gmail.com

over 18 years ago

In reply to: Joe Uhl (#1)

Re: Dell Hardware Recommendations

Thanks for the input. Thus far we have used Dell but I would certainly
be willing to explore other options.

I found a "Reference Guide" for the MD1000 from April, 2006 that
includes info on the PERC 5/E at:

http://www.dell.com/downloads/global/products/pvaul/en/pvaul_md1000_solutions_guide.pdf

To answer the questions below:

How many users do you expect to hit the db at the same time?

There are 2 types of users. For roughly every 5000 active accounts, 10
or fewer or those will have additional privileges. Only those more
privileged users interact substantially with the OLAP portion of the
database. For 1 state 10 concurrent connections was about the max, so
if that holds for 50 states we are looking at 500 concurrent users as a
top end, with a very small fraction of those users interacting with the
OLAP portion.

How big of a dataset will each one be grabbing at the same time?

For the OLTP data it is mostly single object reads and writes and
generally touches only a few tables at a time.

Will your Perc RAID controller have a battery backed cache on board?
If so (and it better!) how big of a cache can it hold?

According to the above link, it has a 256 MB cache that is battery
backed.

Can you split this out onto two different machines, one for the OLAP
load and the other for what I'm assuming is OLTP?
Can you physically partition this out by state if need be?

Right now this system isn't in production so we can explore any option.
We are looking into splitting the OLAP and OLTP portions right now and I
imagine physically splitting the partitions on the big OLAP table is an
option as well.

Really appreciate all of the advice. Before we pull the trigger on
hardware we probably will get some external advice from someone but I
knew this list would provide some excellent ideas and feedback to get us
started.

Joe Uhl
joeuhl@gmail.com

On Thu, 9 Aug 2007 16:02:49 -0500, "Scott Marlowe"
<scott.marlowe@gmail.com> said:

Show quoted text

On 8/9/07, Joe Uhl <joeuhl@gmail.com> wrote:

We have a 30 GB database (according to pg_database_size) running nicely
on a single Dell PowerEdge 2850 right now. This represents data
specific to 1 US state. We are in the process of planning a deployment
that will service all 50 US states.

If 30 GB is an accurate number per state that means the database size is
about to explode to 1.5 TB. About 1 TB of this amount would be OLAP
data that is heavy-read but only updated or inserted in batch. It is
also largely isolated to a single table partitioned on state. This
portion of the data will grow very slowly after the initial loading.

The remaining 500 GB has frequent individual writes performed against
it. 500 GB is a high estimate and it will probably start out closer to
100 GB and grow steadily up to and past 500 GB.

I am trying to figure out an appropriate hardware configuration for such
a database. Currently I am considering the following:

PowerEdge 1950 paired with a PowerVault MD1000
2 x Quad Core Xeon E5310
16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
PERC 5/E Raid Adapter
2 x 146 GB SAS in Raid 1 for OS + logs.
A bunch of disks in the MD1000 configured in Raid 10 for Postgres data.

The MD1000 holds 15 disks, so 14 disks + a hot spare is the max. With
12 250GB SATA drives to cover the 1.5TB we would be able add another
250GB of usable space for future growth before needing to get a bigger
set of disks. 500GB drives would leave alot more room and could allow
us to run the MD1000 in split mode and use its remaining disks for other
purposes in the mean time. I would greatly appreciate any feedback with
respect to drive count vs. drive size and SATA vs. SCSI/SAS. The price
difference makes SATA awfully appealing.

We plan to involve outside help in getting this database tuned and
configured, but want to get some hardware ballparks in order to get
quotes and potentially request a trial unit.

Any thoughts or recommendations? We are running openSUSE 10.2 with
kernel 2.6.18.2-34.

Some questions:

How many users do you expect to hit the db at the same time?
How big of a dataset will each one be grabbing at the same time?
Will your Perc RAID controller have a battery backed cache on board?
If so (and it better!) how big of a cache can it hold?
Can you split this out onto two different machines, one for the OLAP
load and the other for what I'm assuming is OLTP?
Can you physically partition this out by state if need be?

A few comments:

I'd go with the bigger drives. Just as many, so you have spare
storage as you need it. you never know when you'll need to migrate
your whole data set from one pg db to another for testing etc...
extra space comes in REAL handy when things aren't quite going right.
With 10krpm 500 and 750 Gig drives you can use smaller partitions on
the bigger drives to short stroke them and often outrun supposedly
faster drives.

The difference between SAS and SATA drives is MUCH less important than
the difference between one RAID controller and the next. It's not
likely the Dell is gonna come with the fastest RAID controllers
around, as they seem to still be selling Adaptec (buggy and
unreliable, avoid like the plague) and LSI (stable, moderately fast).

I.e. I'd rather have 24 SATA disks plugged into a couple of big Areca
or 3ware (now escalade I think?) controllers than 8 SAS drives plugged
into any Adaptec controller.

Import Notes

Reply to msg id not found: dcc563d10708091402w1766ff60q30245c67b1ec18c7@mail.gmail.com

Scott Marlowe

scott.marlowe@gmail.com

over 18 years ago

In reply to: Joe Uhl (#1)

Re: [PERFORM] Dell Hardware Recommendations

oops

Show quoted text

On 8/9/07, Decibel! <decibel@decibel.org> wrote:

You forgot the list. :)

On Thu, Aug 09, 2007 at 05:29:18PM -0500, Scott Marlowe wrote:

On 8/9/07, Decibel! <decibel@decibel.org> wrote:

Also, a good RAID controller can spread reads out across both drives in
each mirror on a RAID10. Though, there is an argument for not doing
that... it makes it much less likely that both drives in a mirror will
fail close enough to each other that you'd lose that chunk of data.

I'd think that kind of failure mode is pretty uncommon, unless you're
in an environment where physical shocks are common. which is not a
typical database environment. (tell that to the guys writing a db for
a modern tank fire control system though :) )

Speaking of failures, keep in mind that a normal RAID5 puts you only 2
drive failures away from data loss,

Not only that, but the first drive failure puts you way down the list
in terms of performance, where a single failed drive in a large
RAID-10 only marginally affects performance.

while with RAID10 you can
potentially lose half the array without losing any data.

Yes, but the RIGHT two drives can kill EITHER RAID 5 or RAID10.

If you do RAID5
with multiple parity copies that does change things; I'm not sure which
is better at that point (I suspect it matters how many drives are
involved).

That's RAID6. The primary advantages of RAID6 over RAID10 or RAID5
are two fold:

1: A single drive failure has no negative effect on performance, so
the array is still pretty fast, especially for reads, which just suck
under RAID 5 with a missing drive.
2: No two drive failures can cause loss of data. Admittedly, by the
time the second drive fails, you're now running on the equivalent of a
degraded RAID5, unless you've configured >2 drives for parity.

On very large arrays (100s of drives), RAID6 with 2, 3, or 4 drives
for parity makes some sense, since having that many extra drives means
the RAID controller (SW or HW) can now have elections to decide which
drive might be lying if you get data corruption.

Note that you can also look into RAID10 with 3 or more drives per
mirror. I.e. build 3 RAID-1 sets of 3 drives each, then you can lose
any two drives and still stay up. Plus, on a mostly read database,
where users might be reading the same drives but in different places,
multi-disk RAID-1 makes sense under RAID-10.

While I agree with Merlin that for OLTP a faster drive is a must, for
OLAP, more drives is often the real key. The high aggregate bandwidth
of a large array of SATA drives is an amazing thing to watch when
running a reporting server with otherwise unimpressive specs.

--
Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Import Notes

Reply to msg id not found: 20070809224138.GI20424@decibel.org

Scott Marlowe

scott.marlowe@gmail.com

over 18 years ago

In reply to: Joe Uhl (#1)

Re: Dell Hardware Recommendations

oops, the the wrong list... now the right one.

Show quoted text

On 8/9/07, Decibel! <decibel@decibel.org> wrote:

You forgot the list. :)

On Thu, Aug 09, 2007 at 05:29:18PM -0500, Scott Marlowe wrote:

On 8/9/07, Decibel! <decibel@decibel.org> wrote:

Also, a good RAID controller can spread reads out across both drives in
each mirror on a RAID10. Though, there is an argument for not doing
that... it makes it much less likely that both drives in a mirror will
fail close enough to each other that you'd lose that chunk of data.

I'd think that kind of failure mode is pretty uncommon, unless you're
in an environment where physical shocks are common. which is not a
typical database environment. (tell that to the guys writing a db for
a modern tank fire control system though :) )

Speaking of failures, keep in mind that a normal RAID5 puts you only 2
drive failures away from data loss,

Not only that, but the first drive failure puts you way down the list
in terms of performance, where a single failed drive in a large
RAID-10 only marginally affects performance.

while with RAID10 you can
potentially lose half the array without losing any data.

Yes, but the RIGHT two drives can kill EITHER RAID 5 or RAID10.

If you do RAID5
with multiple parity copies that does change things; I'm not sure which
is better at that point (I suspect it matters how many drives are
involved).

That's RAID6. The primary advantages of RAID6 over RAID10 or RAID5
are two fold:

1: A single drive failure has no negative effect on performance, so
the array is still pretty fast, especially for reads, which just suck
under RAID 5 with a missing drive.
2: No two drive failures can cause loss of data. Admittedly, by the
time the second drive fails, you're now running on the equivalent of a
degraded RAID5, unless you've configured >2 drives for parity.

On very large arrays (100s of drives), RAID6 with 2, 3, or 4 drives
for parity makes some sense, since having that many extra drives means
the RAID controller (SW or HW) can now have elections to decide which
drive might be lying if you get data corruption.

Note that you can also look into RAID10 with 3 or more drives per
mirror. I.e. build 3 RAID-1 sets of 3 drives each, then you can lose
any two drives and still stay up. Plus, on a mostly read database,
where users might be reading the same drives but in different places,
multi-disk RAID-1 makes sense under RAID-10.

While I agree with Merlin that for OLTP a faster drive is a must, for
OLAP, more drives is often the real key. The high aggregate bandwidth
of a large array of SATA drives is an amazing thing to watch when
running a reporting server with otherwise unimpressive specs.

--
Decibel!, aka Jim C. Nasby, Database Architect decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828

Import Notes

Reply to msg id not found: 20070809224138.GI20424@decibel.org

Jim Nasby

Jim.Nasby@BlueTreble.com

over 18 years ago

In reply to: Scott Marlowe (#8)

Re: Dell Hardware Recommendations

On Thu, Aug 09, 2007 at 08:58:19PM -0500, Scott Marlowe wrote:

On Thu, Aug 09, 2007 at 05:29:18PM -0500, Scott Marlowe wrote:

On 8/9/07, Decibel! <decibel@decibel.org> wrote:

Also, a good RAID controller can spread reads out across both drives in
each mirror on a RAID10. Though, there is an argument for not doing
that... it makes it much less likely that both drives in a mirror will
fail close enough to each other that you'd lose that chunk of data.

I'd think that kind of failure mode is pretty uncommon, unless you're
in an environment where physical shocks are common. which is not a
typical database environment. (tell that to the guys writing a db for
a modern tank fire control system though :) )

You'd be surprised. I've seen more than one case of a bunch of drives
failing within a month, because they were all bought at the same time.

while with RAID10 you can
potentially lose half the array without losing any data.

Yes, but the RIGHT two drives can kill EITHER RAID 5 or RAID10.

Sure, but the odds of that with RAID5 are 100%, while they're much less
in a RAID10.

While I agree with Merlin that for OLTP a faster drive is a must, for
OLAP, more drives is often the real key. The high aggregate bandwidth
of a large array of SATA drives is an amazing thing to watch when
running a reporting server with otherwise unimpressive specs.

True. In this case, the OP will probably want to have one array for the
OLTP stuff and one for the OLAP stuff.
--
Decibel!, aka Jim Nasby decibel@decibel.org
EnterpriseDB http://enterprisedb.com 512.569.9461 (cell)

#10

Greg Smith

gsmith@gregsmith.com

over 18 years ago

In reply to: Joe Uhl (#1)

Re: Dell Hardware Recommendations

On Thu, 9 Aug 2007, Joe Uhl wrote:

The MD1000 holds 15 disks, so 14 disks + a hot spare is the max. With
12 250GB SATA drives to cover the 1.5TB we would be able add another
250GB of usable space for future growth before needing to get a bigger
set of disks. 500GB drives would leave alot more room and could allow
us to run the MD1000 in split mode and use its remaining disks for other
purposes in the mean time. I would greatly appreciate any feedback with
respect to drive count vs. drive size and SATA vs. SCSI/SAS. The price
difference makes SATA awfully appealing.

The SATA II drives in the MD1000 all run at 7200 RPM, and are around
0.8/GB (just grabbed a random quote from the configurator on their site
for all these) for each of the 250GB, 500GB, and 750GB capacities. If you
couldn't afford to fill the whole array with 500GB models, than it might
make sense to get the 250GB ones instead just to spread the load out over
more spindles; if you're filling it regardless, surely the reduction in
stress over capacity issues of the 500GB models makes more sense. Also,
using the 500 GB models would make it much easier to only ever use 12
active drives and have 3 hot spares, with less pressure to convert spares
into active storage; drives die in surprisingly correlated batches far too
often to only have 1 spare IMHO.

The two SAS options that you could use are both 300GB, and you can have
10K RPM for $2.3/GB or 15K RPM for $3.0/GB. So relative to the SATA
optoins, you're paying about 3X as much to get a 40% faster spin rate, or
around 4X as much to get over a 100% faster spin. There's certainly other
things that factor into performance than just that, but just staring at
the RPM gives you a gross idea how much higher of a raw transaction rate
the drives can support.

The question you have to ask yourself is how much actual I/O are you
dealing with. The tiny 256MB cache on the PERC 5/E isn't going to help
much with buffering writes in particular, so the raw disk performance may
be critical for your update intensive workload. If the combination of
transaction rate and total bandwidth are low enough that the 7200 RPM
drives can keep up with your load, by all means save yourself a lot of
cash and get the SATA drives.

In your situation, I'd be spending a lot of my time measuring the
transaction and I/O bandwidth rates on the active system very carefully to
figure out which way to go here. You're in a better position than most
people buying new hardware to estimate what you need with the existing
system in place, take advantage of that by drilling into the exact numbers
for what you're pushing through your disks now. Every dollar spent on
work to quantify that early will easily pay for itself in helping guide
your purchase and future plans; that's what I'd be bringing in people in
right now to do if I were you, if that's not something you're already
familiar with measuring.

--
* Greg Smith gsmith@gregsmith.com http://www.gregsmith.com Baltimore, MD

#11

david@lang.hm

over 18 years ago

In reply to: Jim Nasby (#9)

Re: Dell Hardware Recommendations

On Thu, 9 Aug 2007, Decibel! wrote:

On Thu, Aug 09, 2007 at 08:58:19PM -0500, Scott Marlowe wrote:

On Thu, Aug 09, 2007 at 05:29:18PM -0500, Scott Marlowe wrote:

On 8/9/07, Decibel! <decibel@decibel.org> wrote:

Also, a good RAID controller can spread reads out across both drives in
each mirror on a RAID10. Though, there is an argument for not doing
that... it makes it much less likely that both drives in a mirror will
fail close enough to each other that you'd lose that chunk of data.

I'd think that kind of failure mode is pretty uncommon, unless you're
in an environment where physical shocks are common. which is not a
typical database environment. (tell that to the guys writing a db for
a modern tank fire control system though :) )

You'd be surprised. I've seen more than one case of a bunch of drives
failing within a month, because they were all bought at the same time.

while with RAID10 you can
potentially lose half the array without losing any data.

Yes, but the RIGHT two drives can kill EITHER RAID 5 or RAID10.

Sure, but the odds of that with RAID5 are 100%, while they're much less
in a RAID10.

so you go with Raid6, not Raid5.

While I agree with Merlin that for OLTP a faster drive is a must, for
OLAP, more drives is often the real key. The high aggregate bandwidth
of a large array of SATA drives is an amazing thing to watch when
running a reporting server with otherwise unimpressive specs.

True. In this case, the OP will probably want to have one array for the
OLTP stuff and one for the OLAP stuff.

one thing that's interesting is that the I/O throughlut on the large SATA
drives can actually be higher then the faster, but smaller SCSI drives.
the SCSI drives can win on seeking, but how much seeking you need to do
depends on how large the OLTP database ends up being

David Lang

#12

Merlin Moncure

mmoncure@gmail.com

over 18 years ago

In reply to: Arjen van der Meijden (#5)

Re: Dell Hardware Recommendations

On 8/10/07, Arjen van der Meijden <acmmailing@tweakers.net> wrote:

On 9-8-2007 23:50 Merlin Moncure wrote:

Where the extra controller especially pays off is if you have to
expand to a second tray. It's easy to add trays but installing
controllers on a production server is scary.

For connectivity-sake that's not a necessity. You can either connect
(two?) extra MD1000's to your first MD1000 or you can use the second
external SAS-port on your controller. Obviously it depends on the
controller whether its good enough to just add the disks to it, rather
than adding another controller for the second tray. Whether the perc5/e
is good enough for that, I don't know, we've only equipped ours with a
single MD1000 holding 15x 15k rpm drives, but in our benchmarks it
scaled pretty well going from a few to all 14 disks (+1 hotspare).

completely correct....I was suggesting this on performance
terms...I've never done it with the Perc/5, but have done it with some
active/active SANs and it works really well.

merlin

#13

Merlin Moncure

mmoncure@gmail.com

over 18 years ago

In reply to: Jim Nasby (#4)

Re: Dell Hardware Recommendations

On 8/10/07, Decibel! <decibel@decibel.org> wrote:

On Thu, Aug 09, 2007 at 05:50:10PM -0400, Merlin Moncure wrote:

Raid 10 is usually better for databases but in my experience it's a
roll of the dice. If you factor cost into the matrix a SAS raid 05
might outperform a SATA raid 10 because you are getting better storage
utilization out of the drives (n - 2 vs. n / 2). Then again, you
might not.

It's going to depend heavily on the controller and the workload.
Theoretically, if most of your writes are to stripes that the controller
already has cached then you could actually out-perform RAID10. But
that's a really, really big IF, because if the strip isn't in cache you
have to read the entire thing in before you can do the write... and that
costs *a lot*.

Also, a good RAID controller can spread reads out across both drives in
each mirror on a RAID10. Though, there is an argument for not doing
that... it makes it much less likely that both drives in a mirror will
fail close enough to each other that you'd lose that chunk of data.

Speaking of failures, keep in mind that a normal RAID5 puts you only 2
drive failures away from data loss, while with RAID10 you can
potentially lose half the array without losing any data. If you do RAID5
with multiple parity copies that does change things; I'm not sure which
is better at that point (I suspect it matters how many drives are
involved).

when making hardware recommendations I always suggest to buy two
servers and rig PITR with warm standby. This allows to adjust the
system a little bit for performance over fault tolerance.

Regarding raid controllers, I've found performance to be quite
variable as stated, especially with regards to RAID 5. I've also
unfortunately found bonnie++ to not be very reflective of actual
performance in high stress environments. We have a IBM DS4200 that
bangs out some pretty impressive numbers with our app using sata while
the bonnie++ numbers fairly suck.

merlin

#14

Merlin Moncure

mmoncure@gmail.com

over 18 years ago

In reply to: Arjen van der Meijden (#5)

Re: Dell Hardware Recommendations

On 8/9/07, Arjen van der Meijden <acmmailing@tweakers.net> wrote:

On 9-8-2007 23:50 Merlin Moncure wrote:

Where the extra controller especially pays off is if you have to
expand to a second tray. It's easy to add trays but installing
controllers on a production server is scary.

For connectivity-sake that's not a necessity. You can either connect
(two?) extra MD1000's to your first MD1000 or you can use the second
external SAS-port on your controller. Obviously it depends on the
controller whether its good enough to just add the disks to it, rather
than adding another controller for the second tray. Whether the perc5/e
is good enough for that, I don't know, we've only equipped ours with a
single MD1000 holding 15x 15k rpm drives, but in our benchmarks it
scaled pretty well going from a few to all 14 disks (+1 hotspare).

As it happens I will have an opportunity to test the dual controller
theory. In about a week we are picking up another md1000 and will
attach it in an active/active configuration with various
hardware/software RAID configurations, and run a battery of database
centric tests. Results will follow.

By the way, the recent dell severs I have seen are well built in my
opinion...better and cheaper than comparable IBM servers. I've also
tested the IBM exp3000, and the MD1000 is cheaper and comes standard
with a second ESM. In my opinion, the Dell 1U 1950 is extremely well
organized in terms of layout and cooling...dual power supplies, dual
PCI-E (one low profile), plus a third custom slot for the optional
perc 5/i which drives the backplane.

merlin

#15

Vick Khera

vivek@khera.org

over 18 years ago

In reply to: Joe Uhl (#1)

Re: Dell Hardware Recommendations

On Aug 9, 2007, at 3:47 PM, Joe Uhl wrote:

PowerEdge 1950 paired with a PowerVault MD1000
2 x Quad Core Xeon E5310
16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
PERC 5/E Raid Adapter
2 x 146 GB SAS in Raid 1 for OS + logs.
A bunch of disks in the MD1000 configured in Raid 10 for Postgres
data.

I'd avoid Dell disk systems if at all possible. I know, I've been
through the pain. You really want someone else providing your RAID
card and disk array, especially if the 5/E card is based on the
Adaptec devices.

#16

Merlin Moncure

mmoncure@gmail.com

over 18 years ago

In reply to: Vick Khera (#15)

Re: Dell Hardware Recommendations

On 8/10/07, Vivek Khera <vivek@khera.org> wrote:

On Aug 9, 2007, at 3:47 PM, Joe Uhl wrote:

PowerEdge 1950 paired with a PowerVault MD1000
2 x Quad Core Xeon E5310
16 GB 667MHz RAM (4 x 4GB leaving room to expand if we need to)
PERC 5/E Raid Adapter
2 x 146 GB SAS in Raid 1 for OS + logs.
A bunch of disks in the MD1000 configured in Raid 10 for Postgres
data.

I'd avoid Dell disk systems if at all possible. I know, I've been
through the pain. You really want someone else providing your RAID
card and disk array, especially if the 5/E card is based on the
Adaptec devices.

I'm not so sure I agree. They are using LSI firmware now (and so is
everyone else). The servers are well built (highly subjective, I
admit) and configurable. I have had some bad experiences with IBM
gear (adaptec controller though), and white box parts 3ware, etc. I
can tell you that dell got us the storage and the server in record
time

do agree on adaptec however

merlin

#17

Joel Fradkin

jfradkin@wazagua.com

over 18 years ago

In reply to: Merlin Moncure (#14)

Re: Dell Hardware Recommendations

I know we bough the 4 proc opteron unit with the sas jbod from dell and it
has been extremely excellent in terms of performance.

Was like 3 times faster the our old dell 4 proc which had xeon processors.

The newer one has had a few issues (I am running redhat as4 since dell
supports it. I have had one kernel failure (but it has been up for like a
year). Other then that no issues a reboot fixed whatever caused the failure
and I have not seen it happen again and its been a few months.

I am definitely going dell for any other server needs their pricing is so
competitive now and the machines I bought both the 1u 2 proc and the larger
4 proc have been very good.

Joel Fradkin

Wazagua, Inc.
2520 Trailmate Dr
Sarasota, Florida 34243
Tel. 941-753-7111 ext 305

jfradkin@wazagua.com
www.wazagua.com
Powered by Wazagua
Providing you with the latest Web-based technology & advanced tools.
C 2004. WAZAGUA, Inc. All rights reserved. WAZAGUA, Inc
This email message is for the use of the intended recipient(s) and may
contain confidential and privileged information. Any unauthorized review,
use, disclosure or distribution is prohibited. If you are not the intended
recipient, please contact the sender by reply email and delete and destroy
all copies of the original message, including attachments.

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Merlin Moncure
Sent: Friday, August 10, 2007 1:31 PM
To: Arjen van der Meijden
Cc: Joe Uhl; pgsql-performance@postgresql.org
Subject: Re: [PERFORM] Dell Hardware Recommendations

On 8/9/07, Arjen van der Meijden <acmmailing@tweakers.net> wrote:

On 9-8-2007 23:50 Merlin Moncure wrote:

Where the extra controller especially pays off is if you have to
expand to a second tray. It's easy to add trays but installing
controllers on a production server is scary.

For connectivity-sake that's not a necessity. You can either connect
(two?) extra MD1000's to your first MD1000 or you can use the second
external SAS-port on your controller. Obviously it depends on the
controller whether its good enough to just add the disks to it, rather
than adding another controller for the second tray. Whether the perc5/e
is good enough for that, I don't know, we've only equipped ours with a
single MD1000 holding 15x 15k rpm drives, but in our benchmarks it
scaled pretty well going from a few to all 14 disks (+1 hotspare).

merlin

---------------------------(end of broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org

#18

Vick Khera

vivek@khera.org

over 18 years ago

In reply to: Merlin Moncure (#16)

Re: Dell Hardware Recommendations

On Aug 10, 2007, at 4:36 PM, Merlin Moncure wrote:

I'm not so sure I agree. They are using LSI firmware now (and so is
everyone else). The servers are well built (highly subjective, I
admit) and configurable. I have had some bad experiences with IBM
gear (adaptec controller though), and white box parts 3ware, etc. I
can tell you that dell got us the storage and the server in record
time

do agree on adaptec however

Ok, perhaps you got luckier... I have two PowerVault 220 rack mounts
with U320 SCSI drives in them. With an LSI 320-2X controller, it
*refuses* to recognize some of the drives (channel 1 on either
array). Dell blames LSI, LSI blames dell's backplane. This is
consistent across multiple controllers we tried, and two different
Dell disk arrays. Dropping the SCSI speed to 160 is the only way to
make them work. I tend to believe LSI here.

The Adaptec 2230SLP controller recognizes the arrays fine, but tends
to "drop" devices at inopportune moments. Re-seating dropped devices
starts a rebuild, but the speed is recognized as "1" and the rebuild
takes two lifetimes to complete unless you insert a reboot of the
system in there. Totally unacceptable. Again, dropping the scsi
rate to 160 seems to make it more stable.

#19

Dave Cramer

pg@fastcrypt.com

over 18 years ago

In reply to: Vick Khera (#18)

Re: Dell Hardware Recommendations

On 13-Aug-07, at 9:50 AM, Vivek Khera wrote:

On Aug 10, 2007, at 4:36 PM, Merlin Moncure wrote:

I'm not so sure I agree. They are using LSI firmware now (and so is
everyone else). The servers are well built (highly subjective, I
admit) and configurable. I have had some bad experiences with IBM
gear (adaptec controller though), and white box parts 3ware, etc. I
can tell you that dell got us the storage and the server in record
time

do agree on adaptec however

Ok, perhaps you got luckier... I have two PowerVault 220 rack
mounts with U320 SCSI drives in them. With an LSI 320-2X
controller, it *refuses* to recognize some of the drives (channel 1
on either array). Dell blames LSI, LSI blames dell's backplane.
This is consistent across multiple controllers we tried, and two
different Dell disk arrays. Dropping the SCSI speed to 160 is the
only way to make them work. I tend to believe LSI here.

This is the crux of the argument here. Perc/5 is a dell trademark.
They can ship any hardware they want and call it a Perc/5.

Dave

Show quoted text

The Adaptec 2230SLP controller recognizes the arrays fine, but
tends to "drop" devices at inopportune moments. Re-seating dropped
devices starts a rebuild, but the speed is recognized as "1" and
the rebuild takes two lifetimes to complete unless you insert a
reboot of the system in there. Totally unacceptable. Again,
dropping the scsi rate to 160 seems to make it more stable.

---------------------------(end of
broadcast)---------------------------
TIP 4: Have you searched our list archives?

http://archives.postgresql.org