Best options for new PG instance

Started by David Gauthierabout 8 years ago6 messagesgeneral

davegauthierpg@gmail.com

about 8 years ago

Hi:

I'm going to be requesting a PG DB instance (v9.6.7) from an IT dept in a
large corp setting. I was wondering if anyone could comment on the
pros/cons of getting this put on a virtual machine vs hard metal ? Locally
mounted disk vs nfs ?

Thanks !

Steve Atkins

steve@blighty.com

about 8 years ago

In reply to: David Gauthier (#1)

Re: Best options for new PG instance

On Mar 5, 2018, at 8:53 AM, David Gauthier <davegauthierpg@gmail.com> wrote:

Hi:

I'm going to be requesting a PG DB instance (v9.6.7) from an IT dept in a large corp setting. I was wondering if anyone could comment on the pros/cons of getting this put on a virtual machine vs hard metal ? Locally mounted disk vs nfs ?

I've been running postgresql instances on ESXi VMs for years with no issues. I've not benchmarked them, but performance has been good enough despite their running on fairly wimpy hardware. Performance relative to bare metal is probably going to be dominated by disk IO, and depending on how you're hosting VMs that can be anywhere between pretty good and terrible - in a large corporation I'd expect it to be pretty good. Just don't skimp on RAM - having your hot data in the filesystem cache is always good and can make high latency storage tolerable.

If performance isn't critical then a VM is great. If it is, you'll want to plan and maybe benchmark a bit to decide whether bare metal is going to be significantly better for what you're doing.

I wouldn't let NFS anywhere near it. I'd ideally want something that looks to the VM like a locally mounted disk, whether that be really local or served from a SAN or iSCSI or ...

https://www.slideshare.net/jkshah/best-practices-of-running-postgresql-in-virtual-environments has some hints on VM-specific things to consider.

Cheers,
Steve

Tim Cross

theophilusx@gmail.com

about 8 years ago

In reply to: David Gauthier (#1)

Re: Best options for new PG instance

David Gauthier <davegauthierpg@gmail.com> writes:

Hi:

I'm going to be requesting a PG DB instance (v9.6.7) from an IT dept in a
large corp setting. I was wondering if anyone could comment on the
pros/cons of getting this put on a virtual machine vs hard metal ? Locally
mounted disk vs nfs ?

This is a hard question to answer as there are too many
variables. However, I would say that my experience has been that most
large organisations are pretty much using VMs for everything, so asking
for something on a 'bare metal' basis is likely to result in your
request needing special attention and justification. On the other hand,
if you make a standard request, it will likely be fulfilled more
rapidly. Sys admins are likely to be resistant to a bare metal setup if
their infrastructure is based around VMs due tot he additional work and
maintenance overheads.

All our databases are running on VMs. Some of them are storing fairly
large amounts of data (i.e. one application stores large amounts of
weather data - adding 650 million records a day with a consolidation
after every 100 days. Currently, the DB is using about 6Tb, so not huge,
but not insignificant). Disk storage is via SAN.

Getting the right performance will require tweaking of memory, cpus
etc. The good news is that adding additional memory and CPUs is
relatively trivial.

For our situation, VMs have been fine and there has been some advantages
with SAN storage infrastructure, such as fast snapshots for backups
etc. In general, I usually find it best to work with the system admins
and follow their recommendations. Provide them with details of your
performance requirements and where you feel resource demands may peak
and let them propose what they feel would be best suited given whatever
infrastructure they have.

Tim

--
Tim Cross

Michael Paquier

michael@paquier.xyz

about 8 years ago

In reply to: Steve Atkins (#2)

Re: Best options for new PG instance

On Mon, Mar 05, 2018 at 09:51:53AM -0800, Steve Atkins wrote:

I've been running postgresql instances on ESXi VMs for years with no
issues. I've not benchmarked them, but performance has been good
enough despite their running on fairly wimpy hardware. Performance
relative to bare metal is probably going to be dominated by disk IO,
and depending on how you're hosting VMs that can be anywhere between
pretty good and terrible - in a large corporation I'd expect it to be
pretty good. Just don't skimp on RAM - having your hot data in the
filesystem cache is always good and can make high latency storage
tolerable.

One thing to be very careful about is the backup strategy of your
PostgreSQL instances. I would recommend primarily using PostgreSQL
in-core tools like pg_basebackup to do the work and make sure that
things are consistent. Users tend to rely a lot on VM snapshots,
particularly quiesced snapshots without memory footprint, but those
could be the cause of data corruption if not using appropriate
pre-freeze and post-thaw scripts in charge of freezing the partitions
while the snapshot is taken (use different partitions for the data
folder, pg_wal and logs as well!), so this would require extra work from
your side. I am talking about VMware technology here, still you can
find a lot of so-told-useful VM-level backup technologies. Be careful
with those as well when it comes to database backups. You can think
that your backups taken are safe, until you see a corruption which has
been hidden for weeks.
--
Michael

Benjamin Scherrey

scherrey@proteus-tech.com

about 8 years ago

In reply to: David Gauthier (#1)

Re: Best options for new PG instance

First - NEVER USE NFS TO STORE DATA YOU DON'T WANT TO LOSE. That said, what
you want to host on depends a lot on whether your system is typically CPU
bound or I/O bound. A VM for the computational side is generally quite
fine. If you're seriously CPU bound then you're likely to want to cluster
the thing and/or use PG10 if you can take advantage of parallel requests.
Once you get I/O bound things get trickier. AWS has horrible I/O
characteristics compared to any "bare metal" solution out there for
example. Yes, you can buy I/Oops but now you have incredibly expensive slow
I/O characteristics. If you're I/O bound your best solution is to host
elsewhere if possible. We have clients who cannot and they're paying a lot
more as a result sadly.

A great way to host PG is inside docker containers and there's some
excellent kubernetes solutions coming around. It is best if you can mount
your data on a host file system rather than a data volume container. The
reasons for that may be less strong than before (that was one area where
early Docker had defects) but we still see better I/O performance when
pushed. That said, I am aware of people happy with their deployments using
volume containers although I don't know their I/O profiles so much. Anyway
- Docker can be run within VMs or directly on bare metal quite easily and
is a great way to compare the impact of the two.

Oh - and lots of memory is always good no matter what as others have said.

Good luck,

-- Ben

On Mon, Mar 5, 2018 at 11:53 PM, David Gauthier <davegauthierpg@gmail.com>
wrote:

Show quoted text

Hi:

I'm going to be requesting a PG DB instance (v9.6.7) from an IT dept in a
large corp setting. I was wondering if anyone could comment on the
pros/cons of getting this put on a virtual machine vs hard metal ? Locally
mounted disk vs nfs ?

Thanks !

pinker

pinker@onet.eu

about 8 years ago

In reply to: Benjamin Scherrey (#5)

Re: Best options for new PG instance

Bugzilla from scherrey@proteus-tech.com wrote

Oh - and lots of memory is always good no matter what as others have said.

I'm probably "the others" here. I have seen already really large
instalations like with 6TB of RAM. Dealing with it is like completely other
universe of problems, because of NUMA - you cannot really have large RAM
without multiple sockets, because every processor has got maximum memory
capacity. What's next - those processors need to communicate with each other
and the hardware and those algorithms aren't perfect yet (would rather say
are underdeveloped).

so - more memory is a good rule of thumb, but sky isn't the limit :)

--
Sent from: http://www.postgresql-archive.org/PostgreSQL-general-f1843780.html