PostgreSQL scalability concerns
Hi,
We are currently planning the deployment of our next generation enterprise database and we are wondering whether or not PostgreSQL could do the heavy lifting that would be required. My post is a little bit long but I hope it will provide you with information to allow someone to provide a definitive answer.
First, a little history about our current setup. We are currently running SQL Server 2000 on Windows 2000 Advanced Server on HA clustered Dell boxes with 4 cpu's and 8 gigabyte of RAM. This is attached via Fiber to an EMC Clariion solution. Our current database is around 250 gigabytes big (including the size of the index files) and has averaged about 60 gigabytes of growth per year. We have around 200 concurrent users that heavily utilize the database.
We are currently in the planning stages of our next generation database system. We expect dramatic growth in the coming years and would like to design a database solution that will last at least 5 years. Within two years, we estimate that we will have around 500 concurrent users and estimate that our database will grow to around 500 gigabytes. Within five years, we estimate that we will have around 1000 concurrent users and estimate that our database will grow to around 1 terabyte.
The major concern we have is that we expect database activity to increase dramatically over the current utilization. Besides the planned increase in the number of employees, there will also be increased database resouce utilization per employee as management is pushing to increase performance per empoyee and increased data analysis to measure the success of the business. So it is very important that we implement a solution that can scale well.
This will be a rather enterprise quality solution. On the hardware, we are leaning on either an EMC or NetApp SAN solution. For a database, we plan to either deploy RHEL as it provides migration from AMD64 to Itanium/Power or Solaris as it provides migration from AMD64 to Sparc. On the database end, the possible options include Oracle, DB2, Sybase, or PostgreSQL. We would prefer to go with PostgreSQL due to the dramatic cost savings we can achieve. We have come to discover that as expensive as the hardware/OS solution is going to be, the commercial database costs will dwarf those costs.
As this database will be our core database and our entire world-wide branches will be completely dependent on it, we will need to make sure that it can perform, scale upwards, and provide high availability features. I already know that PostgreSQL provides high availability. The other two, I am uncertain. Will PostgreSQL be able to handle this job? What do we need to look out for if we are to do such a deployment? What is the largest database someone has deployed in production? Largest table? Any help with this situation will be greatly appreciated.
On Wednesday 15 March 2006 18:14, Alen Garia - IT wrote:
Hi,
We are currently planning the deployment of our next generation
enterprise database and we are wondering whether or not PostgreSQL could do
the heavy lifting that would be required. My post is a little bit long but
I hope it will provide you with information to allow someone to provide a
definitive answer.
The definitive answer is yes, PostgreSQL can handle this. You'll need to make
sure you have good hardware that matches the nature of your app (oltp/olap
and/or web/desktop). You'll probably want something that can do connection
pooling. You can get more help on the -performance list too, just make sure
you provide specifics. You might also want to look into getting commercial
support, though choice questions to the mailing list might be enough to steer
you on the right path.
--
Robert Treat
Build A Brighter Lamp :: Linux Apache {middleware} PostgreSQL
On Thu, 2006-16-03 at 13:51 -0500, Robert Treat wrote:
On Wednesday 15 March 2006 18:14, Alen Garia - IT wrote:
Hi,
We are currently planning the deployment of our next generation
enterprise database and we are wondering whether or not PostgreSQL could do
the heavy lifting that would be required. My post is a little bit long but
I hope it will provide you with information to allow someone to provide a
definitive answer.The definitive answer is yes, PostgreSQL can handle this. You'll need to make
sure you have good hardware that matches the nature of your app (oltp/olap
and/or web/desktop). You'll probably want something that can do connection
pooling. You can get more help on the -performance list too, just make sure
you provide specifics. You might also want to look into getting commercial
support, though choice questions to the mailing list might be enough to steer
you on the right path.
Yes this also looks like a good type of implementation for Slony or the
Java clustering implementations. Both could provide fail over recovery
and load sharing capabilities.
If you'd like to get some more detailed information, there is a Postgres-HA
webinar on March 23rd that looks very interesting and should answer your
questions.
Direct Link: http://www.postgresql.org/about/event.347
-Daniel
Show quoted text
On 3/17/06, Guy Fraser <guy@incentre.net> wrote:
On Thu, 2006-16-03 at 13:51 -0500, Robert Treat wrote:
On Wednesday 15 March 2006 18:14, Alen Garia - IT wrote:
Hi,
We are currently planning the deployment of our next generation
enterprise database and we are wondering whether or not PostgreSQLcould do
the heavy lifting that would be required. My post is a little bit
long but
I hope it will provide you with information to allow someone to
provide a
definitive answer.
The definitive answer is yes, PostgreSQL can handle this. You'll need to
make
sure you have good hardware that matches the nature of your app
(oltp/olap
and/or web/desktop). You'll probably want something that can do
connection
pooling. You can get more help on the -performance list too, just make
sure
you provide specifics. You might also want to look into getting
commercial
support, though choice questions to the mailing list might be enough to
steer
you on the right path.
Yes this also looks like a good type of implementation for Slony or the
Java clustering implementations. Both could provide fail over recovery
and load sharing capabilities.---------------------------(end of broadcast)---------------------------
TIP 6: explain analyze is your friend