PostgreSQL, clusters and load-balance

Started by Bill Wordsworthabout 18 years ago7 messagesgeneral

bill.wordsworth@gmail.com

about 18 years ago

When traffic goes up, my webserver creates multiple instances of
postgresql.exe. At some basic level, aren't they similar to Oracle's RAC
"clusters", except that they are not aware of each other? Does this mean
that if I were to create copies of postgresql.exe beforehand and somehow
split traffic to them, traffic could be handled better?

Also, if I install postgresql on multiple boxes, how can I load-balance-
configure traffic to be split depending on load, at application or webserver
level?

All this for mostly read-only data.

http://www.postgresql.org/docs/8.3/interactive/high-availability.html
Cheers, Bill

Joshua D. Drake

jd@commandprompt.com

about 18 years ago

In reply to: Bill Wordsworth (#1)

Re: PostgreSQL, clusters and load-balance

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On Tue, 25 Mar 2008 14:16:37 -0400
"Bill Wordsworth" <bill.wordsworth@gmail.com> wrote:

When traffic goes up, my webserver creates multiple instances of
postgresql.exe. At some basic level, aren't they similar to Oracle's
RAC "clusters", except that they are not aware of each other? Does
this mean that if I were to create copies of postgresql.exe
beforehand and somehow split traffic to them, traffic could be
handled better?

Ehh no :).

Each of the postgresql.exe is a forked connection to a specific
database within PostgreSQL. They may be one database (if you only have
one) or many.

You could benefit from connection pooling but I have no idea if we have
a connection pooler that runs on Windows.

Also, if I install postgresql on multiple boxes, how can I
load-balance- configure traffic to be split depending on load, at
application or webserver level?

That is a *long* conversation. It entirely depends on your business
requirements.

All this for mostly read-only data.

http://www.postgresql.org/docs/8.3/interactive/high-availability.html

That is correct, you can not have multiple write nodes.

Sincerely,

Joshua D. Drake
- --
The PostgreSQL Company since 1997: http://www.commandprompt.com/
PostgreSQL Community Conference: http://www.postgresqlconference.org/
United States PostgreSQL Association: http://www.postgresql.us/
Donate to the PostgreSQL Project: http://www.postgresql.org/about/donate

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.6 (GNU/Linux)

iD8DBQFH6UMGATb/zqfZUUQRAoT0AJ44/gXekfNVfcjl6FhiO03GnqOj9gCgqYRy
cmWK72yX/YHFJMqt/6RIZNY=
=kM04
-----END PGP SIGNATURE-----

Thomas Kellerer

spam_eater@gmx.net

about 18 years ago

In reply to: Bill Wordsworth (#1)

Re: PostgreSQL, clusters and load-balance

Bill Wordsworth wrote on 25.03.2008 19:16:

When traffic goes up, my webserver creates multiple instances of
postgresql.exe. At some basic level, aren't they similar to Oracle's RAC
"clusters", except that they are not aware of each other?

No, absolutely not. Each client request is handled by a single postgres process
which is spawned by the postmaster upon connection.

It has nothing to do with "instances" or high-availibility

Thomas

Bill Wordsworth

bill.wordsworth@gmail.com

about 18 years ago

In reply to: Thomas Kellerer (#3)

Re: PostgreSQL, clusters and load-balance

On Tue, Mar 25, 2008 at 2:24 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:

Bill Wordsworth wrote on 25.03.2008 19:16:

When traffic goes up, my webserver creates multiple instances of
postgresql.exe. At some basic level, aren't they similar to Oracle's RAC
"clusters", except that they are not aware of each other?

No, absolutely not. Each client request is handled by a single postgres
process
which is spawned by the postmaster upon connection.

Thanks Joshua and Thomas. I guess my ignorance is showing :). Anyway, is
this spawning being done by postmaster or webserver or both? If postmaster,
does an application-level persistent connection request communicate itself
directly to the postmaster, and can the postmaster keep track of its
spawning?

Also, at some crude level, if I were to direct every alternate connection to
a different install box of postgresql, won't that help with *some*
load-balance?
Cheers, Bill

Rodrigo Gonzalez

rjgonzale@gmail.com

about 18 years ago

In reply to: Bill Wordsworth (#4)

Re: PostgreSQL, clusters and load-balance

Bill Wordsworth escribió:

On Tue, Mar 25, 2008 at 2:24 PM, Thomas Kellerer <spam_eater@gmx.net
<mailto:spam_eater@gmx.net>> wrote:

Bill Wordsworth wrote on 25.03.2008 19:16:

When traffic goes up, my webserver creates multiple instances of
postgresql.exe. At some basic level, aren't they similar to

Oracle's RAC

"clusters", except that they are not aware of each other?

No, absolutely not. Each client request is handled by a single
postgres process
which is spawned by the postmaster upon connection.

Thanks Joshua and Thomas. I guess my ignorance is showing :). Anyway,
is this spawning being done by postmaster or webserver or both? If
postmaster, does an application-level persistent connection request
communicate itself directly to the postmaster, and can the
postmaster keep track of its spawning?

Also, at some crude level, if I were to direct every alternate
connection to a different install box of postgresql, won't that help
with *some* load-balance?
Cheers, Bill

I dont know the first answer, but maybe you are needing pgpool, check
it, I think that is what you are needing...

Thomas Kellerer

spam_eater@gmx.net

about 18 years ago

In reply to: Bill Wordsworth (#4)

Re: PostgreSQL, clusters and load-balance

Bill Wordsworth wrote on 25.03.2008 21:04:

On Tue, Mar 25, 2008 at 2:24 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:

Bill Wordsworth wrote on 25.03.2008 19:16:

When traffic goes up, my webserver creates multiple instances of
postgresql.exe. At some basic level, aren't they similar to Oracle's RAC
"clusters", except that they are not aware of each other?

No, absolutely not. Each client request is handled by a single postgres
process
which is spawned by the postmaster upon connection.

Thanks Joshua and Thomas. I guess my ignorance is showing :). Anyway, is
this spawning being done by postmaster or webserver or both?

By the postmaster - for every connection initiated by the "client" (connection
pool in the app server, richt client, ...)

Thomas

Shane Ambler

pgsql@Sheeky.Biz

about 18 years ago

In reply to: Bill Wordsworth (#4)

Re: PostgreSQL, clusters and load-balance

Bill Wordsworth wrote:

On Tue, Mar 25, 2008 at 2:24 PM, Thomas Kellerer <spam_eater@gmx.net> wrote:

Bill Wordsworth wrote on 25.03.2008 19:16:

When traffic goes up, my webserver creates multiple instances of
postgresql.exe. At some basic level, aren't they similar to Oracle's RAC
"clusters", except that they are not aware of each other?

No, absolutely not. Each client request is handled by a single postgres
process
which is spawned by the postmaster upon connection.

Thanks Joshua and Thomas. I guess my ignorance is showing :). Anyway, is
this spawning being done by postmaster or webserver or both? If postmaster,
does an application-level persistent connection request communicate itself
directly to the postmaster, and can the postmaster keep track of its
spawning?

In simplified terms - you have one backend postgres process that handles
the data storage and caching etc. Then you have one postgres process
running for each client connected to the server at any given time. This
client process handles all requests to and from the client and talks to
the backend process to get the data required for the request. You will
have one postgres client connection running for each concurrent db
connection required by the web server.

With the scripting used for building your web pages - each time you open
a connection you start a postgres client process running as you have
seen happen. Then when you close the connection the client process for
that will finish.

If you are using persistant connections - then when you close a
connection the web server will keep the client process running and use
it again for the next new connection saving time in starting the process up.

Also, at some crude level, if I were to direct every alternate connection to
a different install box of postgresql, won't that help with *some*
load-balance?
Cheers, Bill

All of these postgres processes will be running on the one machine -
this may be the same machine as the web server or a separate one. You
can use replication to store the same data on more than one server and
use all of them for responding to selects for the web server.

Most replication options go for only using one of these servers for
updates and the others for selects only. You can then use pooling
options such as pgpool (or code it into your scripting if you wish) to
distribute your connection requests between these replicated servers.

Shane Ambler
pgSQL (at) Sheeky (dot) Biz

Get Sheeky @ http://Sheeky.Biz