Restricting Postgres

Started by Martin Fosterover 21 years ago33 messagesgeneral

martin@ethereal-realms.org

over 21 years ago

Is there a way to restrict how much load a PostgreSQL server can take
before dropping queries in order to safeguard the server? I was
looking at the login.conf (5) man page and while it allows me to limit
by processor time this seems to not fit my specific needs.

Essentially, I am looking for a sort of functionality similar to what
Sendmail and Apache have. Once the load of the system reaches a
certain defined limit the daemon drops tasks until such a time that it
can resume normal operation.

While not necessarily common on my servers I have witnessed some fairly
high load averages which may have led to the machine dropping outright.
Any help on this matter would be appreciated.
--
Martin Foster
Creator/Designer Ethereal Realms
martin@ethereal-realms.org

Andrew Sullivan

ajs@crankycanuck.ca

over 21 years ago

In reply to: Martin Foster (#1)

Re: Restricting Postgres

On Tue, Nov 02, 2004 at 11:52:12PM +0000, Martin Foster wrote:

Is there a way to restrict how much load a PostgreSQL server can take
before dropping queries in order to safeguard the server? I was

Well, you could limit the number of concurrent connections, and set
the query timeout to a relatively low level. What that ought to mean
is that, under heavy load, some queries will abort.

--
Andrew Sullivan | ajs@crankycanuck.ca
When my information changes, I alter my conclusions. What do you do sir?
--attr. John Maynard Keynes

Simon Riggs

simon@2ndQuadrant.com

over 21 years ago

In reply to: Martin Foster (#1)

Re: Restricting Postgres

On Tue, 2004-11-02 at 23:52, Martin Foster wrote:

Is there a way to restrict how much load a PostgreSQL server can take
before dropping queries in order to safeguard the server? I was
looking at the login.conf (5) man page and while it allows me to limit
by processor time this seems to not fit my specific needs.

Essentially, I am looking for a sort of functionality similar to what
Sendmail and Apache have. Once the load of the system reaches a
certain defined limit the daemon drops tasks until such a time that it
can resume normal operation.

Sounds great... could you give more shape to the idea, so people can
comment on it?

What limit? Measured how? Normal operation is what?

Drop what? How to tell?

While not necessarily common on my servers I have witnessed some fairly
high load averages which may have led to the machine dropping outright.
Any help on this matter would be appreciated.

You can limit the number of connections overall?

--
Best Regards, Simon Riggs

Martin Foster

martin@ethereal-realms.org

over 21 years ago

In reply to: Simon Riggs (#3)

Re: Restricting Postgres

Simon Riggs wrote:

On Tue, 2004-11-02 at 23:52, Martin Foster wrote:

Is there a way to restrict how much load a PostgreSQL server can take
before dropping queries in order to safeguard the server? I was
looking at the login.conf (5) man page and while it allows me to limit
by processor time this seems to not fit my specific needs.

Essentially, I am looking for a sort of functionality similar to what
Sendmail and Apache have. Once the load of the system reaches a
certain defined limit the daemon drops tasks until such a time that it
can resume normal operation.

Sounds great... could you give more shape to the idea, so people can
comment on it?

What limit? Measured how? Normal operation is what?

Drop what? How to tell?

Let's use the example in Apache, there is the Apache::LoadAvgLimit
mod_perl module which allows one to limit based on the system load
averages. Here is an example of the configuration one would find:

<Location /perl>
PerlInitHandler Apache::LoadAvgLimit
PerlSetVar LoadAvgLimit_1 3.00
PerlSetVar LoadAvgLimit_5 2.00
PerlSetVar LoadAvgLimit_15 1.50
PerlSetVar LoadAvgRetryAfter 120
</Location>

The end state is simple, once the load average moves above 3.00 for the
1 minute average the web server will not process the CGI scripts or
mod_perl applications under that directory. Instead it will return a
503 error and save the system from being crushed by ever increasing load
averages.

Only once the load average is below the defined limits will the server
process requests as normal. This is not necessarily the nicest or
cleanest way or doing things, but it does allow the Apache web server to
prevent a collapse.

There are ways of restricting the size of files, number of concurrent
processes and even memory being used by a daemon. This can be done
through ulimit or the login.conf file if your system supports it.
However, there is no way to restrict based on load averages, only
processor time which is ineffective for a perpetually running daemon
like PostgreSQL has.

While not necessarily common on my servers I have witnessed some fairly
high load averages which may have led to the machine dropping outright.
Any help on this matter would be appreciated.

You can limit the number of connections overall?

Limiting concurrent connections is not always the solution to the
problem. Problems can occur when there is a major spike in activity
that would be considered abnormal, due to outside conditions.

For example using Apache::DBI or pgpool the DBMS may be required to
spawn a great deal of child processed in a short order of time. This
in turn can cause a major spike in processor load and if unchecked by
running as high demand queries the system can literally increase in load
until the server buckles.

I've seen this behavior before when restarting the web server during
heavy loads. Apache goes from zero connections to a solid 120,
causing PostgreSQL to spawn that many children in a short order of time
just to keep up with the demand.

PostgreSQL undertakes a penalty when spawning a new client and accepting
a connection, this slows takes resources at every level to accomplish.
However clients on the web server are hitting the server at an
accelerated rate because of the slowed response, leading to even more
demand being placed on both machines.

In most cases the processor will be taxed and the load average high
enough to cause even a noticeable delay when using a console, however it
will generally recover... slowly or in rare cases crash outright. In
such a circumstance, having the database server refuse queries when the
sanity of the system is concerned might come in handy for such a
circumstance.

Of course, I am not blaming PostgreSQL, there are probably some
instabilities in the AMD64 port of FreeBSD 5.2.1 for dual processor
systems that lead to an increased chance of failure instead of recovery.
However, if there was a way to prevent the process from reaching
those limits, it may avoid the problem altogether.

Martin Foster
Creator/Designer Ethereal Realms
martin@ethereal-realms.org

John A Meinel

john@johnmeinel.com

over 21 years ago

In reply to: Martin Foster (#4)

Re: Restricting Postgres

Martin Foster wrote:

Simon Riggs wrote:

On Tue, 2004-11-02 at 23:52, Martin Foster wrote:

[...]

I've seen this behavior before when restarting the web server during
heavy loads. Apache goes from zero connections to a solid 120,
causing PostgreSQL to spawn that many children in a short order of time
just to keep up with the demand.

But wouldn't limiting the number of concurrent connections do this at
the source. If you tell it that "You can at most have 20 connections"
you would never have postgres spawn 120 children.
I'm not sure what apache does if it can't get a DB connection, but it
seems exactly like what you want.

Now, if you expected to have 50 clients that all like to just sit on
open connections, you could leave the number of concurrent connections high.

But if your only connect is from the webserver, where all of them are
designed to be short connections, then leave the max low.

The other possibility is having the webserver use connection pooling, so
it uses a few long lived connections. But even then, you could limit it
to something like 10-20, not 120.

John
=:->

Martin Foster

martin@ethereal-realms.org

over 21 years ago

In reply to: John A Meinel (#5)

Re: Restricting Postgres

John A Meinel wrote:

Martin Foster wrote:

Simon Riggs wrote:

On Tue, 2004-11-02 at 23:52, Martin Foster wrote:

[...]

I've seen this behavior before when restarting the web server during
heavy loads. Apache goes from zero connections to a solid 120,
causing PostgreSQL to spawn that many children in a short order of
time just to keep up with the demand.

But wouldn't limiting the number of concurrent connections do this at
the source. If you tell it that "You can at most have 20 connections"
you would never have postgres spawn 120 children.
I'm not sure what apache does if it can't get a DB connection, but it
seems exactly like what you want.

Now, if you expected to have 50 clients that all like to just sit on
open connections, you could leave the number of concurrent connections
high.

But if your only connect is from the webserver, where all of them are
designed to be short connections, then leave the max low.

The other possibility is having the webserver use connection pooling, so
it uses a few long lived connections. But even then, you could limit it
to something like 10-20, not 120.

John
=:->

I have a dual processor system that can support over 150 concurrent
connections handling normal traffic and load. Now suppose I setup
Apache to spawn all of it's children instantly, what will happen is that
as this happens the PostgreSQL server will also receive 150 attempts at
connection.

This will spawn 150 children in a short order of time and as this takes
place clients can connect and start requesting information not allowing
the machine to settle down to a normal traffic. That spike when
initiated can cripple the machine or even the webserver if a deadlocked
transaction is introduced.

Because on the webserver side a slowdown in the database means that it
will just get that many more connection attempts pooled from the
clients. As they keep clicking and hitting reload over and over to get
a page load, that server starts to buckle hitting unbelievably high load
averages.

When the above happened once, I lost the ability to type on a console
because of a 60+ (OpenBSD) load average on a single processor system.
The reason why Apache now drops a 503 Service Unavailable when loads get
too high.

It's that spike I worry about and it can happen for whatever reason. It
could just as easily be triggered by a massive concurrent request for
processing of an expensive query done in DDOS fashion. This may not
affect the webserver at all, at least immediately, but the same problem
can effect can come into effect.

Limiting connections help, but it's not the silver bullet and limits
your ability to support more connections because of that initial spike.
The penalty for forking a new child is hardly unexecpected, even
Apache will show the same effect when restarted in a high traffic time.

Martin Foster
Creator/Designer Ethereal Realms
martin@ethereal-realms.org

Matt Clark

matt@ymogen.net

over 21 years ago

In reply to: Martin Foster (#6)

Re: Restricting Postgres

I have a dual processor system that can support over 150 concurrent
connections handling normal traffic and load. Now suppose I setup
Apache to spawn all of it's children instantly, what will

...

This will spawn 150 children in a short order of time and as
this takes

"Doctor, it hurts when I do this!"
"Well, don't do that then..."

Sorry, couldn't resist ;-)

Our Apache/PG driven website also needs to be able to deal with occasional
large peaks, so what we do is:

StartServers 15 # Don't create too many children initially
MinSpareServers 10 # Always have at least 10 spares lying around
MaxSpareServers 20 # But no more than 20
MaxClients 150 # Up to 150 - the default 256 is too much for our
RAM

So on server restart 15 Apache children are created, then one new child
every second up to a maximum of 150.

Apache's 'ListenBackLog' is around 500 by default, so there's plenty of
scope for queuing inbound requests while we wait for sufficient children to
be spawned.

In addition we (as _every_ high load site should) run Squid as an
accelerator, which dramatically increases the number of client connections
that can be handled. Across 2 webservers at peak times we've had 50,000
concurrently open http & https client connections to Squid, with 150 Apache
children doing the work that squid can't (i.e. all the dynamic stuff), and
PG (on a separate box of course) whipping through nearly 800 mixed selects,
inserts and updates per second - and then had to restart Apache on one of
the servers for a config change... Not a problem :-)

One little tip - if you run squid on the same machine as apache, and use a
dual-proc box, then because squid is single-threaded it will _never_ take
more than half the CPU - nicely self balancing in a way.

Simon Riggs

simon@2ndQuadrant.com

over 21 years ago

In reply to: Martin Foster (#4)

Re: Restricting Postgres

On Wed, 2004-11-03 at 21:25, Martin Foster wrote:

Simon Riggs wrote:

On Tue, 2004-11-02 at 23:52, Martin Foster wrote:

Is there a way to restrict how much load a PostgreSQL server can take
before dropping queries in order to safeguard the server? I was
looking at the login.conf (5) man page and while it allows me to limit
by processor time this seems to not fit my specific needs.

Essentially, I am looking for a sort of functionality similar to what
Sendmail and Apache have. Once the load of the system reaches a
certain defined limit the daemon drops tasks until such a time that it
can resume normal operation.

Sounds great... could you give more shape to the idea, so people can
comment on it?

What limit? Measured how? Normal operation is what?

Drop what? How to tell?

Let's use the example in Apache, there is the Apache::LoadAvgLimit
mod_perl module which allows one to limit based on the system load
averages. Here is an example of the configuration one would find:

<Location /perl>
PerlInitHandler Apache::LoadAvgLimit
PerlSetVar LoadAvgLimit_1 3.00
PerlSetVar LoadAvgLimit_5 2.00
PerlSetVar LoadAvgLimit_15 1.50
PerlSetVar LoadAvgRetryAfter 120
</Location>

The end state is simple, once the load average moves above 3.00 for the
1 minute average the web server will not process the CGI scripts or
mod_perl applications under that directory. Instead it will return a
503 error and save the system from being crushed by ever increasing load
averages.

Only once the load average is below the defined limits will the server
process requests as normal. This is not necessarily the nicest or
cleanest way or doing things, but it does allow the Apache web server to
prevent a collapse.

There are ways of restricting the size of files, number of concurrent
processes and even memory being used by a daemon. This can be done
through ulimit or the login.conf file if your system supports it.
However, there is no way to restrict based on load averages, only
processor time which is ineffective for a perpetually running daemon
like PostgreSQL has.

All workloads are not created equally, so mixing them can be tricky.
This will be better in 8.0 because seq scans don't spoil the cache.

Apache is effectively able to segregate the workloads because each
workload is "in a directory". SQL isn't stored anywhere for PostgreSQL
to say "just those ones please", so defining which statements are in
which workload is the tricky part.

PostgreSQL workload management could look at userid, tables, processor
load (?) and estimated cost to decide what to do.

There is a TODO item on limiting numbers of connections per
userid/group, in addition to the max number of sessions per server.

Perhaps the easiest way would be to have the Apache workloads segregated
by PostgreSQL userid, then limit connections to each.

For example using Apache::DBI or pgpool the DBMS may be required to
spawn a great deal of child processed in a short order of time. This
in turn can cause a major spike in processor load and if unchecked by
running as high demand queries the system can literally increase in load
until the server buckles.

That's been nicely covered off by John and Matt on the other threads, so
you're sorted out for now and doesn't look like a bug in PostgreSQL.

Of course, I am not blaming PostgreSQL, there are probably some
instabilities in the AMD64 port of FreeBSD 5.2.1 for dual processor
systems that lead to an increased chance of failure instead of recovery.

Good!

--
Best Regards, Simon Riggs

Martin Foster

martin@ethereal-realms.org

over 21 years ago

In reply to: Matt Clark (#7)

Re: Restricting Postgres

Matt Clark wrote:

I have a dual processor system that can support over 150 concurrent
connections handling normal traffic and load. Now suppose I setup
Apache to spawn all of it's children instantly, what will

...

This will spawn 150 children in a short order of time and as
this takes

"Doctor, it hurts when I do this!"
"Well, don't do that then..."

Sorry, couldn't resist ;-)

Our Apache/PG driven website also needs to be able to deal with occasional
large peaks, so what we do is:

StartServers 15 # Don't create too many children initially
MinSpareServers 10 # Always have at least 10 spares lying around
MaxSpareServers 20 # But no more than 20
MaxClients 150 # Up to 150 - the default 256 is too much for our
RAM

So on server restart 15 Apache children are created, then one new child
every second up to a maximum of 150.

Apache's 'ListenBackLog' is around 500 by default, so there's plenty of
scope for queuing inbound requests while we wait for sufficient children to
be spawned.

In addition we (as _every_ high load site should) run Squid as an
accelerator, which dramatically increases the number of client connections
that can be handled. Across 2 webservers at peak times we've had 50,000
concurrently open http & https client connections to Squid, with 150 Apache
children doing the work that squid can't (i.e. all the dynamic stuff), and
PG (on a separate box of course) whipping through nearly 800 mixed selects,
inserts and updates per second - and then had to restart Apache on one of
the servers for a config change... Not a problem :-)

One little tip - if you run squid on the same machine as apache, and use a
dual-proc box, then because squid is single-threaded it will _never_ take
more than half the CPU - nicely self balancing in a way.

M

I've heard of the merits of Squid in the use as a reverse proxy.
However, well over 99% of my traffic is dynamic, hence why I may be
experiencing behavior that people normally do not expect.

As I have said before in previous threads, the scripts are completely
database driven and at the time the database averaged 65 queries per
second under MySQL before a migration, while the webserver was averaging
2 to 4.

Martin Foster
Creator/Designer Ethereal Realms
martin@ethereal-realms.org

#10

Martin Foster

martin@ethereal-realms.org

over 21 years ago

In reply to: Simon Riggs (#8)

Re: Restricting Postgres

Simon Riggs wrote

All workloads are not created equally, so mixing them can be tricky.
This will be better in 8.0 because seq scans don't spoil the cache.

Apache is effectively able to segregate the workloads because each
workload is "in a directory". SQL isn't stored anywhere for PostgreSQL
to say "just those ones please", so defining which statements are in
which workload is the tricky part.

PostgreSQL workload management could look at userid, tables, processor
load (?) and estimated cost to decide what to do.

There is a TODO item on limiting numbers of connections per
userid/group, in addition to the max number of sessions per server.

Perhaps the easiest way would be to have the Apache workloads segregated
by PostgreSQL userid, then limit connections to each.

Apache has a global setting for load average limits, the above was just
a module which extended the capability. It might also make sense to
have limitations set on schema's which can be used in a similar way to
Apache directories.

While for most people the database protecting itself against a sudden
surge of high traffic would be undesirable. It can help those who run
dynamically driven sites and get slammed by Slashdot for example.

Martin Foster
Creator/Designer Ethereal Realms
martin@ethereal-realms.org

#11

Kevin Barnard

kevin.barnard@gmail.com

over 21 years ago

In reply to: Martin Foster (#10)

Re: [PERFORM] Restricting Postgres

I am generally interested in a good solution for this. So far our
solution has been to increase the hardware to the point of allowing
800 connections to the DB.

I don't have the mod loaded for Apache, but we haven't had too many
problems there. The site is split pretty good between dynamic and
non-dynamic, it's largely Flash with several plugins to the DB.
However we still can and have been slammed and up to point of the 800
connections.

What I don't get is why not use pgpool? This should eliminate the
rapid fire forking of postgres instanaces in the DB server. I'm
assuming you app can safely handle a failure to connect to the DB
(i.e. exceed number of DB connections). If not it should be fairly
simple to send a 503 header when it's unable to get the connection.

On Thu, 04 Nov 2004 08:17:22 -0500, Martin Foster
<martin@ethereal-realms.org> wrote:

Show quoted text

Apache has a global setting for load average limits, the above was just
a module which extended the capability. It might also make sense to
have limitations set on schema's which can be used in a similar way to
Apache directories.

While for most people the database protecting itself against a sudden
surge of high traffic would be undesirable. It can help those who run
dynamically driven sites and get slammed by Slashdot for example.

Martin Foster
Creator/Designer Ethereal Realms
martin@ethereal-realms.org

---------------------------(end of broadcast)---------------------------
TIP 4: Don't 'kill -9' the postmaster

#12

Martin Foster

martin@ethereal-realms.org

over 21 years ago

In reply to: Kevin Barnard (#11)

Re: Restricting Postgres

Kevin Barnard wrote:

I am generally interested in a good solution for this. So far our
solution has been to increase the hardware to the point of allowing
800 connections to the DB.

I don't have the mod loaded for Apache, but we haven't had too many
problems there. The site is split pretty good between dynamic and
non-dynamic, it's largely Flash with several plugins to the DB.
However we still can and have been slammed and up to point of the 800
connections.

What I don't get is why not use pgpool? This should eliminate the
rapid fire forking of postgres instanaces in the DB server. I'm
assuming you app can safely handle a failure to connect to the DB
(i.e. exceed number of DB connections). If not it should be fairly
simple to send a 503 header when it's unable to get the connection.

Note, that I am not necessarily looking for a PostgreSQL solution to the
matter. Just a way to prevent the database from killing off the server
it sits on, but looking at the load averages.

I have attempted to make use of pgpool and have had some very poor
performance. There were constant error messages being sounded, load
averages on that machine seemed to skyrocket and it just seemed to not
be suited for my needs.

Apache::DBI overall works better to what I require, even if it is not a
pool per sey. Now if pgpool supported variable rate pooling like
Apache does with it's children, it might help to even things out. That
and you'd still get the spike if you have to start the webserver and
database server at or around the same time.

Martin Foster
Creator/Designer Ethereal Realms
martin@ethereal-realms.org

#13

Matt Clark

matt@ymogen.net

over 21 years ago

In reply to: Martin Foster (#12)

Re: Restricting Postgres

Apache::DBI overall works better to what I require, even if
it is not a
pool per sey. Now if pgpool supported variable rate pooling like
Apache does with it's children, it might help to even things
out. That
and you'd still get the spike if you have to start the webserver and
database server at or around the same time.

I still don't quite get it though - you shouldn't be getting more than one
child per second being launched by Apache, so that's only one PG postmaster
per second, which is really a trivial load. That is unless you have
'StartServers' set high, in which case the 'obvious' answer is to lower it.
Are you launching multiple DB connections per Apache process as well?

#14

Matthew Nuzum

newz@bearfruit.org

over 21 years ago

In reply to: Martin Foster (#9)

Re: Restricting Postgres

Matt - Very interesting information about squid effectiveness, thanks.

Martin,
You mean your site had no images? No CSS files? No JavaScript files? Nearly
everything is dynamic?

I've found that our CMS spends more time sending a 23KB image to a dial up
user than it does generating and serving dynamic content.

This means that if you have a "light" squid process who caches and serves
your images and static content from it's cache then your apache processes
can truly focus on only the dynamic data.

Case in point: A first time visitor hits your home page. A dynamic page is
generated (in about 1 second) and served (taking 2 more seconds) which
contains links to 20 additional files (images, styles and etc). Then
expensive apache processes are used to serve each of those 20 files, which
takes an additional 14 seconds. Your precious application server processes
have now spent 14 seconds serving stuff that could have been served by an
upstream cache.

I am all for using upstream caches and SSL accelerators to take the load off
of application servers. My apache children often take 16 or 20MB of RAM
each. Why spend all of that on a 1.3KB image?

Just food for thought. There are people who use proxying in apache to
redirect expensive tasks to other servers that are dedicated to just one
heavy challenge. In that case you likely do have 99% dynamic content.

Matthew Nuzum | Makers of "Elite Content Management System"
www.followers.net | View samples of Elite CMS in action
matt@followers.net | http://www.followers.net/portfolio/

-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of Martin Foster

Matt Clark wrote:

In addition we (as _every_ high load site should) run Squid as an
accelerator, which dramatically increases the number of client connections
that can be handled. Across 2 webservers at peak times we've had 50,000
concurrently open http & https client connections to Squid, with 150

Apache

children doing the work that squid can't (i.e. all the dynamic stuff), and
PG (on a separate box of course) whipping through nearly 800 mixed

selects,

inserts and updates per second - and then had to restart Apache on one of
the servers for a config change... Not a problem :-)

One little tip - if you run squid on the same machine as apache, and use a
dual-proc box, then because squid is single-threaded it will _never_ take
more than half the CPU - nicely self balancing in a way.

M

I've heard of the merits of Squid in the use as a reverse proxy.
However, well over 99% of my traffic is dynamic, hence why I may be
experiencing behavior that people normally do not expect.

Martin Foster
Creator/Designer Ethereal Realms
martin@ethereal-realms.org

#15

Matt Clark

matt@ymogen.net

over 21 years ago

In reply to: Matthew Nuzum (#14)

Re: Restricting Postgres

Case in point: A first time visitor hits your home page. A
dynamic page is generated (in about 1 second) and served
(taking 2 more seconds) which contains links to 20 additional

The gain from an accelerator is actually even more that that, as it takes
essentially zero seconds for Apache to return the generated content (which
in the case of a message board could be quite large) to Squid, which can
then feed it slowly to the user, leaving Apache free again to generate
another page. When serving dialup users large dynamic pages this can be a
_huge_ gain.

I think Martin's pages (dimly recalling another thread) take a pretty long
time to generate though, so he may not see quite such a significant gain.

#16

Pierre-Frédéric Caillaud

lists@boutiquenumerique.com

over 21 years ago

In reply to: Matt Clark (#15)

Re: Restricting Postgres

Myself, I like a small Apache with few modules serving static files (no
dynamic content, no db connections), and with a mod_proxy on a special
path directed to another Apache which generates the dynamic pages (few
processes, persistent connections...)
You get the best of both, static files do not hog DB connections, and the
second apache sends generated pages very fast to the first which then
trickles them down to the clients.

Show quoted text

Case in point: A first time visitor hits your home page. A
dynamic page is generated (in about 1 second) and served
(taking 2 more seconds) which contains links to 20 additional

The gain from an accelerator is actually even more that that, as it takes
essentially zero seconds for Apache to return the generated content
(which
in the case of a message board could be quite large) to Squid, which can
then feed it slowly to the user, leaving Apache free again to generate
another page. When serving dialup users large dynamic pages this can be
a
_huge_ gain.

I think Martin's pages (dimly recalling another thread) take a pretty
long
time to generate though, so he may not see quite such a significant gain.

---------------------------(end of broadcast)---------------------------
TIP 9: the planner will ignore your desire to choose an index scan if
your
joining column's datatypes do not match

#17

Martin Foster

martin@ethereal-realms.org

over 21 years ago

In reply to: Matt Clark (#15)

Re: Restricting Postgres

Matt Clark wrote:

Case in point: A first time visitor hits your home page. A
dynamic page is generated (in about 1 second) and served
(taking 2 more seconds) which contains links to 20 additional

The gain from an accelerator is actually even more that that, as it takes
essentially zero seconds for Apache to return the generated content (which
in the case of a message board could be quite large) to Squid, which can
then feed it slowly to the user, leaving Apache free again to generate
another page. When serving dialup users large dynamic pages this can be a
_huge_ gain.

I think Martin's pages (dimly recalling another thread) take a pretty long
time to generate though, so he may not see quite such a significant gain.

Correct the 75% of all hits are on a script that can take anywhere from
a few seconds to a half an hour to complete. The script essentially
auto-flushes to the browser so they get new information as it arrives
creating the illusion of on demand generation.

A squid proxy would probably cause severe problems when dealing with a
script that does not complete output for a variable rate of time.

As for images, CSS, javascript and such the site makes use of it, but in
the grand scheme of things the amount of traffic they tie up is
literally inconsequential. Though I will probably move all of that
onto another server just to allow the main server the capabilities of
dealing with almost exclusively dynamic content.

Martin Foster
Creator/Designer Ethereal Realms
martin@ethereal-realms.org

#18

Martin Foster

martin@ethereal-realms.org

over 21 years ago

In reply to: Matt Clark (#13)

Re: Restricting Postgres

Matt Clark wrote:

Apache::DBI overall works better to what I require, even if
it is not a
pool per sey. Now if pgpool supported variable rate pooling like
Apache does with it's children, it might help to even things
out. That
and you'd still get the spike if you have to start the webserver and
database server at or around the same time.

I still don't quite get it though - you shouldn't be getting more than one
child per second being launched by Apache, so that's only one PG postmaster
per second, which is really a trivial load. That is unless you have
'StartServers' set high, in which case the 'obvious' answer is to lower it.
Are you launching multiple DB connections per Apache process as well?

I have start servers set to a fairly high limit. However this would
make little different overall if I restarted the webservers to load in
new modules during a high load time. When I am averaging 145
concurrent connections before a restart, I can expect that many request
to hit the server once Apache begins to respond.

As a result, it will literally cause a spike on both machines as new
connections are initiated at a high rate. In my case I don't always
have the luxury of waiting till 0300 just to test a change.

Again, not necessarily looking for a PostgreSQL solution. I am looking
for a method that would allow the database or the OS itself to protect
the system it's hosted on. If both the database and the apache server
were on the same machine this type of scenario would be unstable to say
the least.

Martin Foster
Creator/Designer Ethereal Realms
martin@ethereal-realms.org

#19

Matt Clark

matt@ymogen.net

over 21 years ago

In reply to: Martin Foster (#17)

Re: Restricting Postgres

Correct the 75% of all hits are on a script that can take
anywhere from
a few seconds to a half an hour to complete. The script
essentially
auto-flushes to the browser so they get new information as it arrives
creating the illusion of on demand generation.

This is more like a streaming data server, which is a very different beast
from a webserver, and probably better suited to the job. Usually either
multithreaded or single-process using select() (just like Squid). You could
probably build one pretty easily. Using a 30MB Apache process to serve one
client for half an hour seems like a hell of a waste of RAM.

A squid proxy would probably cause severe problems when
dealing with a
script that does not complete output for a variable rate of time.

No, it's fine, squid gives it to the client as it gets it, but can receive
from the server faster.

#20

Martin Foster

martin@ethereal-realms.org

over 21 years ago

In reply to: Matt Clark (#19)

Re: Restricting Postgres

Matt Clark wrote:

Correct the 75% of all hits are on a script that can take
anywhere from
a few seconds to a half an hour to complete. The script
essentially
auto-flushes to the browser so they get new information as it arrives
creating the illusion of on demand generation.

This is more like a streaming data server, which is a very different beast
from a webserver, and probably better suited to the job. Usually either
multithreaded or single-process using select() (just like Squid). You could
probably build one pretty easily. Using a 30MB Apache process to serve one
client for half an hour seems like a hell of a waste of RAM.

These are CGI scripts at the lowest level, nothing more and nothing
less. While I could probably embed a small webserver directly into the
perl scripts and run that as a daemon, it would take away the
portability that the scripts currently offer.

This should be my last question on the matter, does squid report the
proper IP address of the client themselves? That's a critical
requirement for the scripts.

Martin Foster
Creator/Designer Ethereal Realms
martin@ethereal-realms.org

#21

Pierre-Frédéric Caillaud

lists@boutiquenumerique.com

over 21 years ago

In reply to: Matt Clark (#19)

#22