Service outage: each postgres process use the exact amount of the configured work_mem

Started by Moreno Andreoalmost 9 years ago17 messagesgeneral

moreno.andreo@evolu-s.it

almost 9 years ago

Melvin Davidson

melvin6925@gmail.com

almost 9 years ago

In reply to: Moreno Andreo (#1)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

On Fri, Apr 14, 2017 at 1:12 PM, Moreno Andreo <moreno.andreo@evolu-s.it>
wrote:

Hi all,
About 2 hours and half ago, suddenly (and on the late afternoon of the
Easter Friday), customers reported failing connections to our server, or
even very slow.
After a bit of checking (that also involved server reboot) I noticed
(using top) that every process regarding postgres is using exactly the
amout I configured as work_mem (3 GB). And AFAIK it's not good.

30085 postgres 20 0 *3370048* 156656 153876 S 6.7 0.3 0:00.20 postgres
29833 postgres 20 0 *3370000* 65260 62416 S 1.7 0.1 0:00.17 postgres
29632 postgres 20 0 *3372468* 11712 6028 S 0.7 0.0 0:00.60 postgres

What can be happened?
Nothing has been touched....
postgresql 9.5.6 on debian 8 just apt-get upgrade'd

Any help would be appreciated.
Moreno.

*>using exactly the amout I configured as work_mem (3 GB).*

*You are right, that is bad, but that is your own fault. 3GB of work_mem is
very bad, Try lowing in to something more reasonable, like 20GB.*

*https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY
<https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY>"several
running sessions could be doing such operations concurrently. Therefore,
the total memory used could be many times the value of work_mem; it is
necessary to keep this fact in mind when choosing the value."*

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Moreno Andreo

moreno.andreo@evolu-s.it

almost 9 years ago

In reply to: Melvin Davidson (#2)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Sorry,<br>
my mistake (I'm a bit nervous...)<br>
<br>
that's not work_mem, but shared_buffers<br>
<br>
Thanks<br>
<br>
Il 14/04/2017 19:33, Melvin Davidson ha scritto:<br>
</div>
<blockquote
cite="mid:CANu8Fixpu_dOGR674jXCPO3TD_q08gbXpCB9hQM1tEWzGKSP6g@mail.gmail.com"
type="cite">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Fri, Apr 14, 2017 at 1:12 PM,
Moreno Andreo <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:moreno.andreo@evolu-s.it" target="_blank">moreno.andreo@evolu-s.it</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> Hi all,<br>
About 2 hours and half ago, suddenly (and on the late
afternoon of the Easter Friday), customers reported
failing connections to our server, or even very slow.<br>
After a bit of checking (that also involved server
reboot) I noticed (using top) that every process
regarding postgres is using exactly the amout I
configured as work_mem (3 GB). And AFAIK it's not good.<br>
<br>
30085 postgres 20 0 <b>3370048</b> 156656 153876 S 6.7
0.3 0:00.20 postgres<br>
29833 postgres 20 0 <b>3370000</b> 65260 62416 S 1.7
0.1 0:00.17 postgres <br>
29632 postgres 20 0 <b>3372468</b> 11712 6028 S 0.7 0.0
0:00.60 postgres <br>
<br>
What can be happened?<br>
Nothing has been touched....<br>
postgresql 9.5.6 on debian 8 just apt-get upgrade'd<br>
<br>
Any help would be appreciated.<span class="gmail-HOEnZb"><font
color="#888888"><br>
Moreno.<br>
</font></span></div>
</blockquote>
</div>
<br>
<b>>using exactly the amout I configured as work_mem (3
GB).<br>
<br>
</b></div>
<div class="gmail_extra"><b>You are right, that is bad, but that
is your own fault. 3GB of work_mem is very bad, Try lowing
in to something more reasonable, like 20GB.<br>
<br clear="all">
</b></div>
<div class="gmail_extra"><b><a moz-do-not-send="true"
href="https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY">https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY</a><br>
"several running sessions could be doing such operations
concurrently. <span style="color:rgb(255,0,0)">Therefore,
the total memory used could be many times the value of <tt
class="gmail-VARNAME">work_mem</tt>;</span> it is
necessary to keep this fact in mind when choosing the
value."</b></div>
<div class="gmail_extra"><br>
-- <br>
<div class="gmail_signature">
<div dir="ltr"><font size="4"><b><span
style="font-family:courier new,monospace">Melvin
Davidson</span></b></font><br>
<font style="font-weight:bold" size="3"><span
style="color:rgb(128,0,255)">I reserve the right to
fantasize. Whether or not you </span><br
style="color:rgb(128,0,255)">
<span style="color:rgb(128,0,255)">wish to share my
fantasy is entirely up to you. </span><img
moz-do-not-send="true" style="color: rgb(128, 0,
255);"
src="http://us.i1.yimg.com/us.yimg.com/i/mesg/tsmileys2/01.gif"></font><br>
</div>
</div>
</div>
</div>
</blockquote>
<p><br>
</p>
</body>
</html>

Melvin Davidson

melvin6925@gmail.com

almost 9 years ago

In reply to: Moreno Andreo (#3)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

On Fri, Apr 14, 2017 at 1:50 PM, Moreno Andreo <moreno.andreo@evolu-s.it>
wrote:

Sorry,
my mistake (I'm a bit nervous...)

that's not work_mem, but shared_buffers

Thanks

Il 14/04/2017 19:33, Melvin Davidson ha scritto:

On Fri, Apr 14, 2017 at 1:12 PM, Moreno Andreo <moreno.andreo@evolu-s.it>
wrote:

Hi all,
About 2 hours and half ago, suddenly (and on the late afternoon of the
Easter Friday), customers reported failing connections to our server, or
even very slow.
After a bit of checking (that also involved server reboot) I noticed
(using top) that every process regarding postgres is using exactly the
amout I configured as work_mem (3 GB). And AFAIK it's not good.

30085 postgres 20 0 *3370048* 156656 153876 S 6.7 0.3 0:00.20 postgres
29833 postgres 20 0 *3370000* 65260 62416 S 1.7 0.1 0:00.17 postgres
29632 postgres 20 0 *3372468* 11712 6028 S 0.7 0.0 0:00.60 postgres

What can be happened?
Nothing has been touched....
postgresql 9.5.6 on debian 8 just apt-get upgrade'd

Any help would be appreciated.
Moreno.

*>using exactly the amout I configured as work_mem (3 GB). *

*You are right, that is bad, but that is your own fault. 3GB of work_mem
is very bad, Try lowing in to something more reasonable, like 20GB. *

*https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY
<https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY>
"several running sessions could be doing such operations concurrently.
Therefore, the total memory used could be many times the value of work_mem;
it is necessary to keep this fact in mind when choosing the value."*

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Moreno,
we are working with minimal information here.
Perhaps if you provided us with the following information it would be more
useful,

A. Total *SYSTEM MEMORY*
B. shared_memory
C. work_memory
D. max_connections
E. How many users were connected when the problem occurred?

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Jeff Janes

jeff.janes@gmail.com

almost 9 years ago

In reply to: Moreno Andreo (#1)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

On Fri, Apr 14, 2017 at 10:12 AM, Moreno Andreo <moreno.andreo@evolu-s.it>
wrote:

Hi all,
About 2 hours and half ago, suddenly (and on the late afternoon of the
Easter Friday), customers reported failing connections to our server, or
even very slow.
After a bit of checking (that also involved server reboot) I noticed
(using top) that every process regarding postgres is using exactly the
amout I configured as work_mem (3 GB). And AFAIK it's not good.

30085 postgres 20 0 *3370048* 156656 153876 S 6.7 0.3 0:00.20 postgres
29833 postgres 20 0 *3370000* 65260 62416 S 1.7 0.1 0:00.17 postgres
29632 postgres 20 0 *3372468* 11712 6028 S 0.7 0.0 0:00.60 postgres

They are all sharing the same shared_buffers (not work_mem), and that
shared space is being listed for each one of them.

This is unlikely to be related to your current problems. Nothing shown in
that output is alarming.

What can be happened?

What error messages are the customers getting when they fail to connect?
What error messages are showing up in the server log file? What do you see
in pg_stat_actvity?

Cheers,

Jeff

Chris Mair

chris@1006.org

almost 9 years ago

In reply to: Moreno Andreo (#3)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

Sorry,
my mistake (I'm a bit nervous...)

that's not work_mem, but shared_buffers

Hi.

The resident set size of the worker processes includes all shared memory blocks they touched.
So it's not that each of those workers allocated their own 3GB...

(in Linux at least)

Bye,
Chris.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Moreno Andreo

moreno.andreo@evolu-s.it

almost 9 years ago

In reply to: Melvin Davidson (#4)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Melvin,<br>
Sorry for top-posting. <br>
I'm going ahead in troubleshooting. As Jeff said, there's probably
nothing wrong with my values (at the end of the message you can
find minimal info you requested).<br>
I tried running some queries against psql server and response
times are good, so I'm moving my attentions to Windows server,
which hosts a WCF service, that is the one that actually server
customers.<br>
<br>
Thanks for now<br>
Moreno<br>
<br>
Il 14/04/2017 20:01, Melvin Davidson ha scritto:<br>
</div>
<blockquote
cite="mid:CANu8FixXw3g-fuuPRtyWjDywpe+w+TpO=VfcySo8JXSEOHRmvA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra"><br>
<div class="gmail_quote">On Fri, Apr 14, 2017 at 1:50 PM,
Moreno Andreo <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:moreno.andreo@evolu-s.it" target="_blank">moreno.andreo@evolu-s.it</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF">
<div class="gmail-m_-6686708562138806043moz-cite-prefix">Sorry,<br>
my mistake (I'm a bit nervous...)<br>
<br>
that's not work_mem, but shared_buffers<br>
<br>
Thanks
<div>
<div class="gmail-h5"><br>
<br>
Il 14/04/2017 19:33, Melvin Davidson ha scritto:<br>
</div>
</div>
</div>
<div>
<div class="gmail-h5">
<blockquote type="cite">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Fri, Apr 14, 2017
at 1:12 PM, Moreno Andreo <span dir="ltr"><<a
moz-do-not-send="true"
href="mailto:moreno.andreo@evolu-s.it"
target="_blank">moreno.andreo@evolu-s.it</a>></span>
wrote:<br>
<blockquote class="gmail_quote"
style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<div bgcolor="#FFFFFF"> Hi all,<br>
About 2 hours and half ago, suddenly
(and on the late afternoon of the Easter
Friday), customers reported failing
connections to our server, or even very
slow.<br>
After a bit of checking (that also
involved server reboot) I noticed (using
top) that every process regarding
postgres is using exactly the amout I
configured as work_mem (3 GB). And AFAIK
it's not good.<br>
<br>
30085 postgres 20 0 <b>3370048</b>
156656 153876 S 6.7 0.3 0:00.20 postgres<br>
29833 postgres 20 0 <b>3370000</b>
65260 62416 S 1.7 0.1 0:00.17 postgres <br>
29632 postgres 20 0 <b>3372468</b>
11712 6028 S 0.7 0.0 0:00.60 postgres <br>
<br>
What can be happened?<br>
Nothing has been touched....<br>
postgresql 9.5.6 on debian 8 just
apt-get upgrade'd<br>
<br>
Any help would be appreciated.<span
class="gmail-m_-6686708562138806043gmail-HOEnZb"><font
color="#888888"><br>
Moreno.<br>
</font></span></div>
</blockquote>
</div>
<br>
<b>>using exactly the amout I configured as
work_mem (3 GB).<br>
<br>
</b></div>
<div class="gmail_extra"><b>You are right, that
is bad, but that is your own fault. 3GB of
work_mem is very bad, Try lowing in to
something more reasonable, like 20GB.<br>
<br clear="all">
</b></div>
<div class="gmail_extra"><b><a
moz-do-not-send="true"
href="https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY"
target="_blank">https://www.postgresql.org/<wbr>docs/9.5/static/runtime-<wbr>config-resource.html#RUNTIME-<wbr>CONFIG-RESOURCE-MEMORY</a><br>
"several running sessions could be doing
such operations concurrently. <span
style="color:rgb(255,0,0)">Therefore, the
total memory used could be many times the
value of <tt
class="gmail-m_-6686708562138806043gmail-VARNAME">work_mem</tt>;</span>
it is necessary to keep this fact in mind
when choosing the value."</b></div>
<div class="gmail_extra"><br>
-- <br>
<div
class="gmail-m_-6686708562138806043gmail_signature">
<div dir="ltr"><font size="4"><b><span
style="font-family:courier
new,monospace">Melvin Davidson</span></b></font><br>
<font style="font-weight:bold" size="3"><span
style="color:rgb(128,0,255)">I reserve
the right to fantasize. Whether or
not you </span><br
style="color:rgb(128,0,255)">
<span style="color:rgb(128,0,255)">wish
to share my fantasy is entirely up to
you. </span><img
moz-do-not-send="true" style="color:
rgb(128, 0, 255);"
src="http://us.i1.yimg.com/us.yimg.com/i/mesg/tsmileys2/01.gif"></font><br>
</div>
</div>
</div>
</div>
</blockquote>
<p><br>
</p>
</div>
</div>
</div>
</blockquote>
</div>
<br>
<div>
<div>
<div>Moreno,<br>
</div>
we are working with minimal information here.<br>
</div>
Perhaps if you provided us with the following information it
would be more useful,<br>
<br>
</div>
A. Total <u>SYSTEM MEMORY</u><br clear="all">
</div>
</div>
</blockquote>
52 GB<br>
<blockquote
cite="mid:CANu8FixXw3g-fuuPRtyWjDywpe+w+TpO=VfcySo8JXSEOHRmvA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">B. shared_memory<br>
</div>
</div>
</blockquote>
3 GB (was 13 GB)<br>
<blockquote
cite="mid:CANu8FixXw3g-fuuPRtyWjDywpe+w+TpO=VfcySo8JXSEOHRmvA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">C. work_memory<br>
</div>
</div>
</blockquote>
default<br>
<blockquote
cite="mid:CANu8FixXw3g-fuuPRtyWjDywpe+w+TpO=VfcySo8JXSEOHRmvA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">D. max_connections<br>
</div>
</div>
</blockquote>
1000<br>
<blockquote
cite="mid:CANu8FixXw3g-fuuPRtyWjDywpe+w+TpO=VfcySo8JXSEOHRmvA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">E. How many users were connected when
the problem occurred?<br>
</div>
</div>
</blockquote>
About 350 connections<br>
<br>
Thanks<br>
<br>
<blockquote
cite="mid:CANu8FixXw3g-fuuPRtyWjDywpe+w+TpO=VfcySo8JXSEOHRmvA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra"><br>
</div>
<div class="gmail_extra">-- <br>
<div class="gmail_signature">
<div dir="ltr"><font size="4"><b><span
style="font-family:courier new,monospace">Melvin
Davidson</span></b></font><br>
<font style="font-weight:bold" size="3"><span
style="color:rgb(128,0,255)">I reserve the right to
fantasize. Whether or not you </span><br
style="color:rgb(128,0,255)">
<span style="color:rgb(128,0,255)">wish to share my
fantasy is entirely up to you. </span><img
moz-do-not-send="true" style="color: rgb(128, 0,
255);"
src="http://us.i1.yimg.com/us.yimg.com/i/mesg/tsmileys2/01.gif"></font><br>
</div>
</div>
</div>
</div>
</blockquote>
<p><br>
</p>
</body>
</html>

Melvin Davidson

melvin6925@gmail.com

almost 9 years ago

In reply to: Moreno Andreo (#7)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

On Fri, Apr 14, 2017 at 2:28 PM, Moreno Andreo <moreno.andreo@evolu-s.it>
wrote:

Melvin,
Sorry for top-posting.
I'm going ahead in troubleshooting. As Jeff said, there's probably nothing
wrong with my values (at the end of the message you can find minimal info
you requested).
I tried running some queries against psql server and response times are
good, so I'm moving my attentions to Windows server, which hosts a WCF
service, that is the one that actually server customers.

Thanks for now
Moreno

Il 14/04/2017 20:01, Melvin Davidson ha scritto:

On Fri, Apr 14, 2017 at 1:50 PM, Moreno Andreo <moreno.andreo@evolu-s.it>
wrote:

Sorry,
my mistake (I'm a bit nervous...)

that's not work_mem, but shared_buffers

Thanks

Il 14/04/2017 19:33, Melvin Davidson ha scritto:

On Fri, Apr 14, 2017 at 1:12 PM, Moreno Andreo <moreno.andreo@evolu-s.it>
wrote:

Hi all,
About 2 hours and half ago, suddenly (and on the late afternoon of the
Easter Friday), customers reported failing connections to our server, or
even very slow.
After a bit of checking (that also involved server reboot) I noticed
(using top) that every process regarding postgres is using exactly the
amout I configured as work_mem (3 GB). And AFAIK it's not good.

30085 postgres 20 0 *3370048* 156656 153876 S 6.7 0.3 0:00.20 postgres
29833 postgres 20 0 *3370000* 65260 62416 S 1.7 0.1 0:00.17 postgres
29632 postgres 20 0 *3372468* 11712 6028 S 0.7 0.0 0:00.60 postgres

What can be happened?
Nothing has been touched....
postgresql 9.5.6 on debian 8 just apt-get upgrade'd

Any help would be appreciated.
Moreno.

*>using exactly the amout I configured as work_mem (3 GB). *

*You are right, that is bad, but that is your own fault. 3GB of work_mem
is very bad, Try lowing in to something more reasonable, like 20GB. *

*https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY
<https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html#RUNTIME-CONFIG-RESOURCE-MEMORY>
"several running sessions could be doing such operations concurrently.
Therefore, the total memory used could be many times the value of work_mem;
it is necessary to keep this fact in mind when choosing the value."*

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Moreno,
we are working with minimal information here.
Perhaps if you provided us with the following information it would be more
useful,

A. Total *SYSTEM MEMORY*

52 GB

B. shared_memory

3 GB (was 13 GB)

C. work_memory

default

D. max_connections

1000

E. How many users were connected when the problem occurred?

About 350 connections

Thanks

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

*Moreno, *

*I understand you will troubleshoot on your own, but based on total system
memory, you should increase *

*shared_memory to 40GB. General philosphy is to allocate 80% of system
memory to shared_memory. Of course you will have to do a PostgreSQL restart
*

*for that to take effect.*

*Also, with 1000 max_connections, you would be much better off with a
connection balancer like PgBouncer https://pgbouncer.github.io/downloads/
<https://pgbouncer.github.io/downloads/> *

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

Igor Neyman

ineyman@perceptron.com

almost 9 years ago

In reply to: Moreno Andreo (#7)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

E. How many users were connected when the problem occurred?
About 350 connections

Thanks

______________________________________________________________________________________

Probably that is your problem, if you don’t have connection pooler.
I’d recommend to start with PgBouncer, very light-weight and easy to configure.

Regards,
Igor Neyman

#10

Chris Mair

chris@1006.org

almost 9 years ago

In reply to: Melvin Davidson (#8)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

you should increase shared_memory to 40GB. General philosphy is to allocate 80% of system memory to shared_memory

Uhm...

80% is too much, likely:

https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html

Bye,
Chris.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#11

Melvin Davidson

melvin6925@gmail.com

almost 9 years ago

In reply to: Chris Mair (#10)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

On Fri, Apr 14, 2017 at 2:42 PM, Chris Mair <chris@1006.org> wrote:

you should increase shared_memory to 40GB. General philosphy is to

allocate 80% of system memory to shared_memory

Uhm...

80% is too much, likely:

https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html

Bye,
Chris.

Sorry, I had a brain fart, shared_buffers should be 25% system memory, or
12GB

--
*Melvin Davidson*
I reserve the right to fantasize. Whether or not you
wish to share my fantasy is entirely up to you.

#12

Moreno Andreo

moreno.andreo@evolu-s.it

almost 9 years ago

In reply to: Igor Neyman (#9)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

#13

Moreno Andreo

moreno.andreo@evolu-s.it

almost 9 years ago

In reply to: Melvin Davidson (#11)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Il 14/04/2017 20:55, Melvin Davidson ha
scritto:<br>
</div>
<blockquote
cite="mid:CANu8FixEzCn3zYi_S6U8yV6OYE=1GYL19UhFdrY71vsdzz02OA@mail.gmail.com"
type="cite">
<div dir="ltr"><br>
<div class="gmail_extra"><br>
<div class="gmail_quote">On Fri, Apr 14, 2017 at 2:42 PM,
Chris Mair <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:chris@1006.org" target="_blank">chris@1006.org</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex"><span
class="">
<blockquote class="gmail_quote" style="margin:0 0 0
.8ex;border-left:1px #ccc solid;padding-left:1ex">
you should increase shared_memory to 40GB. General
philosphy is to allocate 80% of system memory to
shared_memory<br>
</blockquote>
<br>
</span>
Uhm...<br>
<br>
80% is too much, likely:<br>
<br>
<a moz-do-not-send="true"
href="https://www.postgresql.org/docs/9.5/static/runtime-config-resource.html"
rel="noreferrer" target="_blank">https://www.postgresql.org/doc<wbr>s/9.5/static/runtime-config-<wbr>resource.html</a><br>
<br>
Bye,<br>
Chris.<br>
<br>
<br>
</blockquote>
</div>
<br>
</div>
<div class="gmail_extra">Sorry, I had a brain fart,
shared_buffers should be 25% system memory, or 12GB<br>
</div>
</div>
</blockquote>
The original set was 13 GB (I lowered to 3 GB to take a test), but
probably I never acknowledged it with top...<br>
<blockquote
cite="mid:CANu8FixEzCn3zYi_S6U8yV6OYE=1GYL19UhFdrY71vsdzz02OA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra"><br clear="all">
<br>
-- <br>
<div class="gmail_signature" data-smartmail="gmail_signature">
<div dir="ltr"><font size="4"><b><span
style="font-family:courier new,monospace">Melvin
Davidson</span></b></font><br>
<font style="font-weight:bold" size="3"><span
style="color:rgb(128,0,255)">I reserve the right to
fantasize. Whether or not you </span><br
style="color:rgb(128,0,255)">
<span style="color:rgb(128,0,255)">wish to share my
fantasy is entirely up to you. </span><img
moz-do-not-send="true" style="color:rgb(128,0,255)"
src="http://us.i1.yimg.com/us.yimg.com/i/mesg/tsmileys2/01.gif"></font><br>
</div>
</div>
</div>
</div>
</blockquote>
<p><br>
</p>
</body>
</html>

#14

Igor Neyman

ineyman@perceptron.com

almost 9 years ago

In reply to: Moreno Andreo (#12)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

From: Moreno Andreo [mailto:moreno.andreo@evolu-s.it]
Sent: Friday, April 14, 2017 3:00 PM
To: Igor Neyman <ineyman@perceptron.com>; Melvin Davidson <melvin6925@gmail.com>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Service outage: each postgres process use the exact amount of the configured work_mem

Attention: This email was sent from someone outside of Perceptron. Always exercise caution when opening attachments or clicking links from unknown senders or when receiving unexpected emails.

Il 14/04/2017 20:40, Igor Neyman ha scritto:

E. How many users were connected when the problem occurred?
About 350 connections

Thanks

______________________________________________________________________________________

Probably that is your problem, if you don’t have connection pooler.
I’d recommend to start with PgBouncer, very light-weight and easy to configure.

Regards,
Igor Neyman

I'm building such a system, but I don't think that's the problem because now, with 30 active connections, it still does not work (actually it's extremely slow).

Both servers are far from being under heavy load (so they were before...)

Having total of 350 connections and only 30 of them being active, is the exact case where connection pooler helps.

Igor Neyman

#15

Moreno Andreo

moreno.andreo@evolu-s.it

almost 9 years ago

In reply to: Igor Neyman (#14)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

#16

Igor Neyman

ineyman@perceptron.com

almost 9 years ago

In reply to: Moreno Andreo (#15)

Re: Service outage: each postgres process use the exact amount of the configured work_mem

From: Moreno Andreo [mailto:moreno.andreo@evolu-s.it]
Sent: Friday, April 14, 2017 3:11 PM
To: Igor Neyman <ineyman@perceptron.com>; Melvin Davidson <melvin6925@gmail.com>
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] Service outage: each postgres process use the exact amount of the configured work_mem

Attention: This email was sent from someone outside of Perceptron. Always exercise caution when opening attachments or clicking links from unknown senders or when receiving unexpected emails.

Il 14/04/2017 21:06, Igor Neyman ha scritto:

From: Moreno Andreo [mailto:moreno.andreo@evolu-s.it]
Sent: Friday, April 14, 2017 3:00 PM
To: Igor Neyman <ineyman@perceptron.com><mailto:ineyman@perceptron.com>; Melvin Davidson <melvin6925@gmail.com><mailto:melvin6925@gmail.com>
Cc: pgsql-general@postgresql.org<mailto:pgsql-general@postgresql.org>
Subject: Re: [GENERAL] Service outage: each postgres process use the exact amount of the configured work_mem

Attention: This email was sent from someone outside of Perceptron. Always exercise caution when opening attachments or clicking links from unknown senders or when receiving unexpected emails.

Il 14/04/2017 20:40, Igor Neyman ha scritto:

E. How many users were connected when the problem occurred?
About 350 connections

Thanks

______________________________________________________________________________________

Probably that is your problem, if you don’t have connection pooler.
I’d recommend to start with PgBouncer, very light-weight and easy to configure.

Regards,
Igor Neyman

I'm building such a system, but I don't think that's the problem because now, with 30 active connections, it still does not work (actually it's extremely slow).

Both servers are far from being under heavy load (so they were before...)

Having total of 350 connections and only 30 of them being active, is the exact case where connection pooler helps.

Igor Neyman

Sorry, I misexplained.
4 hours ago we were in rush hour and the server was running with its usual load (graphs do not show spikes in server load), and we had about 350 connections.
To troubleshoot the whole thing, I needed to restart the servers, in the meantime customers disconnected.
Now I have 30 of them (most of them alive), select * from pg_stat_activity shows 30 connections.
That's why I don't think it's a connection # problem.
____________________________________________________________________________________________________________________

Right.
But you had 300 connections when the problem occurred, and leads me to believe that pgBouncer is needed.

Igor Neyman

#17

Moreno Andreo

moreno.andreo@evolu-s.it

almost 9 years ago

In reply to: Igor Neyman (#16)

Re: Service outage: each postgres process use the exact amount of the configured work_mem [(AUTO-RE)SOLVED...]

<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Il 14/04/2017 21:15, Igor Neyman ha
scritto:<br>
</div>
<blockquote
cite="mid:DM5PR07MB281034984AF523DC4E0095C6DA050@DM5PR07MB2810.namprd07.prod.outlook.com"
type="cite">
<meta http-equiv="Content-Type" content="text/html; charset=utf-8">
<meta name="Generator" content="Microsoft Word 15 (filtered
medium)">
<style></style>
<div class="WordSection1">
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"><o:p> </o:p></span></p>
<div>
<div style="border:none;border-top:solid #E1E1E1
1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:windowtext">From:</span></b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:windowtext">
Moreno Andreo [<a class="moz-txt-link-freetext" href="mailto:moreno.andreo@evolu-s.it">mailto:moreno.andreo@evolu-s.it</a>]
<br>
<b>Sent:</b> Friday, April 14, 2017 3:11 PM<br>
<b>To:</b> Igor Neyman <a class="moz-txt-link-rfc2396E" href="mailto:ineyman@perceptron.com"><ineyman@perceptron.com></a>;
Melvin Davidson <a class="moz-txt-link-rfc2396E" href="mailto:melvin6925@gmail.com"><melvin6925@gmail.com></a><br>
<b>Cc:</b> <a class="moz-txt-link-abbreviated" href="mailto:pgsql-general@postgresql.org">pgsql-general@postgresql.org</a><br>
<b>Subject:</b> Re: [GENERAL] Service outage: each
postgres process use the exact amount of the configured
work_mem<o:p></o:p></span></p>
</div>
</div>
<p class="MsoNormal"><o:p> </o:p></p>
<p><i><span style="color:red">Attention: This email was sent
from someone outside of Perceptron. Always exercise
caution when opening attachments or clicking links from
unknown senders or when receiving unexpected emails.</span></i><span
style="color:red"><o:p></o:p></span></p>
<p class="MsoNormal"><o:p> </o:p></p>
<div>
<div>
<p class="MsoNormal">Il 14/04/2017 21:06, Igor Neyman ha
scritto:<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
<div>
<div style="border:none;border-top:solid #E1E1E1
1.0pt;padding:3.0pt 0in 0in 0in">
<p class="MsoNormal"><b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:windowtext">From:</span></b><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:windowtext">
Moreno Andreo [<a moz-do-not-send="true"
href="mailto:moreno.andreo@evolu-s.it">mailto:moreno.andreo@evolu-s.it</a>]
<br>
<b>Sent:</b> Friday, April 14, 2017 3:00 PM<br>
<b>To:</b> Igor Neyman <a moz-do-not-send="true"
href="mailto:ineyman@perceptron.com"><ineyman@perceptron.com></a>;
Melvin Davidson
<a moz-do-not-send="true"
href="mailto:melvin6925@gmail.com"><melvin6925@gmail.com></a><br>
<b>Cc:</b> <a moz-do-not-send="true"
href="mailto:pgsql-general@postgresql.org">pgsql-general@postgresql.org</a><br>
<b>Subject:</b> Re: [GENERAL] Service outage: each
postgres process use the exact amount of the
configured work_mem</span><o:p></o:p></p>
</div>
</div>
<p class="MsoNormal"> <o:p></o:p></p>
<p><i><span style="color:red">Attention: This email was sent
from someone outside of Perceptron. Always exercise
caution when opening attachments or clicking links
from unknown senders or when receiving unexpected
emails.</span></i><o:p></o:p></p>
<p class="MsoNormal"> <o:p></o:p></p>
<div>
<div>
<p class="MsoNormal">Il 14/04/2017 20:40, Igor Neyman ha
scritto:<o:p></o:p></p>
</div>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<p class="MsoNormal"> <o:p></o:p></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<div>
<p class="MsoNormal">E. How many users were
connected when the problem occurred?<o:p></o:p></p>
</div>
</div>
</blockquote>
<p class="MsoNormal">About 350 connections<br>
<br>
Thanks<br>
<br>
<span style="color:#1F497D">______________________________________________________________________________________</span><o:p></o:p></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Probably
that is your problem, if you don’t have connection
pooler.</span><o:p></o:p></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">I’d
recommend to start with PgBouncer, very light-weight
and easy to configure.</span><o:p></o:p></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D"> </span><o:p></o:p></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Regards,</span><o:p></o:p></p>
<p class="MsoNormal"><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Igor
Neyman</span><o:p></o:p></p>
<p> <o:p></o:p></p>
</blockquote>
<p>I'm building such a system, but I don't think that's
the problem because now, with 30 active connections, it
still does not work (actually it's extremely slow).<o:p></o:p></p>
<div style="border:none;border-bottom:solid windowtext
1.5pt;padding:0in 0in 1.0pt 0in">
<p>Both servers are far from being under heavy load (so
they were before...)<o:p></o:p></p>
</div>
<p><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Having
total of 350 connections and only 30 of them being
active, is the exact case where connection pooler
helps.</span><o:p></o:p></p>
<p><span
style="font-size:11.0pt;font-family:"Calibri",sans-serif;color:#1F497D">Igor
Neyman</span><o:p></o:p></p>
<p class="MsoNormal"><o:p> </o:p></p>
</div>
</blockquote>
<p class="MsoNormal">Sorry, I misexplained.<br>
4 hours ago we were in rush hour and the server was running
with its usual load (graphs do not show spikes in server
load), and we had about 350 connections.<br>
To troubleshoot the whole thing, I needed to restart the
servers, in the meantime customers disconnected.<br>
Now I have 30 of them (most of them alive), select * from
pg_stat_activity shows 30 connections.<br>
That's why I don't think it's a connection # problem.<br>
<span style="color:#1F497D">____________________________________________________________________________________________________________________<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Right.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">But you had
300 connections when the problem occurred, and leads me to
believe that pgBouncer is needed.<o:p></o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D"><o:p> </o:p></span></p>
<p class="MsoNormal"><span style="color:#1F497D">Igor Neyman</span><br>
<br>
<o:p></o:p></p>
<blockquote style="margin-top:5.0pt;margin-bottom:5.0pt">
<div>
<p class="MsoNormal"> </p>
</div>
</blockquote>
</div>
</div>
</blockquote>
For whom may be in interest.....<br>
After creating another VM for the Windows instance, using a snapshot
from past night (and choosing a good client to test... our
customer's has some other problems, I think........) everything
worked.<br>
No lags, no delays. <br>
Fast as usual. <br>
So we switched back to the original server. <br>
Everything worked. <br>
<br>
This is the WORST situation. <br>
Honestly, I didn't understand which server and module was the
problem, neither did figure out the cause. I found *nothing* in
logs.<br>
It can happen again.<br>
And I'll be in the same situation.<br>
<br>
I need to hurry up with the new structure (cluster primary/standby
with pgpool II for pooling and load balancing and barman for
backup), that is growing in the test lab.<br>
<br>
Being on Google Cloud Platform, this makes me suspect of some
"strange" problem (next Tuesday I'll ask them) on their
structure.... but on the other side I think they'd be alerting us on
some issues they're experiencing.<br>
<br>
We designed this application 6 years ago, in a hurry (long story
short: we needed to build a product to be *absolutely* sold before
fall 2012, and we started in July 2011) and went a bit superficial
with some modules of the program. One of them was database design. I
was using Postgresql since 2005, but I never used it in scenarios
that go over having 2 or 3 databases that collect logs and do some
reporting, so I was lacking experience in *battle* scenarios like
this, so I went with a design in which every customer has its own
role, which makes him authenticate, and with this role he can
connect to his database and to some of his colleagues' (imagine a
group of 10... if everyone connects to everyone, it's 100
connections!). Now we're updating this, so only one role is used for
everyone, and authentication is made in other ways, and I think that
in this situation pgbouncer will help a little, but with the new
design a pooler will lend us a great hand.<br>
<br>
Thanks for your patience<br>
Happy Easter<br>
Moreno.<br>
</body>
</html>