Question about locking and pg_locks
Hi folks! :-)
This morning I was woken up by a call of a coworker screaming "Help, our
Postgres server is throwing strange errors!"
Not the best way to start your day...
OK, to the serious part.
"Strange errors" were (in postgresql-9.1-main.log)
WARNING: out of shared memory
ERROR: out of shared memory
HINT: you may need to increase max_locks_per_transaction
Restarting Postgresql solved the issue (for now), but that's what I'm
wondering:
- the greatest part of this locks are used by rubyrep (that we use to
replicate users' databases), no new users since 3 weeks, first time
error show up in almost 2 years
- I read this: https://wiki.postgresql.org/wiki/Lock_Monitoring but
still I can't figure out what to do if I need to know if I have to be
worried or not :-)
- I have
OS: Ubuntu 12.04 (won't upgrade because we are leaving this server to a
new one with Debian Jessie)
PG: 9.1.6 (same as above, in new server ve have 9.5.4)
RAM: 32 GB
shared_buffers = 2GB
max_connections=800
max_locks_per_transaction=64 (default value)
max_prepared_transactions = 0
so, I should be able to manage 800*64 = 5120 locks, right?
Now my pg_locks table has more than 6200 rows, but if I reorder them by
pid I see that one of them has 5800 of them, and it keeps on eating locks.
If I dig more and get pid info, its state is "<IDLE> in transaction"
ATM there are no locks that have granted = false.
Now, question time:
- Is there a number of pg_locks rows to be worried about? At more than
6000 I'm still not facing out of shared memory again
- Is there a way to release locks of that pid without pg_terminate() it?
I tried to give most of the details, if you need more, just ask...
Thanks
Moreno.-
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On 09/08/2016 04:30 AM, Moreno Andreo wrote:
Hi folks! :-)
This morning I was woken up by a call of a coworker screaming "Help, our
Postgres server is throwing strange errors!"
Not the best way to start your day...OK, to the serious part.
"Strange errors" were (in postgresql-9.1-main.log)
WARNING: out of shared memory
ERROR: out of shared memory
HINT: you may need to increase max_locks_per_transactionRestarting Postgresql solved the issue (for now), but that's what I'm
wondering:
- the greatest part of this locks are used by rubyrep (that we use to
replicate users' databases), no new users since 3 weeks, first time
error show up in almost 2 years
- I read this: https://wiki.postgresql.org/wiki/Lock_Monitoring but
still I can't figure out what to do if I need to know if I have to be
worried or not :-)
- I have
OS: Ubuntu 12.04 (won't upgrade because we are leaving this server to a
new one with Debian Jessie)
PG: 9.1.6 (same as above, in new server ve have 9.5.4)
RAM: 32 GB
shared_buffers = 2GB
max_connections=800
max_locks_per_transaction=64 (default value)
max_prepared_transactions = 0so, I should be able to manage 800*64 = 5120 locks, right?
Now my pg_locks table has more than 6200 rows, but if I reorder them by
pid I see that one of them has 5800 of them, and it keeps on eating locks.
If I dig more and get pid info, its state is "<IDLE> in transaction"
So some transaction is being held open and the system cannot close out
the locks until it is done.
ATM there are no locks that have granted = false.
Now, question time:
- Is there a number of pg_locks rows to be worried about? At more than
6000 I'm still not facing out of shared memory again
- Is there a way to release locks of that pid without pg_terminate() it?
Look in pg_stat_activity:
https://www.postgresql.org/docs/9.5/static/monitoring-stats.html
for state 'idle in transaction' and the corresponding query. If you know
where that query is coming from you could manually either commit it or
roll it back.
I tried to give most of the details, if you need more, just ask...
Thanks
Moreno.-
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Thu, Sep 8, 2016 at 4:30 AM, Moreno Andreo <moreno.andreo@evolu-s.it>
wrote:
Hi folks! :-)
This morning I was woken up by a call of a coworker screaming "Help, our
Postgres server is throwing strange errors!"
Not the best way to start your day...OK, to the serious part.
"Strange errors" were (in postgresql-9.1-main.log)
WARNING: out of shared memory
ERROR: out of shared memory
HINT: you may need to increase max_locks_per_transactionRestarting Postgresql solved the issue (for now), but that's what I'm
wondering:
- the greatest part of this locks are used by rubyrep (that we use to
replicate users' databases), no new users since 3 weeks, first time error
show up in almost 2 years
- I read this: https://wiki.postgresql.org/wiki/Lock_Monitoring but still
I can't figure out what to do if I need to know if I have to be worried or
not :-)
- I have
OS: Ubuntu 12.04 (won't upgrade because we are leaving this server to a
new one with Debian Jessie)
PG: 9.1.6 (same as above, in new server ve have 9.5.4)
You have a problem now. Upgrading PG now might help you solve the problem
you have now. Why would you choose to work with one hand tied behind your
back now, just because you were already planning on upgrading later? Also,
moving both the OS and the PG version at the same time is great if
everything goes well. But if everything doesn't go well, you have greatly
increased the scope of the problem-solving by making two changes at once.
If it were me (and my employer gave me the freedom to do my job
effectively), I'd rather spend my time bringing forward the date on which I
upgrade PG, rather than spend that time tracking down problems that occur,
or at least are hard to track down, because I am running an obsolete PG.
Just double max_locks_per_transaction (with 32GB of ram, that should not be
a problem) and call it good until after the upgrade.
RAM: 32 GB
shared_buffers = 2GB
max_connections=800
max_locks_per_transaction=64 (default value)
max_prepared_transactions = 0so, I should be able to manage 800*64 = 5120 locks, right?
Now my pg_locks table has more than 6200 rows, but if I reorder them by
pid I see that one of them has 5800 of them, and it keeps on eating locks.
If I dig more and get pid info, its state is "<IDLE> in transaction"
On PG9.2 or above, you would be able to see the most recently run
statement, in addition to the state. That could help a lot in figuring out
how it doing this lock-fest (of course you can already get the
client_hostname and the application_name, which could also be a big help).
What is the distribution of locks by type?
select locktype, count(*) from pg_locks group by 1;
ATM there are no locks that have granted = false.
Now, question time:
- Is there a number of pg_locks rows to be worried about? At more than
6000 I'm still not facing out of shared memory again
I don't think that the exact number of locks that will fit in shared memory
is predictable. For example, if different processes co-hold sharable locks
on the same list of tables, it seems to take up slightly more memory than
if each process was locking a different list tables. And different lock
types also take different amounts of memory. And memory fragmentation
might also cause changes in capacity that are hard to predict--I can run
the same parallel program repeated, and have it sometimes run out of memory
and sometimes not.
- Is there a way to release locks of that pid without pg_terminate() it?
I'm afraid not. I believe your two options are pg_terminate_backend or
things which are functionally equivalent to that; or to identify who is
doing this (see application_name and client_hostname) and haranguing them
until they stop doing it.
Cheers,
Jeff
<html>
<head>
<meta content="text/html; charset=utf-8" http-equiv="Content-Type">
</head>
<body bgcolor="#FFFFFF" text="#000000">
<div class="moz-cite-prefix">Il 10/09/2016 23:07, Jeff Janes ha
scritto:<br>
</div>
<blockquote
cite="mid:CAMkU=1zuN3KeMzb0Wsi6GaHnc=rt-9HycMVB37N1specYwhoMA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">On Thu, Sep 8, 2016 at 4:30 AM,
Moreno Andreo <span dir="ltr"><<a moz-do-not-send="true"
href="mailto:moreno.andreo@evolu-s.it" target="_blank">moreno.andreo@evolu-s.it</a>></span>
wrote:<br>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">Hi folks! :-)<br>
<br>
This morning I was woken up by a call of a coworker
screaming "Help, our Postgres server is throwing strange
errors!"<br>
Not the best way to start your day...<br>
<br>
OK, to the serious part.<br>
<br>
"Strange errors" were (in postgresql-9.1-main.log)<br>
WARNING: out of shared memory<br>
ERROR: out of shared memory<br>
HINT: you may need to increase max_locks_per_transaction<br>
<br>
Restarting Postgresql solved the issue (for now), but
that's what I'm wondering:<br>
- the greatest part of this locks are used by rubyrep
(that we use to replicate users' databases), no new users
since 3 weeks, first time error show up in almost 2 years<br>
- I read this: <a moz-do-not-send="true"
href="https://wiki.postgresql.org/wiki/Lock_Monitoring"
rel="noreferrer" target="_blank">https://wiki.postgresql.org/wi<wbr>ki/Lock_Monitoring</a>
but still I can't figure out what to do if I need to know
if I have to be worried or not :-)<br>
- I have<br>
OS: Ubuntu 12.04 (won't upgrade because we are leaving
this server to a new one with Debian Jessie)<br>
PG: 9.1.6 (same as above, in new server ve have 9.5.4)<br>
</blockquote>
<div><br>
</div>
<div>You have a problem now. Upgrading PG now might help
you solve the problem you have now. Why would you choose
to work with one hand tied behind your back now, just
because you were already planning on upgrading later?
Also, moving both the OS and the PG version at the same
time is great if everything goes well. But if everything
doesn't go well, you have greatly increased the scope of
the problem-solving by making two changes at once. If it
were me (and my employer gave me the freedom to do my job
effectively), I'd rather spend my time bringing forward
the date on which I upgrade PG, rather than spend that
time tracking down problems that occur, or at least are
hard to track down, because I am running an obsolete PG.
Just double max_locks_per_transaction (with 32GB of ram,
that should not be a problem) and call it good until after
the upgrade.</div>
</div>
</div>
</div>
</blockquote>
This migration has to be done in steps, some databases have already
been migrated to new server (and they seem OK for now), some are
still on the old server.<br>
Since these steps can't be that fast and migration will last at
least 2 months, I was trying to track down the problem to see if
there was some connection with a particular user activity, so if
there were chances for it to come back.<br>
I think I'll definitely set max_locks_per_transaction to 128 and
focus on the migration (You're right about changing OS, but since
Ubuntu and Debian are same family I thought it should't be that
risky...).<br>
<br>
<blockquote
cite="mid:CAMkU=1zuN3KeMzb0Wsi6GaHnc=rt-9HycMVB37N1specYwhoMA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> <br>
<br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
RAM: 32 GB<br>
shared_buffers = 2GB<br>
max_connections=800<br>
max_locks_per_transaction=64 (default value)<br>
max_prepared_transactions = 0<br>
<br>
so, I should be able to manage 800*64 = 5120 locks, right?<br>
<br>
Now my pg_locks table has more than 6200 rows, but if I
reorder them by pid I see that one of them has 5800 of
them, and it keeps on eating locks.<br>
If I dig more and get pid info, its state is "<IDLE>
in transaction"<br>
</blockquote>
<div><br>
</div>
<div>On PG9.2 or above, you would be able to see the most
recently run statement, in addition to the state. That
could help a lot in figuring out how it doing this
lock-fest (of course you can already get the
client_hostname and the application_name, which could also
be a big help).<br>
</div>
<div><br>
</div>
<div>What is the distribution of locks by type?</div>
<div><br>
</div>
<div>select locktype, count(*) from pg_locks group by 1;<br>
</div>
</div>
</div>
</div>
</blockquote>
<br>
virtualxid | 27<br>
transactionid | 133<br>
relation | 775<br>
<br>
(number of rows in pg_locks is quite small at the moment...)<br>
<br>
<blockquote
cite="mid:CAMkU=1zuN3KeMzb0Wsi6GaHnc=rt-9HycMVB37N1specYwhoMA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<div> <br>
</div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
<br>
ATM there are no locks that have granted = false.<br>
<br>
Now, question time:<br>
- Is there a number of pg_locks rows to be worried about?
At more than 6000 I'm still not facing out of shared
memory again<br>
</blockquote>
<div><br>
</div>
<div>I don't think that the exact number of locks that will
fit in shared memory is predictable. For example, if
different processes co-hold sharable locks on the same
list of tables, it seems to take up slightly more memory
than if each process was locking a different list tables.
And different lock types also take different amounts of
memory. And memory fragmentation might also cause changes
in capacity that are hard to predict--I can run the same
parallel program repeated, and have it sometimes run out
of memory and sometimes not.<br>
</div>
</div>
</div>
</div>
</blockquote>
Ok, got it.<br>
<blockquote
cite="mid:CAMkU=1zuN3KeMzb0Wsi6GaHnc=rt-9HycMVB37N1specYwhoMA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div> </div>
<blockquote class="gmail_quote" style="margin:0px 0px 0px
0.8ex;border-left:1px solid
rgb(204,204,204);padding-left:1ex">
- Is there a way to release locks of that pid without
pg_terminate() it?<br>
</blockquote>
<div><br>
</div>
<div>I'm afraid not. I believe your two options are
pg_terminate_backend or things which are functionally
equivalent to that; or to identify who is doing this (see
application_name and client_hostname) and haranguing them
until they stop doing it.</div>
</div>
</div>
</div>
</blockquote>
OK.<br>
<blockquote
cite="mid:CAMkU=1zuN3KeMzb0Wsi6GaHnc=rt-9HycMVB37N1specYwhoMA@mail.gmail.com"
type="cite">
<div dir="ltr">
<div class="gmail_extra">
<div class="gmail_quote">
<div><br>
</div>
<div><br>
</div>
<div>Cheers,</div>
<div><br>
</div>
<div>Jeff</div>
</div>
</div>
</div>
</blockquote>
<p>Thanks,</p>
<p>Moreno.-<br>
</p>
</body>
</html>
On 09/12/2016 12:47 AM, Moreno Andreo wrote:
Ccing list.
Il 08/09/2016 15:26, Adrian Klaver ha scritto:
so, I should be able to manage 800*64 = 5120 locks, right?
OMG, time to go back to school... 800*64 = 51200 ! ! !
Now my pg_locks table has more than 6200 rows, but if I reorder them by
pid I see that one of them has 5800 of them, and it keeps on eating
locks.
If I dig more and get pid info, its state is "<IDLE> in transaction"So some transaction is being held open and the system cannot close out
the locks until it is done.Right, but I can't figure why these transactions are not closed.
If I'm right, when backend is dropped (i.e. the connection is closed)
all its locks, transactions, and so on, are dropped and resources are
released for someone else's use.
In this case we have the backend alive, holding dozens of transaction
IDs and some relation lock. But it's recent_query reports <IDLE> in
transaction
ATM there are no locks that have granted = false.Now, question time:
- Is there a number of pg_locks rows to be worried about? At more than
6000 I'm still not facing out of shared memory again
- Is there a way to release locks of that pid without pg_terminate() it?Look in pg_stat_activity:
https://www.postgresql.org/docs/9.5/static/monitoring-stats.html
for state 'idle in transaction' and the corresponding query. If you
know where that query is coming from you could manually either commit
it or roll it back.pg_stat_activity is where I got these informations, but column query
still says <IDLE in transaction>postgres=# select * from pg_stat_activity where procpid = 31570;
datid | datname | procpid | usesysid | usename |
application_name | client_addr | client_hostname | client_port |
backend_start | xact_start |
query_start | waiting | current_query
---------+---------------+---------+----------+----------+------------------+-------------+-----------------+-------------+-------------------------------+-------------------------------+-------------------------------+---------+-----------------------1067066 | dbname | 31570 | 1067065 | username | | 10.20.9.206
| | 50146 | 2016-09-08 14:23:33.146383+02 |
2016-09-08 14:27:02.585666+02 | 2016-09-08 15:57:32.107801+02 | f
| <IDLE> in transaction
(1 row)
Well you do have the client host IP as well as the port it is connecting
to Postgres on. From that you may be able to use something like ps to
work out what application is sending the query.
You might want to take a look at the below also:
https://www.postgresql.org/docs/9.1/static/monitoring-stats.html
27.2.1. Statistics Collection Configuration
and see what your settings are configured for.
Am I missing something?
I tried to give most of the details, if you need more, just ask...
Thanks
Moreno.-
--
Adrian Klaver
adrian.klaver@aklaver.com
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
Import Notes
Reply to msg id not found: e9ae516c-c2d1-d02a-db84-a1ab562a6d98@evolu-s.it