ALTER SESSION
Hello.
/messages/by-id/20190128.133143.115303042.horiguchi.kyotaro@lab.ntt.co.jp
C. Provide session-specific GUC variable (that overides the global one)
- Add new configuration file "postgresql.conf.<PID>" and
pg_reload_conf() let the session with the PID loads it as if
it is the last included file. All such files are removed at
startup or at the end of the coressponding session.- Add a new syntax like this:
ALTER SESSION WITH (pid=xxxx)
SET configuration_parameter {TO | =} {value | 'value' | DEFAULT}
RESET configuration_parameter
RESET ALL- Target variables are marked with GUC_REMOTE.
I'll consider the last choice and will come up with a patch.
This is that, with a small change in design.
ALTER SESSION WITH (pid <pid>) SET param {TO|=} value [ IMMEDIATE ]
ALTER SESSION WITH (pid <pid>) RESET param [ IMMEDIATE ]
ALTER SESSION WITH (pid <pid>) RESET ALL
The first form create an entry in
$PGDATA/pg_session_conf/postgresql.<beid>.conf.
The second form removes the entry.
The third form removes the file itself.
IMMEDIATE specifies that the change is applied immediately by
sending SIGHUP to the process. pg_reload_conf() works as well.
The session configuration is removed at session-end and the
directory is cleaned up at startup.
It can change varaibles of PGC_USERSET by non-superuser or
PGC_SUSET by superuser. The local session user should have the
same privileges with pg_signal_backend() on the target session.
This patch contains documentation but doesn't contain test yet.
I would appreciate any comments or suggestions on this.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
0001-ALTER-SESSION.patchtext/x-patch; charset=us-asciiDownload+433-79
At Tue, 29 Jan 2019 20:32:54 +0900 (Tokyo Standard Time), Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp> wrote in <20190129.203254.115361483.horiguchi.kyotaro@lab.ntt.co.jp>
Hello.
/messages/by-id/20190128.133143.115303042.horiguchi.kyotaro@lab.ntt.co.jp
C. Provide session-specific GUC variable (that overides the global one)
- Add new configuration file "postgresql.conf.<PID>" and
pg_reload_conf() let the session with the PID loads it as if
it is the last included file. All such files are removed at
startup or at the end of the coressponding session.- Add a new syntax like this:
ALTER SESSION WITH (pid=xxxx)
SET configuration_parameter {TO | =} {value | 'value' | DEFAULT}
RESET configuration_parameter
RESET ALL- Target variables are marked with GUC_REMOTE.
I'll consider the last choice and will come up with a patch.
This is that, with a small change in design.
ALTER SESSION WITH (pid <pid>) SET param {TO|=} value [ IMMEDIATE ]
ALTER SESSION WITH (pid <pid>) RESET param [ IMMEDIATE ]
ALTER SESSION WITH (pid <pid>) RESET ALLThe first form create an entry in
$PGDATA/pg_session_conf/postgresql.<beid>.conf.
The second form removes the entry.
The third form removes the file itself.IMMEDIATE specifies that the change is applied immediately by
sending SIGHUP to the process. pg_reload_conf() works as well.The session configuration is removed at session-end and the
directory is cleaned up at startup.It can change varaibles of PGC_USERSET by non-superuser or
PGC_SUSET by superuser. The local session user should have the
same privileges with pg_signal_backend() on the target session.This patch contains documentation but doesn't contain test yet.
I would appreciate any comments or suggestions on this.
Minor updates and rebased.
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
v2-0001-ALTER-SESSION.patchtext/x-patch; charset=us-asciiDownload+422-80
Hi,
On 2019-01-29 20:32:54 +0900, Kyotaro HORIGUCHI wrote:
Hello.
/messages/by-id/20190128.133143.115303042.horiguchi.kyotaro@lab.ntt.co.jp
C. Provide session-specific GUC variable (that overides the global one)
- Add new configuration file "postgresql.conf.<PID>" and
pg_reload_conf() let the session with the PID loads it as if
it is the last included file. All such files are removed at
startup or at the end of the coressponding session.- Add a new syntax like this:
ALTER SESSION WITH (pid=xxxx)
SET configuration_parameter {TO | =} {value | 'value' | DEFAULT}
RESET configuration_parameter
RESET ALL- Target variables are marked with GUC_REMOTE.
I'll consider the last choice and will come up with a patch.
This is that, with a small change in design.
ALTER SESSION WITH (pid <pid>) SET param {TO|=} value [ IMMEDIATE ]
ALTER SESSION WITH (pid <pid>) RESET param [ IMMEDIATE ]
ALTER SESSION WITH (pid <pid>) RESET ALLThe first form create an entry in
$PGDATA/pg_session_conf/postgresql.<beid>.conf.
The second form removes the entry.
The third form removes the file itself.IMMEDIATE specifies that the change is applied immediately by
sending SIGHUP to the process. pg_reload_conf() works as well.The session configuration is removed at session-end and the
directory is cleaned up at startup.It can change varaibles of PGC_USERSET by non-superuser or
PGC_SUSET by superuser. The local session user should have the
same privileges with pg_signal_backend() on the target session.This patch contains documentation but doesn't contain test yet.
I would appreciate any comments or suggestions on this.
Leaving the desirability of the feature aside, isn't this racy as hell?
I.e. it seems entirely possible that backends stop/start between
determining the PID, and the ALTER SESSION creating the file, and it
actually being processed. By the time that happens an entirely different
session might be using that pid.
And IMMEDIATE wouldn't be very immediate, considering e.g. longrunning
queries / VACUUM etc, which'll only process new config in the mainloop.
- Andres
Greetings,
* Andres Freund (andres@anarazel.de) wrote:
On 2019-01-29 20:32:54 +0900, Kyotaro HORIGUCHI wrote:
Hello.
/messages/by-id/20190128.133143.115303042.horiguchi.kyotaro@lab.ntt.co.jp
C. Provide session-specific GUC variable (that overides the global one)
- Add new configuration file "postgresql.conf.<PID>" and
pg_reload_conf() let the session with the PID loads it as if
it is the last included file. All such files are removed at
startup or at the end of the coressponding session.- Add a new syntax like this:
ALTER SESSION WITH (pid=xxxx)
SET configuration_parameter {TO | =} {value | 'value' | DEFAULT}
RESET configuration_parameter
RESET ALL- Target variables are marked with GUC_REMOTE.
I'll consider the last choice and will come up with a patch.
This is that, with a small change in design.
ALTER SESSION WITH (pid <pid>) SET param {TO|=} value [ IMMEDIATE ]
ALTER SESSION WITH (pid <pid>) RESET param [ IMMEDIATE ]
ALTER SESSION WITH (pid <pid>) RESET ALLThe first form create an entry in
$PGDATA/pg_session_conf/postgresql.<beid>.conf.
The second form removes the entry.
The third form removes the file itself.IMMEDIATE specifies that the change is applied immediately by
sending SIGHUP to the process. pg_reload_conf() works as well.The session configuration is removed at session-end and the
directory is cleaned up at startup.It can change varaibles of PGC_USERSET by non-superuser or
PGC_SUSET by superuser. The local session user should have the
same privileges with pg_signal_backend() on the target session.This patch contains documentation but doesn't contain test yet.
I would appreciate any comments or suggestions on this.
Leaving the desirability of the feature aside, isn't this racy as hell?
I.e. it seems entirely possible that backends stop/start between
determining the PID, and the ALTER SESSION creating the file, and it
actually being processed. By the time that happens an entirely different
session might be using that pid.
That seems like something that could possibly be fixed, by adding in
other things to make it more likely to be the 'right' backend, but my
complaint here is that we are, again, using files to pass data between
backend processes and that seems like a pretty terrible direction to be
going in.
Isn't there a whole system for passing information between different
backend processes that we could and probably should be using here
instead..? I get that it wasn't quite intended for this originally, but
hopefully it would be possible to make it work...
And IMMEDIATE wouldn't be very immediate, considering e.g. longrunning
queries / VACUUM etc, which'll only process new config in the mainloop.
That's certainly a good point.
Thanks!
Stephen
On 2019-01-29 20:52:08 -0500, Stephen Frost wrote:
* Andres Freund (andres@anarazel.de) wrote:
Leaving the desirability of the feature aside, isn't this racy as hell?
I.e. it seems entirely possible that backends stop/start between
determining the PID, and the ALTER SESSION creating the file, and it
actually being processed. By the time that happens an entirely different
session might be using that pid.That seems like something that could possibly be fixed, by adding in
other things to make it more likely to be the 'right' backend, but my
complaint here is that we are, again, using files to pass data between
backend processes and that seems like a pretty terrible direction to be
going in.
I think pid would be wholly unsuitable for this, and if so we'd have to
use something entirely independent.
Isn't there a whole system for passing information between different
backend processes that we could and probably should be using here
instead..? I get that it wasn't quite intended for this originally, but
hopefully it would be possible to make it work...
I'm not sure which system you're referring to? Procsignals? Those rely
on the fact that it's harmless to send such signals even when the pid
has been recycled, so that doesn't really address the issue. And
realistically, you're going to need somtehing to persist such settings
to - they're not fixed size, and using DSM here would complicate things
to a significant degree. I don't think files would necessarily be wrong
here, if we actually want this; alternatively we could go with some
magic catalog, but that'd be a lot of infrastructure for not
particularly much gain.
Greetings,
Andres Freund
Greetings,
* Andres Freund (andres@anarazel.de) wrote:
On 2019-01-29 20:52:08 -0500, Stephen Frost wrote:
* Andres Freund (andres@anarazel.de) wrote:
Leaving the desirability of the feature aside, isn't this racy as hell?
I.e. it seems entirely possible that backends stop/start between
determining the PID, and the ALTER SESSION creating the file, and it
actually being processed. By the time that happens an entirely different
session might be using that pid.That seems like something that could possibly be fixed, by adding in
other things to make it more likely to be the 'right' backend, but my
complaint here is that we are, again, using files to pass data between
backend processes and that seems like a pretty terrible direction to be
going in.I think pid would be wholly unsuitable for this, and if so we'd have to
use something entirely independent.
I would think you'd use pid + other stuff (user OID, backend proc entry
number, other things). Basically, if you see a file there with your pid
on it, then you look and see if the other things match- if so, act on
it, if not, discard the file. I still don't like this approach though,
Isn't there a whole system for passing information between different
backend processes that we could and probably should be using here
instead..? I get that it wasn't quite intended for this originally, but
hopefully it would be possible to make it work...I'm not sure which system you're referring to? Procsignals? Those rely
on the fact that it's harmless to send such signals even when the pid
has been recycled, so that doesn't really address the issue. And
realistically, you're going to need somtehing to persist such settings
to - they're not fixed size, and using DSM here would complicate things
to a significant degree. I don't think files would necessarily be wrong
here, if we actually want this; alternatively we could go with some
magic catalog, but that'd be a lot of infrastructure for not
particularly much gain.
I would think we'd use proc signals to say "hey, go check this when you
get a chance" or similar, but, no, I was thinking for actually passing
the data we'd use a DSM. I can see how that would complicate things but
that seems like something we might be able to solve by making it easier
to use them for this simplified use-case.
I really don't think files are the right way to be going about this.
A magic catalog sounds like an interesting idea. Another thought I had
was something around pipes but it seems like that would require we have
pipes between every pair of backends.. Instead, I'd think we'd have a
way for any backend to plop a message into some other backend's message
queue and then that backend processes it when it gets to it.
I don't think this is going to be the last time we want to do something
like this and so having a bunch of individually built file-based systems
for passing information between backends seems really grotty.
Thanks!
Stephen
On 2019-01-29 21:09:22 -0500, Stephen Frost wrote:
Greetings,
* Andres Freund (andres@anarazel.de) wrote:
On 2019-01-29 20:52:08 -0500, Stephen Frost wrote:
* Andres Freund (andres@anarazel.de) wrote:
Leaving the desirability of the feature aside, isn't this racy as hell?
I.e. it seems entirely possible that backends stop/start between
determining the PID, and the ALTER SESSION creating the file, and it
actually being processed. By the time that happens an entirely different
session might be using that pid.That seems like something that could possibly be fixed, by adding in
other things to make it more likely to be the 'right' backend, but my
complaint here is that we are, again, using files to pass data between
backend processes and that seems like a pretty terrible direction to be
going in.I think pid would be wholly unsuitable for this, and if so we'd have to
use something entirely independent.I would think you'd use pid + other stuff (user OID, backend proc entry
number, other things). Basically, if you see a file there with your pid
on it, then you look and see if the other things match- if so, act on
it, if not, discard the file. I still don't like this approach though,
What do we gain by including the pid here? Seems much more reasonable to
use a session id that's just unique over the life of a cluster.
I really don't think files are the right way to be going about this.
Why? They persist and can be removed, they are introspectable, they
automatically are removed from memory when there's no demand...
Greetings,
Andres Freund
Greetings,
* Andres Freund (andres@anarazel.de) wrote:
On 2019-01-29 21:09:22 -0500, Stephen Frost wrote:
* Andres Freund (andres@anarazel.de) wrote:
On 2019-01-29 20:52:08 -0500, Stephen Frost wrote:
* Andres Freund (andres@anarazel.de) wrote:
Leaving the desirability of the feature aside, isn't this racy as hell?
I.e. it seems entirely possible that backends stop/start between
determining the PID, and the ALTER SESSION creating the file, and it
actually being processed. By the time that happens an entirely different
session might be using that pid.That seems like something that could possibly be fixed, by adding in
other things to make it more likely to be the 'right' backend, but my
complaint here is that we are, again, using files to pass data between
backend processes and that seems like a pretty terrible direction to be
going in.I think pid would be wholly unsuitable for this, and if so we'd have to
use something entirely independent.I would think you'd use pid + other stuff (user OID, backend proc entry
number, other things). Basically, if you see a file there with your pid
on it, then you look and see if the other things match- if so, act on
it, if not, discard the file. I still don't like this approach though,What do we gain by including the pid here? Seems much more reasonable to
use a session id that's just unique over the life of a cluster.
Are you suggesting we have one of those already, or is the idea that
we'd add a cluster-lifetime session id for this?
I really don't think files are the right way to be going about this.
Why? They persist and can be removed, they are introspectable, they
automatically are removed from memory when there's no demand...
Well, we don't actually want these to persist, and it's because they do
that we have to deal with removing them, and I don't see a whole lot of
gain from them being introspectable; indeed, that seems like more of a
drawback than anything since it will invite people to whack those files
around and abuse them as if they were some externally documented
interface.
They also cost disk space, they require inodes, they have to be cleaned
up and managed on shutdown/restart, backup tools need to understand what
to do with them, potentially, we have to consider if we should have a
checksum for them, we have to handle out-of-disk space cases with them,
they could cause us to run out of disk space...
These same arguments could have been made about how we could have
implemented parallel query too. I agree that the use-case is somewhat
different there but there's also a lot of similarity when it comes to
managing this passing of information to that use-case.
Thanks!
Stephen
At Tue, 29 Jan 2019 21:46:28 -0500, Stephen Frost <sfrost@snowman.net> wrote in <20190130024628.GE5118@tamriel.snowman.net>
Greetings,
* Andres Freund (andres@anarazel.de) wrote:
On 2019-01-29 21:09:22 -0500, Stephen Frost wrote:
* Andres Freund (andres@anarazel.de) wrote:
On 2019-01-29 20:52:08 -0500, Stephen Frost wrote:
* Andres Freund (andres@anarazel.de) wrote:
Leaving the desirability of the feature aside, isn't this racy as hell?
I.e. it seems entirely possible that backends stop/start between
determining the PID, and the ALTER SESSION creating the file, and it
actually being processed. By the time that happens an entirely different
session might be using that pid.That seems like something that could possibly be fixed, by adding in
other things to make it more likely to be the 'right' backend, but my
complaint here is that we are, again, using files to pass data between
backend processes and that seems like a pretty terrible direction to be
going in.I think pid would be wholly unsuitable for this, and if so we'd have to
use something entirely independent.I would think you'd use pid + other stuff (user OID, backend proc entry
number, other things). Basically, if you see a file there with your pid
on it, then you look and see if the other things match- if so, act on
it, if not, discard the file. I still don't like this approach though,What do we gain by including the pid here? Seems much more reasonable to
use a session id that's just unique over the life of a cluster.Are you suggesting we have one of those already, or is the idea that
we'd add a cluster-lifetime session id for this?
Just a 32 bit counter would suffice for such a period. But in the
attached the worst thing to happen is that the new session reads
the only one config line written by the last command, which don't
seem harmful.. (Of couse not the best, though.)
I really don't think files are the right way to be going about this.
Why? They persist and can be removed, they are introspectable, they
automatically are removed from memory when there's no demand...Well, we don't actually want these to persist, and it's because they do
that we have to deal with removing them, and I don't see a whole lot of
gain from them being introspectable; indeed, that seems like more of a
drawback than anything since it will invite people to whack those files
around and abuse them as if they were some externally documented
interface.
.auto.conf is already a kind of such.. My first version signals
the change via shared memory (in a largely-improvable way) and
add the GUC system the complex "nontransactional SET" feature,
which lets a change persists beyond transaction end if
any. Holding changes until the session becomes idle seems also
complex.
/messages/by-id/20181127.193622.252197705.horiguchi.kyotaro@lab.ntt.co.jp
The most significant reason for passing-by-file is the affinity
with the current GUC system.
They also cost disk space, they require inodes, they have to be cleaned
up and managed on shutdown/restart, backup tools need to understand what
to do with them, potentially, we have to consider if we should have a
checksum for them, we have to handle out-of-disk space cases with them,
they could cause us to run out of disk space...
The files are so small to cause such problems easily, but I agree
that file handling is bothersome and fragile. A weak point of
signalling via shared memory is incompatibility with the current
GUC system as I mentioned above.
These same arguments could have been made about how we could have
implemented parallel query too. I agree that the use-case is somewhat
different there but there's also a lot of similarity when it comes to
managing this passing of information to that use-case.
Parallel query passes data via DSM?
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center
Attachments:
v3-0001-ALTER-SESSION.patchtext/x-patch; charset=us-asciiDownload+612-81
Thanks for a cool feature with nice UI. Can I expect it to work for background processes? For troubleshooting, I wanted to investigate how autovacuum launcher/worker behaves by having them emit DEBUG messages.
(My comments follow)
From: Kyotaro HORIGUCHI [mailto:horiguchi.kyotaro@lab.ntt.co.jp]
.auto.conf is already a kind of such.. My first version signals the change
via shared memory (in a largely-improvable way) and add the GUC system the
complex "nontransactional SET" feature, which lets a change persists beyond
transaction end if any. Holding changes until the session becomes idle seems
also complex./messages/by-id/20181127.193622.252197705.horigu
chi.kyotaro@lab.ntt.co.jpThe most significant reason for passing-by-file is the affinity with the
current GUC system.
Regarding the target session specification, do we want to use pid or a session ID like the following?
https://www.postgresql.org/docs/devel/runtime-config-logging.html
--------------------------------------------------
The %c escape prints a quasi-unique session identifier, consisting of two 4-byte hexadecimal numbers (without leading zeros) separated by a dot. The numbers are the process start time and the process ID, so %c can also be used as a space saving way of printing those items. For example, to generate the session identifier from pg_stat_activity, use this query:
SELECT to_hex(trunc(EXTRACT(EPOCH FROM backend_start))::integer) || '.' ||
to_hex(pid)
FROM pg_stat_activity;
PRIMARY KEY (session_id, session_line_num)
--------------------------------------------------
pid is easier to type. However, the session startup needs to try to delete the leftover file. Is the file system access negligible compared with the total heavy session startup processing?
If we choose session ID, the session startup doesn't have to care about the leftover file. However, the background process crash could leave the file for a long time, since the crash may not lead to postmaster restart. Also, we will get inclined to add sessionid column in pg_stat_activity (the concept of session ID can be useful for other uses.)
I'm OK about passing parameter changes via a file. But I'm not sure whether using DSM (DSA) is easier with less code.
And considering the multi-threaded server Konstantin is proposing, would it better to take pass-by-memory approach? I imagine that if the server gets multi-threaded, the parameter change would be handled like:
1. Allocate memory for one parameter change.
2. Write the change to that memory.
3. Link the memory to a session-specific list.
4. The target session removes the entry from the list, applies the change, and frees the memory.
The code modification may be minimal when we migrate to the multi-threaded server -- only memory allocation and free functions.
Regards
Takayuki Tsunakawa
Hello.
At Fri, 1 Feb 2019 06:31:40 +0000, "Tsunakawa, Takayuki" <tsunakawa.takay@jp.fujitsu.com> wrote in <0A3221C70F24FB45833433255569204D1FB927B1@G01JPEXMBYT05>
Thanks for a cool feature with nice UI. Can I expect it to work for background processes? For troubleshooting, I wanted to investigate how autovacuum launcher/worker behaves by having them emit DEBUG messages.
(My comments follow)
I haven't did actually, but it doesn't reject background
workers. But background worker seems to assume that no change in
variablres while working. I should consider that.
From: Kyotaro HORIGUCHI [mailto:horiguchi.kyotaro@lab.ntt.co.jp]
.auto.conf is already a kind of such.. My first version signals the change
via shared memory (in a largely-improvable way) and add the GUC system the
complex "nontransactional SET" feature, which lets a change persists beyond
transaction end if any. Holding changes until the session becomes idle seems
also complex./messages/by-id/20181127.193622.252197705.horigu
chi.kyotaro@lab.ntt.co.jpThe most significant reason for passing-by-file is the affinity with the
current GUC system.Regarding the target session specification, do we want to use pid or a session ID like the following?
https://www.postgresql.org/docs/devel/runtime-config-logging.html
--------------------------------------------------
The %c escape prints a quasi-unique session identifier, consisting of two 4-byte hexadecimal numbers (without leading zeros) separated by a dot. The numbers are the process start time and the process ID, so %c can also be used as a space saving way of printing those items. For example, to generate the session identifier from pg_stat_activity, use this query:SELECT to_hex(trunc(EXTRACT(EPOCH FROM backend_start))::integer) || '.' ||
to_hex(pid)
FROM pg_stat_activity;pid is easier to type. However, the session startup needs to try to delete the leftover file. Is the file system access negligible compared with the total heavy session startup processing?
If we choose session ID, the session startup doesn't have to care about the leftover file. However, the background process crash could leave the file for a long time, since the crash may not lead to postmaster restart. Also, we will get inclined to add sessionid column in pg_stat_activity (the concept of session ID can be useful for other uses.)
Sounds reasonbale.
The attached version happens to add backend startup time in
PGPROC and I added session id as a usable key. (Heavily WIP)
ALTER SESSION WITH (id '5c540141.b7f') SET work_mem to '128kB';
I'm OK about passing parameter changes via a file. But I'm not sure whether using DSM (DSA) is easier with less code.
Perhaps DSA is not required. Currently it uses rather a large
area but I think we can do the same thing with smaller memory by
sending long strings part by part.
And considering the multi-threaded server Konstantin is proposing, would it better to take pass-by-memory approach? I imagine that if the server gets multi-threaded, the parameter change would be handled like:
1. Allocate memory for one parameter change.
2. Write the change to that memory.
3. Link the memory to a session-specific list.
4. The target session removes the entry from the list, applies the change, and frees the memory.The code modification may be minimal when we migrate to the multi-threaded server -- only memory allocation and free functions.
The attached is a WIP patch that:
- Using the non-transactional SET (for my convenience).
- based on not file, but static shared memory.
Using a new signal
- It adds PGC_S_REMOTE with the same precedence with PGC_S_SESSION.
(This causes arguable behavior..)
- ALTER SESSION syntax. (key can be pid or session id)
(Sorry for the inconsistent name of the patch files..)
regards.
--
Kyotaro Horiguchi
NTT Open Source Software Center