warn if GUC set to an invalid shared library
forking <CA+TgmoawONZqEwe-GqmKERNY1ug0z1QhBzkHdA158xfToHKN9w@mail.gmail.com>
On Mon, Dec 13, 2021 at 09:01:57AM -0500, Robert Haas wrote:
On Thu, Dec 9, 2021 at 2:32 AM Maciek Sakrejda <m.sakrejda@gmail.com> wrote:
Considering the vanishingly small number of actual complaints we've
seen about this, that sounds ridiculously over-engineered.
A documentation example should be sufficient.I don't know if this will tip the scales, but I'd like to lodge a
belated complaint. I've gotten myself in this server-fails-to-start
situation several times (in development, for what it's worth). The
syntax (as Bharath pointed out in the original message) is pretty
picky, there are no guard rails, and if you got there through ALTER
SYSTEM, you can't fix it with ALTER SYSTEM (because the server isn't
up). If you go to fix it manually, you get a scary "Do not edit this
file manually!" warning that you have to know to ignore in this case
(that's if you find the file after you realize what the fairly generic
"FATAL: ... No such file or directory" error in the log is telling
you). Plus you have to get the (different!) quoting syntax right or
cut your losses and delete the change.+1. I disagree that trying to detect this kind of problem would be
"ridiculously over-engineered." I don't know whether it can be done
elegantly enough that we'd be happy with it and I don't know whether
it would end up just garden variety over-engineered. But there's
nothing ridiculous about trying to prevent people from putting their
system into a state where it won't start.(To be clear, I also think updating the documentation is sensible,
without taking a view on exactly what that update should look like.)
Yea, I think documentation won't help to avoid this issue:
If ALTER SYSTEM gives an ERROR, someone will likely to check the docs after a
few minutes if they know that they didn't get the correct syntax.
But if it gives no error nor warning, then most likely they won't know to check
the docs.
We should check session_preload_libraries too, right ? It's PGC_SIGHUP, so if
someone sets the variable and sends sighup, clients will be rejected, and they
had no good opportunity to avoid that.
0001 adds WARNINGs when doing SET:
postgres=# SET local_preload_libraries=xyz;
WARNING: could not load library: xyz: cannot open shared object file: No such file or directory
SET
postgres=# ALTER SYSTEM SET shared_preload_libraries =asdf;
WARNING: could not load library: $libdir/plugins/asdf: cannot open shared object file: No such file or directory
ALTER SYSTEM
0002 adds context when failing to start.
2021-12-27 17:01:12.996 CST postmaster[1403] WARNING: could not load library: $libdir/plugins/asdf: cannot open shared object file: No such file or directory
2021-12-27 17:01:14.938 CST postmaster[1403] FATAL: could not access file "asdf": No such file or directory
2021-12-27 17:01:14.938 CST postmaster[1403] CONTEXT: guc "shared_preload_libraries"
2021-12-27 17:01:14.939 CST postmaster[1403] LOG: database system is shut down
But I wonder whether it'd be adequate context if dlopen were to fail rather
than stat() ?
Before 0003:
2021-12-18 23:13:57.861 CST postmaster[11956] FATAL: could not access file "asdf": No such file or directory
2021-12-18 23:13:57.862 CST postmaster[11956] LOG: database system is shut down
After 0003:
2021-12-18 23:16:05.719 CST postmaster[13481] FATAL: could not load library: asdf: cannot open shared object file: No such file or directory
2021-12-18 23:16:05.720 CST postmaster[13481] LOG: database system is shut down
--
Justin
Attachments:
0001-warn-if-shared_preload_libraries-refers-to-an-nonext.patchtext/x-diff; charset=us-asciiDownload+145-46
0002-errcontext-if-server-fails-to-start-due-to-library-G.patchtext/x-diff; charset=us-asciiDownload+17-7
0003-wip-allow-dlopen-to-error-rather-than-stat.patchtext/x-diff; charset=us-asciiDownload+10-15
On Tue, Dec 28, 2021 at 11:15 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
forking <CA+TgmoawONZqEwe-GqmKERNY1ug0z1QhBzkHdA158xfToHKN9w@mail.gmail.com>
On Mon, Dec 13, 2021 at 09:01:57AM -0500, Robert Haas wrote:
On Thu, Dec 9, 2021 at 2:32 AM Maciek Sakrejda <m.sakrejda@gmail.com> wrote:
Considering the vanishingly small number of actual complaints we've
seen about this, that sounds ridiculously over-engineered.
A documentation example should be sufficient.I don't know if this will tip the scales, but I'd like to lodge a
belated complaint. I've gotten myself in this server-fails-to-start
situation several times (in development, for what it's worth). The
syntax (as Bharath pointed out in the original message) is pretty
picky, there are no guard rails, and if you got there through ALTER
SYSTEM, you can't fix it with ALTER SYSTEM (because the server isn't
up). If you go to fix it manually, you get a scary "Do not edit this
file manually!" warning that you have to know to ignore in this case
(that's if you find the file after you realize what the fairly generic
"FATAL: ... No such file or directory" error in the log is telling
you). Plus you have to get the (different!) quoting syntax right or
cut your losses and delete the change.+1. I disagree that trying to detect this kind of problem would be
"ridiculously over-engineered." I don't know whether it can be done
elegantly enough that we'd be happy with it and I don't know whether
it would end up just garden variety over-engineered. But there's
nothing ridiculous about trying to prevent people from putting their
system into a state where it won't start.(To be clear, I also think updating the documentation is sensible,
without taking a view on exactly what that update should look like.)Yea, I think documentation won't help to avoid this issue:
If ALTER SYSTEM gives an ERROR, someone will likely to check the docs after a
few minutes if they know that they didn't get the correct syntax.
But if it gives no error nor warning, then most likely they won't know to check
the docs.We should check session_preload_libraries too, right ? It's PGC_SIGHUP, so if
someone sets the variable and sends sighup, clients will be rejected, and they
had no good opportunity to avoid that.0001 adds WARNINGs when doing SET:
postgres=# SET local_preload_libraries=xyz;
WARNING: could not load library: xyz: cannot open shared object file: No such file or directory
SETpostgres=# ALTER SYSTEM SET shared_preload_libraries =asdf;
WARNING: could not load library: $libdir/plugins/asdf: cannot open shared object file: No such file or directory
ALTER SYSTEM0002 adds context when failing to start.
2021-12-27 17:01:12.996 CST postmaster[1403] WARNING: could not load library: $libdir/plugins/asdf: cannot open shared object file: No such file or directory
2021-12-27 17:01:14.938 CST postmaster[1403] FATAL: could not access file "asdf": No such file or directory
2021-12-27 17:01:14.938 CST postmaster[1403] CONTEXT: guc "shared_preload_libraries"
2021-12-27 17:01:14.939 CST postmaster[1403] LOG: database system is shut downBut I wonder whether it'd be adequate context if dlopen were to fail rather
than stat() ?Before 0003:
2021-12-18 23:13:57.861 CST postmaster[11956] FATAL: could not access file "asdf": No such file or directory
2021-12-18 23:13:57.862 CST postmaster[11956] LOG: database system is shut downAfter 0003:
2021-12-18 23:16:05.719 CST postmaster[13481] FATAL: could not load library: asdf: cannot open shared object file: No such file or directory
2021-12-18 23:16:05.720 CST postmaster[13481] LOG: database system is shut down
Overall the idea looks good to me. A warning on ALTER SYSTEM SET seems
reasonable than nothing. ERROR isn't the way to go as it limits the
users of setting the extensions in shared_preload_libraries first,
installing them later. Is NOTICE here a better idea than WARNING?
I haven't looked at the patches though.
Regards,
Bharath Rupireddy.
Thanks for working on this! I tried it out and it worked for me. I
reviewed the patch and didn't see any problems, but I'm not much of a
C programmer.
On Tue, Dec 28, 2021 at 9:45 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
0002 adds context when failing to start.
2021-12-27 17:01:12.996 CST postmaster[1403] WARNING: could not load library: $libdir/plugins/asdf: cannot open shared object file: No such file or directory
2021-12-27 17:01:14.938 CST postmaster[1403] FATAL: could not access file "asdf": No such file or directory
2021-12-27 17:01:14.938 CST postmaster[1403] CONTEXT: guc "shared_preload_libraries"
2021-12-27 17:01:14.939 CST postmaster[1403] LOG: database system is shut down
For whatever reason, I get slightly different (and somewhat redundant)
output on failing to start:
2022-01-08 12:59:36.784 PST [324482] WARNING: could not load library:
$libdir/plugins/totally bogus: cannot open shared object file: No such
file or directory
2022-01-08 12:59:36.787 PST [324482] FATAL: could not load library:
totally bogus: cannot open shared object file: No such file or
directory
2022-01-08 12:59:36.787 PST [324482] LOG: database system is shut down
I'm on a pretty standard Ubuntu 20.04 (with PGDG packages installed
for 11, 12, and 13). I did configure to a --prefix and set
LD_LIBRARY_PATH and PATH. Not sure if this is an issue in my
environment or a platform difference?
Also, regarding the original warning:
2022-01-08 12:57:24.953 PST [324338] WARNING: could not load library:
$libdir/plugins/totally bogus: cannot open shared object file: No such
file or directory
I think this is pretty clear to users familiar with
shared_preload_libraries. However, for someone less experienced with
Postgres and just following instructions on how to set up, e.g.,
auto_explain (and making a typo), it's not clear from this message
that your server will fail to start again after this if you shut it
down (or it crashes!), and how to get out of this situation. Should we
add a HINT to that effect?
Similarly, for
0001 adds WARNINGs when doing SET:
postgres=# SET local_preload_libraries=xyz;
WARNING: could not load library: xyz: cannot open shared object file: No such file or directory
SET
This works for me, but should this explain the impact (especially if
used with something like ALTER ROLE)? I guess that's probably because
the context may not be easily accessible. I think
shared_preload_libraries is much more common, though, so I'm more
interested in a warning there.
Thanks,
Maciek
On Thu, Dec 30, 2021 at 12:21 AM Bharath Rupireddy
<bharath.rupireddyforpostgres@gmail.com> wrote:
Overall the idea looks good to me. A warning on ALTER SYSTEM SET seems
reasonable than nothing. ERROR isn't the way to go as it limits the
users of setting the extensions in shared_preload_libraries first,
installing them later. Is NOTICE here a better idea than WARNING?
I don't think so--I'm skeptical that "updated shared_preload_libraries
first, then install them" is much more than a theoretical use case. We
may not want to block that off completely, but I think a warning is
reasonable here, because you're *probably* doing something wrong if
you get to this message at all (and if you're not, you're probably
familiar enough with Postgres to know to ignore the warning).
Thanks,
Maciek
On Sat, Jan 08, 2022 at 01:29:24PM -0800, Maciek Sakrejda wrote:
Thanks for working on this! I tried it out and it worked for me. I
reviewed the patch and didn't see any problems, but I'm not much of a
C programmer.
Thanks for looking at it. I was just hacking on it myself.
Unfortunately, the output for dlopen() is not portable, which (I think) means
most of what I wrote can't be made to work.. Since it doesn't work to call
dlopen() when using SET, I tried using just stat(). But that also fails on
windows, since one of the regression tests has an invalid filename involving
unbalanced quotes, which cause it to return EINVAL rather than ENOENT. So SET
cannot warn portably, unless it includes no details at all (or we specially
handle the windows case), or change the pre-existing regression test. But
there's a 2nd instability, too, apparently having to do with timing. So I'm
planning to drop the 0001 patch.
On Tue, Dec 28, 2021 at 9:45 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
0002 adds context when failing to start.
2021-12-27 17:01:12.996 CST postmaster[1403] WARNING: could not load library: $libdir/plugins/asdf: cannot open shared object file: No such file or directory
2021-12-27 17:01:14.938 CST postmaster[1403] FATAL: could not access file "asdf": No such file or directory
2021-12-27 17:01:14.938 CST postmaster[1403] CONTEXT: guc "shared_preload_libraries"
2021-12-27 17:01:14.939 CST postmaster[1403] LOG: database system is shut downFor whatever reason, I get slightly different (and somewhat redundant)
output on failing to start:2022-01-08 12:59:36.784 PST [324482] WARNING: could not load library: $libdir/plugins/totally bogus: cannot open shared object file: No such file or directory
2022-01-08 12:59:36.787 PST [324482] FATAL: could not load library: totally bogus: cannot open shared object file: No such file or directory
2022-01-08 12:59:36.787 PST [324482] LOG: database system is shut down
I think the first WARNING is from the GUC mechanism "setting" the library.
And then the FATAL is from trying to apply the GUC.
It looks like you didn't apply the 0002 patch for that test so got no CONTEXT ?
$ ./tmp_install/usr/local/pgsql/bin/postgres -D src/test/regress/tmp_check/data -c shared_preload_libraries=asdf
2022-01-08 16:05:00.050 CST postmaster[2588] FATAL: could not access file "asdf": No such file or directory
2022-01-08 16:05:00.050 CST postmaster[2588] CONTEXT: while loading shared libraries for GUC "shared_preload_libraries"
2022-01-08 16:05:00.050 CST postmaster[2588] LOG: database system is shut down
--
Justin
Attachments:
v2-0001-errcontext-if-server-fails-to-start-due-to-librar.patchtext/x-diff; charset=us-asciiDownload+17-7
On Sat, Jan 8, 2022 at 2:07 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
Unfortunately, the output for dlopen() is not portable, which (I think) means
most of what I wrote can't be made to work.. Since it doesn't work to call
dlopen() when using SET, I tried using just stat(). But that also fails on
windows, since one of the regression tests has an invalid filename involving
unbalanced quotes, which cause it to return EINVAL rather than ENOENT. So SET
cannot warn portably, unless it includes no details at all (or we specially
handle the windows case), or change the pre-existing regression test. But
there's a 2nd instability, too, apparently having to do with timing. So I'm
planning to drop the 0001 patch.
Hmm. I think 001 is a big part of the usability improvement here.
Could we not at least warn generically, without relaying the
underlying error? The notable thing in this situation is that the
specified library could not be loaded (and that it will almost
certainly cause problems on restart). The specific error would be nice
to have, but it's less important. What is the timing instability?
On Tue, Dec 28, 2021 at 9:45 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
For whatever reason, I get slightly different (and somewhat redundant)
output on failing to start:2022-01-08 12:59:36.784 PST [324482] WARNING: could not load library: $libdir/plugins/totally bogus: cannot open shared object file: No such file or directory
2022-01-08 12:59:36.787 PST [324482] FATAL: could not load library: totally bogus: cannot open shared object file: No such file or directory
2022-01-08 12:59:36.787 PST [324482] LOG: database system is shut downI think the first WARNING is from the GUC mechanism "setting" the library.
And then the FATAL is from trying to apply the GUC.
It looks like you didn't apply the 0002 patch for that test so got no CONTEXT ?
I still had the terminal open where I tested this, and the scrollback
did show me applying the patch (and building after). I tried a make
clean and applying the patch again, and I do see the CONTEXT line now.
I'm not sure what the problem was but seems like PEBKAC--sorry about
that.
Thanks,
Maciek
On Sun, Jan 09, 2022 at 11:58:18AM -0800, Maciek Sakrejda wrote:
On Sat, Jan 8, 2022 at 2:07 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
Unfortunately, the output for dlopen() is not portable, which (I think) means
most of what I wrote can't be made to work.. Since it doesn't work to call
dlopen() when using SET, I tried using just stat(). But that also fails on
windows, since one of the regression tests has an invalid filename involving
unbalanced quotes, which cause it to return EINVAL rather than ENOENT. So SET
cannot warn portably, unless it includes no details at all (or we specially
handle the windows case), or change the pre-existing regression test. But
there's a 2nd instability, too, apparently having to do with timing. So I'm
planning to drop the 0001 patch.Hmm. I think 001 is a big part of the usability improvement here.
I agree - it helps people avoid causing a disruption, rather than just helping
them to fix it faster.
Could we not at least warn generically, without relaying the
underlying error? The notable thing in this situation is that the
specified library could not be loaded (and that it will almost
certainly cause problems on restart). The specific error would be nice
to have, but it's less important. What is the timing instability?
I saw regression diffs like this, showing that the warning could be displayed
before or after the SELECT was echoed.
https://cirrus-ci.com/task/6301672321318912
-SELECT * FROM schema4.counted;
WARNING: could not load library: $libdir/plugins/worker_spi: cannot open shared object file: No such file or directory
+SELECT * FROM schema4.counted;
It's certainly possible to show a static message without additional text from
errno.
On Tue, Dec 28, 2021 at 9:45 AM Justin Pryzby <pryzby@telsasoft.com> wrote:
For whatever reason, I get slightly different (and somewhat redundant)
output on failing to start:2022-01-08 12:59:36.784 PST [324482] WARNING: could not load library: $libdir/plugins/totally bogus: cannot open shared object file: No such file or directory
2022-01-08 12:59:36.787 PST [324482] FATAL: could not load library: totally bogus: cannot open shared object file: No such file or directory
2022-01-08 12:59:36.787 PST [324482] LOG: database system is shut downI think the first WARNING is from the GUC mechanism "setting" the library.
And then the FATAL is from trying to apply the GUC.
It looks like you didn't apply the 0002 patch for that test so got no CONTEXT ?I still had the terminal open where I tested this, and the scrollback
did show me applying the patch (and building after). I tried a make
clean and applying the patch again, and I do see the CONTEXT line now.
I'm not sure what the problem was but seems like PEBKAC--sorry about
that.
Maybe you missed "make install" or similar.
I took the liberty of adding you as a reviewer here:
https://commitfest.postgresql.org/36/3482/
--
Justin
On Tue, Dec 28, 2021 at 12:45 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
0002 adds context when failing to start.
2021-12-27 17:01:12.996 CST postmaster[1403] WARNING: could not load library: $libdir/plugins/asdf: cannot open shared object file: No such file or directory
2021-12-27 17:01:14.938 CST postmaster[1403] FATAL: could not access file "asdf": No such file or directory
2021-12-27 17:01:14.938 CST postmaster[1403] CONTEXT: guc "shared_preload_libraries"
2021-12-27 17:01:14.939 CST postmaster[1403] LOG: database system is shut down
-1 from me on using "guc" in any user-facing error message. And even
guc -> setting isn't a big improvement. If we're going to structure
the reporting this way there, we should try to use a meaningful phrase
there, probably beginning with the word "while"; see "git grep
errcontext.*while" for interesting precedents.
That said, that series of messages seems a bit suspect to me, because
the WARNING seems to be stating the same problem as the subsequent
FATAL and CONTEXT lines. Ideally we'd tighten that somehow.
--
Robert Haas
EDB: http://www.enterprisedb.com
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: tested, passed
Documentation: not tested
Hello
I tested the patches on master branch on Ubuntu 18.04 and regression turns out fine. I did a manual test following the query examples in this email thread and I do see the warnings when I attempted: these queries:
postgres=# SET local_preload_libraries=xyz.so;
2022-01-28 15:11:00.592 PST [13622] WARNING: could not access file "xyz.so"
WARNING: could not access file "xyz.so"
SET
postgres=# ALTER SYSTEM SET shared_preload_libraries=abc.so;
2022-01-28 15:11:07.729 PST [13622] WARNING: could not access file "$libdir/plugins/abc.so"
WARNING: could not access file "$libdir/plugins/abc.so"
ALTER SYSTEM
This is fine as this is what these patches are aiming to provide. However, when I try to restart the server, it fails to start because abc.so and xyz.so do not exist. Setting the parameters "local_preload_libraries" and "local_preload_libraries" to something else in postgresql.conf does not seem to take effect either.
It still complains shared_preload_libraries abc.so does not exist even though I have already set shared_preload_libraries to something else in postgresql.conf. This seems a little strange to me
thank you
Cary
Thanks for loooking
On Fri, Jan 28, 2022 at 11:36:20PM +0000, Cary Huang wrote:
This is fine as this is what these patches are aiming to provide. However, when I try to restart the server, it fails to start because abc.so and xyz.so do not exist. Setting the parameters "local_preload_libraries" and "local_preload_libraries" to something else in postgresql.conf does not seem to take effect either.
It still complains shared_preload_libraries abc.so does not exist even though I have already set shared_preload_libraries to something else in postgresql.conf. This seems a little strange to me
Could you show exactly what you did and the output ?
The patches don't entirely prevent someone from putting the server config into
a bad state. It only aims to tell them if they've done that, so they can fix
it, rather than letting someone (else) find the error at some later (probably
inconvenient) time.
ALTER SYSTEM adds config into postgresql.auto.conf. If you stop the server
after adding bad config there (after getting a warning), the server won't
start. Once the server is off, you have to remove it manually.
The goal of the patch is to 1) warn someone that they've put a bad config in
place, so they don't leave it there; and, 2) if the server fails to start for
such a reason, provide a CONTEXT line to help them resolve it quickly.
Maybe you know all that and I didn't understand what you're saying.
--
Justin
The following review has been posted through the commitfest application:
make installcheck-world: tested, passed
Implements feature: tested, passed
Spec compliant: not tested
Documentation: not tested
I tried the latest version of the patch, and it works as discussed. There is no documentation, but I think that's moot for this warning (we may want to note something in the setting docs, but even if so, I think we should figure out the message first and then decide if it merits additional explanation in the docs). I do not know whether it is spec-compliant, but I doubt the spec has much to say on something like this.
I tried running ALTER SYSTEM and got the warnings as expected:
postgres=# alter system set shared_preload_libraries = no_such_library,not_this_one_either;
WARNING: could not access file "$libdir/plugins/no_such_library"
WARNING: could not access file "$libdir/plugins/not_this_one_either"
ALTER SYSTEM
I think this is great, but it would be really helpful to also indicate that at this point the server will fail to come back up after a restart. In my mind, that's a big part of the reason for having a warning here. Having made this mistake a couple of times, I would be able to read between the lines, as would many other users, but if you're not familiar with Postgres this might still be pretty opaque. I think if I'm reading the code correctly, this warning path is shared between ALTER SYSTEM and a SET of local_preload_libraries so it might be tricky to word this in a way that works in all situations, but it could make the precarious situation a lot clearer. I don't really know a good wording here, but maybe a hint like "The server or session will not be able to start if it has been configured to use libraries that cannot be loaded."?
Also, there are two sides to this: one is actually applying the possibly-bogus setting, and the other is when that setting takes effect (e.g., attempting to start the server or to start a new session). I think Robert had good feedback regarding the latter:
On Fri, Jan 28, 2022 at 6:42 AM Robert Haas <robertmhaas@gmail.com> wrote:
On Tue, Dec 28, 2021 at 12:45 PM Justin Pryzby <pryzby@telsasoft.com> wrote:
0002 adds context when failing to start.
2021-12-27 17:01:12.996 CST postmaster[1403] WARNING: could not load library: $libdir/plugins/asdf: cannot open shared object file: No such file or directory
2021-12-27 17:01:14.938 CST postmaster[1403] FATAL: could not access file "asdf": No such file or directory
2021-12-27 17:01:14.938 CST postmaster[1403] CONTEXT: guc "shared_preload_libraries"
2021-12-27 17:01:14.939 CST postmaster[1403] LOG: database system is shut down-1 from me on using "guc" in any user-facing error message. And even
guc -> setting isn't a big improvement. If we're going to structure
the reporting this way there, we should try to use a meaningful phrase
there, probably beginning with the word "while"; see "git grep
errcontext.*while" for interesting precedents.That said, that series of messages seems a bit suspect to me, because
the WARNING seems to be stating the same problem as the subsequent
FATAL and CONTEXT lines. Ideally we'd tighten that somehow.
Maybe we don't even need the WARNING in this case? At this point, it's clear what the problem is. I think the CONTEXT line does actually help, because otherwise it's not clear why the server failed to start, but the warning does seem superfluous here. I do agree that GUC is awkward here, and I like the "while..." wording suggested both for consistency with other messages and how it could work here:
CONTEXT: while loading "shared_preload_libraries"
I think that would be pretty clear. In the ALTER SYSTEM case, you still need to know to edit the file in spite of the warning telling you not to edit it, but I think that's still better. Based on Cary's feedback, maybe that could be clearer, too (like you, I'm not sure if I understood what he did correctly), but I think that could certainly be future work.
Thanks,
Maciek
On Tue, Feb 1, 2022 at 11:06 PM Maciek Sakrejda <m.sakrejda@gmail.com>
wrote:
I tried running ALTER SYSTEM and got the warnings as expected:
postgres=# alter system set shared_preload_libraries =
no_such_library,not_this_one_either;
WARNING: could not access file "$libdir/plugins/no_such_library"
WARNING: could not access file "$libdir/plugins/not_this_one_either"
ALTER SYSTEMI think this is great, but it would be really helpful to also indicate
that at this point the server will fail to come back up after a restart. In
my mind, that's a big part of the reason for having a warning here. Having
made this mistake a couple of times, I would be able to read between the
lines, as would many other users, but if you're not familiar with Postgres
this might still be pretty opaque.
+1
I would at least consider having the UX go something like:
postgres=# ALTER SYSTEM SET shared_preload_libraries = not_such_library;
ERROR: <paraphrase: your system will not reboot in its current state as
that library is not present>.
HINT: to bypass the error please add FORCE before SET
postgres=# ALTER SYSTEM FORCE SET shared_preload_libraries =
no_such_library;
NOTICE: Error suppressed while setting shared_preload_libraries.
That is, have the user express their desire to leave the system in a
precarious state explicitly before actually doing so.
Upon startup, if the system already can track each separate location that
shared_preload_libraries is set, printing out those locations and current
values would be useful context. Seeing ALTER SYSTEM in that listing would
be helpful.
David J.
On Wed, Feb 2, 2022 at 7:39 AM David G. Johnston <david.g.johnston@gmail.com>
wrote:
I would at least consider having the UX go something like:
postgres=# ALTER SYSTEM SET shared_preload_libraries = not_such_library;
ERROR: <paraphrase: your system will not reboot in its current state as
that library is not present>.
HINT: to bypass the error please add FORCE before SET
postgres=# ALTER SYSTEM FORCE SET shared_preload_libraries =
no_such_library;
NOTICE: Error suppressed while setting shared_preload_libraries.That is, have the user express their desire to leave the system in a
precarious state explicitly before actually doing so.
While I don't have a problem with that behavior, given that there are
currently no such facilities for asserting "yes, really" with ALTER SYSTEM,
I don't think it's worth introducing that just for this patch. A warning
seems like a reasonable first step. This can always be expanded later. I'd
rather see a warning ship than move the goalposts out of reach.