overwriting an existing .so while being used crashes the server process
Hi,
whenever I run a C-function (part of an .so file) and the file is
overwritten, the connection crashes. Tested on 9.1.3 and 9.2-beta1.
It's 100% reproducible:
1) compile the attached file and copy the .so to pkglibdir
$ gcc -I/home/tomas/tmp/postgresql-9.1.2/src/include testcomp.c
-shared -fPIC -o testcomp.so
$ cp testcomp.so `pg_config --pkglibdir`
2) create a function, calling the .so
CREATE FUNCTION test_computation()
RETURNS void
AS 'testcomp','test_computation'
LANGUAGE C STRICT;
3) call the function and while it's running, repeat step (1).
4) an example of the output
WARNING: i = 532000000 v = 141512000266000000
WARNING: i = 533000000 v = 142044500266500000
WARNING: i = 534000000 v = 142578000267000000
The connection to the server was lost. Attempting reset: Failed.
and a log says this
LOG: server process (PID 17161) was terminated by signal 7: Bus
error
LOG: terminating any other active server processes
WARNING: terminating connection because of crash of another server
process
...
This does not happen when the .so is removed or just touched, it needs
to be overwritten (although with a file that's binary exactly the same).
Basic info about the box: Linux rimmer 3.3.2-gentoo #1 SMP PREEMPT Wed
Apr 18 14:54:04 CEST 2012 x86_64 Intel(R) Core(TM) i5-2500K CPU @
3.30GHz GenuineIntel GNU/Linux
kind regards
Tomas
Attachments:
testcomp.ctext/x-c; name=testcomp.cDownload
Tomas Vondra <tv@fuzzy.cz> writes:
whenever I run a C-function (part of an .so file) and the file is
overwritten, the connection crashes. Tested on 9.1.3 and 9.2-beta1.
"Doctor, it hurts when I do this."
"So don't do that."
What exactly would you expect Postgres to do about such a thing, anyway?
It has no control over people overwriting its executable files.
regards, tom lane
On 30.5.2012 22:35, Tom Lane wrote:
Tomas Vondra <tv@fuzzy.cz> writes:
whenever I run a C-function (part of an .so file) and the file is
overwritten, the connection crashes. Tested on 9.1.3 and 9.2-beta1."Doctor, it hurts when I do this."
"So don't do that."What exactly would you expect Postgres to do about such a thing, anyway?
It has no control over people overwriting its executable files.
Well, I expected the existing connection will use the old .so, while new
connections would use the new version (although they're exactly the
same). I suppose there are issues with that option too, but crashing the
server is a bit unfortunate ...
And it actually happens even when the file is overwritten between two
queries. I wonder how this affects installing new versions of extensions
- does that mean I can't do that while the database is running?
Is this mentioned in the docs, somewhere? IMHO there should be a big red
banner "DON'T DO THIS" but all I found is this:
http://www.postgresql.org/docs/9.1/interactive/xfunc-c.html
After it is used for the first time, a dynamically loaded object
file is retained in memory. Future calls in the same session to the
function(s) in that file will only incur the small overhead of a
symbol table lookup. If you need to force a reload of an object
file, for example after recompiling it, begin a fresh session.
Which kinda looks like my expectation that the session won't crash was
correct. Clearly seems like bug to me.
Tomas
Tomas Vondra <tv@fuzzy.cz> writes:
On 30.5.2012 22:35, Tom Lane wrote:
Tomas Vondra <tv@fuzzy.cz> writes:
whenever I run a C-function (part of an .so file) and the file is
overwritten, the connection crashes. Tested on 9.1.3 and 9.2-beta1.
What exactly would you expect Postgres to do about such a thing, anyway?
It has no control over people overwriting its executable files.
Well, I expected the existing connection will use the old .so, while new
connections would use the new version (although they're exactly the
same).
Well, that would be something to discuss with the implementors of shared
library functionality on your platform, not with us.
I suspect it depends on how you install the new version of the library,
too. I would somewhat expect it to work as you're thinking if the
install consists of "rename old file out of the way, copy new file into
place, unlink old file" or equivalent. If you are actually
*overwriting* the file in place, a crash does not seem especially
surprising --- it would make perfect sense if the kernel expects the
file to be usable as backing store for the in-memory image, which is not
exactly unreasonable. IOW, if the in-memory bits we're executing are
just an mmap'd image of the .so file, changing the .so file could
entirely be expected to lead to a crash.
After it is used for the first time, a dynamically loaded object
file is retained in memory. Future calls in the same session to the
function(s) in that file will only incur the small overhead of a
symbol table lookup. If you need to force a reload of an object
file, for example after recompiling it, begin a fresh session.
Which kinda looks like my expectation that the session won't crash was
correct. Clearly seems like bug to me.
No, that just means that we don't unload it from memory. Where the bits
actually are, and whether the kernel has defenses against somebody
modifying the executable, is not something you should be asking us.
Talk to a kernel hacker for your platform.
regards, tom lane
On 30.5.2012 23:19, Tom Lane wrote:
I suspect it depends on how you install the new version of the library,
too. I would somewhat expect it to work as you're thinking if the
install consists of "rename old file out of the way, copy new file into
place, unlink old file" or equivalent. If you are actually
*overwriting* the file in place, a crash does not seem especially
surprising --- it would make perfect sense if the kernel expects the
file to be usable as backing store for the in-memory image, which is not
exactly unreasonable. IOW, if the in-memory bits we're executing are
just an mmap'd image of the .so file, changing the .so file could
entirely be expected to lead to a crash.
Aha! That might be the culprit - I've just tested that deleting the olf
file and copying new version (thus not overwriting it) did not cause a
crash. Funny.
After it is used for the first time, a dynamically loaded object
file is retained in memory. Future calls in the same session to the
function(s) in that file will only incur the small overhead of a
symbol table lookup. If you need to force a reload of an object
file, for example after recompiling it, begin a fresh session.Which kinda looks like my expectation that the session won't crash was
correct. Clearly seems like bug to me.No, that just means that we don't unload it from memory. Where the bits
actually are, and whether the kernel has defenses against somebody
modifying the executable, is not something you should be asking us.
Talk to a kernel hacker for your platform.
OK, thanks for the explanation.
I still think it's worth mentioning this issue in the docs ...
Tomas
On ons, 2012-05-30 at 23:43 +0200, Tomas Vondra wrote:
On 30.5.2012 23:19, Tom Lane wrote:
I suspect it depends on how you install the new version of the library,
too. I would somewhat expect it to work as you're thinking if the
install consists of "rename old file out of the way, copy new file into
place, unlink old file" or equivalent. If you are actually
*overwriting* the file in place, a crash does not seem especially
surprising --- it would make perfect sense if the kernel expects the
file to be usable as backing store for the in-memory image, which is not
exactly unreasonable. IOW, if the in-memory bits we're executing are
just an mmap'd image of the .so file, changing the .so file could
entirely be expected to lead to a crash.Aha! That might be the culprit - I've just tested that deleting the olf
file and copying new version (thus not overwriting it) did not cause a
crash. Funny.
That's one of the reasons why one normally uses "install" rather than
"cp" to install files. So this shouldn't be a problem in practice if
people use the provided pgxs infrastructure or something similar.
GNU cp has the --remove-destination option, which should also work for
this purpose.