Thread: overwriting an existing .so while being used crashes the server process
Hi, whenever I run a C-function (part of an .so file) and the file is overwritten, the connection crashes. Tested on 9.1.3 and 9.2-beta1. It's 100% reproducible: 1) compile the attached file and copy the .so to pkglibdir $ gcc -I/home/tomas/tmp/postgresql-9.1.2/src/include testcomp.c -shared -fPIC -o testcomp.so $ cp testcomp.so `pg_config --pkglibdir` 2) create a function, calling the .so CREATE FUNCTION test_computation() RETURNS void AS 'testcomp','test_computation' LANGUAGE C STRICT; 3) call the function and while it's running, repeat step (1). 4) an example of the output WARNING: i = 532000000 v = 141512000266000000 WARNING: i = 533000000 v = 142044500266500000 WARNING: i = 534000000 v = 142578000267000000 The connection to the server was lost. Attempting reset: Failed. and a log says this LOG: server process (PID 17161) was terminated by signal 7: Bus error LOG: terminating any other active server processes WARNING: terminating connection because of crash of another server process ... This does not happen when the .so is removed or just touched, it needs to be overwritten (although with a file that's binary exactly the same). Basic info about the box: Linux rimmer 3.3.2-gentoo #1 SMP PREEMPT Wed Apr 18 14:54:04 CEST 2012 x86_64 Intel(R) Core(TM) i5-2500K CPU @ 3.30GHz GenuineIntel GNU/Linux kind regards Tomas
Attachment
Tomas Vondra <tv@fuzzy.cz> writes: > whenever I run a C-function (part of an .so file) and the file is > overwritten, the connection crashes. Tested on 9.1.3 and 9.2-beta1. "Doctor, it hurts when I do this." "So don't do that." What exactly would you expect Postgres to do about such a thing, anyway? It has no control over people overwriting its executable files. regards, tom lane
On 30.5.2012 22:35, Tom Lane wrote: > Tomas Vondra <tv@fuzzy.cz> writes: >> whenever I run a C-function (part of an .so file) and the file is >> overwritten, the connection crashes. Tested on 9.1.3 and 9.2-beta1. > > "Doctor, it hurts when I do this." > "So don't do that." > > What exactly would you expect Postgres to do about such a thing, anyway? > It has no control over people overwriting its executable files. Well, I expected the existing connection will use the old .so, while new connections would use the new version (although they're exactly the same). I suppose there are issues with that option too, but crashing the server is a bit unfortunate ... And it actually happens even when the file is overwritten between two queries. I wonder how this affects installing new versions of extensions - does that mean I can't do that while the database is running? Is this mentioned in the docs, somewhere? IMHO there should be a big red banner "DON'T DO THIS" but all I found is this: http://www.postgresql.org/docs/9.1/interactive/xfunc-c.html After it is used for the first time, a dynamically loaded object file is retained in memory. Future calls in the same session to the function(s) in that file will only incur the small overhead of a symbol table lookup. If you need to force a reload of an object file, for example after recompiling it, begin a fresh session. Which kinda looks like my expectation that the session won't crash was correct. Clearly seems like bug to me. Tomas
Tomas Vondra <tv@fuzzy.cz> writes: > On 30.5.2012 22:35, Tom Lane wrote: >> Tomas Vondra <tv@fuzzy.cz> writes: >>> whenever I run a C-function (part of an .so file) and the file is >>> overwritten, the connection crashes. Tested on 9.1.3 and 9.2-beta1. >> What exactly would you expect Postgres to do about such a thing, anyway? >> It has no control over people overwriting its executable files. > Well, I expected the existing connection will use the old .so, while new > connections would use the new version (although they're exactly the > same). Well, that would be something to discuss with the implementors of shared library functionality on your platform, not with us. I suspect it depends on how you install the new version of the library, too. I would somewhat expect it to work as you're thinking if the install consists of "rename old file out of the way, copy new file into place, unlink old file" or equivalent. If you are actually *overwriting* the file in place, a crash does not seem especially surprising --- it would make perfect sense if the kernel expects the file to be usable as backing store for the in-memory image, which is not exactly unreasonable. IOW, if the in-memory bits we're executing are just an mmap'd image of the .so file, changing the .so file could entirely be expected to lead to a crash. > http://www.postgresql.org/docs/9.1/interactive/xfunc-c.html > After it is used for the first time, a dynamically loaded object > file is retained in memory. Future calls in the same session to the > function(s) in that file will only incur the small overhead of a > symbol table lookup. If you need to force a reload of an object > file, for example after recompiling it, begin a fresh session. > Which kinda looks like my expectation that the session won't crash was > correct. Clearly seems like bug to me. No, that just means that we don't unload it from memory. Where the bits actually are, and whether the kernel has defenses against somebody modifying the executable, is not something you should be asking us. Talk to a kernel hacker for your platform. regards, tom lane
On 30.5.2012 23:19, Tom Lane wrote: > I suspect it depends on how you install the new version of the library, > too. I would somewhat expect it to work as you're thinking if the > install consists of "rename old file out of the way, copy new file into > place, unlink old file" or equivalent. If you are actually > *overwriting* the file in place, a crash does not seem especially > surprising --- it would make perfect sense if the kernel expects the > file to be usable as backing store for the in-memory image, which is not > exactly unreasonable. IOW, if the in-memory bits we're executing are > just an mmap'd image of the .so file, changing the .so file could > entirely be expected to lead to a crash. Aha! That might be the culprit - I've just tested that deleting the olf file and copying new version (thus not overwriting it) did not cause a crash. Funny. >> http://www.postgresql.org/docs/9.1/interactive/xfunc-c.html > >> After it is used for the first time, a dynamically loaded object >> file is retained in memory. Future calls in the same session to the >> function(s) in that file will only incur the small overhead of a >> symbol table lookup. If you need to force a reload of an object >> file, for example after recompiling it, begin a fresh session. > >> Which kinda looks like my expectation that the session won't crash was >> correct. Clearly seems like bug to me. > > No, that just means that we don't unload it from memory. Where the bits > actually are, and whether the kernel has defenses against somebody > modifying the executable, is not something you should be asking us. > Talk to a kernel hacker for your platform. OK, thanks for the explanation. I still think it's worth mentioning this issue in the docs ... Tomas
Re: overwriting an existing .so while being used crashes the server process
From
Peter Eisentraut
Date:
On ons, 2012-05-30 at 23:43 +0200, Tomas Vondra wrote: > On 30.5.2012 23:19, Tom Lane wrote: > > I suspect it depends on how you install the new version of the library, > > too. I would somewhat expect it to work as you're thinking if the > > install consists of "rename old file out of the way, copy new file into > > place, unlink old file" or equivalent. If you are actually > > *overwriting* the file in place, a crash does not seem especially > > surprising --- it would make perfect sense if the kernel expects the > > file to be usable as backing store for the in-memory image, which is not > > exactly unreasonable. IOW, if the in-memory bits we're executing are > > just an mmap'd image of the .so file, changing the .so file could > > entirely be expected to lead to a crash. > > Aha! That might be the culprit - I've just tested that deleting the olf > file and copying new version (thus not overwriting it) did not cause a > crash. Funny. That's one of the reasons why one normally uses "install" rather than "cp" to install files. So this shouldn't be a problem in practice if people use the provided pgxs infrastructure or something similar. GNU cp has the --remove-destination option, which should also work for this purpose.