Thread: overwriting an existing .so while being used crashes the server process

overwriting an existing .so while being used crashes the server process

From
Tomas Vondra
Date:
Hi,

whenever I run a C-function (part of an .so file) and the file is
overwritten, the connection crashes. Tested on 9.1.3 and 9.2-beta1.

It's 100% reproducible:

 1) compile the attached file and copy the .so to pkglibdir

    $ gcc -I/home/tomas/tmp/postgresql-9.1.2/src/include testcomp.c
          -shared -fPIC -o testcomp.so

    $ cp testcomp.so `pg_config --pkglibdir`

 2) create a function, calling the .so

    CREATE FUNCTION test_computation()
           RETURNS void
           AS 'testcomp','test_computation'
           LANGUAGE C STRICT;

 3) call the function and while it's running, repeat step (1).

 4) an example of the output

    WARNING:  i = 532000000 v = 141512000266000000
    WARNING:  i = 533000000 v = 142044500266500000
    WARNING:  i = 534000000 v = 142578000267000000
    The connection to the server was lost. Attempting reset: Failed.

    and a log says this

    LOG:  server process (PID 17161) was terminated by signal 7: Bus
          error
    LOG:  terminating any other active server processes
    WARNING:  terminating connection because of crash of another server
              process
    ...

This does not happen when the .so is removed or just touched, it needs
to be overwritten (although with a file that's binary exactly the same).

Basic info about the box: Linux rimmer 3.3.2-gentoo #1 SMP PREEMPT Wed
Apr 18 14:54:04 CEST 2012 x86_64 Intel(R) Core(TM) i5-2500K CPU @
3.30GHz GenuineIntel GNU/Linux

kind regards
Tomas

Attachment
Tomas Vondra <tv@fuzzy.cz> writes:
> whenever I run a C-function (part of an .so file) and the file is
> overwritten, the connection crashes. Tested on 9.1.3 and 9.2-beta1.

"Doctor, it hurts when I do this."
"So don't do that."

What exactly would you expect Postgres to do about such a thing, anyway?
It has no control over people overwriting its executable files.

            regards, tom lane
On 30.5.2012 22:35, Tom Lane wrote:
> Tomas Vondra <tv@fuzzy.cz> writes:
>> whenever I run a C-function (part of an .so file) and the file is
>> overwritten, the connection crashes. Tested on 9.1.3 and 9.2-beta1.
>
> "Doctor, it hurts when I do this."
> "So don't do that."
>
> What exactly would you expect Postgres to do about such a thing, anyway?
> It has no control over people overwriting its executable files.

Well, I expected the existing connection will use the old .so, while new
connections would use the new version (although they're exactly the
same). I suppose there are issues with that option too, but crashing the
server is a bit unfortunate ...

And it actually happens even when the file is overwritten between two
queries. I wonder how this affects installing new versions of extensions
- does that mean I can't do that while the database is running?

Is this mentioned in the docs, somewhere? IMHO there should be a big red
banner "DON'T DO THIS" but all I found is this:

   http://www.postgresql.org/docs/9.1/interactive/xfunc-c.html

   After it is used for the first time, a dynamically loaded object
   file is retained in memory. Future calls in the same session to the
   function(s) in that file will only incur the small overhead of a
   symbol table lookup. If you need to force a reload of an object
   file, for example after recompiling it, begin a fresh session.

Which kinda looks like my expectation that the session won't crash was
correct. Clearly seems like bug to me.

Tomas
Tomas Vondra <tv@fuzzy.cz> writes:
> On 30.5.2012 22:35, Tom Lane wrote:
>> Tomas Vondra <tv@fuzzy.cz> writes:
>>> whenever I run a C-function (part of an .so file) and the file is
>>> overwritten, the connection crashes. Tested on 9.1.3 and 9.2-beta1.

>> What exactly would you expect Postgres to do about such a thing, anyway?
>> It has no control over people overwriting its executable files.

> Well, I expected the existing connection will use the old .so, while new
> connections would use the new version (although they're exactly the
> same).

Well, that would be something to discuss with the implementors of shared
library functionality on your platform, not with us.

I suspect it depends on how you install the new version of the library,
too.  I would somewhat expect it to work as you're thinking if the
install consists of "rename old file out of the way, copy new file into
place, unlink old file" or equivalent.  If you are actually
*overwriting* the file in place, a crash does not seem especially
surprising --- it would make perfect sense if the kernel expects the
file to be usable as backing store for the in-memory image, which is not
exactly unreasonable.  IOW, if the in-memory bits we're executing are
just an mmap'd image of the .so file, changing the .so file could
entirely be expected to lead to a crash.

>    http://www.postgresql.org/docs/9.1/interactive/xfunc-c.html

>    After it is used for the first time, a dynamically loaded object
>    file is retained in memory. Future calls in the same session to the
>    function(s) in that file will only incur the small overhead of a
>    symbol table lookup. If you need to force a reload of an object
>    file, for example after recompiling it, begin a fresh session.

> Which kinda looks like my expectation that the session won't crash was
> correct. Clearly seems like bug to me.

No, that just means that we don't unload it from memory.  Where the bits
actually are, and whether the kernel has defenses against somebody
modifying the executable, is not something you should be asking us.
Talk to a kernel hacker for your platform.

            regards, tom lane
On 30.5.2012 23:19, Tom Lane wrote:
> I suspect it depends on how you install the new version of the library,
> too.  I would somewhat expect it to work as you're thinking if the
> install consists of "rename old file out of the way, copy new file into
> place, unlink old file" or equivalent.  If you are actually
> *overwriting* the file in place, a crash does not seem especially
> surprising --- it would make perfect sense if the kernel expects the
> file to be usable as backing store for the in-memory image, which is not
> exactly unreasonable.  IOW, if the in-memory bits we're executing are
> just an mmap'd image of the .so file, changing the .so file could
> entirely be expected to lead to a crash.

Aha! That might be the culprit - I've just tested that deleting the olf
file and copying new version (thus not overwriting it) did not cause a
crash. Funny.

>>    http://www.postgresql.org/docs/9.1/interactive/xfunc-c.html
>
>>    After it is used for the first time, a dynamically loaded object
>>    file is retained in memory. Future calls in the same session to the
>>    function(s) in that file will only incur the small overhead of a
>>    symbol table lookup. If you need to force a reload of an object
>>    file, for example after recompiling it, begin a fresh session.
>
>> Which kinda looks like my expectation that the session won't crash was
>> correct. Clearly seems like bug to me.
>
> No, that just means that we don't unload it from memory.  Where the bits
> actually are, and whether the kernel has defenses against somebody
> modifying the executable, is not something you should be asking us.
> Talk to a kernel hacker for your platform.

OK, thanks for the explanation.

I still think it's worth mentioning this issue in the docs ...

Tomas

Re: overwriting an existing .so while being used crashes the server process

From
Peter Eisentraut
Date:
On ons, 2012-05-30 at 23:43 +0200, Tomas Vondra wrote:
> On 30.5.2012 23:19, Tom Lane wrote:
> > I suspect it depends on how you install the new version of the library,
> > too.  I would somewhat expect it to work as you're thinking if the
> > install consists of "rename old file out of the way, copy new file into
> > place, unlink old file" or equivalent.  If you are actually
> > *overwriting* the file in place, a crash does not seem especially
> > surprising --- it would make perfect sense if the kernel expects the
> > file to be usable as backing store for the in-memory image, which is not
> > exactly unreasonable.  IOW, if the in-memory bits we're executing are
> > just an mmap'd image of the .so file, changing the .so file could
> > entirely be expected to lead to a crash.
>
> Aha! That might be the culprit - I've just tested that deleting the olf
> file and copying new version (thus not overwriting it) did not cause a
> crash. Funny.

That's one of the reasons why one normally uses "install" rather than
"cp" to install files.  So this shouldn't be a problem in practice if
people use the provided pgxs infrastructure or something similar.

GNU cp has the --remove-destination option, which should also work for
this purpose.