Re: dynamic shared memory - Mailing list pgsql-hackers

From Jim Nasby
Subject Re: dynamic shared memory
Date
Msg-id 522A2FBE.9090104@nasby.net
Whole thread Raw
In response to Re: dynamic shared memory  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: dynamic shared memory
List pgsql-hackers
On 9/5/13 11:37 AM, Robert Haas wrote:
>> ISTM that at some point we'll want to look at putting top-level shared
>> >memory into this system (ie: allowing dynamic resizing of GUCs that affect
>> >shared memory size).
> A lot of people want that, but being able to resize the shared memory
> chunk itself is only the beginning of the problem.  So I wouldn't hold
> my breath.

<starts breathing again>

>> >Wouldn't it protect against a crash while writing the file? I realize the
>> >odds of that are pretty remote, but AFAIK it wouldn't cost that much to
>> >write a new file and do an atomic mv...
> If there's an OS-level crash, we don't need the state file; the shared
> memory will be gone anyway.  And if it's a PostgreSQL-level failure,
> this game neither helps nor hurts.
>
>>> >>Sure.  A messed-up backend can clobber the control segment just as it
>>> >>can clobber anything else in shared memory.  There's really no way
>>> >>around that problem.  If the control segment has been overwritten by a
>>> >>memory stomp, we can't use it to clean up.  There's no way around that
>>> >>problem except to not the control segment, which wouldn't be better.
>> >
>> >Are we trying to protect against "memory stomps" when we restart after a
>> >backend dies? I thought we were just trying to ensure that all shared data
>> >structures were correct and consistent. If that's the case, then I was
>> >thinking that by using a pointer that can be updated in a CPU-atomic fashion
>> >we know we'd never end up with a corrupted entry that was in use; the
>> >partial write would be to a slot with nothing pointing at it so it could be
>> >safely reused.
> When we restart after a backend dies, shared memory contents are
> completely reset, from scratch.  This is true of both the fixed size
> shared memory segment and of the dynamic shared memory control
> segment.  The only difference is that, with the dynamic shared memory
> control segment, we need to use the segment for cleanup before
> throwing it out and starting over.  Extra caution is required because
> we're examining memory that could hypothetically have been stomped on;
> we must not let the postmaster do anything suicidal.

Not doing something suicidal is what I'm worried about (that and not cleaning up as well as possible).

The specific scenario I'm worried about is something like a PANIC in the middle of the snprintf call in
dsm_write_state_file().That would leave that file in a completely unknown state so who knows what would then happen on
restart.ISTM that writing a temp file and then doing a filesystem mv would eliminate that issue.
 

Or is it safe to assume that the snprintf call will be atomic since we're just spitting out a long?
-- 
Jim C. Nasby, Data Architect                       jim@nasby.net
512.569.9461 (cell)                         http://jim.nasby.net



pgsql-hackers by date:

Previous
From: Jeff Janes
Date:
Subject: Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?
Next
From: Tom Lane
Date:
Subject: Re: Re: [HACKERS] Re: [HACKERS] Is it necessary to rewrite table while increasing the scale of datatype numeric?