Re: Inconsistent use of "volatile" when accessing shared memory? - Mailing list pgsql-hackers

From Jeff Davis
Subject Re: Inconsistent use of "volatile" when accessing shared memory?
Date
Msg-id 1b9325d827558eb03a59c5281abed35ad75b165e.camel@j-davis.com
Whole thread Raw
In response to Re: Inconsistent use of "volatile" when accessing shared memory?  (Andres Freund <andres@anarazel.de>)
Responses Re: Inconsistent use of "volatile" when accessing shared memory?
List pgsql-hackers
On Fri, 2023-11-03 at 15:59 -0700, Andres Freund wrote:
> I don't think so. We used to use volatile for most shared memory
> accesses, but
> volatile doesn't provide particularly useful semantics - and
> generates
> *vastly* slower code in a lot of circumstances. Most of that usage
> predates
> spinlocks being proper compiler barriers, 

A compiler barrier doesn't always force the compiler to generate loads
and stores, though.

For instance (example code I placed at the bottom of xlog.c):

  typedef struct DummyStruct {
      XLogRecPtr recptr;
  } DummyStruct;
  extern void DummyFunction(void);
  static DummyStruct Dummy = { 5 };
  static DummyStruct *pDummy = &Dummy;
  void
  DummyFunction(void)
  {
      while(true)
      {
          pg_compiler_barrier();
          pg_memory_barrier();
          if (pDummy->recptr == 0)
              break;
          pg_compiler_barrier();
          pg_memory_barrier();
      }
  }


Generates the following code (clang -O2):

  000000000016ed10 <DummyFunction>:
    16ed10:       f0 83 04 24 00          lock addl $0x0,(%rsp)
    16ed15:       f0 83 04 24 00          lock addl $0x0,(%rsp)
    16ed1a:       eb f4                   jmp    16ed10 <DummyFunction>
    16ed1c:       0f 1f 40 00             nopl   0x0(%rax)

Obviously this is an oversimplified example and if I complicate it in
any number of ways then it will start generating actual loads and
stores, and then the compiler and memory barriers should do their job.


> Note that use of volatile does *NOT* guarantee anything about memory
> ordering!

Right, but it does force loads/stores to be emitted by the compiler;
and without loads/stores a memory barrier is useless.

I understand that my example is too simple and I'm not claiming that
there's a problem. I'd just like to understand the key difference
between my example and what we do with XLogCtl.

Another way to phrase my question: under what specific circumstances
must we use something like UINT32_ACCESS_ONCE()? That seems to be used
for local pointers, but it's not clear to me exactly why that matters.
Intuitively, access through a local pointer seems much more likely to
be optimized and therefore more dangerous, but that doesn't imply that
access through global variables is not dangerous.

Regards,
    Jeff Davis





pgsql-hackers by date:

Previous
From: Andres Freund
Date:
Subject: Re: Improve WALRead() to suck data directly from WAL buffers when possible
Next
From: Andres Freund
Date:
Subject: Re: Explicitly skip TAP tests under Meson if disabled