Re: Remove header lock BufferGetLSNAtomic() on architectures with 64 bit atomic operations - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Remove header lock BufferGetLSNAtomic() on architectures with 64 bit atomic operations
Date
Msg-id d5632a49-06c1-4d07-9cfd-dd2459e6b101@vondra.me
Whole thread Raw
In response to Re: Remove header lock BufferGetLSNAtomic() on architectures with 64 bit atomic operations  (Andres Freund <andres@anarazel.de>)
Responses Re: Remove header lock BufferGetLSNAtomic() on architectures with 64 bit atomic operations
List pgsql-hackers

On 3/11/26 01:21, Andres Freund wrote:
> Hi,
> 
> On 2026-03-11 01:00:34 +0100, Tomas Vondra wrote:
>> On 3/10/26 21:41, Andres Freund wrote:
>>> On 2026-03-10 19:41:54 +0100, Tomas Vondra wrote:
>>>> On 2/5/26 16:38, Peter Geoghegan wrote:
>>>>> On Wed, Jan 14, 2026 at 1:31 AM Andreas Karlsson <andreas@proxel.se> wrote:
>>>>>> Yeah, that was a quite big thinko. I have a attached a patch with the
>>>>>> thinko fixed but I am still not happy with it. I think I will try to use
>>>>>> atomics.h and see if that makes the code nicer to read.
>>>>>>
>>>>>> Also will after that see what I can do about pageinspect.
>>>>>
>>>>> I believe that pageinspect's heap_page_items function needs to use
>>>>> get_page_from_raw -- see commit 14e9b18fed.
>>>
>>> Maybe I'm slow today, but how's that related to the patch at hand?
>>>
>>> Regardless of that, it seems a bit confusing to fold the pageinspect changes
>>> into the main commit.
>>>
>>
>> That's because of alignment requirements, per this comment:
>>
>>  * On machines with MAXALIGN = 8, the payload of a bytea is not
>>  * maxaligned, since it will start 4 bytes into a palloc'd value.
>>  * PageHeaderData requires 8 byte alignment, so always use this
>>  * function when accessing page header fields from a raw page bytea.
> 
> I guess we are now reading 8 bytes as one.
> 

Not sure I understand. Are you saying it's OK or not?

> 
>> AFAIK proper alignment is required to make the load atomic.
> 
> Sure, but it's a copy of the page, so that'd itself would not matter.
> 

I does seem a bit unnecessary, in the sense that there shouldn't be
anything changing the data.

> 
>>>> +static inline PageXLogRecPtr
>>>> +PageXLogRecPtrGet(PageXLogRecPtr pd_lsn)
>>>>  {
>>>> -    return (uint64) val.xlogid << 32 | val.xrecoff;
>>>> +#ifdef WORDS_BIGENDIAN
>>>> +    return pd_lsn;
>>>> +#else
>>>> +    return (pd_lsn << 32) | (pd_lsn >> 32);
>>>> +#endif
>>>>  }
>>>>  
>>>> -#define PageXLogRecPtrSet(ptr, lsn) \
>>>> -    ((ptr).xlogid = (uint32) ((lsn) >> 32), (ptr).xrecoff = (uint32) (lsn))
>>>> -
>>>>  /*
>>>>   * disk page organization
>>>>   *
>>>> @@ -387,10 +390,11 @@ PageGetLSN(const PageData *page)
>>>>  {
>>>>      return PageXLogRecPtrGet(((const PageHeaderData *) page)->pd_lsn);
>>>>  }
>>>
>>> I think this may need to actually use a volatile to force the read to happen
>>> as the 64bit value, otherwise the compiler would be entirely free to implement
>>> it as two 32bit reads.
>>>
>>
>> I did that in v5 (I think). But at the same time I'm now rather confused
>> about the meaning of PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY. Because if a
>> well-aligned load/store of 8B can be implemented as two 32-bit reads
>> (with a "sequence point" in between), then how is that atomic?
> 
> All PG_HAVE_8BYTE_SINGLE_COPY_ATOMICITY says is that the hardware can do 8
> byte reads/writes without tearing. But that still requires 8 bytes
> reads/writes to be done as one, not multiple instructions.
> 

OK

> 
>> I'm also a bit ... unsure about "volatile" in general. Is that actually
>> something the specification says is needed here, or is it just an
>> attempt to "disable optimizations" (like the split into two stores)?
> 
> It's definitely suboptimal.  We should use something C11's
> atomic_load()/atomic_store(). But we can't rely on that yet (it's not enabled
> without extra flags in at least msvc).
> 
> The volatile does prevent the compiler from just doing a read/write twice or
> never, just because it'd be nicer for code generation.  But it doesn't
> actually guarantee that the right instructions for reading writing are
> generated :(
> 

That sounds a bit scary, TBH.


regards

-- 
Tomas Vondra




pgsql-hackers by date:

Previous
From: Tender Wang
Date:
Subject: Re: [PATCH] Simplify trivial shmem size calculations
Next
From: Michael Paquier
Date:
Subject: Add documentation for PG_ABS_SRCDIR, PG_ABS_BUILDDIR, PG_LIBDIR, PG_DLSUFFIX