Re: Buffer locking is special (hints, checksums, AIO writes) - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Buffer locking is special (hints, checksums, AIO writes)
Date
Msg-id b6ca8cfa-e93a-4755-ad46-55e053d0605f@iki.fi
Whole thread Raw
In response to Re: Buffer locking is special (hints, checksums, AIO writes)  (Andres Freund <andres@anarazel.de>)
Responses Re: Buffer locking is special (hints, checksums, AIO writes)
List pgsql-hackers
A few minor nitpicks on v12 below. Other than these and the comments I 
wrote in separate emails, looks good to me.

> @@ -371,8 +382,6 @@ _bt_killitems(IndexScanDesc scan)
>         }
>  
>         /*
> -        * Since this can be redone later if needed, mark as dirty hint.
> -        *
>          * Whenever we mark anything LP_DEAD, we also set the page's
>          * BTP_HAS_GARBAGE flag, which is likewise just a hint.  (Note that we
>          * only rely on the page-level flag in !heapkeyspace indexes.)

Seems a bit random to remove that.

> +/*
> + * Try to set a single hint bit in a buffer.
> + *
> + * This is a bit faster than BufferBeginSetHintBits() /
> + * BufferFinishSetHintBits() when setting a single hint bit, but slower than
> + * the former when setting several hint bits.
> + */
> +bool
> +BufferSetHintBits16(uint16 *ptr, uint16 val, Buffer buffer)

This could use some more explanation. The point is that this does "*ptr 
= val", if it's allowed to set hint bits. That's not obvious. And 
"single hint bit" isn't really accurate, as you could update multiple 
bits in *ptr with one call.

>     /*
>      * If the buffer was dirty, try to write it out.  There is a race
>      * condition here, in that someone might dirty it after we released the
>      * buffer header lock above.  We will recheck the dirty bit after
>      * re-locking the buffer header.
>      */

It's not clear what "above" means in that paragraph. Where do we release 
the buffer header lock? In StrategyGetBuffer?

(This is not actually new with this patch; it goes back to commit 
5e89985928. Before that, there was a call to PinBuffer_Locked() which 
released the spinlock.)

> @@ -2516,18 +2515,21 @@ again:
>                 /*
>                  * If using a nondefault strategy, and writing the buffer would
>                  * require a WAL flush, let the strategy decide whether to go ahead
> -                * and write/reuse the buffer or to choose another victim.  We need a
> -                * lock to inspect the page LSN, so this can't be done inside
> +                * and write/reuse the buffer or to choose another victim.  We need to
> +                * hold the content lock in at least share-exclusive mode to safely
> +                * inspect the page LSN, so this couldn't have been done inside
>                  * StrategyGetBuffer.
>                  */
>                 if (strategy != NULL)
>                 {
>                         XLogRecPtr      lsn;
>  
> -                       /* Read the LSN while holding buffer header lock */
> -                       buf_state = LockBufHdr(buf_hdr);
> +                       /*
> +                        * As we now hold at least a share-exclusive lock on the buffer,
> +                        * the LSN cannot change during the flush (and thus can't be
> +                        * torn).
> +                        */
>                         lsn = BufferGetLSN(buf_hdr);
> -                       UnlockBufHdr(buf_hdr);
>  
>                         if (XLogNeedsFlush(lsn)
>                                 && StrategyRejectBuffer(strategy, buf_hdr, from_ring))

I think the second comment is redundant with the first one. Let's just 
remove it.

> +/*
> + * Helper for BufferBeginSetHintBits() and BufferSetHintBits16().
> + *
> + * This checks if the current lock mode already suffices to allow hint bits
> + * being set and, if not, whether the current lock can be upgraded.
> + *
> + * Updates *lockstate when returning true.
> + */
> +static inline bool
> +SharedBufferBeginSetHintBits(Buffer buffer, BufferDesc *buf_hdr, uint64 *lockstate)

Would be good to be more explicit what returning true/false here means.

- Heikki




pgsql-hackers by date:

Previous
From: Chengpeng Yan
Date:
Subject: Re: Unfortunate pushing down of expressions below sort
Next
From: Robert Haas
Date:
Subject: Re: pg_waldump: support decoding of WAL inside tarfile