Re: [PATCH] Refactoring of LWLock tranches - Mailing list pgsql-hackers

From andres@anarazel.de
Subject Re: [PATCH] Refactoring of LWLock tranches
Date
Msg-id 20151116002028.GA24814@alap3.anarazel.de
Whole thread Raw
In response to Re: [PATCH] Refactoring of LWLock tranches  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: [PATCH] Refactoring of LWLock tranches  (Robert Haas <robertmhaas@gmail.com>)
List pgsql-hackers
On 2015-11-15 16:24:24 -0500, Robert Haas wrote:
> I think it needs to be adapted to use predefined constants for the
> tranche IDs instead of hardcoded values, maybe based on the attached
> tranche-constants.patch.

Yea, that part is clearly not ok. Let me look at the patch.

> Also, I think we should rip all the volatile qualifiers out of
> bufmgr.c, using the second attached patch, devolatileize-bufmgr.patch.
> The comment claiming that we need this because spinlocks aren't
> compiler barriers is obsolete.

Same here.

> @@ -457,7 +457,7 @@ CreateLWLocks(void)
>          LWLockCounter = (int *) ((char *) MainLWLockArray - 3 * sizeof(int));
>          LWLockCounter[0] = NUM_FIXED_LWLOCKS;
>          LWLockCounter[1] = numLocks;
> -        LWLockCounter[2] = 1;    /* 0 is the main array */
> +        LWLockCounter[2] = LWTRANCHE_LAST_BUILTIN_ID + 1;
>      }

Man this LWLockCounter[0] stuff is just awful. Absolutely nothing that
needs to be fixed here, but here it really should be fixed someday.

>  /*
> + * We reserve a few predefined tranche IDs.  These values will never be
> + * returned by LWLockNewTrancheId.
> + */
> +#define LWTRANCHE_MAIN                        0
> +#define LWTRANCHE_BUFFER_CONTENT            1
> +#define LWTRANCHE_BUFFER_IO_IN_PROGRESS        2
> +#define LWTRANCHE_LAST_BUILTIN_ID            LWTRANCHE_BUFFER_IO_IN_PROGRESS

Nitpick: I'm inclined to use an enum to avoid having to adjust the last
builtin id when adding a new builtin tranche.


Besides that minor thing I think this works for me. We might
independently want something making adding non-builtin tranches easier,
but that's really just independent.

> From 9e7f9219b5e752da46be0e26a0be074191ae8f62 Mon Sep 17 00:00:00 2001
> From: Robert Haas <rhaas@postgresql.org>
> Date: Sun, 15 Nov 2015 10:24:03 -0500
> Subject: [PATCH 1/3] De-volatile-ize.

I very strongly think this should be done. It's painful having to
cast-away volatile, and it de-optimizes a fair bit of performance
relevant code.

>  /* local state for StartBufferIO and related functions */
>  /* local state for StartBufferIO and related functions */
> -static volatile BufferDesc *InProgressBuf = NULL;
> +static BufferDesc *InProgressBuf = NULL;
>  static bool IsForInput;
>  
>  /* local state for LockBufferForCleanup */
> -static volatile BufferDesc *PinCountWaitBuf = NULL;
> +static BufferDesc *PinCountWaitBuf = NULL;

Hm. These could also be relevant for sigsetjmp et al. Haven't found
relevant code tho.

> -    volatile BufferDesc *bufHdr;
> +    BufferDesc *bufHdr;
>      Block        bufBlock;

Looks mis-indented now, similarly in a bunch of other places. Maybe
pg-indent afterwards?


>          /*
>           * Header spinlock is enough to examine BM_DIRTY, see comment in
> @@ -1707,7 +1707,7 @@ BufferSync(int flags)
>      num_written = 0;
>      while (num_to_scan-- > 0)
>      {
> -        volatile BufferDesc *bufHdr = GetBufferDescriptor(buf_id);
> +        BufferDesc *bufHdr = GetBufferDescriptor(buf_id);
>

This case has some theoretical behavioural implications: As
bufHdr->flags is accessed without a spinlock the compiler might re-use
an older value. But that's ok: a) there's no old value it really could
use b) the whole race-condition exists anyway, and the comment in the
body explains why that's ok.

>  BlockNumber
>  BufferGetBlockNumber(Buffer buffer)
>  {
> -    volatile BufferDesc *bufHdr;
> +    BufferDesc *bufHdr;
>  
>      Assert(BufferIsPinned(buffer));
>  
> @@ -2346,7 +2346,7 @@ void
>  BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
>               BlockNumber *blknum)
>  {
> -    volatile BufferDesc *bufHdr;
> +    BufferDesc *bufHdr;

>      /* Do the same checks as BufferGetBlockNumber. */
>      Assert(BufferIsPinned(buffer));
> @@ -2382,7 +2382,7 @@ BufferGetTag(Buffer buffer, RelFileNode *rnode, ForkNumber *forknum,
>   * as the second parameter.  If not, pass NULL.
>   */
>  static void
> -FlushBuffer(volatile BufferDesc *buf, SMgrRelation reln)
> +FlushBuffer(BufferDesc *buf, SMgrRelation reln)
>  {

>      XLogRecPtr    recptr;
>      ErrorContextCallback errcallback;
> @@ -2520,7 +2520,7 @@ RelationGetNumberOfBlocksInFork(Relation relation, ForkNumber forkNum)
>  bool
>  BufferIsPermanent(Buffer buffer)
>  {
> -    volatile BufferDesc *bufHdr;
> +    BufferDesc *bufHdr;


These require that the buffer is pinned (a barrier implying
operation). The pin prevents the contents from changing in a relevant
manner, the barrier implied in the pin forces the core's view to be
non-stale.


> @@ -2613,7 +2613,7 @@ DropRelFileNodeBuffers(RelFileNodeBackend rnode, ForkNumber forkNum,
>  
>      for (i = 0; i < NBuffers; i++)
>      {
> -        volatile BufferDesc *bufHdr = GetBufferDescriptor(i);
> +        BufferDesc *bufHdr = GetBufferDescriptor(i);

>          /*
>           * We can make this a tad faster by prechecking the buffer tag before
> @@ -2703,7 +2703,7 @@ DropRelFileNodesAllBuffers(RelFileNodeBackend *rnodes, int nnodes)
>      for (i = 0; i < NBuffers; i++)
>      {
>          RelFileNode *rnode = NULL;
> -        volatile BufferDesc *bufHdr = GetBufferDescriptor(i);
> +        BufferDesc *bufHdr = GetBufferDescriptor(i);
>  
>          /*
>           * As in DropRelFileNodeBuffers, an unlocked precheck should be safe
> @@ -2767,7 +2767,7 @@ DropDatabaseBuffers(Oid dbid)
>  
>      for (i = 0; i < NBuffers; i++)
>      {
> -        volatile BufferDesc *bufHdr = GetBufferDescriptor(i);
> +        BufferDesc *bufHdr = GetBufferDescriptor(i);

> @@ -2863,7 +2863,7 @@ void
>  FlushRelationBuffers(Relation rel)
>  {
>      int            i;
> -    volatile BufferDesc *bufHdr;
> +    BufferDesc *bufHdr;
>  
>      /* Open rel at the smgr level if not already done */
>      RelationOpenSmgr(rel);
> @@ -2955,7 +2955,7 @@ void
>  FlushDatabaseBuffers(Oid dbid)
>  {
>      int            i;
> -    volatile BufferDesc *bufHdr;
> +    BufferDesc *bufHdr;

These all are correct based on the premise that some heavy lock -
implying a barrier - prevents new blocks with relevant tags from being
loaded into s_b. And it's fine if some other backend concurrently writes
out the buffer before we do - we'll notice that in the re-check after


So, looks good to me.

Andres



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: check for interrupts in set_rtable_names
Next
From: Peter Eisentraut
Date:
Subject: Re: proposal: multiple psql option -c