Re: Fast promotion failure - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Fast promotion failure
Date
Msg-id 519A7C1E.2050604@vmware.com
Whole thread Raw
In response to Re: Fast promotion failure  (Simon Riggs <simon@2ndQuadrant.com>)
Responses Re: Fast promotion failure  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
On 20.05.2013 22:18, Simon Riggs wrote:
> On 20 May 2013 18:47, Heikki Linnakangas<hlinnakangas@vmware.com>  wrote:
>> Not sure what the best fix would be. Perhaps change the code in
>> CreateRestartPoint() to do something like this instead:
>>
>> GetXLogReplayRecPtr(&replayTLI);
>> if (RecoveryInProgress())
>>    ThisTimeLineID = replayTLI;
>
> Installing a few extra WAL files doesn't seem to be a good reason to
> mess with such an important variable as ThisTimeLineID, especially
> since its such a rare event and hardly worth optimising for.
>
> I would prefer it if we didn't set ThisTimeLineID at all in that way,
> or anywhere else. If we do, it needs to be done safely, otherwise any
> caller could make the same mistake.

> Suggested patch attached.

> @@ -7466,14 +7468,6 @@ CreateRestartPoint(int flags)
>           * not be used, and will go wasted until recycled on the next
>           * restartpoint. We'll live with that.
>           */
> -        (void) GetXLogReplayRecPtr(&ThisTimeLineID);
> -
> -        RemoveOldXlogFiles(_logSegNo, endptr);
> -
> -        /*
> -         * Make more log segments if needed.  (Do this after recycling old log
> -         * segments, since that may supply some of the needed files.)
> -         */
>          PreallocXlogFiles(endptr);
>      }

That's essentially reverting commit 343ee00b7, resurrecting the bug that 
that fixed.

> @@ -9098,10 +9092,10 @@ GetXLogReplayRecPtr(TimeLineID *replayTLI)
>      SpinLockAcquire(&xlogctl->info_lck);
>      recptr = xlogctl->lastReplayedEndRecPtr;
>      tli = xlogctl->lastReplayedTLI;
> +    if (replayTLI && xlogctl->SharedRecoveryInProgress)
> +        *replayTLI = tli;
>      SpinLockRelease(&xlogctl->info_lck);
>
> -    if (replayTLI)
> -        *replayTLI = tli;
>      return recptr;
>  }


That breaks the only remaining caller that passes a replayTLI pointer, 
GetStandbyFlushRecPtr(); *replayTLI points to a local variable in that 
function, which is left uninitialized if !xlogctl->SharedRecoveryInProgress.

- Heikki



pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: FK locking concurrency improvement
Next
From: Heikki Linnakangas
Date:
Subject: Re: Heap truncation without AccessExclusiveLock (9.4)