Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)? - Mailing list pgsql-hackers

From Andres Freund
Subject Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)?
Date
Msg-id 20170904070434.fngklyhuqg5lflcl@alap3.anarazel.de
Whole thread Raw
In response to Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)?  (Kyotaro HORIGUCHI <horiguchi.kyotaro@lab.ntt.co.jp>)
Responses Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least 9.5)?
List pgsql-hackers
Hi,

On 2017-09-04 15:51:51 +0900, Kyotaro HORIGUCHI wrote:
>          SpinLockAcquire(&walsnd->mutex);
> +        oldFlushPtr = walsnd->flush;
>          walsnd->write = writePtr;
>          walsnd->flush = flushPtr;
>          walsnd->apply = applyPtr;
> @@ -1821,7 +1836,93 @@ ProcessStandbyReplyMessage(void)
>          if (SlotIsLogical(MyReplicationSlot))
>              LogicalConfirmReceivedLocation(flushPtr);
>          else
> -            PhysicalConfirmReceivedLocation(flushPtr);
> +        {
> +            /*
> +             * Recovery on standby requires that a continuation record is
> +             * available from single WAL source. For the reason, physical
> +             * replication slot should stay in the first segment of the
> +             * multiple segments that a continued record is spanning
> +             * over. Since we look pages and don't look into individual record
> +             * here, restartLSN may stay a bit too behind but it doesn't
> +             * matter.
> +             *
> +             * Since the objective is avoiding to remove required segments,
> +             * checking at the beginning of every segment is enough. But once
> +             * restartLSN goes behind, check every page for quick restoration.
> +             *
> +             * restartLSN has a valid value only when it is behind flushPtr.
> +             */
> +            bool    check_contrecords = false;
> +
> +            if (restartLSN != InvalidXLogRecPtr)
> +            {
> +                /*
> +                 * We're retarding restartLSN, check continuation records
> +                 * at every page boundary for quick restoration.
> +                 */
> +                if (restartLSN / XLOG_BLCKSZ != flushPtr / XLOG_BLCKSZ)
> +                    check_contrecords = true;
> +            }
> +            else if (oldFlushPtr != InvalidXLogRecPtr)
> +            {
> +                /*
> +                 * We're not retarding restartLSN , check continuation records
> +                 * only at segment boundaries. No need to check if
> +                 */
> +                if (oldFlushPtr / XLOG_SEG_SIZE != flushPtr / XLOG_SEG_SIZE)
> +                    check_contrecords = true;
> +            }
> +
> +            if (check_contrecords)
> +            {
> +                XLogRecPtr rp;
> +
> +                if (restartLSN == InvalidXLogRecPtr)
> +                    restartLSN = oldFlushPtr;
> +
> +                rp = restartLSN - (restartLSN % XLOG_BLCKSZ);
> +
> +                /*
> +                 * We may have let the record at flushPtr be sent, so it's
> +                 * worth looking
> +                 */
> +                while (rp <= flushPtr)
> +                {
> +                    XLogPageHeaderData header;
> +
> +                    /*
> +                     * If the page header is not available for now, don't move
> +                     * restartLSN forward. We can read it by the next chance.
> +                     */
> +                    if(sentPtr - rp >= sizeof(XLogPageHeaderData))
> +                    {
> +                        bool found;
> +                        /*
> +                         * Fetch the page header of the next page. Move
> +                         * restartLSN forward only if it is not a continuation
> +                         * page.
> +                         */
> +                        found = XLogRead((char *)&header, rp,
> +                                             sizeof(XLogPageHeaderData), true);
> +                        if (found &&
> +                            (header.xlp_info & XLP_FIRST_IS_CONTRECORD) == 0)
> +                            restartLSN = rp;
> +                    }
> +                    rp += XLOG_BLCKSZ;
> +                }
> +
> +                /*
> +                 * If restartLSN is on the same page with flushPtr, it means
> +                 * that we are catching up.
> +                 */
> +                if (restartLSN / XLOG_BLCKSZ == flushPtr / XLOG_BLCKSZ)
> +                    restartLSN = InvalidXLogRecPtr;
> +            }
> +
> +            /* restartLSN == InvalidXLogRecPtr means catching up */
> +            PhysicalConfirmReceivedLocation(restartLSN != InvalidXLogRecPtr ?
> +                                            restartLSN : flushPtr);
> +        }

I've not read through the thread, but this seems like the wrong approach
to me. The receiving side should use a correct value, instead of putting
this complexity on the sender's side.

Don't we also use the value on the receiving side to delete old segments
and such?

- Andres



pgsql-hackers by date:

Previous
From: Kyotaro HORIGUCHI
Date:
Subject: Re: [HACKERS] [BUGS] Bug in Physical Replication Slots (at least9.5)?
Next
From: Ashutosh Bapat
Date:
Subject: Re: [HACKERS] dropping partitioned tables without CASCADE