Hot Standby b-tree delete records review - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Hot Standby b-tree delete records review
Date
Msg-id 4BCFF9AA.6000703@enterprisedb.com
Whole thread Raw
Responses Re: Hot Standby b-tree delete records review  (Simon Riggs <simon@2ndQuadrant.com>)
List pgsql-hackers
btree_redo:
>     case XLOG_BTREE_DELETE:
> 
>         /*
>          * Btree delete records can conflict with standby queries. You
>          * might think that vacuum records would conflict as well, but
>          * we've handled that already. XLOG_HEAP2_CLEANUP_INFO records
>          * provide the highest xid cleaned by the vacuum of the heap
>          * and so we can resolve any conflicts just once when that
>          * arrives. After that any we know that no conflicts exist
>          * from individual btree vacuum records on that index.
>          */
>         {
>             TransactionId latestRemovedXid = btree_xlog_delete_get_latestRemovedXid(record);
>             xl_btree_delete *xlrec = (xl_btree_delete *) XLogRecGetData(record);
> 
>             /*
>              * XXX Currently we put everybody on death row, because
>              * currently _bt_delitems() supplies InvalidTransactionId.
>              * This can be fairly painful, so providing a better value
>              * here is worth some thought and possibly some effort to
>              * improve.
>              */
>             ResolveRecoveryConflictWithSnapshot(latestRemovedXid, xlrec->node);
>         }
>         break;

The XXX comment is out-of-date, the latestRemovedXid value is calculated
by btree_xlog_delete_get_latestRemovedXid() nowadays.

If we're re-replaying the WAL record, for example after restarting the
standby server, btree_xlog_delete_get_latestRemovedXid() won't find the
deleted records and will return InvalidTransactionId. That's OK, because
until we reach again the point in WAL where we were before the restart,
we don't accept read-only connections so there's no-one to kill anyway,
but you do get a useless "Invalid latestRemovedXid reported, using
latestCompletedXid instead" message in the log (that shouldn't be
capitalized, BTW).

It would be nice to check if there's any potentially conflicting
read-only queries before calling
btree_xlog_delete_get_latestRemovedXid(), which can be quite expensive.

If the "Invalid latestRemovedXid reported, using latestCompletedXid
instead" message is going to happen commonly, I think it should be
downgraded to DEBUG1. If it's an unexpected scenario, it should be
upgraded to WARNING.

In btree_xlog_delete_get_latestRemovedXid:
>     Assert(num_unused == 0);

Can't that happen as well in a re-replay scenario, if a heap item was
vacuumed away later on?

>     /*
>      * Note that if all heap tuples were LP_DEAD then we will be
>      * returning InvalidTransactionId here. This seems very unlikely
>      * in practice.
>      */

If none of the removed heap tuples were present anymore, we currently
return InvalidTransactionId, which kills/waits out all read-only
queries. But if none of the tuples were present anymore, the read-only
queries wouldn't have seen them anyway, so ISTM that we should treat
InvalidTransactionId return value as "we don't need to kill anyone".

Why does btree_xlog_delete_get_latestRemovedXid() keep the
num_unused/num_dead/num_redirect counts, it doesn't actually do anything
with them.

--  Heikki Linnakangas EnterpriseDB   http://www.enterprisedb.com


pgsql-hackers by date:

Previous
From: Simon Riggs
Date:
Subject: Re: testing HS/SR - 1 vs 2 performance
Next
From: Heikki Linnakangas
Date:
Subject: Re: Assertion failure twophase.c (3) (testing HS/SR)