Hot Standby, deferred conflict resolution for cleanup records (v2) - Mailing list pgsql-hackers

From Simon Riggs
Subject Hot Standby, deferred conflict resolution for cleanup records (v2)
Date
Msg-id 1260630406.1984.97.camel@ebony
Whole thread Raw
Responses Re: Hot Standby, deferred conflict resolution for cleanup records (v2)  (Greg Stark <gsstark@mit.edu>)
List pgsql-hackers
I think I've found a better way of doing deferred conflict resolution
for WAL cleanup records. (This does not check for conflicts at block
level).

When a cleanup arrives, check *lock* conflicts to see who is accessing
the relation about to be cleaned.

If there are any lock conflicts, then wait, if requested.

If we waited, re-check *lock* conflicts to see who is accessing the
relation about to be cleaned. While holding lock, set latestRemovedXid
for the relation (protected by the partition lock).

Anyone acquiring a lock on a table should check the latestRemovedXid for
the table and abort if their xmin is too old. This prevents new lockers
from accessing a cleaned relation immediately after we decide to abort
anyone looking at that table. (Anyone queuing for the existing locks
would be caught by this).

We then cancel the list of current lock conflicts using the
latestRemovedXid (if there is one) as a cross-check to see if we can
avoid cancelling the query.

So if latestRemovedXid advances on a table you have locked, you will
have your xmin re-checked. If you access a table that has been or is
about to be cleaned then you will check xmin also.

Taken together this will mean that far fewer queries get cancelled,
since we check on both relid and latestRemovedXid. Reasonably simple
queries that take locks on a small number of relations at the start of
their execution will continue processing for long periods if they do not
access fast changing relations.

In particular, IMHO, this will cure about 90% of the btree delete issue,
since only users accessing a particularly busy index will need to cancel
themselves. Since many longer running queries don't use indexes at all
that trait alone will ensure that queries survive longer.

We need to keep track of latestRemovedXids for various relations in
shared memory. ISTM we can track top 8? common relids per lock partition
using a trivial LRU and then have a catch-all value for others. That
will allow us to track more than 100 relations without sweating too
much. All the fuss is handled during hot standby, so if you choose not
to use it, you have no impact.

-- Simon Riggs           www.2ndQuadrant.com



pgsql-hackers by date:

Previous
From: KaiGai Kohei
Date:
Subject: Re: SE-PostgreSQL/Lite Review
Next
From: Tom Lane
Date:
Subject: Re: Streaming replication and non-blocking I/O