On Sat, Feb 24, 2018 at 12:04 AM, Thomas Munro
<thomas.munro@enterprisedb.com> wrote:
> I'm testing another version that is a lot simpler: like v10, it relies
> on the knowledge that the leader's transaction will always end after
> the workers have finished, but it handles the RO_SAFE optimisation by
> keeping the SERIALIZABLEXACT alive but freeing its locks etc. More
> soon.
I've now broken it into two patches.
Patch 0001 is like my original patch with some minor improvements,
except that it now disables the RO_SAFE optimisation completely in
parallel mode. In other words, it's the stupidest fix possible to the
problem you flagged up. I think the main questions to answer about
the 0001 patch are whether this new locking protocol is sufficient,
whether anything bad could happen as a result of lock
escalation/transfer, and whether the underlying assumption about the
SERIALIZABLEXACT's lifetime holds true (that the leader will never
call ReleasePredicateLocks() while a worker is still running).
There are a couple of easy incremental improvements that could be made
on top of that patch, but I didn't make them because I'm trying to be
conservative in the hope of landing at least the basic feature in
PostgreSQL 11. Namely:
1. We could still return false if we see SXACT_FLAG_RO_SAFE in
SerializationNeededForRead() (we just couldn't call
ReleasePredicateLocks()).
2. We could set MySerializableXact to InvalidSerializableXact in
worker backends so at least they'd benefit from the optimisation (we
just couldn't do that in the leader or it'd leak resources).
Patch 0002 aims a bit higher than those ideas. I wanted to make sure
that the leader wouldn't arbitrarily miss out on the optimisation, and
I also suspect that the optimisation might be contagious in the sense
that actually releasing sooner might cause the RO_SAFE flag to be set
on *other* transactions sooner. Patch 0002 works like this:
The first backend to observe the RO_SAFE flag 'partially releases' the
SERIALIZABLEXACT, so that the SERIALIZABLEXACT itself remains valid.
(The concept of 'partial release' already existed, but I'm using it in
a new way.) All backends clear their MySerializableXact variable so
that they drop to faster SI in their own time. The leader keeps a
copy of it in SavedSerializableXact, so that it can fully release it
at the end of the transaction when we know that no other backend has a
reference to it.
These patches survive hammering with a simple test that generates a
mixture of read only and read write parallel queries that hit the
interesting case (attached; this test helped me understand that the
refcount scheme I considered was going to be hard). I haven't
personally tried to measure the value of the optimisation (though I'm
pretty sure it exists, based on the VLDB paper and the knowledge that
REPEATABLE READ (what the optimisation effectively gives you) just has
to be faster than SERIALIZABLE 'cause I've see all that code you get
to not run!). I'd like to propose the 0001 patch for now, but keep
the 0002 patch back for a bit as it's very new and needs more
feedback, if possible from Kevin and others involved in the SSI
project. Of course their input on the 0001 patch is also super
welcome.
--
Thomas Munro
http://www.enterprisedb.com