Re: cheaper snapshots redux - Mailing list pgsql-hackers

From Robert Haas
Subject Re: cheaper snapshots redux
Date
Msg-id CA+TgmoaQutxtcbMhTOAPXtOUhhjvAQb99uW_4N1C2aDPgLg3ig@mail.gmail.com
Whole thread Raw
In response to cheaper snapshots redux  (Robert Haas <robertmhaas@gmail.com>)
Responses Re: cheaper snapshots redux
List pgsql-hackers
On Mon, Sep 12, 2011 at 11:07 AM, Amit Kapila <amit.kapila@huawei.com> wrote:
>>If you know what transactions were running the last time a snapshot summary
>> was written and what >transactions have ended since then, you can work out
>> the new xmin on the fly.  I have working >code for this and it's actually
>> quite simple.
>
> I believe one method to do same is as follows:
>
> Let us assume at some point of time the snapshot and completed XID list is
> somewhat as follows:
>
> Snapshot
>
> { Xmin – 5, Xip[] – 8 10 12, Xmax  - 15 }
>
> Committed XIDS – 8, 10 , 12, 18, 20, 21
>
> So it means 16,17,19 are running transactions. So it will behave as follows:
>
> { Xmin – 16, Xmax – 21, Xip[] – 17,19 }

Yep, that's pretty much what it does, although xmax is actually
defined as the XID *following* the last one that ended, and I think
xmin needs to also be in xip, so in this case you'd actually end up
with xmin = 15, xmax = 22, xip = { 15, 16, 17, 19 }.  But you've got
the basic idea of it.

> But if we do above way to calculate Xmin, we need to check in existing Xip
> array and committed Xid array to find Xmin. Won’t this cause reasonable time
> even though it is outside lock time if Xip and Xid are large.

Yes, Tom raised this concern earlier.  I can't answer it for sure
without benchmarking, but clearly xip[] can't be allowed to get too
big.

>> Because GetSnapshotData() computes a new value for RecentGlobalXmin by
>> scanning the ProcArray.  > This isn't costing a whole lot extra right now
>> because the xmin and xid fields are normally in > the same cache line, so
>> once you've looked at one of them it doesn't cost that much extra to
>> look at the other.  If, on the other hand, you're not looking at (or even
>> locking) the
>> ProcArray, then doing so just to recomputed RecentGlobalXmin sucks.
>
> Yes, this is more time as compare to earlier, but if our approach to
> calculate Xmin is like above point, then one extra read outside lock should
> not matter. However if for above point approach is different then it will be
> costlier.

It's not one extra read - you'd have to look at every PGPROC.  And it
is not outside a lock, either.  You definitely need locking around
computing RecentGlobalXmin; see src/backend/access/transa/README.  In
particular, if someone with proc->xmin = InvalidTransactionId is
taking a snapshot while you're computing RecentGlobalXmin, and then
stores a proc->xmin less than your newly-computed RecentGlobalXmin,
you've got a problem.  That can't happen right now because no
transactions can commit while RecentGlobalXmin is being computed, but
the point here is precisely to allow those operations to (mostly) run
in parallel.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


pgsql-hackers by date:

Previous
From: Alexey Klyukin
Date:
Subject: ALTER TABLE ONLY ...DROP CONSTRAINT is broken in HEAD.
Next
From: Mark Hills
Date:
Subject: libpq: Return of NULL from PQexec