Re: Proposal for CSN based snapshots - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Proposal for CSN based snapshots
Date
Msg-id 69041c5a-ec60-3c54-49d0-5df071b27f52@iki.fi
Whole thread Raw
In response to Re: Proposal for CSN based snapshots  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: Proposal for CSN based snapshots  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Re: Proposal for CSN based snapshots  (Heikki Linnakangas <hlinnaka@iki.fi>)
List pgsql-hackers
(Reviving an old thread)

I spent some time dusting off this old patch, to implement CSN
snapshots. Attached is a new patch, rebased over current master, and
with tons of comments etc. cleaned up. There's more work to be done
here, I'm posting this to let people know I'm working on this again. And
to have a backup on the 'net :-).

I switched to using a separate counter for CSNs. CSN is no longer the
same as the commit WAL record's LSN. While I liked the conceptual
simplicity of CSN == LSN a lot, and the fact that the standby would see
the same commit order as master, I couldn't figure out how to make async
commits to work.

Next steps:

* Hot standby feedback is broken, now that CSN != LSN again. Will have
to switch this back to using an "oldest XID", rather than a CSN.

* I plan to replace pg_subtrans with a special range of CSNs in the
csnlog. Something like, start the CSN counter at 2^32 + 1, and use CSNs
< 2^32 to mean "this is a subtransaction, parent is XXX". One less SLRU
to maintain.

* Put per-proc xmin back into procarray. I removed it, because it's not
necessary for snapshots or GetOldestSnapshot() (which replaces
GetOldestXmin()) anymore. But on second thoughts, we still need it for
deciding when it's safe to truncate the csnlog.

* In this patch, HeapTupleSatisfiesVacuum() is rewritten to use an
"oldest CSN", instead of "oldest xmin", but that's not strictly
necessary. To limit the size of the patch, I might revert those changes
for now.

* Rewrite the way RecentGlobalXmin is updated. As Alvaro pointed out in
his review comments two years ago, that was quite complicated. And I'm
worried that the lazy scheme I had might not allow pruning fast enough.
I plan to make it more aggressive, so that whenever the currently oldest
transaction finishes, it's responsible for advancing the "global xmin"
in shared memory. And the way it does that, is by scanning the csnlog,
starting from the current "global xmin", until the next still
in-progress XID. That could be a lot, if you have a very long-running
transaction that ends, but we'll see how it performs.

* Performance testing. Clearly this should have a performance benefit,
at least under some workloads, to be worthwhile. And not regress.

- Heikki


Attachment

pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: Wait events monitoring future development
Next
From: Shay Rojansky
Date:
Subject: Re: Slowness of extended protocol