Re: change in LOCK behavior - Mailing list pgsql-hackers
From | Ants Aasma |
---|---|
Subject | Re: change in LOCK behavior |
Date | |
Msg-id | CA+CSw_tFbZ71jWDJGXtvJSrHPZ2y16yb6QDvCTRy4eRr=NsYiA@mail.gmail.com Whole thread Raw |
In response to | Re: change in LOCK behavior (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
On Thu, Oct 11, 2012 at 7:53 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote: > Maybe what we really need is to find a way to make taking a snapshot a > lot cheaper, such that the whole need for this patch goes away. We're > not going to get far with the idea of making SnapshotNow MVCC-safe > unless it becomes a lot cheaper to get an MVCC snapshot. I recall some > discussion of trying to reduce a snapshot to a WAL offset --- did that > idea crash and burn, or is it still viable? This was mostly covered in the cheaper snapshots thread. [1] Robert decided to abandon the idea after concluding that the memory overhead was untenable with very old snapshots. [2] I had a really hand-wavy idea of lazily converting snapshots from sequence number based snapshots to traditional list of xids snapshots to limit the overhead. That idea was promptly shot down because in that incarnation it needed snapshots to be stored in shared memory. [3] I have done some more thinking on this topic, although I have to admit that it has been on the backburner. It seems to me that the problems are all surmountable. To recap shortly, the idea is to define visibility and snapshots through commit sequence numbers (LSNs have problems due to async commit). The tricky part is the datastructure to support fast xid-to-csn lookup for visibility checks. To support visibility checks enough information needs to be kept so that the oldest CSN based snapshot can resolve its xmin-xmax range to csns. My idea currently is to have two fixed size shared memory buffers and an overflow log. The first ring buffer is a dense array mapping of xids to csns. The overflow entries from the dense ring buffer are checked if they might be invisible to any CSN based snapshots, and if so inserted into the sparse buffer. The sparse buffer is a sorted array containing xid-csn pairs that are still running or are concurrent with an active CSN based snapshot. Once the sparse buffer is filled up, the smallest xid-csn pairs are evicted to the CSN log. The long running CSN based snapshots then need to read this log to build up the SnapshotData->xip/subxip arrays. The backends can either discover that their snapshots CSNs values have overflowed by checking the appropriate horizon value, or be signaled via an interrupt to enable CSN log cleanup ASAP. I still have to work out some details on how to handle subtransaction overflow, how to maintain reasonably fresh values for different horizons and what are necessary ordering barriers to get lock-free visibility checks. The idea currently seems workable and will make taking snapshots really cheap, while the worst case maintenance overhead is mostly shifted to sessions that acquire lots of writing transactions and hold snapshots open for a long time. If anyone is interested I can do a slightly longer write up detailing what I have worked out so far. Ants Aasma [1] http://archives.postgresql.org/message-id/CA%2BTgmoaAjiq%3Dd%3DkYt3qNj%2BUvi%2BMB-aRovCwr75Ca9egx-Ks9Ag%40mail.gmail.com [2] http://archives.postgresql.org/message-id/CA%2BTgmoYD6EhYy1Rb%2BSEuns5smreY1_3rAMeL%3D76rX8deijy56Q%40mail.gmail.com [3] http://archives.postgresql.org/message-id/CA%2BCSw_uDfg2SBMicGNu13bpr2upbnVL_edoTbzvacR1FrNrZ1g%40mail.gmail.com -- Cybertec Schönig & Schönig GmbH Gröhrmühlgasse 26 A-2700 Wiener Neustadt Web: http://www.postgresql-support.de
pgsql-hackers by date: