Re: Snapshot management, final - Mailing list pgsql-patches
From | Tom Lane |
---|---|
Subject | Re: Snapshot management, final |
Date | |
Msg-id | 19801.1210109194@sss.pgh.pa.us Whole thread Raw |
In response to | Re: Snapshot management, final (Alvaro Herrera <alvherre@commandprompt.com>) |
Responses |
Re: Snapshot management, final
Re: Snapshot management, final Re: Snapshot management, final |
List | pgsql-patches |
Alvaro Herrera <alvherre@commandprompt.com> writes: > Simon Riggs wrote: >> On Tue, 2008-04-22 at 15:49 -0400, Alvaro Herrera wrote: >>> - Three CopySnapshot call sites remain outside snapmgr.c: DoCopy() on >>> copy.c, ExplainOnePlan() on explain.c and _SPI_execute_plan() on spi.c. >>> They are there because they grab the current ActiveSnapshot, modify it, >>> and then use the resulting snapshot. There is no corresponding >>> FreeSnapshot, because it's not needed. >> >> Not needed? How can we be certain that the modified snapshot does not >> outlive its original source? > It's not CopySnapshot that's not needed, but FreeSnapshot. The point > here is that the snapshot will be freed automatically as soon as it is > PopActiveSnapshot'd out of existance. CopySnapshot creates a new, > separate copy of the passed snapshot, and each of them will be freed > (separately) as soon as their refcounts reach zero. I looked over this patch and I think it still needs work. I concur with Simon's discomfort about the external CopySnapshot calls: they are reference leaks waiting to happen. As an example, you have the pattern snap = CopySnapshot(...); do some stuff; RegisterSnapshot(); but what if the "stuff" fails? In most of these code paths there's at least one palloc in between, so a failure is certainly possible. In the current form of the patch, a failure might not cost more than some leaked memory in TopTransactionContext, but it's still pretty ugly. I think it would be best if it were simply not possible to do CopySnapshot without simultaneously putting the snap into either the registered or active lists. In the COPY and EXPLAIN cases I think you misunderstood the point of the original code anyway: the modified snapshot is still the active snapshot for the duration of the command. So I think the right approach for both of these is the equivalent of what you put into spi.c: snap = CopySnapshot(snapshot); if (!read_only) snap->curcid = GetCurrentCommandId(false); PushActiveSnapshot(snap); and then a Pop at the end. It might be worth encapsulating the above series as a single copy-modify-and-push subroutine in snapmgr.c, so that you don't have to expose CopySnapshot publicly, nor expose adjustment of a snap's curcid outside snapmgr. Also, I don't think the subtransaction management is correct at all. How can it handle cases where a subtransaction and a sub-sub-transaction both take out reference counts on the same snap? If the sub-sub-xact aborts, you have no way to determine how many refcounts should go away. I think it would work better to keep the subxact level indicators in the ActiveSnapshotElt/RegdSnapshotElt list members. This would mean multiple RegdSnapshotElt members pointing at the same snapshot in any case where two different subxact levels actually had refs to the same snapshot. The RegdSnapshotElt members would each count registration counts for their own subtransaction, and the underlying snapshot would just count how many RegdSnapshotElts were pointing at it. This would allow the same amount of verification in subxact commit as you have in xact commit, ie there should be no counts left for the current subtransaction. Also, I think that the whole snapshot-sharing mechanism is not working as intended except for the serializable case; otherwise sequences like x = RegisterSnapshot(GetTransactionSnapshot()); y = RegisterSnapshot(GetTransactionSnapshot()); will result in x and y being separate copies. Or are you assuming that this just isn't worth optimizing? Some minor points: It seems odd that snapshot cleanup is late in main xact commit/abort and early in subxact commit/abort. Unless there's a really good reason for that, keep the ordering of module cleanup calls consistent across the various cases in xact.c. GetSnapshotData's serializable parameter is obsolete and should be removed. I'm not entirely comfortable with the reference counts being only uint16 wide. int32 seems safer and it isn't really saving anything much. Push/PopActiveSnap leaks the ActiveSnapshotElt. PopActiveSnapshot should probably assert active_count > 0 before decrementing, likewise in UnregisterSnapshot. regards, tom lane
pgsql-patches by date: