Thread: [subxacts] Fixing TODO items
Hackers, Here is a patch that fixes some of the TODO items in the nested xacts list: handle large objects password file namespace (temp namespace cleanup) files Make TransactionIdIsInProgress more efficient. Make a subxid cache and a negative xid cache. Nesting level report via a ParameterStatus message sent just before ReadyForQuery (Z message). Allow a readonly subtransaction inside a read-write parent, but not viceversa Several tests in tqual.c have been 'optimized'. This means most uses of SubTransXidsHaveCommonAncestor are gone. Also there are additional places where HEAP_XMAX_INVALID is set. There are some likely controversial changes; the Xid caches, in the first place. I have tested carefully but I'm not sure if all the possible race conditions are accounted for. Second, the storage/file/fd.c management. I unified the arrays for AllocateDir and AllocateFile in a single array. It works fine AFAICS, but I'd feel better if someone more knowledgeable takes a look. Password file was already reviewed -- comments needed some work, which is in. Temp namespace handling uses the same design idea. Large objects are managed setting the CurrentResourceOwner to TopTransactionResourceOwner; they are manually freed on subtransaction abort (if allocated inside the subtransaction), and kept open till main transaction commit. Please review this patch. In the meantime I'll concentrate on some SGML docs and sourcecode comments where needed. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "I personally became interested in Linux while I was dating an English major who wouldn't know an operating system if it walked up and bit him." (Val Henson)
Attachment
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > Nesting level report via a ParameterStatus message sent just before > ReadyForQuery (Z message). Hasn't this really been obsoleted by events? In the SAVEPOINT world I doubt that it's needed. regards, tom lane
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > Here is a patch that fixes some of the TODO items in the nested xacts > list: > handle large objects On the large-objects stuff: it seems like a really bad idea to be fooling with CurrentResourceOwner like that. What that means is that if a subtransaction opens a LO and then fails, the associated low-level resources will remain allocated until the top transaction ends ... even though the LargeObjectDesc went away in subtransaction abort. In the worst case with repeated retries, it seems like this could run you out of free buffers, for example. It occurs to me that the only resource held for any long period in this code is the open relations (heap_r and index_r), and that it's pretty silly to have every open LO have its own reference to pg_largeobject. (I believe the code in inv_api.c is a holdover from a time when LOs could live in different tables; but that's been dead for a long time and seems quite unlikely to get resurrected.) Also, there's no urgent need to maintain a distinction between read and write access in terms of the kind of lock we take on pg_largeobject. There aren't any operations that anyone should be performing on pg_largeobject that would care about the difference between AccessShareLock and RowExclusiveLock. This means that upon first use of any LO within a main transaction, we can open pg_largeobject *once* and use it for all LOs for the rest of the transaction. This reference would be okay to assign to the top ResourceOwner. The resources acquired and released during inv_read, inv_write, etc, can be left with the normal CurrentResourceOwner so that they'll be dropped immediately in case of error. Any comments on this plan? regards, tom lane
Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > There are some likely controversial changes; the Xid caches, in the > first place. No kidding ;-) Do you have any theoretical or practical evidence for the usefulness of the negxids cache? Seems to me the shared memory space would be better spent on allowing deeper nesting of the running-subxids list. Also, what happened to marking whether the running-subxids list has overflowed? If none of them have then there's no need to wonder whether we have a still-running subxact. The apparent lack of any locking on these data structures seems wrong too. regards, tom lane
On Tue, Jul 27, 2004 at 01:32:01PM -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > > There are some likely controversial changes; the Xid caches, in the > > first place. > > No kidding ;-) > > Do you have any theoretical or practical evidence for the usefulness of > the negxids cache? Seems to me the shared memory space would be better > spent on allowing deeper nesting of the running-subxids list. Also, > what happened to marking whether the running-subxids list has > overflowed? If none of them have then there's no need to wonder whether > we have a still-running subxact. Oh, sure, I had forgot about the overflow flag. I am working on that now. > The apparent lack of any locking on these data structures seems wrong > too. Well, storing the main Xid of a transaction in PGPROC is regarded as "atomic" and it has no locking (other than SInvalLock in shared mode). I think the correct solution would be to lock both the main Xid and the XidCache with a per-backend LWLock, but you already rejected that idea. My current patch uses SInvalLock in shared mode to protect reading and writing the XidCaches. I am not sure this is a very good idea, but it doesn't seem unreasonable either. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) "Et put se mouve" (Galileo Galilei)
On Tue, Jul 27, 2004 at 01:32:01PM -0400, Tom Lane wrote: > Alvaro Herrera <alvherre@dcc.uchile.cl> writes: > > There are some likely controversial changes; the Xid caches, in the > > first place. > > No kidding ;-) Ok, here is another try. This patch includes both the Xid cache rewrite, source documentation (removed the big comment in xact.c and added a README file), htup.h comments updated, and some user documentation. BTW, the overflow bit is certainly a better approach than the negative cache. I am not sure about locking w.r.t. the XidCache though. Please take a look at it. -- Alvaro Herrera (<alvherre[a]dcc.uchile.cl>) Maybe there's lots of data loss but the records of data loss are also lost. (Lincoln Yeoh)