Thread: [subxacts] Fixing TODO items

[subxacts] Fixing TODO items

From
Alvaro Herrera
Date:
Hackers,

Here is a patch that fixes some of the TODO items in the nested xacts
list:

handle large objects
password file
namespace (temp namespace cleanup)
files
Make TransactionIdIsInProgress more efficient.  Make a subxid cache and
  a negative xid cache.
Nesting level report via a ParameterStatus message sent just before
  ReadyForQuery (Z message).
Allow a readonly subtransaction inside a read-write parent, but not
  viceversa
Several tests in tqual.c have been 'optimized'.  This means most uses of
  SubTransXidsHaveCommonAncestor are gone.  Also there are additional
  places where HEAP_XMAX_INVALID is set.



There are some likely controversial changes; the Xid caches, in the
first place.  I have tested carefully but I'm not sure if all the
possible race conditions are accounted for.

Second, the storage/file/fd.c management.  I unified the arrays for
AllocateDir and AllocateFile in a single array.  It works fine AFAICS,
but I'd feel better if someone more knowledgeable takes a look.

Password file was already reviewed -- comments needed some work, which
is in.  Temp namespace handling uses the same design idea.

Large objects are managed setting the CurrentResourceOwner to
TopTransactionResourceOwner; they are manually freed on subtransaction
abort (if allocated inside the subtransaction), and kept open till main
transaction commit.


Please review this patch.  In the meantime I'll concentrate on some SGML
docs and sourcecode comments where needed.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"I personally became interested in Linux while I was dating an English major
who wouldn't know an operating system if it walked up and bit him."
(Val Henson)

Attachment

Re: [subxacts] Fixing TODO items

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> Nesting level report via a ParameterStatus message sent just before
>   ReadyForQuery (Z message).

Hasn't this really been obsoleted by events?  In the SAVEPOINT world
I doubt that it's needed.

            regards, tom lane

Re: [subxacts] Fixing TODO items

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> Here is a patch that fixes some of the TODO items in the nested xacts
> list:
> handle large objects

On the large-objects stuff: it seems like a really bad idea to be
fooling with CurrentResourceOwner like that.  What that means is that if
a subtransaction opens a LO and then fails, the associated low-level
resources will remain allocated until the top transaction ends ... even
though the LargeObjectDesc went away in subtransaction abort.  In the
worst case with repeated retries, it seems like this could run you out
of free buffers, for example.

It occurs to me that the only resource held for any long period in this
code is the open relations (heap_r and index_r), and that it's pretty
silly to have every open LO have its own reference to pg_largeobject.
(I believe the code in inv_api.c is a holdover from a time when LOs
could live in different tables; but that's been dead for a long time
and seems quite unlikely to get resurrected.)

Also, there's no urgent need to maintain a distinction between read and
write access in terms of the kind of lock we take on pg_largeobject.
There aren't any operations that anyone should be performing on
pg_largeobject that would care about the difference between
AccessShareLock and RowExclusiveLock.

This means that upon first use of any LO within a main transaction,
we can open pg_largeobject *once* and use it for all LOs for the rest
of the transaction.  This reference would be okay to assign to the top
ResourceOwner.  The resources acquired and released during inv_read,
inv_write, etc, can be left with the normal CurrentResourceOwner so that
they'll be dropped immediately in case of error.

Any comments on this plan?

            regards, tom lane

Re: [subxacts] Fixing TODO items

From
Tom Lane
Date:
Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> There are some likely controversial changes; the Xid caches, in the
> first place.

No kidding ;-)

Do you have any theoretical or practical evidence for the usefulness of
the negxids cache?  Seems to me the shared memory space would be better
spent on allowing deeper nesting of the running-subxids list.  Also,
what happened to marking whether the running-subxids list has
overflowed?  If none of them have then there's no need to wonder whether
we have a still-running subxact.

The apparent lack of any locking on these data structures seems wrong
too.

            regards, tom lane

Re: [subxacts] Fixing TODO items

From
Alvaro Herrera
Date:
On Tue, Jul 27, 2004 at 01:32:01PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> > There are some likely controversial changes; the Xid caches, in the
> > first place.
>
> No kidding ;-)
>
> Do you have any theoretical or practical evidence for the usefulness of
> the negxids cache?  Seems to me the shared memory space would be better
> spent on allowing deeper nesting of the running-subxids list.  Also,
> what happened to marking whether the running-subxids list has
> overflowed?  If none of them have then there's no need to wonder whether
> we have a still-running subxact.

Oh, sure, I had forgot about the overflow flag.  I am working on that
now.

> The apparent lack of any locking on these data structures seems wrong
> too.

Well, storing the main Xid of a transaction in PGPROC is regarded as
"atomic" and it has no locking (other than SInvalLock in shared mode).
I think the correct solution would be to lock both the main Xid and the
XidCache with a per-backend LWLock, but you already rejected that idea.

My current patch uses SInvalLock in shared mode to protect reading and
writing the XidCaches.  I am not sure this is a very good idea, but it
doesn't seem unreasonable either.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
"Et put se mouve" (Galileo Galilei)


Re: [subxacts] Fixing TODO items

From
Alvaro Herrera
Date:
On Tue, Jul 27, 2004 at 01:32:01PM -0400, Tom Lane wrote:
> Alvaro Herrera <alvherre@dcc.uchile.cl> writes:
> > There are some likely controversial changes; the Xid caches, in the
> > first place.
>
> No kidding ;-)

Ok, here is another try.  This patch includes both the Xid cache
rewrite, source documentation (removed the big comment in xact.c and
added a README file), htup.h comments updated, and some user
documentation.

BTW, the overflow bit is certainly a better approach than the negative
cache.  I am not sure about locking w.r.t. the XidCache though.

Please take a look at it.

--
Alvaro Herrera (<alvherre[a]dcc.uchile.cl>)
Maybe there's lots of data loss but the records of data loss are also lost.
(Lincoln Yeoh)

Attachment