Re: TransactionIdIsInProgress() cache - Mailing list pgsql-patches

From Heikki Linnakangas
Subject Re: TransactionIdIsInProgress() cache
Date
Msg-id 47D681B3.5070309@enterprisedb.com
Whole thread Raw
In response to TransactionIdIsInProgress() cache  (Simon Riggs <simon@2ndquadrant.com>)
Responses Re: TransactionIdIsInProgress() cache  (Alvaro Herrera <alvherre@commandprompt.com>)
Re: TransactionIdIsInProgress() cache  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: TransactionIdIsInProgress() cache  (Simon Riggs <simon@2ndquadrant.com>)
List pgsql-patches
Simon Riggs wrote:
> We currently have a single item cache of the last checked TransactionId,
> which optimises the call to TransactionIdDidCommit() during
> HeapTupleSatisfiesMVCC() and partners.
>
> Before we call TransactionIdDidCommit() we always call
> TransactionIdIsInProgress().
>
> TransactionIdIsInProgress() doesn't check the single item cache, so even
> if we have just checked for this xid, we will check it again. Since this
> function takes ProcArrayLock and may be called while holding other locks
> it will improve scalability if we can skip the call, for the cost of an
> integer comparison.

Seems plausible. Have you done any performance testing?

I presume the case where this would help would be when you populate a
large table, with COPY for example, and the run a seq scan on it. As all
rows in the table have the same xmin, you keep testing for the same XID
over and over again.

To matter from scalability point of view, there would need to be a lot
of concurrent activity that compete for the lock. Can you formulate a
test case for that?

> Following patch implements fastpath in TransactionIdIsInProgress() to
> utilise single item cache.

Hmm. The pattern in tqual.c is:

>  if (!TransactionIdIsInProgress(xvac))
>  {
>     if (TransactionIdDidCommit(xvac))
>     {
>        /* committed */
>     }
>     else
>     {
>        /* aborted */
>     }
>  }
>  else
>  {
>      /* in-progress */
>  }

We could do this instead:

>  if (TransactionIdDidCommit(xvac))
>  {
>      /* committed */
>  }
>  else if (!TransactionIdIsInProgress(xvac))
>  {
>     if (TransactionIdDidCommit(xvac))
>     {
>        /* committed */
>     }
>     else
>     {
>        /* aborted */
>     }
>  }
>  else
>  {
>      /* in-progress */
>  }

(hopefully there would be a way to macroize that or something to avoid
bloating the code any more.)

For committed transactions, this would save the
TransactionIdIsInProgress call completely, whether or not it's in the
one-item cache. The tradeoff is that we would have to call
TransactionIdDidCommit twice for aborted transactions.

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com


pgsql-patches by date:

Previous
From: "Heikki Linnakangas"
Date:
Subject: Re: [PERFORM] Very slow (2 tuples/second) sequential scan after bulk insert; speed returns to ~500 tuples/second after commit
Next
From: "Heikki Linnakangas"
Date:
Subject: Re: Bulk Insert tuning