Thread: Optimizing TransactionIdIsCurrentTransactionId()
TransactionIdIsCurrentTransactionId() doesn't seem to be well optimized for the case when an xid has not yet been assigned, so for read only transactions.
A patch for this is attached.
--
Attachment
On Wed, Dec 18, 2019 at 5:07 AM Simon Riggs <simon@2ndquadrant.com> wrote: > TransactionIdIsCurrentTransactionId() doesn't seem to be well optimized for the case when an xid has not yet been assigned,so for read only transactions. > > A patch for this is attached. It might be an idea to first call TransactionIdIsNormal(xid), then GetTopTransactionIdIfAny(), then TransactionIdIsNormal(topxid), so that we don't bother with GetTopTransactionIdIfAny() when !TransactionIdIsNormal(xid). But it's also not clear to me whether this is actually a win. You're dong an extra TransactionIdIsNormal() test to sometimes avoid a GetTopTransactionIdIfAny() test. TransactionIdIsNormal() is pretty cheap, but GetTopTransactionIdIfAny() isn't all that expensive either, and adding more branches costs something. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Thu, Dec 19, 2019 at 02:27:01PM -0500, Robert Haas wrote: >On Wed, Dec 18, 2019 at 5:07 AM Simon Riggs <simon@2ndquadrant.com> wrote: >> TransactionIdIsCurrentTransactionId() doesn't seem to be well optimized for the case when an xid has not yet been assigned,so for read only transactions. >> >> A patch for this is attached. > >It might be an idea to first call TransactionIdIsNormal(xid), then >GetTopTransactionIdIfAny(), then TransactionIdIsNormal(topxid), so >that we don't bother with GetTopTransactionIdIfAny() when >!TransactionIdIsNormal(xid). > >But it's also not clear to me whether this is actually a win. You're >dong an extra TransactionIdIsNormal() test to sometimes avoid a >GetTopTransactionIdIfAny() test. TransactionIdIsNormal() is pretty >cheap, but GetTopTransactionIdIfAny() isn't all that expensive either, >and adding more branches costs something. > I think "optimization" patches should generally come with some sort of quantification of the gains - e.g. a benchmark with somewhat realistic workload (but even synthetic is better than nothing). Or at least some explanation *why* it's going to be an improvement. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Thu, 19 Dec 2019 at 19:27, Robert Haas <robertmhaas@gmail.com> wrote:
On Wed, Dec 18, 2019 at 5:07 AM Simon Riggs <simon@2ndquadrant.com> wrote:
> TransactionIdIsCurrentTransactionId() doesn't seem to be well optimized for the case when an xid has not yet been assigned, so for read only transactions.
>
> A patch for this is attached.
It might be an idea to first call TransactionIdIsNormal(xid), then
GetTopTransactionIdIfAny(), then TransactionIdIsNormal(topxid), so
that we don't bother with GetTopTransactionIdIfAny() when
!TransactionIdIsNormal(xid).
But it's also not clear to me whether this is actually a win. You're
dong an extra TransactionIdIsNormal() test to sometimes avoid a
GetTopTransactionIdIfAny() test.
That's not the point of the patch.
If the TopTransactionId is not assigned, we can leave the whole function more quickly, not just avoid a test.
Read only transactions should have a fast path thru this function since they frequently read more data than write transactions.
On Fri, Dec 20, 2019 at 12:46 AM Simon Riggs <simon@2ndquadrant.com> wrote: > If the TopTransactionId is not assigned, we can leave the whole function more quickly, not just avoid a test. Those things are not really any different from each other. You leave the function when you've done all the necessary tests.... > Read only transactions should have a fast path thru this function since they frequently read more data than write transactions. With regard to this point, I second Tomas's comments. -- Robert Haas EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On Fri, 20 Dec 2019 at 13:07, Robert Haas <robertmhaas@gmail.com> wrote:
> Read only transactions should have a fast path thru this function since they frequently read more data than write transactions.
With regard to this point, I second Tomas's comments.
I also agree with Tomas' comments. I am explaining *why* it will be an improvement, expanding on my earlier notes.
This function is called extremely frequently in query processing and is fairly efficient. I'm pointing out cases where making it even quicker makes sense.
The TopXid is assigned in very few calls. Write transactions perform searching before the xid is assigned, so UPDATE and DELETE transactions will call this with TopXid unassigned in many small transactions, e.g. simple pgbench. In almost all read-only cases and especially on standby nodes there will be no TopXid assigned, so I estimate that 90-99% of calls will be made with TopXid invalid. In this case it makes a great deal of sense to have a fastpath out of this function, by testing TransactionIdIsNormal(topxid).
I also now notice that on entry the xid provided is hardly ever InvalidTransactionId. Once, it might have been called repeatedly with FrozenTransactionId, but that is no longer the case since we no longer reset the xid on freezing. So the test for TransactionIdIsNormal(xid) appears to need rethinking since it is now mostly redundant.
Such a frequently used function is worth discussing, just as we previously optimised TransactionIdIsInProgress() and MVCC visibility routines, where we discussed what the most common routes through the functions were before deciding how to optimize them.
Simon Riggs <simon@2ndquadrant.com> writes: > On Fri, 20 Dec 2019 at 13:07, Robert Haas <robertmhaas@gmail.com> wrote: >> With regard to this point, I second Tomas's comments. > I also agree with Tomas' comments. I am explaining *why* it will be an > improvement, expanding on my earlier notes. > This function is called extremely frequently in query processing and is > fairly efficient. I'm pointing out cases where making it even quicker makes > sense. I think the point is that you haven't demonstrated that this particular patch makes it quicker. regards, tom lane
On Fri, 20 Dec 2019 at 17:46, Tom Lane <tgl@sss.pgh.pa.us> wrote:
Simon Riggs <simon@2ndquadrant.com> writes:
> On Fri, 20 Dec 2019 at 13:07, Robert Haas <robertmhaas@gmail.com> wrote:
>> With regard to this point, I second Tomas's comments.
> I also agree with Tomas' comments. I am explaining *why* it will be an
> improvement, expanding on my earlier notes.
> This function is called extremely frequently in query processing and is
> fairly efficient. I'm pointing out cases where making it even quicker makes
> sense.
I think the point is that you haven't demonstrated that this particular
patch makes it quicker.
Not yet, but I was trying to agree what an appropriate test would be before running it.
On Fri, Dec 20, 2019 at 05:57:55PM +0000, Simon Riggs wrote: >On Fri, 20 Dec 2019 at 17:46, Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Simon Riggs <simon@2ndquadrant.com> writes: >> > On Fri, 20 Dec 2019 at 13:07, Robert Haas <robertmhaas@gmail.com> wrote: >> >> With regard to this point, I second Tomas's comments. >> >> > I also agree with Tomas' comments. I am explaining *why* it will be an >> > improvement, expanding on my earlier notes. >> > This function is called extremely frequently in query processing and is >> > fairly efficient. I'm pointing out cases where making it even quicker >> makes >> > sense. >> >> I think the point is that you haven't demonstrated that this particular >> patch makes it quicker. >> > >Not yet, but I was trying to agree what an appropriate test would be before >running it. > Isn't that a bit backwards? I mean, we usually identify opportunities for optimizations by observing poor performance with a workload, which means that workload can serve as a test. Of course, it's possible to notice an opprtunity by eye-balling the code, but you've already said this is supposed to improve read-only transactions. I've actually tried to measure if/how this affects performance using a simple read-only pgbench pgbench -S -M prepared -T 60 test I did this with a long-running transaction to prevent hint bits from getting set. But I've not measured any difference in performane. So either this improves a different workload, or maybe I'm doing something silly that makes the patch irrelevant. regards -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services