Re: Reduce heap tuple header size - Mailing list pgsql-patches
From | Manfred Koizar |
---|---|
Subject | Re: Reduce heap tuple header size |
Date | |
Msg-id | dalkgugaoqd8hqdjvvjn0ord2vtfo68irt@4ax.com Whole thread Raw |
In response to | Re: Reduce heap tuple header size (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-patches |
On Fri, 14 Jun 2002 10:16:22 -0400, Tom Lane <tgl@sss.pgh.pa.us> wrote: >As I commented before, I am not in favor of this. I don't think that a >four-byte savings justifies a forced initdb with no chance of >pg_upgrade, As I don't know other users' preferences, I cannot comment on this. I just think that four bytes per tuple can amount to a sum that justifies this effort. Disk space is not my argument here, but reduced disk IO is. >plus significantly slower access to >several critical header fields. The tqual.c routines are already >hotspots in many scenarios. I believe this will make them noticeably >slower. Significantly slower? I tried to analyze HeapTupleSatisfiesUpdate(), as I think it is the most complicated of these Satisfies functions (look for the !! comments): /* !! HeapTupleHeaderGetXmin no slowdown !! HeapTupleHeaderGetXmax one infomask compare !! HeapTupleHeaderGetCmin two infomask compares !! HeapTupleHeaderGetCMax two infomask compares !! HeapTupleHeaderGetXvac no slowdown */ int HeapTupleSatisfiesUpdate(HeapTuple htuple, CommandId curcid) { HeapTupleHeader tuple = htuple->t_data; if (!(tuple->t_infomask & HEAP_XMIN_COMMITTED)) { if (tuple->t_infomask & HEAP_XMIN_INVALID) /* !! no slowdown */ return HeapTupleInvisible; if (tuple->t_infomask & HEAP_MOVED_OFF) { if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXvac(tuple))) /* !! no slowdown */ return HeapTupleInvisible; if (!TransactionIdIsInProgress(HeapTupleHeaderGetXvac(tuple))) { if (TransactionIdDidCommit(HeapTupleHeaderGetXvac(tuple))) { tuple->t_infomask |= HEAP_XMIN_INVALID; /* !! no slowdown */ return HeapTupleInvisible; } tuple->t_infomask |= HEAP_XMIN_COMMITTED; } } else if (tuple->t_infomask & HEAP_MOVED_IN) { if (!TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXvac(tuple))) { if (TransactionIdIsInProgress(HeapTupleHeaderGetXvac(tuple))) /* !! no slowdown */ return HeapTupleInvisible; if (TransactionIdDidCommit(HeapTupleHeaderGetXvac(tuple))) tuple->t_infomask |= HEAP_XMIN_COMMITTED; else { tuple->t_infomask |= HEAP_XMIN_INVALID; /* !! no slowdown */ return HeapTupleInvisible; } } } /* !! no slowdown up to here */ else if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmin(tuple))) { /* !! GetCmin does 2 infomask compares */ if (HeapTupleHeaderGetCmin(tuple) >= curcid) /* !! returning with 2 additional infomask compares */ return HeapTupleInvisible; /* inserted after scan * started */ if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid */ /* !! returning with 2 additional infomask compares */ return HeapTupleMayBeUpdated; /* !! assertions turned off in production: no slowdown */ Assert(TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple))); if (tuple->t_infomask & HEAP_MARKED_FOR_UPDATE) /* !! returning with 2 additional infomask compares */ return HeapTupleMayBeUpdated; /* !! GetCmax does 2 infomask compares */ if (HeapTupleHeaderGetCmax(tuple) >= curcid) /* !! returning with 4 additional infomask compares */ return HeapTupleSelfUpdated; /* updated after scan * started */ else /* !! returning with 4 additional infomask compares */ return HeapTupleInvisible; /* updated before scan * started */ } /* !! no slowdown up to here */ else if (!TransactionIdDidCommit(HeapTupleHeaderGetXmin(tuple))) { if (TransactionIdDidAbort(HeapTupleHeaderGetXmin(tuple))) tuple->t_infomask |= HEAP_XMIN_INVALID; /* aborted */ /* !! no slowdown */ return HeapTupleInvisible; } else tuple->t_infomask |= HEAP_XMIN_COMMITTED; } /* !! no slowdown */ /* by here, the inserting transaction has committed */ if (tuple->t_infomask & HEAP_XMAX_INVALID) /* xid invalid or aborted */ /* !! no slowdown */ return HeapTupleMayBeUpdated; if (tuple->t_infomask & HEAP_XMAX_COMMITTED) { if (tuple->t_infomask & HEAP_MARKED_FOR_UPDATE) /* !! no slowdown ?? BTW, shouldn't we set HEAP_MAX_INVALID here? */ return HeapTupleMayBeUpdated; /* !! no slowdown */ return HeapTupleUpdated; /* updated by other */ } /* !! no slowdown up to here, !! one infomask compare to follow in GetXmax(), !! additional waste of precious CPU cycles could be avoided by: !! TransactionId xmax = HeapTupleHeaderGetXmax(tuple); */ if (TransactionIdIsCurrentTransactionId(HeapTupleHeaderGetXmax(tuple))) { if (tuple->t_infomask & HEAP_MARKED_FOR_UPDATE) /* !! returning with 1 additional infomask compare */ return HeapTupleMayBeUpdated; /* !! GetCmax does 2 infomask compares */ if (HeapTupleHeaderGetCmax(tuple) >= curcid) /* !! returning with 3 additional infomask compares */ return HeapTupleSelfUpdated; /* updated after scan * started */ else /* !! returning with 3 additional infomask compares */ return HeapTupleInvisible; /* updated before scan started */ } /* !! 1 infomask compare up to here, !! another infomask compare ... !! or use xmax */ if (!TransactionIdDidCommit(HeapTupleHeaderGetXmax(tuple))) { /* !! and a third infomask compare */ if (TransactionIdDidAbort(HeapTupleHeaderGetXmax(tuple))) { tuple->t_infomask |= HEAP_XMAX_INVALID; /* aborted */ /* !! returning with 1 or 3 additional infomask compares */ return HeapTupleMayBeUpdated; } /* running xact */ /* !! returning with 1 or 3 additional infomask compares */ return HeapTupleBeingUpdated; /* in updation by other */ } /* !! 2 (or 1 with a local xmax) infomask compares up to here */ /* xmax transaction committed */ tuple->t_infomask |= HEAP_XMAX_COMMITTED; if (tuple->t_infomask & HEAP_MARKED_FOR_UPDATE) /* ?? BTW, shouldn't we set HEAP_MAX_INVALID here? */ return HeapTupleMayBeUpdated; return HeapTupleUpdated; /* updated by other */ } So in the worst case we return after having done four more compares than without the patch. Note that in the most common cases there is no additional cost at all. If you still think we have a performance problem here, we could replace GetCmin and GetCmax by cheaper macros: #define HeapTupleHeaderGetCminKnowingThatNotMoved(tup) \ ( \ AssertMacro(!((tup)->t_infomask & HEAP_MOVED)), ( \ ((tup)->t_infomask & (HEAP_XMIN_IS_XMAX | HEAP_XMAX_INVALID)) ? \ (CommandId) (tup)->t_cid \ : \ FirstCommandId \ ) \ ) thus reducing the additional cost to one t_infomask compare, because the Satisfies functions only access Cmin and Cmax, when HEAP_MOVED is known to be not set. OTOH experimenting with a moderatly sized "out of production" database I got the following results: | pages | pages | relkind | count | tuples | before| after | savings --------+-------+--------+-------+-------+-------- i | 31 | 808146 | 8330 | 8330 | 0.00% r | 32 | 612968 | 13572 | 13184 | 2.86% all | 63 | | 21902 | 21514 | 1.77% 2.86% fewer heap pages mean 2.86% less disk IO caused by heap pages. Considering that index pages tend to benefit more from caching we conclude that heap pages contribute more to the overall IO load, so the total savings in the number of disk IOs should be better than the 1.77% shown in the table above. I think this outweighs a few CPU cycles now and then. Servus Manfred
pgsql-patches by date: