Thread: Save a few bytes in pg_attribute
After the discussion in [0] ff., I was looking around in pg_attribute and noticed that we could possibly save 4 bytes. We could change both attstattarget and attinhcount from int4 to int2, which together with some reordering would save 4 bytes from the fixed portion. attstattarget is already limited to 10000, so this wouldn't lose anything. For attinhcount, I don't see any documented limits. But it seems unlikely to me that someone would need more than 32k immediate inheritance parents on a column. (Maybe an overflow check would be useful, though, to prevent shenanigans.) The attached patch seems to work. Thoughts? [0]: https://www.postgresql.org/message-id/20230313204119.4mkepdvixcxrwpsc%40awork3.anarazel.de
Attachment
Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: > After the discussion in [0] ff., I was looking around in pg_attribute > and noticed that we could possibly save 4 bytes. We could change both > attstattarget and attinhcount from int4 to int2, which together with > some reordering would save 4 bytes from the fixed portion. > attstattarget is already limited to 10000, so this wouldn't lose > anything. For attinhcount, I don't see any documented limits. But it > seems unlikely to me that someone would need more than 32k immediate > inheritance parents on a column. (Maybe an overflow check would be > useful, though, to prevent shenanigans.) > The attached patch seems to work. Thoughts? I agree that attinhcount could be narrowed, but I have some concern about attstattarget. IIRC, the limit on attstattarget was once 1000 and then we raised it to 10000. Is it inconceivable that we might want to raise it to 100000 someday? regards, tom lane
On 3/20/23 15:37, Tom Lane wrote: > Peter Eisentraut <peter.eisentraut@enterprisedb.com> writes: >> After the discussion in [0] ff., I was looking around in pg_attribute >> and noticed that we could possibly save 4 bytes. We could change both >> attstattarget and attinhcount from int4 to int2, which together with >> some reordering would save 4 bytes from the fixed portion. > >> attstattarget is already limited to 10000, so this wouldn't lose >> anything. For attinhcount, I don't see any documented limits. But it >> seems unlikely to me that someone would need more than 32k immediate >> inheritance parents on a column. (Maybe an overflow check would be >> useful, though, to prevent shenanigans.) > >> The attached patch seems to work. Thoughts? > > I agree that attinhcount could be narrowed, but I have some concern > about attstattarget. IIRC, the limit on attstattarget was once 1000 > and then we raised it to 10000. Is it inconceivable that we might > want to raise it to 100000 someday? > Yeah, I don't think it'd be wise to make it harder to increase the statistics target limit. IMHO it'd be much better to just not store the statistics target for attributes that have it default (which we now identify by -1), or for system attributes (where we store 0). I'd bet vast majority of systems will just use the default / GUC value. So if we're interested in saving these bytes, we could just store NULL in these cases, no? regards -- Tomas Vondra EnterpriseDB: http://www.enterprisedb.com The Enterprise PostgreSQL Company
Tomas Vondra <tomas.vondra@enterprisedb.com> writes: > IMHO it'd be much better to just not store the statistics target for > attributes that have it default (which we now identify by -1), or for > system attributes (where we store 0). I'd bet vast majority of systems > will just use the default / GUC value. So if we're interested in saving > these bytes, we could just store NULL in these cases, no? Hmm, we'd have to move it to the nullable part of the row and expend more code to fetch it; but I don't think it's touched in many places, so that might be a good tradeoff. Couple of notes: * As things stand I think we have a null bitmap in every row of pg_attribute already (surely attfdwoptions and attmissingval are never both filled), so there's no extra cost there. * Putting it in the variable part of the row means it wouldn't appear in tuple descriptors, but that seems fine. I wonder if the same is true of attinhcount. Since it's nonzero for partitions' attributes, it might be non-null in a fairly large fraction of pg_attribute rows in some use-cases, but it still seems like we'd not be losing anything. It wouldn't need to be touched in any high-performance code paths AFAICS. regards, tom lane
On Mon, 20 Mar 2023, 11:00 pm Peter Eisentraut, <peter.eisentraut@enterprisedb.com> wrote: > After the discussion in [0] ff., I was looking around in pg_attribute > and noticed that we could possibly save 4 bytes. We could change both > attstattarget and attinhcount from int4 to int2, which together with > some reordering would save 4 bytes from the fixed portion. I just want to highlight 1ef61ddce9, which fixed a very long-standing bug that meant that pg_inherits.inhseqno was effectively just 16-bit. Perhaps because nobody seemed to report that as a limitation 16 bits is enough space. I only noticed that as a bug from code reading. David
Hi, On 2023-03-20 10:37:36 -0400, Tom Lane wrote: > I agree that attinhcount could be narrowed, but I have some concern > about attstattarget. IIRC, the limit on attstattarget was once 1000 > and then we raised it to 10000. Is it inconceivable that we might > want to raise it to 100000 someday? Hard to believe that'd happen in a minor version - and I don't think there'd an issue with widening it again in a major version? I doubt we'll ever go to 100k without a major redesign of stats storage/access - the size of the stats datums would make that pretty impractical right now. Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > On 2023-03-20 10:37:36 -0400, Tom Lane wrote: >> I agree that attinhcount could be narrowed, but I have some concern >> about attstattarget. IIRC, the limit on attstattarget was once 1000 >> and then we raised it to 10000. Is it inconceivable that we might >> want to raise it to 100000 someday? > Hard to believe that'd happen in a minor version - and I don't think there'd > an issue with widening it again in a major version? True. However, I think Tomas' idea of making these columns nullable is even better than narrowing them. regards, tom lane
Hi, On 2023-03-20 11:00:12 +0100, Peter Eisentraut wrote: > After the discussion in [0] ff., I was looking around in pg_attribute and > noticed that we could possibly save 4 bytes. We could change both > attstattarget and attinhcount from int4 to int2, which together with some > reordering would save 4 bytes from the fixed portion. attndims seems like another good candidate to shrink. Greetings, Andres Freund
On 21.03.23 00:51, Tom Lane wrote: > Andres Freund <andres@anarazel.de> writes: >> On 2023-03-20 10:37:36 -0400, Tom Lane wrote: >>> I agree that attinhcount could be narrowed, but I have some concern >>> about attstattarget. IIRC, the limit on attstattarget was once 1000 >>> and then we raised it to 10000. Is it inconceivable that we might >>> want to raise it to 100000 someday? > >> Hard to believe that'd happen in a minor version - and I don't think there'd >> an issue with widening it again in a major version? > > True. However, I think Tomas' idea of making these columns nullable > is even better than narrowing them. The context of my message was to do the proposed change for PG16 to buy back a few bytes that are being added by another feature, and then consider doing a larger detangling of pg_attribute and tuple descriptors in PG17, which might well involve taking the attstattarget out of the hot path. Making attstattarget nullable (i.e., not part of the fixed part of pg_attribute) would require fairly significant surgery, so I think it would be better done as part of a more comprehensive change that would allow the same treatment for other columns as well.
Hi, On 2023-03-21 17:36:48 +0100, Peter Eisentraut wrote: > On 21.03.23 00:51, Tom Lane wrote: > > Andres Freund <andres@anarazel.de> writes: > > > On 2023-03-20 10:37:36 -0400, Tom Lane wrote: > > > > I agree that attinhcount could be narrowed, but I have some concern > > > > about attstattarget. IIRC, the limit on attstattarget was once 1000 > > > > and then we raised it to 10000. Is it inconceivable that we might > > > > want to raise it to 100000 someday? > > > > > Hard to believe that'd happen in a minor version - and I don't think there'd > > > an issue with widening it again in a major version? > > > > True. However, I think Tomas' idea of making these columns nullable > > is even better than narrowing them. Why not do both? > The context of my message was to do the proposed change for PG16 to buy back > a few bytes that are being added by another feature How much would you need to buy back to "reach parity"? Greetings, Andres Freund
On 21.03.23 17:43, Andres Freund wrote: >> The context of my message was to do the proposed change for PG16 to buy back >> a few bytes that are being added by another feature > How much would you need to buy back to "reach parity"? I don't think we can find enough to make the impact zero bytes. It's also not clear exactly what the impact of each byte would be (compared to possible complications in other parts of the code, for example). But if there are a few low-hanging fruit, it seems like we could pick them, to old us over until we have a better solution to the underlying issue.
Hi, On 2023-03-21 18:15:40 +0100, Peter Eisentraut wrote: > On 21.03.23 17:43, Andres Freund wrote: > > > The context of my message was to do the proposed change for PG16 to buy back > > > a few bytes that are being added by another feature > > How much would you need to buy back to "reach parity"? > > I don't think we can find enough to make the impact zero bytes. It's also > not clear exactly what the impact of each byte would be (compared to > possible complications in other parts of the code, for example). But if > there are a few low-hanging fruit, it seems like we could pick them, to old > us over until we have a better solution to the underlying issue. attndims 4->2 attstattarget 4->2 attinhcount 4->2 + some reordering only gets you from 112->108 unfortunately, due to a 1 byte alignment hole and 2 bytes of trailing padding. before: /* size: 112, cachelines: 2, members: 22 */ /* sum members: 111, holes: 1, sum holes: 1 */ /* last cacheline: 48 bytes */ after: /* size: 108, cachelines: 2, members: 22 */ /* sum members: 105, holes: 1, sum holes: 1 */ /* padding: 2 */ /* last cacheline: 44 bytes */ You might be able to fill the hole + padding with your data - but IIRC that was 3 4byte integers? FWIW, I think we should consider getting rid of attcacheoff. I doubt it's worth its weight these days, because deforming via slots starts at the beginning anyway. The overhead of maintaining it is not insubstantial, and it's just architecturally ugly to to update tupledescs continually. Not for your current goal, but I do wonder how hard it'd be to make it work to store multiple booleans as bitmasks. Probably ties into the discussion around not relying on struct "mapping" for catalog tables (which we IIRC decided is the sensible way the NAMEDATALEN restriction). E.g. pg_attribute has 6 booleans, and attgenerated effectively is a boolean too, and attidentity could easily be modeled as such as well. If were to not rely on struct mapping anymore, we could possibly transparently do this as part of forming/deforming heap tuples. Using something like TYPALIGN_BIT. The question is whether it'd be too expensive to decode... Greetings, Andres Freund
Andres Freund <andres@anarazel.de> writes: > FWIW, I think we should consider getting rid of attcacheoff. I doubt it's > worth its weight these days, because deforming via slots starts at the > beginning anyway. The overhead of maintaining it is not insubstantial, and > it's just architecturally ugly to to update tupledescs continually. I'd be for that if we can convince ourselves there's not a material speed penalty. As you say, it's quite ugly. regards, tom lane
On Tue, 21 Mar 2023 at 19:55, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Andres Freund <andres@anarazel.de> writes: > > FWIW, I think we should consider getting rid of attcacheoff. I doubt it's > > worth its weight these days, because deforming via slots starts at the > > beginning anyway. The overhead of maintaining it is not insubstantial, and > > it's just architecturally ugly to to update tupledescs continually. > > I'd be for that if we can convince ourselves there's not a material > speed penalty. As you say, it's quite ugly. Yes, attcacheoff is a tremendous performance boon in many cases. But all is not lost: When I was working on other improvements I experimented with storing the attributes used in (de)serializing tuples to disk in a separate structured array in the TupleDesc, a prototype patch of which I shared here [0]. I didn't see a speed difference back then so I didn't further venture into that path (as it adds complexity without performance benefits), but I think it can be relevant to this thread because with that patch we actually don't need the attcacheoff in the pg_atttribute struct: it only needs to be present in the derived "TupleAttrAlignData" structs which carry the length/alignment/storage/byval info. Kind regards, Matthias van de Meent [0] https://www.postgresql.org/message-id/CAEze2Wh8-metSryZX_Ubj-uv6kb%2B2YnzHAejmEdubjhmGusBAg%40mail.gmail.com
Matthias van de Meent <boekewurm+postgres@gmail.com> writes: > ... with that patch we actually don't need the attcacheoff in the > pg_atttribute struct: it only needs to be present in the derived > "TupleAttrAlignData" structs which carry the > length/alignment/storage/byval info. Yeah, I was wondering about that too: keeping attcacheoff as local state in slots might get us all its win without so much conceptual dirtiness. regards, tom lane
Hi, On 2023-03-21 20:20:40 +0100, Matthias van de Meent wrote: > On Tue, 21 Mar 2023 at 19:55, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > Andres Freund <andres@anarazel.de> writes: > > > FWIW, I think we should consider getting rid of attcacheoff. I doubt it's > > > worth its weight these days, because deforming via slots starts at the > > > beginning anyway. The overhead of maintaining it is not insubstantial, and > > > it's just architecturally ugly to to update tupledescs continually. > > > > I'd be for that if we can convince ourselves there's not a material > > speed penalty. As you say, it's quite ugly. > > Yes, attcacheoff is a tremendous performance boon in many cases. Which? We don't use fastgetattr() in many places these days. And in some quick measurements it's a wash or small loss when deforming slot tuples, even when the attcacheoff optimization would apply, because the branches for managing it add more overhead than they safe. Greetings, Andres Freund
On Tue, 21 Mar 2023 at 20:58, Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2023-03-21 20:20:40 +0100, Matthias van de Meent wrote: > > On Tue, 21 Mar 2023 at 19:55, Tom Lane <tgl@sss.pgh.pa.us> wrote: > > > > > > Andres Freund <andres@anarazel.de> writes: > > > > FWIW, I think we should consider getting rid of attcacheoff. I doubt it's > > > > worth its weight these days, because deforming via slots starts at the > > > > beginning anyway. The overhead of maintaining it is not insubstantial, and > > > > it's just architecturally ugly to to update tupledescs continually. > > > > > > I'd be for that if we can convince ourselves there's not a material > > > speed penalty. As you say, it's quite ugly. > > > > Yes, attcacheoff is a tremendous performance boon in many cases. > > Which? We don't use fastgetattr() in many places these days. And in some quick > measurements it's a wash or small loss when deforming slot tuples, even when > the attcacheoff optimization would apply, because the branches for managing it > add more overhead than they safe. My experience with attcacheoff performance is in indexes, specifically index_getattr(). Sure, multi-column indexes are uncommon, but the difference between have and have-not for cached attribute offsets is several %. Kind regards, Matthias van de Meent
Hi, On 2023-03-21 15:26:38 -0400, Tom Lane wrote: > Matthias van de Meent <boekewurm+postgres@gmail.com> writes: > > ... with that patch we actually don't need the attcacheoff in the > > pg_atttribute struct: it only needs to be present in the derived > > "TupleAttrAlignData" structs which carry the > > length/alignment/storage/byval info. > > Yeah, I was wondering about that too: keeping attcacheoff as local > state in slots might get us all its win without so much conceptual > dirtiness. It's also the place where it's the least likely to help - afaict attcacheoff is only really beneficial for fastgetattr(). Which conditions it's use more strictly - not only can there not be any NULLs before the accessed column, there may not be any NULLs in the tuple at all. Greetings, Andres Freund
Hi, On 2023-03-21 21:02:08 +0100, Matthias van de Meent wrote: > On Tue, 21 Mar 2023 at 20:58, Andres Freund <andres@anarazel.de> wrote: > > On 2023-03-21 20:20:40 +0100, Matthias van de Meent wrote: > > > Yes, attcacheoff is a tremendous performance boon in many cases. > > > > Which? We don't use fastgetattr() in many places these days. And in some quick > > measurements it's a wash or small loss when deforming slot tuples, even when > > the attcacheoff optimization would apply, because the branches for managing it > > add more overhead than they safe. > > My experience with attcacheoff performance is in indexes, specifically > index_getattr(). Sure, multi-column indexes are uncommon, but the > difference between have and have-not for cached attribute offsets is > several %. I did indeed not think of index_getattr(), just heap related things. Do you have a good test workload handy - I'm kinda curious to compare the cost of removing attcacheoff vs the gain of not maintaining it for index workloads. It looks like many of the index_getattr() cases could be made faster without attcacheoff. A lot of places seem to loop over all attributes, and the key to accelerating that is to keep state between the iterations. Attcacheoff is that, but quite stunted, because it only works if there aren't any NULLs (even if the NULL is in a later column). Greetings, Andres Freund
On Tue, 21 Mar 2023 at 23:05, Andres Freund <andres@anarazel.de> wrote: > > Hi, > > On 2023-03-21 21:02:08 +0100, Matthias van de Meent wrote: > > On Tue, 21 Mar 2023 at 20:58, Andres Freund <andres@anarazel.de> wrote: > > > On 2023-03-21 20:20:40 +0100, Matthias van de Meent wrote: > > > > Yes, attcacheoff is a tremendous performance boon in many cases. > > > > > > Which? We don't use fastgetattr() in many places these days. And in some quick > > > measurements it's a wash or small loss when deforming slot tuples, even when > > > the attcacheoff optimization would apply, because the branches for managing it > > > add more overhead than they safe. > > > > My experience with attcacheoff performance is in indexes, specifically > > index_getattr(). Sure, multi-column indexes are uncommon, but the > > difference between have and have-not for cached attribute offsets is > > several %. > > I did indeed not think of index_getattr(), just heap related things. > > Do you have a good test workload handy - I'm kinda curious to compare the cost > of removing attcacheoff vs the gain of not maintaining it for index workloads. Rebuilding indexes has been my go-to workload for comparing attribute-related btree performance optimizations in [0] and [1]. Results of tests from '21 in which we're always calculating offsets from 0 show a slowdown of 4-18% in attcacheoff-enabled workloads if we're calculating offsets dynamically. > It looks like many of the index_getattr() cases could be made faster without > attcacheoff. A lot of places seem to loop over all attributes, and the key to > accelerating that is to keep state between the iterations. Indeed, it's not great. You can take a look at [1], which is where I'm trying to optimize btree's handling of comparing tuples; which includes work on reducing overhead for attribute accesses. Note that each btree page should be able to do with comparing at most 2*log(ntups) columns, where this is currently natts * log(ntups). > Attcacheoff is > that, but quite stunted, because it only works if there aren't any NULLs (even > if the NULL is in a later column). Yes, that isn't great either, but most indexes I've seen have tuples that are either all NULL, or have no nulls; only seldom I see indexes that have mixed NULL/not-null index tuple attributes. Kind regards, Matthias van de Meent. [0] https://www.postgresql.org/message-id/flat/CAEze2WhyBT2bKZRdj_U0KS2Sbewa1XoO_BzgpzLC09sa5LUROg%40mail.gmail.com#fe3369c4e202a7ed468e47bf5420f530 [1] https://www.postgresql.org/message-id/flat/CAEze2Wg52tsSWA9Fy7OCXx-K7pPLMNxA_fmQ6-+_pzR-AoODDA@mail.gmail.com
On 21.03.23 18:46, Andres Freund wrote: > FWIW, I think we should consider getting rid of attcacheoff. I doubt it's > worth its weight these days, because deforming via slots starts at the > beginning anyway. The overhead of maintaining it is not insubstantial, and > it's just architecturally ugly to to update tupledescs continually. Btw., could attcacheoff be int16?
On Wed, 22 Mar 2023 at 10:42, Peter Eisentraut <peter.eisentraut@enterprisedb.com> wrote: > > On 21.03.23 18:46, Andres Freund wrote: > > FWIW, I think we should consider getting rid of attcacheoff. I doubt it's > > worth its weight these days, because deforming via slots starts at the > > beginning anyway. The overhead of maintaining it is not insubstantial, and > > it's just architecturally ugly to to update tupledescs continually. > > Btw., could attcacheoff be int16? I had the same thought in '21, and in the patch linked upthread[0] I added an extra comment on the field: > + Note: Although the maximum offset encountered in stored tuples is > + limited to the max BLCKSZ (2**15), FormData_pg_attribute is used for > + all internal tuples as well, so attcacheoff may be larger for those > + tuples, and it is therefore not safe to use int16. So, we can't reduce its size while we use attcacheoff for (de)serialization of tuples with up to MaxAttributeNumber (=INT16_MAX) of attributes which each can be larger than one byte (such as in tuplestore, tuplesort, spilling hash aggregates, ...) Kind regards, Matthias van de Meent [0] https://www.postgresql.org/message-id/CAEze2Wh8-metSryZX_Ubj-uv6kb+2YnzHAejmEdubjhmGusBAg@mail.gmail.com
On 21.03.23 18:46, Andres Freund wrote: >> I don't think we can find enough to make the impact zero bytes. It's also >> not clear exactly what the impact of each byte would be (compared to >> possible complications in other parts of the code, for example). But if >> there are a few low-hanging fruit, it seems like we could pick them, to old >> us over until we have a better solution to the underlying issue. > > attndims 4->2 > attstattarget 4->2 > attinhcount 4->2 > > + some reordering only gets you from 112->108 unfortunately, due to a 1 byte > alignment hole and 2 bytes of trailing padding. > > before: > /* size: 112, cachelines: 2, members: 22 */ > /* sum members: 111, holes: 1, sum holes: 1 */ > /* last cacheline: 48 bytes */ > > after: > /* size: 108, cachelines: 2, members: 22 */ > /* sum members: 105, holes: 1, sum holes: 1 */ > /* padding: 2 */ > /* last cacheline: 44 bytes */ > > You might be able to fill the hole + padding with your data - but IIRC that > was 3 4byte integers? Here is an updated patch that handles those three fields, including some overflow checks. I also changed coninhcount to match attinhcount. I structured the inhcount overflow checks to be independent of the integer size, but maybe others find this approach weird. Given the calculation shown, there is no value in reducing all three fields versus just two, but I don't find compelling reasons to leave out one or the other field. (attstattarget got the most discussion, but that one is actually the easiest part of the patch.) I took another hard look at some of the other proposals, including moving some fields to the variable length part or combining some bool or char fields. Those changes all appear to have a really long tail of issues all over the code that I wouldn't want to attack them now in an ad hoc way. My suggestion is to use this patch and then consider the column encryption patch as it stands now. The discussion about attcacheoff seems to be still ongoing. But it seems whatever the outcome would be independent of this patch: Either we keep it or we remove it; there is no proposal to resize it.
Attachment
On 23.03.23 13:45, Peter Eisentraut wrote: > My suggestion is to use this patch and then consider the column > encryption patch as it stands now. I have committed this.