Re: Removing alignment padding for byval types - Mailing list pgsql-hackers
From | Tomas Vondra |
---|---|
Subject | Re: Removing alignment padding for byval types |
Date | |
Msg-id | 20191031194534.5kssk4c5uiieakpp@development Whole thread Raw |
In response to | Re: Removing alignment padding for byval types (Andres Freund <andres@anarazel.de>) |
List | pgsql-hackers |
On Thu, Oct 31, 2019 at 12:24:33PM -0700, Andres Freund wrote: >Hi, > >On 2019-10-31 20:15:12 +0100, Tomas Vondra wrote: >> On Thu, Oct 31, 2019 at 11:48:21AM -0700, Andres Freund wrote: >> > We currently align byval types such as int4/8, float4/8, timestamp *, >> > date etc, even though we mostly don't need to. When tuples are deformed, >> > all byval types are copied out from the tuple data into the >> > corresponding Datum array, therefore the original alignment in the tuple >> > data doesn't matter. This is different from byref types, where the >> > Datum formed will often be a pointer into the tuple data. >> > >> > While there are some older systems where it could be a bit slower to >> > copy data out from unaligned positions into the datum array, this is >> > more than bought back by the next point: >> > >> > >> > The fact that these types are aligned has substantial costs: >> > >> > For one, we often waste substantial amounts of space inside tables with >> > alignment padding. It's not uncommon to see about 30% or more of space >> > wasted (especially when taking alignment of the first column into >> > account). >> > >> > For another, and this I think is less obvious, we actually waste >> > substantial amounts of CPU maintaining the alignment. This is primarily >> > the case because we have to perform to align the pointer to the next >> > field during tuple [de]forming. Those instructions [1] have to be >> > executed taking time, but what's worse, they also reduce the ability of >> > out-of-order execution to hide latencies. There's a hard dependency on >> > knowing the offset to the next column to be able to continue with the >> > next column. >> > >> >> Right. Reducing this overhead was one of the goals to allow "logical >> ordering" of columns in a table (while arbitrarily reordering the >> physical ones), but that patch got out of hand pretty quickly. Also, >> it'd still keep some of the overhead, because it was not removing the >> alignment/padding entirely. > >Yea. It'd keep just about all the CPU overhead, because we'd still need >to align as soon as there is a preceding nulled or varlena colum. > >There's still some benefit for logical column order, as grouping NOT >NULL fixed-length columns at the start is beneficial. And it's also >beneficial to have frequently accessed columns at the start. But I think >this proposal gains a lot of the space related benefits, at a much lower >complexity, together with a lot of other benefits. > +1 > >> > There's two reasons why we can't just set the alignment for these types >> > to 'c'. >> > 1) pg_upgrade, for fairly obvious reasons >> > 2) We map catalog table rows to structs, in a *lot* of places. >> > >> > >> > It seems to me that, despite the above, it's still worth trying to >> > improve upon the current state, to benefit from reduced space and CPU >> > usage. >> > >> > As it turns out we already separate out the alignment for a type, and a >> > column, between pg_type.typalign and pg_attribute.attalign. One way to >> > tackle this would be to allow to specify, for byval types only, at >> > column creation time whether a column uses a 'struct-mappable' alignment >> > or not. When set, set attalign to pg_type.typalign for alignment, when >> > not, to 'c'. By changing pg_dump in binary upgrade mode to emit the >> > necessary options, and by adding such options during bki processing, >> > we'd solve 1) and 2), but otherwise gain the benefits. >> > >> > Alternatively we could declare such a propert on the table level, but >> > that seems more restrictive, without a corresponding upside. > >> I don't know, but it seems entirely sufficient specifying this at the >> table level, no? What would be the use case for removing padding for >> only some of the columns? I don't see the use case for that. > >Well, if we had it on a per-table level, we'd also align >a) catalog table columns that follow the first varlena column - which we don't need > to align, as they can't be accessed via mapping >b) columns in pg_upgraded tables that have been added after the upgrade > Hmm, OK. I think the question is whether it's worth the extra complexity. I'd say it's not, but perhaps I'm wrong. > >> > It's possible that we should do something related with a few varlena >> > datatypes. We currently use intalign for types like text, json, and as >> > far as I can tell that does not make all that much sense. They're not >> > struct mappable *anyway* (and if they were, they'd need to be 8 byte >> > aligned on common platforms, __alignof__(void*) is 8). We'd have to take >> > a bit of care to treat the varlena header as unaligned - but we need to >> > do so anyway, because of 1byte varlenas. Short varlenas seems to make it >> > less crucial to pursue this, as the average datum that'd benefit is long >> > enough to make padding a non-issue. So I don't think it'd make sense to >> > tackle this project at the same time. >> > >> >> Not sure, but how come it's not failing on the picky platforms, then? On >> x86 it's probably OK because it's pretty permissive, but I'd expect some >> platforms (s390, parisc, itanium, powerpc, ...) to be much pickier. > >I'm not quite following? As I said, we already need to use alignment >aware code due to caring for short varlenas. And these types aren't >actually struct mappable. > Sorry, I misread/misunderstood what you wrote. cheers -- Tomas Vondra http://www.2ndQuadrant.com PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
pgsql-hackers by date: