Greetings,
* Peter Geoghegan (pg@bowt.ie) wrote:
> Andres has suggested that I work on teaching nbtree to accommodate
> variable-width, logical table identifiers, such as those required for
> indirect indexes, or clustered indexes, where secondary indexes must
> use a logical primary key value instead of a heap TID. I'm not
> currently committed to working on this as a project, but I definitely
> don't want to make it any harder. This has caused me to think about
> the problem as it relates to the new on-disk representation for v4
> nbtree indexes in Postgres 12. I do have a minor misgiving about one
> particular aspect of what I came up with: The precise details of how
> we represent heap TID in pivot tuples seems like it might make things
> harder than they need to be for a future logical/varwidth table
> identifier project. This probably isn't worth doing anything about
> now, but it seems worth discussing now, just in case.
This seems like it would be helpful for global indexes as well, wouldn't
it?
> This also results in an immediate though small benefit for v4 nbtree
> indexes: _bt_truncate() produces smaller pivot tuples in a few cases.
> For example, indexes with one or two boolean fields will have pivot
> tuples that are 15 bytes and 16 bytes in length respectively,
> occupying 16 bytes of tuple space on internal pages. The saving comes
> because we can use the alignment padding hole, that was empty in the
> original non-pivot index tuple that the new pivot tuple is to be
> formed from. Currently, the size of these pivot tuples would be 24
> bytes, so we're occasionally saving a MAXALIGN() quantum in space this
> way. It is unlikely that anyone would actually care very much about
> these kinds of space savings, but at the same time it feels more
> elegant to me. The heap TID may not have a pg_attribute entry, but
> ISTM that the on-disk representation should not have padding "in the
> wrong place", on general principle.
>
> Thoughts?
I agree with trying to avoid having padding 'in the wrong place' and if
it makes some indexes smaller, great, even if they're unlikely to be
interesting in the vast majority of cases, they may still exist out
there. Of course, this is provided that it doesn't overly complicate
the code, but it sounds like it wouldn't be too bad in this case.
Thanks!
Stephen