Hi,
I noticed that effectively all indexes use the special region of a
page to store some index-specific data on the page. In all cases I've
noticed, this is a constant-sized struct, located at what is
effectively a fixed offset from the end of the page; indicated on the
page by pd_special; and accessed through PageGetSpecialPointer() (that
is about equal to `page + page->pd_special` modulo typing and
assertions).
Seeing that these indexes effectively always use a constant-sized
struct as the only data in the special region; why does this code
depend on the pd_special pointer to retrieve their PageOpaque pointer?
Wouldn't a constant offset off of (page), based on the size of the
opaque structure and BLCKSZ, be enough?
Sure, in assertion builds this also validates that pd_special indeed
points in the bounds of the page, but PageGetSpecialPointer() does not
validate that the struct itself is in the bounds of the page, so
there's little guarantee that this is actually safe.
Additionally, this introduces a layer of indirection that is not
necessarily useful: for example the_bt_step_left code has no use for
the page's header information other than that it contains the
pd_special pointer (which is effectively always BLCKSZ -
MAXALIGN(sizeof(opaque))). This introduces more data and ordering
dependencies in the CPU, where this should be as simple as a constant
offset over the base page pointer.
Assuming there's no significant reason to _not_ to change this code to
a constant offset off the page pointer and only relying on pd_special
in asserts when retrieving the IndexPageOpaques, I propose the
attached patches:
0001 adds a new macro PageGetSpecialOpaque(page, opaquedatatyp); which
replaces PageGetSpecialPointer for constant sized special area
structures, and sets up the SPGist, GIST and Gin index methods to use
that instead of PageGetSpecialPointer.
0002 replaces manual PageGetSpecialPointer calls & casts in btree code
with a new macro BTPageGetOpaque, which utilizes PageGetSpecialOpaque.
0003 does the same as 0002, but for hash indexes.
A first good reason to do this is preventing further damage when a
page is corrupted: if I can somehow overwrite pd_special,
non-assert-enabled builds would start reading and writing at arbitrary
offsets from the page pointer, quite possibly in subsequent buffers
(or worse, on the stack, in case of stack-allocated blocks).
A second reason would be less indirection to get to the opaque
pointer. This should improve performance a bit in those cases where we
(initially) only want to access the [Index]PageOpaque struct.
Lastly, 0002 and 0003 remove some repetitive tasks of dealing with the
[nbtree, hash] opaques by providing a typed accessor macro similar to
what is used in the GIN and (SP-)GIST index methods; improving the
legibility of the code and decreasing the churn.
Kind regards,
Matthias van de Meent.