Thread: Issue in _bt_getrootheight
Hi everyone,
We have been working on the pg_adviser extension whose goal is to suggest indexes by creating virtual/hypothetical indexes and see how it affects the query cost.But the problem comes from here when the function get_relation_info is called in planning stage, it tries to calculate the B-Tree height by calling function _bt_getrootheight, but the B-Tree is not built at all, and its metadata page (which is block 0 in our case) doesn't exist, so this returns error that it cannot read the page (since it doesn't exist).
I tried to debug the code and found that this feature was introduced in version 9.3 under this commit [1]. I think that in the code we need to check if it's a B-Tree index AND the index is built/have some pages, then we can go and calculate it otherwise just put it to -1
I mean instead of this
if (info->relam == BTREE_AM_OID)
{
/* For btrees, get tree height while we have the index open */
info->tree_height = _bt_getrootheight(indexRelation);
}
else
{
/* For other index types, just set it to "unknown" for now */
info->tree_height = -1;
}
The first line should be
if (info->relam == BTREE_AM_OID && info->pages > 0)
or use the storage manager (smgr) to know if the first block exists.
I would appreciate it if anyone can agree/approve or deny so that I know if anything I am missing :)
Thanks everyone :)
On Tue, Jul 11, 2023 at 9:35 AM Ahmed Ibrahim <ahmed.ibr.hashim@gmail.com> wrote: > > We have been working on the pg_adviser extension whose goal is to suggest indexes by creating virtual/hypothetical indexesand see how it affects the query cost. > > The hypothetical index shouldn't take any space on the disk (allocates 0 pages) so we give it the flag INDEX_CREATE_SKIP_BUILD. > But the problem comes from here when the function get_relation_info is called in planning stage, it tries to calculatethe B-Tree height by calling function _bt_getrootheight, but the B-Tree is not built at all, and its metadata page(which is block 0 in our case) doesn't exist, so this returns error that it cannot read the page (since it doesn't exist). > > I tried to debug the code and found that this feature was introduced in version 9.3 under this commit [1]. I think thatin the code we need to check if it's a B-Tree index AND the index is built/have some pages, then we can go and calculateit otherwise just put it to -1 > I mean instead of this > if (info->relam == BTREE_AM_OID) > { > /* For btrees, get tree height while we have the index open */ > info->tree_height = _bt_getrootheight(indexRelation); > } > else > { > /* For other index types, just set it to "unknown" for now */ > info->tree_height = -1; > } > > The first line should be > if (info->relam == BTREE_AM_OID && info->pages > 0) > or use the storage manager (smgr) to know if the first block exists. I think the better method would be to calculate the index height *after* get_relation_info_hook is called. That way, instead of the server guessing whether or not an index is hypothetical it can rely on the index adviser's notion of which index is hypothetical. The hook implementer has the opportunity to not only mark the indexOptInfo->hypothetical = true, but also calculate the tree height, if they can. Please see attached the patch that does this. Let me know if this patch helps. Best regards, Gurjeet http://Gurje.et
Attachment
Gurjeet Singh <gurjeet@singh.im> writes: > Please see attached the patch that does this. Let me know if this patch helps. I don't like this patch one bit, because it adds a lot of overhead (i.e., an extra index_open/close cycle for every btree index in every query) to support a tiny minority use-case. How come we don't already know whether the index is hypothetical at the point where _bt_getrootheight is called now? Actually, looking at the existing comment at the call site: /* * Allow a plugin to editorialize on the info we obtained from the * catalogs. Actions might include altering the assumed relation size, * removing an index, or adding a hypothetical index to the indexlist. */ if (get_relation_info_hook) (*get_relation_info_hook) (root, relationObjectId, inhparent, rel); reminds me that the design intention was that hypothetical indexes would get added to the list by get_relation_info_hook itself. If that's not how the index adviser is operating, maybe we need to have a discussion about that. regards, tom lane
On Fri, Jul 21, 2023 at 10:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > Gurjeet Singh <gurjeet@singh.im> writes: > > Please see attached the patch that does this. Let me know if this patch helps. > > I don't like this patch one bit, because it adds a lot of overhead > (i.e., an extra index_open/close cycle for every btree index in every > query) to support a tiny minority use-case. I anticipated the patch's performance impact may be a concern, but before addressing it I wanted to see if the patch actually helped Index Adviser. Ahmed has confirmed that my proposed patch works for him. I believe the additional index_open() would not affect the performance significantly, since the very same indexes were index_open()ed just before calling the get_relation_info_hook. All the relevant caches would be quite fresh because of the index_open() in the same function above. And since the locks taken on these indexes haven't been released, we don't have to work hard to take any new locks (hence the index_open() with NoLock flag). > How come we don't > already know whether the index is hypothetical at the point where > _bt_getrootheight is called now? Because the 'hypthetical' flag is not stored in catalogs, and that's okay; see below. At that point, the only indication that an index may be a hypothetical index is if RelationGetNumberOfBlocks() returns 0 for it, and that's what Ahmed's proposed patch relied on. But I think extrapolating that info->pages==0 implies it's a hypothetical index, is stretching that assumption too far. > Actually, looking at the existing comment at the call site: > > /* > * Allow a plugin to editorialize on the info we obtained from the > * catalogs. Actions might include altering the assumed relation size, > * removing an index, or adding a hypothetical index to the indexlist. > */ > if (get_relation_info_hook) > (*get_relation_info_hook) (root, relationObjectId, inhparent, rel); > > reminds me that the design intention was that hypothetical indexes > would get added to the list by get_relation_info_hook itself. > If that's not how the index adviser is operating, maybe we need > to have a discussion about that. Historically, to avoid having to hand-create the IndexOptInfo and risk getting something wrong, the Index Adviser has used index_create() to create a full-blown btree index, (sans that actual build step, with skip_build = true), and saving the returned OID. This ensured that all the catalog entries were in place before it called the standard_planner(). This way Postgres would build IndexOptInfo from the entries in the catalog, as usual. Then, inside the get_relation_info_hook() callback, Index Adviser identifies these virtual indexes by their OID, and at that point marks them with hypothetical=true. After planning is complete, the Index Adviser scans the plan to find any IndexScan objects that have indexid matching the saved OIDs. Index Adviser performs the whole thing in a subtransaction, which gets rolled back. So the hypothetical indexes are not visible to any other transaction, ever. Assigning OID to a hypothetical index is necessary, and I believe index_create() is the right way to do it. In fact, in the 9.1 cycle there was a bug fixed (a2095f7fb5, where the hypothetical flag was also invented), to solve precisely this problem; to allow the Index Adviser to use OIDs to identify hypothetical indexes that were used/chosen by the planner. But now I believe this architecture of the Index Adviser needs to change, primarily to alleviate the performance impact of creating catalog entries, subtransaction overhead, and the catalog bloat caused by index_create() (and then rolling back the subtransaction). As part of this architecture change, the Index Adviser will have to cook up IndexOptInfo objects and append them to the relation. And that should line up with the design intention you mention. But the immediate blocker is how to assign OIDs to the hypothetical indexes so that all hypothetical indexes chosen by the planner can be identified by the Index Adviser. I'd like the Index Adviser to work on read-only /standby nodes as well, but that wouldn't be possible because calling GetNewObjectId() is not allowed during recovery. I see that HypoPG uses a neat little hack [1]. Perhaps Index Adviser will also have to resort to that trick. [1]: hypo_get_min_fake_oid() finds the usable oid range below FirstNormalObjectId https://github.com/HypoPG/hypopg/blob/57d832ce7a2937fe7d42b113c7e95dd1f129795b/hypopg.c#L458 Best regards, Gurjeet http://Gurje.et