Thread: Issue in _bt_getrootheight

Issue in _bt_getrootheight

From

Ahmed Ibrahim

Date:

11 July 2023, 16:35:14

Hi everyone,

We have been working on the pg_adviser extension whose goal is to suggest indexes by creating virtual/hypothetical indexes and see how it affects the query cost.

The hypothetical index shouldn't take any space on the disk (allocates 0 pages) so we give it the flag INDEX_CREATE_SKIP_BUILD.

But the problem comes from here when the function get_relation_info is called in planning stage, it tries to calculate the B-Tree height by calling function _bt_getrootheight, but the B-Tree is not built at all, and its metadata page (which is block 0 in our case) doesn't exist, so this returns error that it cannot read the page (since it doesn't exist).

I tried to debug the code and found that this feature was introduced in version 9.3 under this commit [1]. I think that in the code we need to check if it's a B-Tree index AND the index is built/have some pages, then we can go and calculate it otherwise just put it to -1

I mean instead of this

if (info->relam == BTREE_AM_OID)

{

/* For btrees, get tree height while we have the index open */

info->tree_height = _bt_getrootheight(indexRelation);

}

else

{

/* For other index types, just set it to "unknown" for now */

info->tree_height = -1;

}

The first line should be

if (info->relam == BTREE_AM_OID && info->pages > 0)

or use the storage manager (smgr) to know if the first block exists.

I would appreciate it if anyone can agree/approve or deny so that I know if anything I am missing :)

Thanks everyone :)

[1] https://github.com/postgres/postgres/commit/31f38f28b00cbe2b9267205359e3cf7bafa1cb97

Re: Issue in _bt_getrootheight

From

Gurjeet Singh

Date:

20 July 2023, 03:45:41

On Tue, Jul 11, 2023 at 9:35 AM Ahmed Ibrahim
<ahmed.ibr.hashim@gmail.com> wrote:
>
> We have been working on the pg_adviser extension whose goal is to suggest indexes by creating virtual/hypothetical
indexesand see how it affects the query cost. 
>
> The hypothetical index shouldn't take any space on the disk (allocates 0 pages) so we give it the flag
INDEX_CREATE_SKIP_BUILD.
> But the problem comes from here when the function get_relation_info is called in planning stage, it tries to
calculatethe B-Tree height by calling function _bt_getrootheight, but the B-Tree is not built at all, and its metadata
page(which is block 0 in our case) doesn't exist, so this returns error that it cannot read the page (since it doesn't
exist).
>
> I tried to debug the code and found that this feature was introduced in version 9.3 under this commit [1]. I think
thatin the code we need to check if it's a B-Tree index AND the index is built/have some pages, then we can go and
calculateit otherwise just put it to -1 

> I mean instead of this
> if (info->relam == BTREE_AM_OID)
> {
>     /* For btrees, get tree height while we have the index open */
>     info->tree_height = _bt_getrootheight(indexRelation);
> }
> else
> {
>      /* For other index types, just set it to "unknown" for now */
>     info->tree_height = -1;
> }
>
> The first line should be
> if (info->relam == BTREE_AM_OID && info->pages > 0)
> or use the storage manager (smgr) to know if the first block exists.

I think the better method would be to calculate the index height
*after* get_relation_info_hook is called. That way, instead of the
server guessing whether or not an index is hypothetical it can rely on
the index adviser's notion of which index is hypothetical. The hook
implementer has the opportunity to not only mark the
indexOptInfo->hypothetical = true, but also calculate the tree height,
if they can.

Please see attached the patch that does this. Let me know if this patch helps.

Best regards,
Gurjeet
http://Gurje.et

Attachment

calculate-index-height-after-calling-get_relation_info_hook.patch

Re: Issue in _bt_getrootheight

From

Tom Lane

Date:

21 July 2023, 17:41:56

Gurjeet Singh <gurjeet@singh.im> writes:
> Please see attached the patch that does this. Let me know if this patch helps.

I don't like this patch one bit, because it adds a lot of overhead
(i.e., an extra index_open/close cycle for every btree index in every
query) to support a tiny minority use-case.  How come we don't
already know whether the index is hypothetical at the point where
_bt_getrootheight is called now?

Actually, looking at the existing comment at the call site:

    /*
     * Allow a plugin to editorialize on the info we obtained from the
     * catalogs.  Actions might include altering the assumed relation size,
     * removing an index, or adding a hypothetical index to the indexlist.
     */
    if (get_relation_info_hook)
        (*get_relation_info_hook) (root, relationObjectId, inhparent, rel);

reminds me that the design intention was that hypothetical indexes
would get added to the list by get_relation_info_hook itself.
If that's not how the index adviser is operating, maybe we need
to have a discussion about that.

            regards, tom lane

Re: Issue in _bt_getrootheight

From

Gurjeet Singh

Date:

24 July 2023, 10:33:58

On Fri, Jul 21, 2023 at 10:42 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:
>
> Gurjeet Singh <gurjeet@singh.im> writes:
> > Please see attached the patch that does this. Let me know if this patch helps.
>
> I don't like this patch one bit, because it adds a lot of overhead
> (i.e., an extra index_open/close cycle for every btree index in every
> query) to support a tiny minority use-case.

I anticipated the patch's performance impact may be a concern, but
before addressing it I wanted to see if the patch actually helped
Index Adviser. Ahmed has confirmed that my proposed patch works for
him.

I believe the additional index_open() would not affect the performance
significantly, since the very same indexes were index_open()ed just
before calling the get_relation_info_hook. All the relevant caches
would be quite fresh because of the index_open() in the same function
above. And since the locks taken on these indexes haven't been
released, we don't have to work hard to take any new locks (hence the
index_open() with NoLock flag).

> How come we don't
> already know whether the index is hypothetical at the point where
> _bt_getrootheight is called now?

Because the 'hypthetical' flag is not stored in catalogs, and that's
okay; see below.

At that point, the only indication that an index may be a hypothetical
index is if RelationGetNumberOfBlocks() returns 0 for it, and that's
what Ahmed's proposed patch relied on. But I think extrapolating that
info->pages==0 implies it's a hypothetical index, is stretching that
assumption too far.

> Actually, looking at the existing comment at the call site:
>
>     /*
>      * Allow a plugin to editorialize on the info we obtained from the
>      * catalogs.  Actions might include altering the assumed relation size,
>      * removing an index, or adding a hypothetical index to the indexlist.
>      */
>     if (get_relation_info_hook)
>         (*get_relation_info_hook) (root, relationObjectId, inhparent, rel);
>
> reminds me that the design intention was that hypothetical indexes
> would get added to the list by get_relation_info_hook itself.
> If that's not how the index adviser is operating, maybe we need
> to have a discussion about that.

Historically, to avoid having to hand-create the IndexOptInfo and risk
getting something wrong, the Index Adviser has used index_create() to
create a full-blown btree index, (sans that actual build step, with
skip_build = true), and saving the returned OID. This ensured that all
the catalog entries were in place before it called the
standard_planner(). This way Postgres would build IndexOptInfo from
the entries in the catalog, as usual. Then, inside the
get_relation_info_hook() callback, Index Adviser identifies these
virtual indexes by their OID, and at that point marks them with
hypothetical=true.

After planning is complete, the Index Adviser scans the plan to find
any IndexScan objects that have indexid matching the saved OIDs.

Index Adviser performs the whole thing in a subtransaction, which gets
rolled back. So the hypothetical indexes are not visible to any other
transaction, ever.

Assigning OID to a hypothetical index is necessary, and I believe
index_create() is the right way to do it. In fact, in the 9.1 cycle
there was a bug fixed (a2095f7fb5, where the hypothetical flag was
also invented), to solve precisely this problem; to allow the Index
Adviser to use OIDs to identify hypothetical indexes that were
used/chosen by the planner.

But now I believe this architecture of the Index Adviser needs to
change, primarily to alleviate the performance impact of creating
catalog entries, subtransaction overhead, and the catalog bloat caused
by index_create() (and then rolling back the subtransaction). As part
of this architecture change, the Index Adviser will have to cook up
IndexOptInfo objects and append them to the relation. And that should
line up with the design intention you mention.

But the immediate blocker is how to assign OIDs to the hypothetical
indexes so that all hypothetical indexes chosen by the planner can be
identified by the Index Adviser. I'd like the Index Adviser to work on
read-only /standby nodes as well, but that wouldn't be possible
because calling GetNewObjectId() is not allowed during recovery. I see
that HypoPG uses a neat little hack [1]. Perhaps Index Adviser will
also have to resort to that trick.

[1]: hypo_get_min_fake_oid() finds the usable oid range below
FirstNormalObjectId
https://github.com/HypoPG/hypopg/blob/57d832ce7a2937fe7d42b113c7e95dd1f129795b/hypopg.c#L458

Best regards,
Gurjeet
http://Gurje.et