Re: Yet another fast GiST build - Mailing list pgsql-hackers

From Heikki Linnakangas
Subject Re: Yet another fast GiST build
Date
Msg-id 282f7007-326a-f3d6-7394-0e5ea52df84f@iki.fi
Whole thread Raw
In response to Re: Yet another fast GiST build  ("Andrey M. Borodin" <x4mmm@yandex-team.ru>)
Responses Re: Yet another fast GiST build  (Pavel Borisov <pashkin.elfe@gmail.com>)
List pgsql-hackers
On 29/09/2020 21:04, Andrey M. Borodin wrote:
> 28 сент. 2020 г., в 13:12, Heikki Linnakangas <hlinnaka@iki.fi> написал(а):
>> I did some testing with your patch. It seems that the rightlinks
>> are still not always set. I didn't try to debug why.
> 
> Yes, there is a bug. Now it seems to me so obvious, yet it took some 
> time to understand that links were shifted by one extra jump. PFA 
> fixed rightlinks installation.
Ah, that was simple. I propose adding a comment on it:

--- a/src/backend/access/gist/gistbuild.c
+++ b/src/backend/access/gist/gistbuild.c
@@ -540,6 +540,19 @@ gist_indexsortbuild_pagestate_flush(GISTBuildState 
*state,
      /* Re-initialize the page buffer for next page on this level. */
      pagestate->page = palloc(BLCKSZ);
      gistinitpage(pagestate->page, isleaf ? F_LEAF : 0);
+
+    /*
+     * Set the right link to point to the previous page. This is just for
+     * debugging purposes: GiST only follows the right link if a page is split
+     * concurrently to a scan, and that cannot happen during index build.
+     *
+     * It's a bit counterintuitive that we set the right link on the new page
+     * to point to the previous page, and not the other way round. But GiST
+     * pages are not ordered like B-tree pages are, so as long as the
+     * right-links form a chain through all the pages in the same level, the
+     * order doesn't matter.
+     */
+    GistPageGetOpaque(pagestate->page)->rightlink = blkno;
  }

> BTW some one more small thing: we initialise page buffers with
> palloc() and palloc0(), while first one is sufficient for
> gistinitpage().
Hmm. Only the first one, in gist_indexsortbuild(), but that needs to be 
palloc0, because it's used to write an all-zeros placeholder for the 
root page.

> Also I'm working on btree_gist opclasses and found out that new
> sortsupport function is not mentioned in gistadjustmembers(). I think
> this can be fixed with gist_btree patch.

Thanks!

> Your pageinspect patch seems very useful. How do you think, should we
> provide a way to find invalid tuples in GiST within
> gist_page_items()? At some point we will have to ask user to reindex
> GiSTs with invalid tuples.
You mean invalid tuples created by crash on PostgreSQL version 9.0 or 
below, and pg_upgraded? I doubt there are any of those still around in 
the wild. We have to keep the code to detect them, though.

It would be nice to improve gist_page_items() to display more 
information about the items, although I wouldn't worry much about 
invalid tuples. The 'gevel' extension that Oleg mentioned upthread does 
more, it would be nice to incorporate that into pageinspect somehow.

- Heikki



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Dumping/restoring fails on inherited generated column
Next
From: David Rowley
Date:
Subject: Re: Planner making bad choice in alternative subplan decision