Home > mailing lists

Re: why do hash index builds use smgrextend() for new splitpoint pages - Mailing list pgsql-hackers

From	Melanie Plageman
Subject	Re: why do hash index builds use smgrextend() for new splitpoint pages
Date	February 26, 2022 00:31:23
Msg-id	CAAKRu_ZnNM-FAYNOsgFD6JT9_c0Dc5b61atykpy_9sAhevLh9g@mail.gmail.com Whole thread Raw
In response to	Re: why do hash index builds use smgrextend() for new splitpoint pages (Amit Kapila <amit.kapila16@gmail.com>)
Responses	Re: why do hash index builds use smgrextend() for new splitpoint pages (Amit Kapila <amit.kapila16@gmail.com>)
List	pgsql-hackers

Tree view

On Thu, Feb 24, 2022 at 10:24 PM Amit Kapila <amit.kapila16@gmail.com> wrote:
>
> On Fri, Feb 25, 2022 at 4:41 AM Melanie Plageman
> <melanieplageman@gmail.com> wrote:
> >
> > I'm trying to understand why hash indexes are built primarily in shared
> > buffers except when allocating a new splitpoint's worth of bucket pages
> > -- which is done with smgrextend() directly in _hash_alloc_buckets().
> >
> > Is this just so that the value returned by smgrnblocks() includes the
> > new splitpoint's worth of bucket pages?
> >
> > All writes of tuple data to pages in this new splitpoint will go
> > through shared buffers (via hash_getnewbuf()).
> >
> > I asked this and got some thoughts from Robert in [1], but I still don't
> > really get it.
> >
> > When a new page is needed during the hash index build, why can't
> > _hash_expandtable() just call ReadBufferExtended() with P_NEW instead of
> > _hash_getnewbuf()? Does it have to do with the BUCKET_TO_BLKNO mapping?
> >
>
> We allocate the chunk of pages (power-of-2 groups) at the time of
> split which allows them to appear consecutively in an index. This
> helps us to compute the physical block number from bucket number
> easily (BUCKET_TO_BLKNO mapping) with some minimal control
> information.

got it, thanks.

Since _hash_alloc_buckets() WAL-logs the last page of the
splitpoint, is it safe to skip the smgrimmedsync()? What if the last
page of the splitpoint doesn't end up having any tuples added to it
during the index build and the redo pointer is moved past the WAL for
this page and then later there is a crash sometime before this page
makes it to permanent storage. Does it matter that this page is lost? If
not, then why bother WAL-logging it?

- Melanie

pgsql-hackers by date:

From: "Imseih (AWS), Sami"
Date: 26 February 2022, 00:28:18
Subject: Re: [BUG] Panic due to incorrect missingContrecPtr after promotion

From: "Hsu, John"
Date: 26 February 2022, 00:52:03
Subject: Re: Allow async standbys wait for sync replication (was: Disallow quorum uncommitted (with synchronous standbys) txns in logical replication subscribers)

Re: why do hash index builds use smgrextend() for new splitpoint pages - Mailing list pgsql-hackers

Previous

Next