Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers

From John Naylor
Subject Re: [PoC] Improve dead tuple storage for lazy vacuum
Date
Msg-id CANWCAZbeQnXH2yr32iW=+9s+WJUi7+dbyr7pmJ8OJ51+ObeUyQ@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Improve dead tuple storage for lazy vacuum  (Masahiko Sawada <sawada.mshk@gmail.com>)
Responses Re: [PoC] Improve dead tuple storage for lazy vacuum
List pgsql-hackers
On Tue, Mar 19, 2024 at 10:24 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
>
> On Tue, Mar 19, 2024 at 8:35 AM John Naylor <johncnaylorls@gmail.com> wrote:
> >
> > On Mon, Mar 18, 2024 at 11:12 AM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> > >
> > > On Sun, Mar 17, 2024 at 11:46 AM John Naylor <johncnaylorls@gmail.com> wrote:

> > It might also be worth reducing the number of blocks in the random
> > test -- multiple runs will have different offsets anyway.
>
> Yes. If we reduce the number of blocks from 1000 to 100, the
> regression test took on my environment:
>
> 1000 blocks : 516 ms
> 100 blocks   : 228 ms

Sounds good.

> Removed some unnecessary variables in 0002 patch.

Looks good.

> So the MaxBlocktableEntrySize calculation would be as follows?
>
> #define MaxBlocktableEntrySize \
>     offsetof(BlocktableEntry, words) + \
>         (sizeof(bitmapword) * \
>         WORDS_PER_PAGE(Min(MaxOffsetNumber, \
>                            BITS_PER_BITMAPWORD * PG_INT8_MAX - 1))))
>
> I've made this change in the 0003 patch.

This is okay, but one side effect is that we have both an assert and
an elog, for different limits. I think we'll need a separate #define
to help. But for now, I don't want to hold up tidstore further with
this because I believe almost everything else in v74 is in pretty good
shape. I'll save this for later as a part of the optimization I
proposed.

Remaining things I noticed:

+#define RT_PREFIX local_rt
+#define RT_PREFIX shared_rt

Prefixes for simplehash, for example, don't have "sh" -- maybe "local/shared_ts"

+ /* MemoryContext where the radix tree uses */

s/where/that/

+/*
+ * Lock support functions.
+ *
+ * We can use the radix tree's lock for shared TidStore as the data we
+ * need to protect is only the shared radix tree.
+ */
+void
+TidStoreLockExclusive(TidStore *ts)

Talking about multiple things, so maybe a blank line after the comment.

With those, I think you can go ahead and squash all the tidstore
patches except for 0003 and commit it.

> While reviewing the vacuum patch, I realized that we always pass
> LWTRANCHE_SHARED_TIDSTORE to RT_CREATE(), and the wait event related
> to the tidstore is therefore always the same. I think it would be
> better to make the caller of TidStoreCreate() specify the tranch_id
> and pass it to RT_CREATE(). That way, the caller can specify their own
> wait event for tidstore. The 0008 patch tried this idea. dshash.c does
> the same idea.

Sounds reasonable. I'll just note that src/include/storage/lwlock.h
still has an entry for LWTRANCHE_SHARED_TIDSTORE.



pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: What is a typical precision of gettimeofday()?
Next
From: Bertrand Drouvot
Date:
Subject: Re: Introduce XID age and inactive timeout based replication slot invalidation