Home > mailing lists

Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers

From	Masahiko Sawada
Subject	Re: [PoC] Improve dead tuple storage for lazy vacuum
Date	July 13, 2023 11:08:38
Msg-id	CAD21AoDCTS573Tp5TnpgUDmMYeH=Xz19UabctYus_Eib0-jWQQ@mail.gmail.com Whole thread Raw
In response to	Re: [PoC] Improve dead tuple storage for lazy vacuum (John Naylor <john.naylor@enterprisedb.com>)
Responses	Re: [PoC] Improve dead tuple storage for lazy vacuum Re: [PoC] Improve dead tuple storage for lazy vacuum
List	pgsql-hackers

Tree view

On Sat, Jul 8, 2023 at 11:54 AM John Naylor
<john.naylor@enterprisedb.com> wrote:
>
>
> On Fri, Jul 7, 2023 at 2:19 PM Masahiko Sawada <sawada.mshk@gmail.com> wrote:
> >
> > On Wed, Jul 5, 2023 at 8:21 PM John Naylor <john.naylor@enterprisedb.com> wrote:
> > > Well, it's going to be a bit of a mess until I can demonstrate it working (and working well) with bitmap heap
scan.Fixing that now is just going to create conflicts. I do have a couple small older patches laying around that were
quickexperiments -- I think at least some of them should give a performance boost in loading speed, but haven't had
timeto test. Would you like to take a look? 
> >
> > Yes, I can experiment with these patches in the meantime.
>
> Okay, here it is in v36. 0001-6 are same as v35.
>
> 0007 removes a wasted extra computation newly introduced by refactoring growing nodes. 0008 just makes 0011 nicer.
Notworth testing by themselves, but better to be tidy. 
> 0009 is an experiment to get rid of slow memmoves in node4, addressing a long-standing inefficiency. It looks a bit
tricky,but I think it's actually straightforward after drawing out the cases with pen and paper. It works if the fanout
iseither 4 or 5, so we have some wiggle room. This may give a noticeable boost if the input is reversed or random. 
> 0010 allows RT_EXTEND_DOWN to reduce function calls, so should help with sparse trees.
> 0011 reduces function calls when growing the smaller nodes. Not sure about this one -- possibly worth it for node4
only?
>
> If these help, it'll show up more easily in smaller inputs. Large inputs tend to be more dominated by RAM latency.

Thanks for sharing the patches!

0007, 0008, 0010, and 0011 are straightforward and agree to merge them.

I have some questions on 0009 patch:

+       /* shift chunks and children
+
+               Unfortunately, gcc has gotten too aggressive in
turning simple loops
+               into slow memmove's, so we have to be a bit more clever.
+               See https://gcc.gnu.org/bugzilla/show_bug.cgi?id=101481
+
+               We take advantage of the fact that a good
+               compiler can turn a memmove of a small constant power-of-two
+               number of bytes into a single load/store.
+       */

According to the comment, this optimization is for only gcc? and there
is no negative impact when building with other compilers such as clang
by this change?

I'm not sure that it's a good approach to hand-optimize the code much
to generate better instructions on gcc. I think this change reduces
readability and maintainability. According to the bugzilla ticket
referred to in the comment, it's realized as a bug in the community,
so once the gcc bug fixes, we might no longer need this trick, no?

Regards,

--
Masahiko Sawada
Amazon Web Services: https://aws.amazon.com

pgsql-hackers by date:

From: Julien Rouhaud
Date: 13 July 2023, 10:29:03
Subject: Re: \di+ cannot show the same name indexes

From: Michael Paquier
Date: 13 July 2023, 11:10:37
Subject: Re: 'ERROR: attempted to update invisible tuple' from 'ALTER INDEX ... ATTACH PARTITION' on parent index

Re: [PoC] Improve dead tuple storage for lazy vacuum - Mailing list pgsql-hackers

Previous

Next