Home > mailing lists

Re: _mdfd_getseg can be expensive - Mailing list pgsql-hackers

From	Peter Geoghegan
Subject	Re: _mdfd_getseg can be expensive
Date	July 1, 2016 21:18:12
Msg-id	CAM3SWZR3ScU+NhNnL-sEsyDadV0DfyoyKS+Wu0rBdbq2E2M3_g@mail.gmail.com Whole thread Raw
In response to	Re: _mdfd_getseg can be expensive (Andres Freund <andres@anarazel.de>)
List	pgsql-hackers

Tree view

On Thu, Jun 30, 2016 at 7:08 PM, Andres Freund <andres@anarazel.de> wrote:
> If you have a big enough index (maybe ~150GB+), sure. Before that,
> probably not.
>
> It's usually pretty easy to see in cpu profiles whether this issue
> exists.

I think that this is a contributing factor to why merging in parallel
CREATE INDEX becomes much more CPU bound when building such very large
indexes, which Corey Huinker has benchmarked using an advanced copy of
the patch. He has shown cases that are sped up by 3.6x when 8 parallel
workers are used (compared to a serial CREATE INDEX), but a several
hundred gigabyte index case only sees a speedup of about 1.5x. (This
bottleneck affects serial CREATE INDEX merging just as much as
parallel, since that part isn't parallelized, but it's far more
noticeable with parallel CREATE INDEX simply because merging in the
leader becomes a huge bottleneck).

Those two cases were not exactly comparable in perhaps several other
ways, but even still my sense is that that this can be at least
partially explained by md.c bottlenecks. This is something that we'll
need to confirm through profiling. Hopefully it's just this one
bottleneck.

--
Peter Geoghegan

pgsql-hackers by date:

From: Jim Nasby
Date: 01 July 2016, 20:49:16
Subject: Re: Reviewing freeze map code

From: Tom Lane
Date: 01 July 2016, 21:29:47
Subject: Re: [sqlsmith] ERROR: plan should not reference subplan's variable

Re: _mdfd_getseg can be expensive - Mailing list pgsql-hackers

Previous

Next