Home > mailing lists

Re: track_planning causing performance regression - Mailing list pgsql-hackers

From	Fujii Masao
Subject	Re: track_planning causing performance regression
Date	June 30, 2020 08:43:39
Msg-id	5725cc95-6aea-d3a8-6ea5-36acf780a55d@oss.nttdata.com Whole thread Raw
In response to	Re: track_planning causing performance regression (Ants Aasma <ants@cybertec.at>)
Responses	Re: track_planning causing performance regression Re: track_planning causing performance regression
List	pgsql-hackers

Tree view


On 2020/06/29 22:23, Ants Aasma wrote:
> On Mon, 29 Jun 2020 at 12:17, Julien Rouhaud <rjuju123@gmail.com <mailto:rjuju123@gmail.com>> wrote:
> 
>     On Mon, Jun 29, 2020 at 10:55 AM Fujii Masao
>     <masao.fujii@oss.nttdata.com <mailto:masao.fujii@oss.nttdata.com>> wrote:
>      >
>      > On 2020/06/29 16:05, Julien Rouhaud wrote:
>      > > On Mon, Jun 29, 2020 at 7:49 AM Tharakan, Robins <tharar@amazon.com <mailto:tharar@amazon.com>> wrote:
>      > >>
>      > >> During fully-cached SELECT-only test using pgbench, Postgres v13Beta1 shows
>      >
>      > Thanks for the benchmark!
>      >
>      >
>      > >> ~45% performance drop [2] at high DB connection counts (when compared with v12.3)
>      >
>      > That's bad :(
>      >
>      >
>      > >>
>      > >> Disabling pg_stat_statements.track_planning (which is 'On' by default)
>      > >> brings the TPS numbers up to v12.3 levels.
>      > >>
>      > >> The inflection point (in this test-case) is 128 Connections, beyond which the
>      > >> TPS numbers are consistently low. Looking at the mailing list [1], this issue
>      > >> didn't surface earlier possibly since the regression is trivial at low connection counts.
>      > >>
>      > >> It would be great if this could be optimized further, or track_planning
>      > >> disabled (by default) so as to not trip users upgrading from v12 with pg_stat_statement
>      > >> enabled (but otherwise not particularly interested in track_planning).
>      >
>      > Your benchmark result seems to suggest that the cause of the problem is
>      > the contention of per-query spinlock in pgss_store(). Right?
>      > This lock contention is likely to happen when multiple sessions run
>      > the same queries.
>      >
>      > One idea to reduce that lock contention is to separate per-query spinlock
>      > into two; one is for planning, and the other is for execution. pgss_store()
>      > determines which lock to use based on the given "kind" argument.
>      > To make this idea work, also every pgss counters like shared_blks_hit
>      > need to be separated into two, i.e., for planning and execution.
> 
>     This can probably remove some overhead, but won't it eventually hit
>     the same issue when multiple connections try to plan the same query,
>     given the number of different queries and very low execution runtime?
>     It'll also quite increase the shared memory consumption.
> 
>     I'm wondering if we could instead use atomics to store the counters.
>     The only downside is that we won't guarantee per-row consistency
>     anymore, which may be problematic.
> 
> 
> 
> The problem looks to be that spinlocks are terrible with overloaded CPU and a contended spinlock. A process holding
thespinlock might easily get scheduled out leading to excessive spinning by everybody. I think a simple thing to try
wouldbe to replace the spinlock with LWLock.
 

Yes. Attached is the POC patch that replaces per-counter spinlock with LWLock.

> 
> I did a prototype patch that replaces spinlocks with futexes, but was not able to find a workload where it mattered.

I'm not familiar with futex, but could you tell me why you used futex instead
of LWLock that we already have? Is futex portable?

> We have done a great job at eliminating spinlocks from contended code paths. Robins, perhaps you could try it to see
ifit reduces the regression you are observing.
 

Yes. Also we need to check that this change doesn't increase performance
overhead in other workloads.

Regards,

-- 
Fujii Masao
Advanced Computing Technology Center
Research and Development Headquarters
NTT DATA CORPORATION

Attachment

pgss_lwlock_v1.patch

pgsql-hackers by date:

From: Dipesh Pandit
Date: 30 June 2020, 08:15:43
Subject: Re: refactoring basebackup.c

From: Amit Langote
Date: 30 June 2020, 08:53:55
Subject: Re: POC: postgres_fdw insert batching

Re: track_planning causing performance regression - Mailing list pgsql-hackers

Attachment

Previous

Next