Re: Gather performance analysis - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: Gather performance analysis
Date
Msg-id d76a759d-9240-94f5-399e-ae244e5f0285@enterprisedb.com
Whole thread Raw
In response to Re: Gather performance analysis  (Dilip Kumar <dilipbalaut@gmail.com>)
Responses Re: Gather performance analysis
List pgsql-hackers

On 9/8/21 9:40 AM, Dilip Kumar wrote:
> On Wed, Sep 8, 2021 at 12:03 PM Andres Freund <andres@anarazel.de
> <mailto:andres@anarazel.de>> wrote:
> 
>     Hi,
> 
>     On 2021-09-08 11:45:16 +0530, Dilip Kumar wrote:
>     > On Wed, Sep 8, 2021 at 3:08 AM Andres Freund <andres@anarazel.de
>     <mailto:andres@anarazel.de>> wrote:
>     >
>     >
>     > > Looking at this profile made me wonder if this was a build without
>     > > optimizations. The
>     pg_atomic_read_u64()/pg_atomic_read_u64_impl() calls
>     > > should
>     > > be inlined. And while perf can reconstruct inlined functions
>     when using
>     > > --call-graph=dwarf, they show up like "pg_atomic_read_u64
>     (inlined)" for
>     > > me.
>     > >
>     >
>     > Yeah, for profiling generally I build without optimizations so
>     that I can
>     > see all the functions in the stack, so yeah profile results are
>     without
>     > optimizations build but the performance results are with optimizations
>     > build.
> 
>     I'm afraid that makes the profiles just about meaningless :(.
> 
> 
> Maybe it can be misleading sometimes, but I feel sometimes it is more
> informative compared to the optimized build where it makes some function
> inline, and then it becomes really hard to distinguish which function
> really has the problem.  But your point is taken and I will run with an
> optimized build.
> 

IMHO Andres is right optimization may make profiles mostly useless in
most cases - it may skew timings for different parts differently, so
something that'd be optimized out may take much more time.

It may provide valuable insights, but we definitely should not use such
binaries for benchmarking and comparisons of the patches.

As mentioned, I did some benchmarks, and I do see some nice improvements
even with properly optimized builds -O2.

Attached is a simple script that varies a bunch of parameters (number of
workers, number of rows/columns, ...) and then measures duration of a
simple query, similar to what you did. I haven't varied the queue size,
that might be interesting too.

The PDF shows a comparison of master and the two patches. For 10k rows
there's not much difference, but for 1M and 10M rows there are some nice
improvements in the 20-30% range. Of course, it's just a single query in
a simple benchmark.


regards

-- 
Tomas Vondra
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Attachment

pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: Proposal: More structured logging
Next
From: Ajin Cherian
Date:
Subject: Re: row filtering for logical replication