Thread: v17 vs v16 performance comparison

v17 vs v16 performance comparison

From

Alexander Lakhin

Date:

01 August 2024, 03:00:00

Hello hackers,

I've repeated the performance measurement for REL_17_STABLE (1e020258e)
and REL_16_STABLE (6f6b0f193) and found several benchmarks where v16 is
significantly better than v17. Please find attached an html table with
all the benchmarking results.

I had payed attention to:
Best pg-src-17--.* worse than pg-src-16--.* by 57.9 percents (225.11 > 142.52): pg_tpcds.query15
Average pg-src-17--.* worse than pg-src-16--.* by 55.5 percents (230.57 > 148.29): pg_tpcds.query15
in May, performed `git bisect` for this degradation, that led me to commit
b7b0f3f27 [1].

This time I bisected the following anomaly:
Best pg-src-17--.* worse than pg-src-16--.* by 23.6 percents (192.25 > 155.58): pg_tpcds.query21
Average pg-src-17--.* worse than pg-src-16--.* by 25.1 percents (196.19 > 156.85): pg_tpcds.query21
and to my surprise I got "b7b0f3f27 is the first bad commit".

Moreover, bisecting of another anomaly:
Best pg-src-17--.* worse than pg-src-16--.* by 24.2 percents (24269.21 > 19539.89): pg_tpcds.query72
Average pg-src-17--.* worse than pg-src-16--.* by 24.2 percents (24517.66 > 19740.12): pg_tpcds.query72
pointed at the same commit again.

So it looks like q15 from TPC-DS is not the only query suffering from that
change.

But beside that, I've found a separate regression. Bisecting for this degradation:
Best pg-src-17--.* worse than pg-src-16--.* by 105.0 percents (356.63 > 173.96): s64da_tpcds.query95
Average pg-src-17--.* worse than pg-src-16--.* by 105.2 percents (357.79 > 174.38): s64da_tpcds.query95
pointed at f7816aec2.

Does this deserve more analysis and maybe fixing?

[1] https://www.postgresql.org/message-id/63a63690-dd92-c809-0b47-af05459e95d1%40gmail.com

Best regards,
Alexander

Attachment

Re: v17 vs v16 performance comparison

From

Tom Lane

Date:

01 August 2024, 03:41:16

Alexander Lakhin <exclusion@gmail.com> writes:
> I've repeated the performance measurement for REL_17_STABLE (1e020258e)
> and REL_16_STABLE (6f6b0f193) and found several benchmarks where v16 is
> significantly better than v17. Please find attached an html table with
> all the benchmarking results.

Thanks for doing that!

I have no opinion about b7b0f3f27, but as far as this goes:

> But beside that, I've found a separate regression. Bisecting for this degradation:
> Best pg-src-17--.* worse than pg-src-16--.* by 105.0 percents (356.63 > 173.96): s64da_tpcds.query95
> Average pg-src-17--.* worse than pg-src-16--.* by 105.2 percents (357.79 > 174.38): s64da_tpcds.query95
> pointed at f7816aec2.

I'm not terribly concerned about that.  The nature of planner changes
like that is that some queries will get worse and some better, because
the statistics and cost estimates we're dealing with are not perfect.
It is probably worth drilling down into that test case to understand
where the planner is going wrong, with an eye to future improvements;
but I doubt it's something we need to address for v17.

            regards, tom lane

Re: v17 vs v16 performance comparison

From

Thomas Munro

Date:

01 August 2024, 05:57:36

On Thu, Aug 1, 2024 at 3:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
> So it looks like q15 from TPC-DS is not the only query suffering from that
> change.

I'm going to try to set up a local repro to study these new cases.  If
you have a write-up somewhere of how exactly you run that, that'd be
useful.

Re: v17 vs v16 performance comparison

From

Alexander Lakhin

Date:

01 August 2024, 07:00:00

Hello Thomas.

01.08.2024 08:57, Thomas Munro wrote:
> On Thu, Aug 1, 2024 at 3:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
>> So it looks like q15 from TPC-DS is not the only query suffering from that
>> change.
> I'm going to try to set up a local repro to study these new cases.  If
> you have a write-up somewhere of how exactly you run that, that'd be
> useful.

I'm using this instrumentation (on my Ubuntu 22.04 workstation):
https://github.com/alexanderlaw/pg-mark.git
README.md can probably serve as a such write-up.

If you install all the prerequisites (some tests, including pg_tpcds,
require downloading additional resources; run-benchmarks.py will ask to
do that), there should be no problems with running benchmarks.

I just added two instances to config.xml:
         <instance id="pg-src-16" type="src" pg_version="16devel" git_branch="REL_16_STABLE" />
         <instance id="pg-src-17" type="src" pg_version="17devel" git_branch="REL_17_STABLE" />
and ran
1)
./prepare-instances.py -i pg-src-16 pg-src-17

2)
time ./run-benchmarks.py -i pg-src-16 pg-src-17 pg-src-16 pg-src-17 pg-src-17 pg-src-16
(it took 1045m55,215s on my machine so you may prefer to choose the single
benchmark (-b pg_tpcds or maybe s64da_tpcds))

3)
./analyze-benchmarks.py -i 'pg-src-17--.*' 'pg-src-16--.*'

All the upper-level commands to run benchmarks are contained in config.xml,
so you can just execute them separately, but my instrumentation eases
processing of the results by creating one unified benchmark-results.xml.

Please feel free to ask any questions or give your feedback.

Thank you for paying attention to this!

Best regards,
Alexander

Re: v17 vs v16 performance comparison

From

Alexander Lakhin

Date:

02 August 2024, 09:00:00

01.08.2024 06:41, Tom Lane wrote:
>
>> But beside that, I've found a separate regression. Bisecting for this degradation:
>> Best pg-src-17--.* worse than pg-src-16--.* by 105.0 percents (356.63 > 173.96): s64da_tpcds.query95
>> Average pg-src-17--.* worse than pg-src-16--.* by 105.2 percents (357.79 > 174.38): s64da_tpcds.query95
>> pointed at f7816aec2.
> I'm not terribly concerned about that.  The nature of planner changes
> like that is that some queries will get worse and some better, because
> the statistics and cost estimates we're dealing with are not perfect.
> It is probably worth drilling down into that test case to understand
> where the planner is going wrong, with an eye to future improvements;
> but I doubt it's something we need to address for v17.

Please find attached two plans for that query [1].
(I repeated the benchmark for f7816aec2 and f7816aec2~1 five times and
made sure that both plans are stable.)

Meanwhile I've bisected another degradation:
Best pg-src-17--.* worse than pg-src-16--.* by 11.3 percents (7.17 > 6.44): job.query6f
and came to the commit b7b0f3f27 again.

[1] https://github.com/swarm64/s64da-benchmark-toolkit/blob/master/benchmarks/tpcds/queries/queries_10/95.sql

Best regards,
Alexander

Attachment

Re: v17 vs v16 performance comparison

From

Thomas Munro

Date:

03 September 2024, 09:21:59

On Tue, Sep 3, 2024 at 5:00 PM Alexander Lakhin <exclusion@gmail.com> wrote:
>  From a bird's eye view, new v17-vs-v16 comparison has only 87 "worse",
> while the previous one had 115 (it requires deeper analysis, of course, but
> still...).

Any chance you could share that whole pgdata dir with me, assuming it
compresses to a manageable size?  Perhaps we could discuss that
off-list?