Re: [PoC] Reducing planning time when tables have many partitions - Mailing list pgsql-hackers

From Yuya Watari
Subject Re: [PoC] Reducing planning time when tables have many partitions
Date
Msg-id CAJ2pMkYvniGV96EmfefGwvSQREoiEpDP+Rhruz_VpKHLVKG_QA@mail.gmail.com
Whole thread Raw
In response to Re: [PoC] Reducing planning time when tables have many partitions  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: [PoC] Reducing planning time when tables have many partitions
List pgsql-hackers
Hello Alvaro,

Thank you for your reply, and I'm sorry if my previous emails caused
confusion or made it seem like I was ignoring more important issues.

On Thu, Dec 12, 2024 at 9:09 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
>
> I'm repeating myself, but I disagree that this is something we should
> spend _any_ time on.  Developers running assertion-enabled builds do not
> care if a complicated query with one thousand partitions is planned in
> 500 ms instead of 300 ms.  Heck, I bet nobody cares if it took 2000 ms
> either, because, you know what?  The developers don't have a thousand
> partitions to begin with; if they do, it's precisely because they want
> to measure this kind of effect.  This is not going to bother anyone
> ever, unless you stick a hundred of these queries in the regression
> tests.  In regression tests you're going to have, say, 64 partitions at
> most, because having more than that doesn't test anything additional;
> having that go from 40 ms to 60 ms (or whatever) isn't going to bother
> anyone.

I agree that focusing too much on assert-enabled builds is not
productive at this point. In my last email, I shared benchmark results
for debug builds, but I understand your point that even a few seconds
of regression is not practically important for debug builds.

For context, there have been reports in the past of minute-order
regressions in assert-enabled builds (100 seconds [1] and 50 seconds
[2]). I mentioned these minute-order regressions not to refocus the
discussion on debug builds right now, but to clarify why we have been
concerned about them in the past. I should have shared this background
and done appropriate benchmarks (not millisecond regressions, but
minutes). My sincere apologies. Once we have addressed the primary
goals (release build performance and memory usage), I will revisit
these regressions.

> If anything, you can add a note to remove the USE_ASSERTIONS blocks once
> we get past the beta process; by then any bugs will have been noticed
> and the asserts will be of less value.

Thank you for your advice. I will consider removing these assertions
after the beta process or using OPTIMIZER_DEBUG, which is Ashutosh's
idea.

> I would like to see this patch series get committed, and this concern
> about planning time in development builds under conditions that are
> unrealistic for testing is slowing the process down.  (The process is
> slow enough.  This patch has already missed two releases.)  Please stop.

I will speed up the process for committing this patch series.

> Memory usage and planning time in production builds is important.  You
> can better spend your energy there.

As you said, we have another big problem, which is memory usage. I
will focus on the memory usage problem first, as you suggested. After
fixing those problems, we can revisit the assert-enabled build
regressions as a final step if necessary. What do you think about this
approach?

[1] https://www.postgresql.org/message-id/d8db5b4e-e358-2567-8c56-a85d2d8013df%40postgrespro.ru
[2] https://www.postgresql.org/message-id/CAExHW5uVZ3E5RT9cXHaxQ_DEK7tasaMN%3DD6rPHcao5gcXanY5w%40mail.gmail.com

--
Best regards,
Yuya Watari



pgsql-hackers by date:

Previous
From: Steven Niu
Date:
Subject: Re: Patching for increasing the number of columns
Next
From: Alexander Kuznetsov
Date:
Subject: Re: [PATCH] Check for TupleTableSlot nullness before dereferencing