Re: PG15 beta1 sort performance regression due to Generation context change - Mailing list pgsql-hackers

From Jonathan S. Katz
Subject Re: PG15 beta1 sort performance regression due to Generation context change
Date
Msg-id 0cd21b8c-13fb-9b94-1cba-525597a9da67@postgresql.org
Whole thread Raw
In response to Re: PG15 beta1 sort performance regression due to Generation context change  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: PG15 beta1 sort performance regression due to Generation context change
List pgsql-hackers
On 7/15/22 4:36 PM, Tomas Vondra wrote:
> 
> 
> On 7/13/22 17:32, Andres Freund wrote:
>> Hi,
>>
>> On 2022-07-13 09:23:00 -0400, Jonathan S. Katz wrote:
>>> On 7/13/22 12:13 AM, David Rowley wrote:
>>>> On Tue, 12 Jul 2022 at 17:15, David Rowley <dgrowleyml@gmail.com> wrote:
>>>>> So far only Robert has raised concerns with this regression for PG15
>>>>> (see [2]). Tom voted for leaving things as they are for PG15 in [3].
>>>>> John agrees, as quoted above. Does anyone else have any opinion?
>>>>
>>>> Let me handle this slightly differently.  I've moved the open item for
>>>> this into the "won't fix" section.  If nobody shouts at me for that
>>>> then I'll let that end the debate.  Otherwise, we can consider the
>>>> argument when it arrives.
>>>
>>> The RMT discussed this issue at its meeting today (and a few weeks back --
>>> apologies for not writing sooner). While we agree with your analysis that 1/
>>> this issue does appear to be a corner case and 2/ the benefits outweigh the
>>> risks, we still don't know how prevalent it may be in the wild and the
>>> general impact to user experience.
>>>
>>> The RMT suggests that you make one more pass at attempting to solve it.
>>
>> I think without a more concrete analysis from the RMT that's not really
>> actionable. What risks are we willing to accept to resolve this? This is
>> mostly a question of tradeoffs.
>>
>> Several "senior" postgres hackers looked at this and didn't find any solution
>> that makes sense to apply to 15. I don't think having David butt his head
>> further against this code is likely to achieve much besides a headache.  Note
>> that David already has a patch to address this for 16.
>>
> 
> I agree with this. It's not clear to me how we'd asses how prevalent it
> really is (reports on a mailing list surely are not a very great way to
> measure this). My personal opinion is that it's a rare regression. Other
> optimization patches have similar rare regressions, except that David
> spent so much time investigating this one it seems more serious.
> 
> I think it's fine to leave this as is. If we feel we have to fix this
> for v15, it's probably best to just apply the v16. I doubt we'll find
> anything simpler.

I think the above is reasonable.

>>> If there does not appear to be a clear path forward, we should at least
>>> document how a user can detect and resolve the issue.
>>
>> To me that doesn't really make sense. We have lots of places were performance
>> changes once you cross some threshold, and lots of those are related to
>> work_mem.

Yes, but in this case this is nonobvious to a user. A sort that may be 
performing just fine on a pre-PG15 version is suddenly degraded, and the 
user has no guidance as to why or how to remediate.

>> We don't, e.g., provide tooling to detect when performance in aggregation
>> regresses due to crossing work_mem and could be fixed by a tiny increase in
>> work_mem.
>>
> 
> Yeah. I find it entirely reasonable to tell people to increase work_mem
> a bit to fix this. The problem is knowing you're affected :-(

This is the concern the RMT discussed. Yes it is reasonable to say 
"increase work_mem to XYZ", but how does a user know how to detect and 
apply that? This is the part that is worrisome, especially because we 
don't have any sense of what the overall impact will be.

Maybe it's not much, but we should document that there is the potential 
for a regression.

Jonathan

Attachment

pgsql-hackers by date:

Previous
From: "David G. Johnston"
Date:
Subject: Re: doc: Bring mention of unique index forced transaction wait behavior outside of the internal section
Next
From: Justin Pryzby
Date:
Subject: Re: MERGE and parsing with prepared statements