Re: "memory exhausted" in query parser/simplifier for many nested parentheses - Mailing list pgsql-bugs

From Niklas Hambüchen
Subject Re: "memory exhausted" in query parser/simplifier for many nested parentheses
Date
Msg-id a14f012f-a05d-4a19-855c-36a5d512e438@nh2.me
Whole thread Raw
In response to Re: "memory exhausted" in query parser/simplifier for many nested parentheses  (Greg Sabino Mullane <htamfids@gmail.com>)
List pgsql-bugs
Hi Greg,

thanks for testing the parser behaviour of "x in (A, B, ...)", it is useful to know that.

> queries this large are not going to be caused by checkboxes being clicked.

This is how I encountered it:
Doing a selection of 10k files (equivalent to the Shift-click range-selection in most GUI file managers), and using
themin a query.
 

> We cannot simply push the parser to accept anything, regardless of the impact on other parts of the system.

I'm not suggesting to remove limits regardless of impact.
There are good limits and bad limits:

I think the happy path is when programs scale to the users' machines and tasks, constrained by limits that achieve
specificgoals.
 
Many of postgres's limits from https://www.postgresql.org/docs/17/runtime-config-resource.html achieve the goal of not
havingqueries blow up unreasonably.
 
They are good limits, set to sensible defaults by developers, and tunable by users to fit their workloads.
Same for most of Linux's limits, such as `ulimit`.


In contrast, the impact of `YYMAXDEPTH` on postgres looks like a historical accident.

* It was set to `YMAXDEPTH 500` by Richard Stallman in _1988_ in Bison commit f791019c "entered into RCS" (thus
probablyeven older).
 
* Then changed to `YYMAXDEPTH 10000` in 1993.

So we're looking at a > 35 years old hardcode; a bad limit that probably should always have been a runtime-configurable
parameter.

Maybe in 1993 "10000 of something" was a huge limit, but it isn't today.
Most modern parsers today do not hardcode key aspects such as nesting depth, because what's "a lot" of course depends
onthe application, and it often isn't 4 or 500 or 10000.
 
I also doesn't fit well into what Postgres usually delivers:
As you showed, postgres by default supports "millions of things" in queries, and it really feels like a surprise bug
thatthis limit is much lower when you write the query in an equally long but slightly different way, because that
specificsyntax code path goes through a library with hardcodes from 1993.
 

So I think the ideal solution would be:

* Make `YYMAXDEPTH` configurable at run-time (it always should have been), and add a postgres config for it with a good
default.

Of course it's not for me to judge whether this is a high priority topic for postgres, but I think it would be fair to
classifyit as a "one of the tools we use hardcodes a silly limit that doesn't fit to our other limits and thus triggers
usersurprise" bug, as opposed to "postgres authors designed it this way".
 

Greetings,
Niklas



pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: "memory exhausted" in query parser/simplifier for many nested parentheses
Next
From: "David G. Johnston"
Date:
Subject: Re: "memory exhausted" in query parser/simplifier for many nested parentheses