Re: "memory exhausted" in query parser/simplifier for many nested parentheses - Mailing list pgsql-bugs
From | Tomas Vondra |
---|---|
Subject | Re: "memory exhausted" in query parser/simplifier for many nested parentheses |
Date | |
Msg-id | cb8fa16c-0141-4569-8599-d009e0f1d96d@vondra.me Whole thread Raw |
In response to | Re: "memory exhausted" in query parser/simplifier for many nested parentheses (Greg Sabino Mullane <htamfids@gmail.com>) |
Responses |
Re: "memory exhausted" in query parser/simplifier for many nested parentheses
|
List | pgsql-bugs |
On 12/13/24 15:44, Greg Sabino Mullane wrote: > On Fri, Dec 13, 2024 at 8:53 AM Niklas Hambüchen <mail@nh2.me > <mailto:mail@nh2.me>> wrote: > > As a user, how can I know that 10000 list entries in "x in (A, > B, ...)" will not also hit an arbitrary parser implementation detail > hardcoded limit? > How shall the user have confidence that the parser is better at > handling multiple commas than multiple parens? > > > As an application developer, you test it, especially if your application > has the potential to generate insanely large queries. For the record, > the catch point on my system using Postgres 17 seems to be around 8.4 > *MILLION* items for IN() lists: > > greg=# \set VERBOSITY terse > greg=# do $$begin execute format('SELECT 1 WHERE 1 IN (%s1)', > repeat('123,',8_385_000));end$$; > DO > greg=# do $$begin execute format('SELECT 1 WHERE 1 IN (%s1)', > repeat('123,',8_390_000));end$$; > ERROR: invalid memory alloc request size 1073741824 > > > None of that seems to be documented anywhere > > > Documentation patches are always welcome. Perhaps at https:// > www.postgresql.org/docs/current/limits.html <https://www.postgresql.org/ > docs/current/limits.html> ? > FWIW I don't think it's practical to document such limits in detail. Everyone knows the limits exist, and there's a lot of them - various libraries we rely on (and we have plenty of them) may have a couple of them, etc. It'd be a huge effort, and it's not clear where to draw the line. And I'm not aware of other projects documenting such stuff in detail - e.g. kernel certainly has a lot of such limits. > In absence of such docs, one must build systems that later fail at > arbitrary limits when e.g. the user clicks some larger number of > checkboxes that construct a batch query. > > > That's a bit of a strawman argument: queries this large are not going to > be caused by checkboxes being clicked. > Not sure I understand the suggestion that "one must build systems that later fail at arbitrary limits". It's a trade off, and it's perfectly reasonable to not spend time optimizing for weirdly complex queries when there's a much better/faster way to write the query, not hitting those limits ... > If I build some workaround today, e.g. splitting the query into > multiple ones of max length N, how do I know it will still work in > the future, e.g. if Postgres changes the Bison version or switches > to a different parser? > > > You don't, that's the nature of software. But it's fairly unlikely we'd > switch to something that was MORE contraining than in the past. Still, > don't play on the edge. > Right. A lot of the mailing list traffic is discussions about possible regressions. But even with that we have little control over changes in the dependencies. This just means it's important to set baselines and do testing as part of an upgrade. > > The hardcodedness of arbitrary small limits that don't scale itself > is also a problem: > One cannot configure postgres to allow queries that the hardware is > perfectly able of handling. > > It would be a bit like GCC giving up to compile a file if it > contains more than 10000 words. > > > No, that's not an apt comparison at all. We cannot simply push the > parser to accept anything, regardless of the impact on other parts of > the system. Software engineering is all about tradeoffs. I agree with > your point about documentation, but this seems like trying to pick a fight. > Yeah, it's a matter of trade offs. Not just technical ones, but also (and maybe in the first place) economical - which improvements to spend time on to get the most benefit. Time is a finite resource. regards -- Tomas Vondra
pgsql-bugs by date: