Michael Paquier <michael@paquier.xyz> writes:
>>> Is there some downside to XML_PARSE_HUGE?
> If one looks at the libxml2 like this mirror at [1], it is possible to
> see that the flag is only used to lift internal hard limits, for stuff
> like XML_MAX_TEXT_LENGTH and XML_MAX_NAME_LENGTH for size control, or
> max node depths.
I dug through those sources and I concur that mostly, setting that
flag just results in replacing arbitrary hard-coded limits with higher
arbitrary hard-coded limits. The one place I found where it looks
like the clamps come off entirely is that the XML_MAX_DICTIONARY_LIMIT
on the number of entries in a dictionary is replaced by "unlimited".
Given that in our usage the input string will be limited to 1GB, the
number of entries you could possibly create is still pretty finite.
> Knowing that we have full control of the memory contexts for the XML
> nodes, just enforcing the huge flag does not seem like there's any
> downside for us. (Right?)
Blowing out a backend's memory or CPU consumption is not something
we try hard to prevent, so I'm not terribly worried on that score.
The one thing I'm concerned about is that raising these limits could
make bugs (like integer overflow problems) reachable that were not
otherwise, and that such bugs might rise to the level of security
problems. They've had such issues before (CVE-2022-40303) and it'd be
foolish to be sure that none remain. Still, that's clearly their bug
not our bug.
On the whole I'm not too worried, and even if I were, I doubt that
an enabling GUC would be the answer. We'd have to make it SUSET
and default to off for it to be a credible security defense, and that
seems like an excessive amount of paranoia. Besides, I believe that
downstream packagers who don't trust libxml2 are already just building
PG without XML support.
regards, tom lane