Re: BUG #18080: to_tsvector fails for long text input - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #18080: to_tsvector fails for long text input
Date
Msg-id 3300287.1694786029@sss.pgh.pa.us
Whole thread Raw
In response to Re: BUG #18080: to_tsvector fails for long text input  (Alvaro Herrera <alvherre@alvh.no-ip.org>)
Responses Re: BUG #18080: to_tsvector fails for long text input
List pgsql-bugs
Alvaro Herrera <alvherre@alvh.no-ip.org> writes:
> On 2023-Sep-04, PG Bug reporting form wrote:
>> SELECT to_tsvector('english'::regconfig, (REPEAT('<Long123456789/>'::text,
>> 20000000)));
>> results in
>> ERROR:  invalid memory alloc request size 2133333320

> This is because to_tsvector_byid does this:
>     prs.lenwords = VARSIZE_ANY_EXHDR(in) / 6;    /* just estimation of word's
>                                                  * number */
>     if (prs.lenwords < 2)
>         prs.lenwords = 2;

Yeah.  My thought about blocking the error had been to limit
prs.lenwords to MaxAllocSize/sizeof(ParsedWord) in this code.
I doubt that switching over to MCXT_ALLOC_HUGE is a good idea.
(Would we not also have to touch the places that repalloc that
array?)

            regards, tom lane



pgsql-bugs by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: BUG #18080: to_tsvector fails for long text input
Next
From: Robert Sanford
Date:
Subject: Re: Segmentation Fault