BUG #17145: Invalid memory alloc request size, when searching in text column with big content > 250MB - Mailing list pgsql-bugs

From PG Bug reporting form
Subject BUG #17145: Invalid memory alloc request size, when searching in text column with big content > 250MB
Date
Msg-id 17145-b7bf85f59bf37b13@postgresql.org
Whole thread Raw
Responses Re: BUG #17145: Invalid memory alloc request size, when searching in text column with big content > 250MB  (Julien Rouhaud <rjuju123@gmail.com>)
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      17145
Logged by:          James Inform
Email address:      james.inform@pharmapp.de
PostgreSQL version: 13.4
Operating system:   Ubuntu 18.04 LTS / MacOS 10.15.7
Description:

Hi,

while importing some log data into a PostgreSQL table a came across this
issue.

When I try to search a text field in a where clause that holds more than 250
MB of text, PostgreSQL runs out of memory for the request.

You can reproduce this behaviour with the following sql statement:

with
    q1 as 
    (
                -- 260*1024*1024 = 260MB
        select repeat('x',260*1024*1024) as mydata
    )
select count(*) from q1 where lower(mydata) like '%a%'
;

ERROR:  invalid memory alloc request size 1090519044

Using just a 250MB string:

with
    q1 as 
    (
                -- 250*1024*1024 = 250MB
        select repeat('x',250*1024*1024) as mydata
    )
select count(*) from q1 where lower(mydata) like '%a%'
;

Everything is fine!

The alloc request size seems to be 4 times the length of the text field.

The issue is also reproducible when using a table with a text column and
filling it with a string like above.

create table mytest as
 with
    q1 as 
    (
                -- 260*1024*1024 = 260MB
        select repeat('x',260*1024*1024) as mydata
    )
 select * from q1;

 select count(*) from mytest where lower(mydata) like '%a%'
 ;
 ERROR:  invalid memory alloc request size 1090519044

Also creating an index e.g.

> create extension pg_trgm;
> create index on mytest using gin(lower(mydata) gin_trgm_ops);

is giving the error.

Is this an expected behaviour? 
Why ist PostgreSQL allocating 4 times the column's memory when doing a
search like above?

It seems, that currently nobody will be able to use a text field with more
than 256 MB content with a function bases index nor search for a record
using the field in a where clause.

If the behaviour of using 4 times the memory is due to performance
considerations, then maybe the underlaying algorithm must be devided into an
efficient algo for < 256 MB and a less efficient but working without error
one for >= 256MB.

If we cannot change the behaviour, we should state this information in the
documentation, telling that you can store up to 1GB of string data into a
text column, but only up to 256 MB if you want to use the column for
complexer where clauses.

Cheers, 
James


pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #17144: Upgrade from v13 to v14 with the cube extension failed
Next
From: PG Bug reporting form
Date:
Subject: BUG #17146: pg_dump statements are going into pg_stat_statements in 13.4