Thread: [BUGS] BUG #14654: With high statistics targets on ts_vector,unexpectedly high memory use & OOM are triggered

The following bug has been logged on the website:

Bug reference:      14654
Logged by:          James C
Email address:      james+postgres@carbocation.com
PostgreSQL version: 9.5.5
Operating system:   Linux
Description:

I am running a mirror of PubMed, which contains about 27 million articles.
One of my tables (article) includes a unique ID for each article, the
article title, a ts_vector of the abstract, some timestamps, and a numeric
column. There are (unsurprisingly) about 27 million rows in this table.
Because the abstracts contain scientific terms, the number of lexemes is
quite large.

With some prototyping on a local machine with 64G of RAM, I found queries to
work quite well with the the ts_vector column having default statistics of
10,000. The production box has about 12G of RAM, but initially I didn't
notice any issues.

Today, I attempted to execute 'ANALYZE article;' However, the OOM killer
kept taking down postgres. I reviewed numerous system settings and postgres
settings, all of which seemed to properly give about 10G+ of RAM to
postgres, and which should have constrained postgres to using less than that
limit.

RhodiumToad in Freenode#postgres helped me troubleshoot, and suggested I try
reducing the statistics on the ts_vector. Dropping from 10,000 to 1,000
permitted ANALYZE to run without consuming large amounts of RAM.

It seems that ANALYZE on a ts_vector column can consume 300 * (statistics
target) * (size of data in field), which in my case ended up being well
above 10 gigabytes. I wonder if this might be considered a bug (either in
code, or of documentation), as this memory usage seems not to obey other
limits, or at least wasn't documented in a way that might have helped me
guess at the underlying problem.


--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

On 05/14/2017 11:06 PM, james+postgres@carbocation.com wrote:
> It seems that ANALYZE on a ts_vector column can consume 300 * (statistics
> target) * (size of data in field), which in my case ended up being well
> above 10 gigabytes. I wonder if this might be considered a bug (either in
> code, or of documentation), as this memory usage seems not to obey other
> limits, or at least wasn't documented in a way that might have helped me
> guess at the underlying problem.

Yes, I can see that happening here too. The problem seems to be that the 
analyze-function detoasts every row in the sample. Tsvectors can be very 
large, so it adds up.

That's pretty easy to fix, the analyze function needs to free the 
detoasted copies as it goes. But in order to do that, it needs to make 
copies of all the lexemes stored in the hash table, instead of pointing 
directly to the detoasted copies.

Patch attached. I think this counts as a bug, and we should backport this.

- Heikki


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Attachment
Heikki Linnakangas <hlinnaka@iki.fi> writes:
> Yes, I can see that happening here too. The problem seems to be that the 
> analyze-function detoasts every row in the sample. Tsvectors can be very 
> large, so it adds up.

> That's pretty easy to fix, the analyze function needs to free the 
> detoasted copies as it goes. But in order to do that, it needs to make 
> copies of all the lexemes stored in the hash table, instead of pointing 
> directly to the detoasted copies.

> Patch attached. I think this counts as a bug, and we should backport this.

+1.  I didn't test the patch, but it looks sane to the eyeball.
        regards, tom lane


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

On 07/12/2017 06:30 PM, Tom Lane wrote:
> Heikki Linnakangas <hlinnaka@iki.fi> writes:
>> Yes, I can see that happening here too. The problem seems to be that the
>> analyze-function detoasts every row in the sample. Tsvectors can be very
>> large, so it adds up.
>
>> That's pretty easy to fix, the analyze function needs to free the
>> detoasted copies as it goes. But in order to do that, it needs to make
>> copies of all the lexemes stored in the hash table, instead of pointing
>> directly to the detoasted copies.
>
>> Patch attached. I think this counts as a bug, and we should backport this.
>
> +1.  I didn't test the patch, but it looks sane to the eyeball.

Ok, committed.

In some quick testing on my laptop, and the extra palloc+pfree adds 
about 10% of overhead, in the worst case scenario that every tsvector in 
the sample consists of totally unique lexemes. That's a bit unfortunate, 
but it's a lot better than consuming gigabytes of memory.

- Heikki



-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs