Re: [BUGS] BUG #14654: With high statistics targets on ts_vector,unexpectedly high memory use & OOM are triggered - Mailing list pgsql-bugs

From Heikki Linnakangas
Subject Re: [BUGS] BUG #14654: With high statistics targets on ts_vector,unexpectedly high memory use & OOM are triggered
Date
Msg-id dc65ba89-46d1-07f2-3f94-51ba00446931@iki.fi
Whole thread Raw
In response to [BUGS] BUG #14654: With high statistics targets on ts_vector,unexpectedly high memory use & OOM are triggered  (james+postgres@carbocation.com)
Responses Re: [BUGS] BUG #14654: With high statistics targets on ts_vector, unexpectedly high memory use & OOM are triggered
List pgsql-bugs
On 05/14/2017 11:06 PM, james+postgres@carbocation.com wrote:
> It seems that ANALYZE on a ts_vector column can consume 300 * (statistics
> target) * (size of data in field), which in my case ended up being well
> above 10 gigabytes. I wonder if this might be considered a bug (either in
> code, or of documentation), as this memory usage seems not to obey other
> limits, or at least wasn't documented in a way that might have helped me
> guess at the underlying problem.

Yes, I can see that happening here too. The problem seems to be that the 
analyze-function detoasts every row in the sample. Tsvectors can be very 
large, so it adds up.

That's pretty easy to fix, the analyze function needs to free the 
detoasted copies as it goes. But in order to do that, it needs to make 
copies of all the lexemes stored in the hash table, instead of pointing 
directly to the detoasted copies.

Patch attached. I think this counts as a bug, and we should backport this.

- Heikki


-- 
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs

Attachment

pgsql-bugs by date:

Previous
From: "K S, Sandhya (Nokia - IN/Bangalore)"
Date:
Subject: [BUGS] Re: [HACKERS] Postgres process invoking exit resulting in sh-QUITcore
Next
From: Heikki Linnakangas
Date:
Subject: Re: [BUGS] BUG #14721: Assertion of synchronous replication