On 05/14/2017 11:06 PM, james+postgres@carbocation.com wrote:
> It seems that ANALYZE on a ts_vector column can consume 300 * (statistics
> target) * (size of data in field), which in my case ended up being well
> above 10 gigabytes. I wonder if this might be considered a bug (either in
> code, or of documentation), as this memory usage seems not to obey other
> limits, or at least wasn't documented in a way that might have helped me
> guess at the underlying problem.
Yes, I can see that happening here too. The problem seems to be that the
analyze-function detoasts every row in the sample. Tsvectors can be very
large, so it adds up.
That's pretty easy to fix, the analyze function needs to free the
detoasted copies as it goes. But in order to do that, it needs to make
copies of all the lexemes stored in the hash table, instead of pointing
directly to the detoasted copies.
Patch attached. I think this counts as a bug, and we should backport this.
- Heikki
--
Sent via pgsql-bugs mailing list (pgsql-bugs@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-bugs