Home > mailing lists

Re: GIN improvements part 1: additional information - Mailing list pgsql-hackers

From	Alexander Korotkov
Subject	Re: GIN improvements part 1: additional information
Date	January 15, 2014 06:47:37
Msg-id	CAPpHfdv9Pheu5atEEUk75f_S1nf6vCoRqge2yUQ1v3xgZ4UP3g@mail.gmail.com Whole thread Raw
In response to	Re: GIN improvements part 1: additional information (Tomas Vondra <tv@fuzzy.cz>)
Responses	Re: GIN improvements part 1: additional information
List	pgsql-hackers

Tree view

On Wed, Jan 15, 2014 at 5:17 AM, Tomas Vondra <tv@fuzzy.cz> wrote:

On 14.1.2014 00:38, Tomas Vondra wrote:
> On 13.1.2014 18:07, Alexander Korotkov wrote:
>> On Sat, Jan 11, 2014 at 6:15 AM, Tomas Vondra <tv@fuzzy.cz
>> <mailto:tv@fuzzy.cz>> wrote:
>>
>> On 8.1.2014 22:58, Alexander Korotkov wrote:
>> > Thanks for reporting. Fixed version is attached.
>>
>> I've tried to rerun the 'archie' benchmark with the current patch, and
>> once again I got
>>
>> PANIC: could not split GIN page, didn't fit
>>
>> I reran it with '--enable-cassert' and with that I got
>>
>> TRAP: FailedAssertion("!(ginCompareItemPointers(&items[i - 1],
>> &items[i]) < 0)", File: "gindatapage.c", Line: 149)
>> LOG: server process (PID 5364) was terminated by signal 6: Aborted
>> DETAIL: Failed process was running: INSERT INTO messages ...
>>
>> so the assert in GinDataLeafPageGetUncompressed fails for some reason.
>>
>> I can easily reproduce it, but my knowledge in this area is rather
>> limited so I'm not entirely sure what to look for.
>>
>>
>> I've fixed this bug and many other bug. Now patch passes test suite that
>> I've used earlier. The results are so:
>
> OK, it seems the bug is gone. However now there's a memory leak
> somewhere. I'm loading pgsql mailing list archives (~600k messages)
> using this script
>
> https://bitbucket.org/tvondra/archie/src/1bbeb920/bin/load.py
>
> And after loading about 1/5 of the data, all the memory gets filled by
> the pgsql backends (loading the data in parallel) and the DB gets killed
> by the OOM killer.

I've spent a fair amount of time trying to locate the memory leak, but
so far no luck. I'm not sufficiently familiar with the GIN code.

I can however demonstrate that it's there, and I have rather simple test
case to reproduce it - basically just a CREATE INDEX on a table with ~1M
email message bodies (in a tsvector column). The data is available here
(360MB compressed, 1GB raw):

http://www.fuzzy.cz/tmp/message-b.data.gz

Simply create a single-column table, load data and create the index

CREATE TABLE test ( body_tsvector TSVECTOR );
COPY test FROM '/tmp/message-b.data';
CREATE test_idx ON test USING gin test ( body_tsvector );

I'm running this on a machine with 8GB of RAM, with these settings

shared_buffers=1GB
maintenance_work_mem=1GB

According to top, CREATE INDEX from the current HEAD never consumes more
than ~25% of RAM:

PID USER PR NI VIRT RES SHR %CPU %MEM COMMAND
32091 tomas 20 0 2026032 1,817g 1,040g 56,2 23,8 postgres

which is about right, as (shared_buffers + maintenance_work_mem) is
about 1/4 of RAM.

With the v5 patch version applied, the CREATE INDEX process eventually
goes crazy and allocates almost all the available memory (but somesimes
finishes, mostly by pure luck). This is what I was able to get from top

PID USER PR NI VIRT RES SHR S %CPU %MEM COMMAND
14090 tomas 20 0 7913820 6,962g 955036 D 4,3 91,1 postgres

while the system was still reasonably responsive.

Thanks a lot for your help! I believe problem is that each decompressed item pointers array is palloc'd but not freed. I hope to fix it today.

------
With best regards,
Alexander Korotkov.

pgsql-hackers by date:

From: KONDO Mitsumasa
Date: 15 January 2014, 06:47:21
Subject: drop duplicate buffers in OS

From: Alexander Korotkov
Date: 15 January 2014, 06:50:22
Subject: Re: GIN improvements part2: fast scan

Re: GIN improvements part 1: additional information - Mailing list pgsql-hackers

Previous

Next