Re: GIN improvements part 1: additional information - Mailing list pgsql-hackers

From Alexander Korotkov
Subject Re: GIN improvements part 1: additional information
Date
Msg-id CAPpHfdvmYZqh=QdV+wrNyR+GJxS4gEH0gmKG4sTxeOfVxbcV_g@mail.gmail.com
Whole thread Raw
In response to Re: GIN improvements part 1: additional information  (Tomas Vondra <tv@fuzzy.cz>)
Responses Re: GIN improvements part 1: additional information  (Tomas Vondra <tv@fuzzy.cz>)
List pgsql-hackers
On Sat, Oct 12, 2013 at 1:55 AM, Tomas Vondra <tv@fuzzy.cz> wrote:
On 10.10.2013 13:57, Heikki Linnakangas wrote:
> On 09.10.2013 02:04, Tomas Vondra wrote:
>> On 8.10.2013 21:59, Heikki Linnakangas wrote:
>>> On 08.10.2013 17:47, Alexander Korotkov wrote:
>>>> Hi, Tomas!
>>>>
>>>> On Sun, Oct 6, 2013 at 3:58 AM, Tomas Vondra<tv@fuzzy.cz>   wrote:
>>>>
>>>>> I've attempted to rerun the benchmarks tests I did a few weeks ago,
>>>>> but
>>>>>    I got repeated crashes when loading the data (into a table with
>>>>> tsvector+gin index).
>>>>>
>>>>> Right before a crash, theres this message in the log:
>>>>>
>>>>>      PANIC:  not enough space in leaf page!
>>>>>
>>>>
>>>> Thanks for testing. Heikki's version of patch don't works for me too on
>>>> even much more simplier examples. I can try to get it working if he
>>>> answer
>>>> my question about GinDataLeafPageGetPostingList* macros.
>>>
>>> The new macros in that patch version were quite botched. Here's a new
>>> attempt.
>>
>> Nope, still the same errors :-(
>>
>> PANIC:  not enough space in leaf page!
>> LOG:  server process (PID 29722) was terminated by signal 6: Aborted
>> DETAIL:  Failed process was running: autovacuum: ANALYZE public.messages
>
> I've continued hacking away at the patch, here's yet another version.
> I've done a lot of cleanup and refactoring to make the code more
> readable (I hope). I'm not sure what caused the panic you saw, but it's
> probably fixed now.  Let me know if not.

Yup, this version fixed the issues. I haven't been able to do any
benchmarks yet, all I have is some basic stats

               |   HEAD   |  patched
======================================
load duration  |  1084 s  |   1086 s
subject index  |   96 MB  |     96 MB
body index     | 2349 MB  |   2051 MB

So there's virtually no difference in speed (which is expected, AFAIK)
and the large index on full message bodies is significantly smaller.

Yes, it should be no significant difference in speed. But difference in index sizes seems to be too small. Could you share database dump somewhere?

------
With best regards,
Alexander Korotkov.

pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Auto-tuning work_mem and maintenance_work_mem
Next
From: Amit Kapila
Date:
Subject: Re: Compression of full-page-writes