Re: [PROPOSAL] Shared Ispell dictionaries - Mailing list pgsql-hackers

From Arthur Zakirov
Subject Re: [PROPOSAL] Shared Ispell dictionaries
Date
Msg-id 20180302111924.GB18933@zakirov.localdomain
Whole thread Raw
In response to Re: [PROPOSAL] Shared Ispell dictionaries  (Andres Freund <andres@anarazel.de>)
List pgsql-hackers
Hello,

Thank you for your comments.

On Thu, Mar 01, 2018 at 08:31:49PM -0800, Andres Freund wrote:
> Hi,
> 
> On 2018-02-07 19:28:29 +0300, Arthur Zakirov wrote:
> > +    {
> > +        {"max_shared_dictionaries_size", PGC_POSTMASTER, RESOURCES_MEM,
> > +            gettext_noop("Sets the maximum size of all text search dictionaries loaded into shared memory."),
> > +            gettext_noop("Currently controls only loading of Ispell dictionaries. "
> > +                         "If total size of simultaneously loaded dictionaries "
> > +                         "reaches the maximum allowed size then a new dictionary "
> > +                         "will be loaded into local memory of a backend."),
> > +            GUC_UNIT_KB,
> > +        },
> > +        &max_shared_dictionaries_size,
> > +        100 * 1024, 0, MAX_KILOBYTES,
> > +        NULL, NULL, NULL
> > +    },
> 
> So this uses shared memory, allocated at server start?  That doesn't
> seem right. Wouldn't it make more sense to have a
> 'num_shared_dictionaries' GUC, and then allocate them with dsm? Or even
> better not have any such limit and us a dshash table to point to
> individual loaded tables?

The patch uses dsm and dshash table already.
'max_shared_dictionaries_size' GUC was introduced after discussion with
Tomas [1]. To limit amount of memory consumed by loaded dictionaries and to
prevent possible memory bloating. Its default value is 100MB.

There was 'shared_dictionaries' GUC before, it was introduced because
usual hash tables was used before, not dshash. I replaced usual hash
tables by dshash, removed 'shared_dictionaries' and added
'max_shared_dictionaries_size'. 

> Is there any chance we can instead can convert dictionaries into a form
> we can just mmap() into memory?  That'd scale a lot higher and more
> dynamicallly?

I think new IspellDictData structure (in 0003-Store-ispell-structures-in-shmem-v5.patch)
can be stored in a binary file and mapped into memory already. But
mmap() is not used in this patch yet.

I can do some experiments and make a prototype.


1 - https://www.postgresql.org/message-id/d12d9395-922c-64c9-c87d-dd0e1d31440e%402ndquadrant.com

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


pgsql-hackers by date:

Previous
From: Magnus Hagander
Date:
Subject: Re: Allow workers to override datallowconn
Next
From: Amit Langote
Date:
Subject: Re: [HACKERS] path toward faster partition pruning