[PROPOSAL] Shared Ispell dictionaries - Mailing list pgsql-hackers

From Arthur Zakirov
Subject [PROPOSAL] Shared Ispell dictionaries
Date
Msg-id 20171226164825.GA29922@zakirov.localdomain
Whole thread Raw
Responses Re: [PROPOSAL] Shared Ispell dictionaries
Re: [PROPOSAL] Shared Ispell dictionaries
List pgsql-hackers
Hello, hackers!

Introduction
------------

I'm going to implement a patch which will store Ispell dictionaries in a shared memory.

There is an extension shared_ispell [1], developed by Tomas Vondra. But it is a bad candidate for including into
contrib.
Because it should know a lot of information about IspellDict struct to copy it into a shared memory.

Why
---

Shared Ispell dictionary gives the following improvements:
- consume less memory - Ispell dictionary loads into memory for every backends and requires for some dictionaries more
than100Mb
 
- there is no overhead during first call of a full text search function (such as to_tsvector(), to_tsquery())

Implementation
--------------

It is necessary to change all structures related with IspellDict: SPNode, AffixNode, AFFIX, CMPDAffix, IspellDict
itself.They all shouldn't use pointers for this reason. Others are used only during dictionary building.
 
It would be good to store in a shared memory StopList struct too.

All fields of IspellDict struct, which are used only during dictionary building, will be move into new IspellDictBuild
todecrease needed shared memory size. And they are going to be released by buildCxt.
 

Each dictionary will be stored in its own dsm segment. Structures for regular expressions won't be stored in a shared
memory.They are compiled for every backend.
 

The patch will be ready and added into the 2018-03 commitfest.

Thank you for your attention. Any thoughts?


1 - github.com/tvondra/shared_ispell or github.com/postgrespro/shared_ispell

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


pgsql-hackers by date:

Previous
From: Alvaro Herrera
Date:
Subject: Re: Deadlock in multiple CIC.
Next
From: Alvaro Herrera
Date:
Subject: Re: [PROPOSAL] Shared Ispell dictionaries