Re: [PROPOSAL] Shared Ispell dictionaries - Mailing list pgsql-hackers

From Tomas Vondra
Subject Re: [PROPOSAL] Shared Ispell dictionaries
Date
Msg-id d4d52484-e269-9d7b-cef5-c1e95a03d9c6@2ndquadrant.com
Whole thread Raw
In response to Re: [PROPOSAL] Shared Ispell dictionaries  (Arthur Zakirov <a.zakirov@postgrespro.ru>)
Responses Re: [PROPOSAL] Shared Ispell dictionaries  (Andres Freund <andres@anarazel.de>)
Re: [PROPOSAL] Shared Ispell dictionaries  (Arthur Zakirov <a.zakirov@postgrespro.ru>)
List pgsql-hackers
On 03/17/2018 05:43 AM, Arthur Zakirov wrote:
> Hello Tomas,
> 
>     Arthur, what are your plans with this patch in the current CF?
> 
> 
> I think dsm-based approach is in good shape already and works nice.
> I've planned only to improve the documentation a little. Also it seems I
> should change 0004 part, I found that extension upgrade scripts may be
> made in wrong way.
> In my opinion RELOAD and UNLOAD commands can be made in next commitfest
> (2018-09).
> Did you look it? Have you arguments about how shared memory allocation
> and releasing functions are made?
>  
> 
> 
>     It does not seem to be moving towards RFC very much, and reworking the
>     patch to use mmap() seems like a quite significant change late in the
>     CF. Which means it's likely to cause the patch get get bumped to the
>     next CF (2018-09).
> 
> 
> Agree. I have a draft version for mmap-based approach which works in
> platforms with mmap. In Windows it is necessary to use another API
> (CreateFileMapping, etc). But this approach requires more work on
> handling processed dictionary files (how name them, when remove).
>  
> 
> 
>     FWIW I am not quite sure if the mmap() approach is better than what was
>     implemented by the patch. I'm not sure how exactly will it behave under
>     memory pressure (AFAIK it goes through page cache, which means random
>     parts of dictionaries might get evicted) or how well is it supported on
>     various platforms (say, Windows).
> 
> 
> Yes, as I wrote mmap-based approach requires more work. The only
> benefit I see is that you don't need to process a dictionary after
> server restart.  I'd vote for dsm-based approach.
> 

I do agree with that. We have a working well-understood dsm-based
solution, addressing the goals initially explained in this thread.

I don't see a reason to stall this patch based on a mere assumption that
the mmap-based approach might be magically better in some unknown
aspects. It might be, but we may as well leave that as a future work.

I wonder how much of this patch would be affected by the switch from dsm
to mmap? I guess the memory limit would get mostly irrelevant (mmap
would rely on the OS to page the memory in/out depending on memory
pressure), and so would the UNLOAD/RELOAD commands (because each backend
would do it's own mmap).

In any case, I suggest to polish the dsm-based patch, and see if we can
get that one into PG11.

regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: ECPG installcheck tests fail if PGDATABASE is set
Next
From: Craig Ringer
Date:
Subject: Re: HELP