Re: [PROPOSAL] Shared Ispell dictionaries - Mailing list pgsql-hackers

From Arthur Zakirov
Subject Re: [PROPOSAL] Shared Ispell dictionaries
Date
Msg-id 20180325095340.GA1370@arthur.localdomain
Whole thread Raw
In response to Re: [PROPOSAL] Shared Ispell dictionaries  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [PROPOSAL] Shared Ispell dictionaries  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Mar 24, 2018 at 04:56:36PM -0400, Tom Lane wrote:
> Arthur Zakirov <a.zakirov@postgrespro.ru> writes:
> > [ v10 patch versions ]

Thank you for the review, Tom!

Tomas Vondra wrote:
> Tom Lane wrote:
>> * I cannot imagine a use-case for setting max_shared_dictionaries_size
>> to anything except "unlimited".  If it's not that, and you exceed it,
>> then subsequent backends load private copies of the dictionary, making
>> your memory situation rapidly worse not better.  I think we should lose
>> that GUC altogether and just load dictionaries automatically.
>
> Introduction of that limit is likely my fault. It came from from an
> extension I wrote a long time ago, but back then it was a necessity
> because we did not have DSM. So in retrospect I agree with you - it's
> not particularly useful and we should ditch it.
> 
> Arthur, let this be a lesson for you! You have to start fight against
> bogus feature requests from other people ;-)

Yeah, in this sense max_shared_dictionaries_size is pointless. I'll
remove it then :).

> * Similarly, I see no point in a "sharable" option on individual
> dictionaries, especially when there's only one allowed setting for
> most dictionary types.  Let's lose that too.

I think "Shareable" option could be useful if a shared dictionary
building time was much longer than a non-shared dictionary building
time. It is slightly longer because of additional memcpy(), but isn't
noticable I think. So it is worth to remove it.

> * And that leads us to not particularly need a view telling which
> dictionaries are loaded, either.  It's just an implementation detail
> that users don't need to worry about.

If all dictionaries will be shareable then this view could be removed.
Unfortunately I think it can't help with leaked segments, I didn't find
a way to iterate dshash entries. That's why pg_ts_shared_dictionaries()
scans pg_ts_dict table instead of scanning dshash table.

> I do think it's required that changing the dictionary's options with
> ALTER TEXT SEARCH DICTIONARY automatically cause a reload; but if that's
> happening with this patch, I don't see where.  (It might work to use
> the combination of dictionary OID and TID of the dictionary's pg_ts_dict
> tuple as the lookup key for shared dictionaries.  Oh, and have you
> thought about the possibility of conflicting OIDs in different DBs?
> Probably the database OID has to be part of the key, as well.)

Yes unfortunately ALTER TEXT SEARCH DICTIONARY doesn't reload a
dictionary. TID can help here. I thought about using XID too when I
started to work on RELOAD command. But I'm not sure that it is a good
idea, anyway XID isn't needed in current version.

> Also, the scheme for releasing the dictionary DSM during
> RemoveTSDictionaryById is uncertain and full of race conditions:
> the DROP might roll back later, or someone might come along and
> start using the dictionary (causing a fresh DSM load) before the
> DROP commits and makes the dictionary invisible to other sessions.
> I don't think that either of those are necessarily fatal objections,
> but there needs to be some commentary there explaining what happens.

I missed this case. As you wrote below ROLLBACK case is ok. But I
haven't a soluton for the second case for now. If I won't solve it I'll
add additional comments in RemoveTSConfigurationById() and maybe in the
documentation if it's appropriate.

> BTW, I was going to complain that this patch alters the API for
> dictionary template init functions without any documentation updates;
> but then I realized that there isn't any documentation to update.
> That pretty well sucks, but I suppose it's not the job of this patch
> to improve that situation.  Still, you could spend a bit more effort on
> the commentary in ts_public.h in 0002, because that commentary is as
> close to an API spec as we've got.

I'll fix the comments.

-- 
Arthur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company


pgsql-hackers by date:

Previous
From: Michael Paquier
Date:
Subject: Re: Using base backup exclusion filters to reduce data transferredwith pg_rewind
Next
From: Arthur Zakirov
Date:
Subject: Re: [PROPOSAL] Shared Ispell dictionaries