Re: patch: preload dictionary new version - Mailing list pgsql-hackers
From | Pavel Stehule |
---|---|
Subject | Re: patch: preload dictionary new version |
Date | |
Msg-id | AANLkTimgw4N_rNFpJANboidG9O5oCdkzGqdKHO0O2jCG@mail.gmail.com Whole thread Raw |
In response to | Re: patch: preload dictionary new version (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-hackers |
2010/7/8 Tom Lane <tgl@sss.pgh.pa.us>: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> 2010/7/8 Robert Haas <robertmhaas@gmail.com>: >>> A precompiler can give you all the same memory management benefits. > >> I use mmap(). And with mmap the precompiler are not necessary. >> Dictionary is loaded only one time - in original ispell format. I >> think, it is much more simple for administration - just copy ispell >> files. There are not some possible problems with binary >> incompatibility, you don't need to solve serialisation, >> deserialiasation, ...you don't need to copy TSearch ispell parser code >> to client application - probably we would to support not compiled >> ispell dictionaries still. Using a precompiler means a new questions >> for upgrade! > > You're inventing a bunch of straw men to attack. There's no reason that > a precompiler approach would have to put any new requirements on the > user. For example, the dictionary-load code could automatically execute > the precompile step if it observed that the precompiled copy of the > dictionary was missing or had an older file timestamp than the source. uff - just safe activation of precompiler needs lot of low level code - but maybe I see it wrong, and I doesn't work directly with files inside pg. But I can't to see it as simple solution. > > I like the idea of a precompiler step mainly because it still gives you > most of the benefits of the patch on platforms without mmap. (Instead > of mmap'ing, just open and read() the precompiled file.) In particular, > you would still have a creditable improvement for Windows users without > writing any Windows-specific code. > the loading cca 10 MB takes on my comp cca 30 ms - it is better than 90ms, but it isn't a win. >> I think we can divide this problem to three parts > >> a) simple allocator - it can be used not only for TSearch dictionaries. > > I think that's a waste of time, frankly. There aren't enough potential > use cases. > >> b) sharing a data - it is important for large dictionaries > > Useful but not really essential. > >> c) preloading - it decrease load time of first TSearch query > > This is the part that is the make-or-break benefit of the patch. > You need a solution that cuts load time even when mmap isn't > available. > I am not sure if this existing, and if it is necessary. Probably main problem is with Czech language - we have a few specialities. For Czech environment is UNIX and Windows platform the most important. I have not information about using Postgres and Fulltext on other platforms here. So, probably the solution doesn't need be core. I am thinking about some pgfoundry project now - some like ispell dictionary preload. I can send only simplified version without preloading and sharing. Just solving a memory issue - I think so there are not different opinions. best regards Pavel Stehule > regards, tom lane >
pgsql-hackers by date: