Re: WIP: shared ispell dictionary - Mailing list pgsql-hackers
From | Pavel Stehule |
---|---|
Subject | Re: WIP: shared ispell dictionary |
Date | |
Msg-id | 162867791003180808p49a047cfj72d1d89ce5121d9e@mail.gmail.com Whole thread Raw |
In response to | Re: WIP: shared ispell dictionary (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: WIP: shared ispell dictionary
|
List | pgsql-hackers |
2010/3/18 Tom Lane <tgl@sss.pgh.pa.us>: > Pavel Stehule <pavel.stehule@gmail.com> writes: >> I know so Tom worries about using of share memory. > > You're right, and if I have any say in the matter no patch like this > will ever go in. > > What I would suggest looking into is some way of preprocessing the raw > text dictionary file into a format that can be slurped into memory > quickly. The main problem compared to the way things are done now > is that the current internal format relies heavily on pointers. > Maybe you could replace those by offsets? You have to maintain a new application :( There can be a new kind of bugs. I playing with preload solution now. And I found a new issue. I don't know why, but when I preload library with large mem like ispell, then all next operations are ten times slower :( [pavel@nemesis tsearch]$ psql-dev3 postgres Timing is on. psql-dev3 (9.0devel) Type "help" for help. postgres=# select 10;?column? ---------- 10 (1 row) Time: 0,611 ms postgres=# select 10;?column? ---------- 10 (1 row) Time: 0,277 ms postgres=# select 10;?column? ---------- 10 (1 row) Time: 0,266 ms postgres=# select 10;?column? ---------- 10 (1 row) Time: 0,348 ms postgres=# select * from ts_debug('cs','Jmenuji se Pavel Stěhule a bydlím ve Skalici'); alias | description | token | dictionaries | dictionary | lexemes -----------+-------------------+---------+---------------------------+------------------+----------------asciiword | Word,all ASCII | Jmenuji | {preloaded_cspell,simple} | preloaded_cspell | {jmenovat}blank | Space symbols | | {} | |asciiword| Word, all ASCII | se | {preloaded_cspell,simple} | preloaded_cspell | {}blank | Space symbols | | {} | |asciiword | Word,all ASCII | Pavel | {preloaded_cspell,simple} | preloaded_cspell | {pavel,pavla}blank | Space symbols | | {} | |word | Word, all letters | Stěhule | {preloaded_cspell,simple} | preloaded_cspell | {stěhule}blank | Space symbols | | {} | |asciiword| Word, all ASCII | a | {preloaded_cspell,simple} | preloaded_cspell | {}blank | Space symbols | | {} | |word | Word,all letters | bydlím | {preloaded_cspell,simple} | preloaded_cspell | {bydlet,bydlit}blank | Space symbols | | {} | |asciiword| Word, all ASCII | ve | {preloaded_cspell,simple} | preloaded_cspell | {}blank | Space symbols | | {} | |asciiword | Word,all ASCII | Skalici | {preloaded_cspell,simple} | preloaded_cspell | {skalice} (15 rows) Time: 24,495 ms postgres=# select * from ts_debug('cs','Jmenuji se Pavel Stěhule a bydlím ve Skalici'); alias | description | token | dictionaries | dictionary | lexemes -----------+-------------------+---------+---------------------------+------------------+----------------asciiword | Word,all ASCII | Jmenuji | {preloaded_cspell,simple} | preloaded_cspell | {jmenovat}blank | Space symbols | | {} | |asciiword| Word, all ASCII | se | {preloaded_cspell,simple} | preloaded_cspell | {}blank | Space symbols | | {} | |asciiword | Word,all ASCII | Pavel | {preloaded_cspell,simple} | preloaded_cspell | {pavel,pavla}blank | Space symbols | | {} | |word | Word, all letters | Stěhule | {preloaded_cspell,simple} | preloaded_cspell | {stěhule}blank | Space symbols | | {} | |asciiword| Word, all ASCII | a | {preloaded_cspell,simple} | preloaded_cspell | {}blank | Space symbols | | {} | |word | Word,all letters | bydlím | {preloaded_cspell,simple} | preloaded_cspell | {bydlet,bydlit}blank | Space symbols | | {} | |asciiword| Word, all ASCII | ve | {preloaded_cspell,simple} | preloaded_cspell | {}blank | Space symbols | | {} | |asciiword | Word,all ASCII | Skalici | {preloaded_cspell,simple} | preloaded_cspell | {skalice} (15 rows) ...skipping... alias | description | token | dictionaries | dictionary | lexemes -----------+-------------------+---------+---------------------------+------------------+----------------asciiword | Word,all ASCII | Jmenuji | {preloaded_cspell,simple} | preloaded_cspell | {jmenovat}blank | Space symbols | | {} | |asciiword| Word, all ASCII | se | {preloaded_cspell,simple} | preloaded_cspell | {}blank | Space symbols | | {} | |asciiword | Word,all ASCII | Pavel | {preloaded_cspell,simple} | preloaded_cspell | {pavel,pavla}blank | Space symbols | | {} | |word | Word, all letters | Stěhule | {preloaded_cspell,simple} | preloaded_cspell | {stěhule}blank | Space symbols | | {} | |asciiword| Word, all ASCII | a | {preloaded_cspell,simple} | preloaded_cspell | {}blank | Space symbols | | {} | |word | Word,all letters | bydlím | {preloaded_cspell,simple} | preloaded_cspell | {bydlet,bydlit}blank | Space symbols | | {} | |asciiword| Word, all ASCII | ve | {preloaded_cspell,simple} | preloaded_cspell | {}blank | Space symbols | | {} | |asciiword | Word,all ASCII | Skalici | {preloaded_cspell,simple} | preloaded_cspell | {skalice} (15 rows) ~ ~ ~ Time: 18,426 ms postgres=# select 10;?column? ---------- 10 (1 row) Time: 12,700 ms postgres=# select 10;?column? ---------- 10 (1 row) Time: 12,465 ms postgres=# select 10;?column? ---------- 10 (1 row) Time: 12,603 ms postgres=# select 10;?column? ---------- 10 (1 row) Time: 12,901 ms postgres=# select 10;?column? ---------- 10 (1 row) Time: 12,642 ms When I reduce memory with simple allocator, then this issue is removed, but it is strange. Pavel > > regards, tom lane >
pgsql-hackers by date: