Re: patch: preload dictionary new version - Mailing list pgsql-hackers

From Tom Lane
Subject Re: patch: preload dictionary new version
Date
Msg-id 24354.1278598720@sss.pgh.pa.us
Whole thread Raw
In response to Re: patch: preload dictionary new version  (Pavel Stehule <pavel.stehule@gmail.com>)
Responses Re: patch: preload dictionary new version
Re: patch: preload dictionary new version
Re: patch: preload dictionary new version
List pgsql-hackers
Pavel Stehule <pavel.stehule@gmail.com> writes:
> 2010/7/8 Robert Haas <robertmhaas@gmail.com>:
>> A precompiler can give you all the same memory management benefits.

> I use mmap(). And with  mmap the precompiler are not necessary.
> Dictionary is loaded only one time - in original ispell format. I
> think, it is much more simple for administration - just copy ispell
> files. There are not some possible problems with binary
> incompatibility, you don't need to solve serialisation,
> deserialiasation, ...you don't need to copy TSearch ispell parser code
> to client application - probably we would to support not compiled
> ispell dictionaries still. Using a precompiler means a new questions
> for upgrade!

You're inventing a bunch of straw men to attack.  There's no reason that
a precompiler approach would have to put any new requirements on the
user.  For example, the dictionary-load code could automatically execute
the precompile step if it observed that the precompiled copy of the
dictionary was missing or had an older file timestamp than the source.

I like the idea of a precompiler step mainly because it still gives you
most of the benefits of the patch on platforms without mmap.  (Instead
of mmap'ing, just open and read() the precompiled file.)  In particular,
you would still have a creditable improvement for Windows users without
writing any Windows-specific code.

> I think we can divide this problem to three parts

> a) simple allocator - it can be used not only for TSearch dictionaries.

I think that's a waste of time, frankly.  There aren't enough potential
use cases.

> b) sharing a data - it is important for large dictionaries

Useful but not really essential.

> c) preloading - it decrease load time of first TSearch query

This is the part that is the make-or-break benefit of the patch.
You need a solution that cuts load time even when mmap isn't
available.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: [RRR] Reviewfest 2010-06 Plans and Call for Reviewers
Next
From: Tom Lane
Date:
Subject: Re: [v9.1] Add security hook on initialization of instance