Re: daitch_mokotoff module - Mailing list pgsql-hackers

From Paul Ramsey
Subject Re: daitch_mokotoff module
Date
Msg-id CACowWR0Lg+49Z4ncN2-U0-fUYLpAVQYupRr=0UaSLYxgrAWVHQ@mail.gmail.com
Whole thread Raw
In response to Re: daitch_mokotoff module  (Dag Lem <dag@nimrod.no>)
List pgsql-hackers
On Mon, Jan 2, 2023 at 2:03 PM Dag Lem <dag@nimrod.no> wrote:

> I also improved on the documentation example (using Full Text Search).
> AFAIK you can't make general queries like that using arrays, however in
> any case I must admit that text arrays seem like more natural building
> blocks than space delimited text here.

This is a fun addition to fuzzystrmatch.

While it's a little late in the game, I'll just put it out there:
daitch_mokotoff() is way harder to type than soundex_dm(). Not sure
how you feel about that.

On the documentation, I found the leap directly into the tsquery
example a bit too big. Maybe start with a very simple example,

--
dm=# SELECT daitch_mokotoff('Schwartzenegger'),
            daitch_mokotoff('Swartzenegger');

 daitch_mokotoff | daitch_mokotoff
-----------------+-----------------
 {479465}        | {479465}
--

Then transition into a more complex example that illustrates the GIN
index technique you mention in the text, but do not show:

--
CREATE TABLE dm_gin (source text, dm text[]);

INSERT INTO dm_gin (source) VALUES
    ('Swartzenegger'),
    ('John'),
    ('James'),
    ('Steinman'),
    ('Steinmetz');

UPDATE dm_gin SET dm = daitch_mokotoff(source);

CREATE INDEX dm_gin_x ON dm_gin USING GIN (dm);

SELECT * FROM dm_gin WHERE dm && daitch_mokotoff('Schwartzenegger');
--

And only then go into the tsearch example. Incidentally, what does the
tsearch approach provide that the simple GIN approach does not?
Ideally explain that briefly before launching into the example. With
all the custom functions and so on it's a little involved, so maybe if
there's not a huge win in using that approach drop it entirely?

ATB,
P



pgsql-hackers by date:

Previous
From: Justin Pryzby
Date:
Subject: Re: Option to not use ringbuffer in VACUUM, using it in failsafe mode
Next
From: Robert Haas
Date:
Subject: Re: logical decoding and replication of sequences, take 2