On Tue, Oct 13, 2015 at 11:54 AM, Alvaro Herrera
<alvherre@2ndquadrant.com> wrote:
> Lele Gaifax wrote:
>> Alvaro Herrera <alvherre@2ndquadrant.com> writes:
>>
>> > So 10% of your rows in the master_l10n table start with "quattro"?
>> > That's pretty odd, isn't it? How did you manufacture these data?
>>
>> Well, not a real scenario for sure, but definitely not odd: I just needed an
>> "extremely" big dataset to test out several different strategies, both on
>> table layout and indexes. The tables are populated by mechanically translating
>> the integer primary key into the corresponding "in words" string (1 -> "one")...
>
> I imagined it would be something like that. It's not the most useful
> set of test data, precisely because it doesn't accurately reflect what
> you're going to have in practice. I suggest you enter some actual text,
> even if it's just text from Don Camillo or whatever.
>
>> > How often are you going to look for translated text without specifying a
>> > language?
>>
>> Never. The most frequently used criteria is «LIKE '%word%'» in the context of
>> a user session, and thus with a "preferred language".
>
> Be very careful with a % at the left. The index is not going to work at
> all there. It is not the same as looking for stuff without a % at the
> left.
Note, you can optimize LIKE '%foo%' with pg_trgm/gin indexing.
Trigram based indexing is kind of a mixed bag but is about to get a
lot faster with recent enhancements so that it should mostly match or
beat the brute force search. This is the preferred solution if you
need to do partial string matching -- for most other cases of
attribute searching I'd be looking at jsonb. Welcome to postgres OP!
merlin