Home > mailing lists

Re: CREATE TEXT SEARCH DICTIONARY segfaulting on 9.6+ - Mailing list pgsql-hackers

From	Tomas Vondra
Subject	Re: CREATE TEXT SEARCH DICTIONARY segfaulting on 9.6+
Date	October 13, 2019 21:38:08
Msg-id	20191013213808.wfhtvkqcvlvbdkmz@development Whole thread Raw
In response to	CREATE TEXT SEARCH DICTIONARY segfaulting on 9.6+ (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
List	pgsql-hackers

Tree view

I spent a bit of time investigating this, and it seems the new code is
somewhat too trusting when it comes to data from the affix/dict files.
In this particular case, it boils down to this code in NISortDictionary:

    if (Conf->useFlagAliases)
    {
        for (i = 0; i < Conf->nspell; i++)
        {
            char   *end;

            if (*Conf->Spell[i]->p.flag != '\0')
            {
                curaffix = strtol(Conf->Spell[i]->p.flag, &end, 10);
                if (Conf->Spell[i]->p.flag == end || errno == ERANGE)
                    ereport(ERROR,
                            (errcode(ERRCODE_CONFIG_FILE_ERROR),
                             errmsg("invalid affix alias \"%s\"",
                                    Conf->Spell[i]->p.flag)));
            }
            ...
            Conf->Spell[i]->p.d.affix = curaffix;
            ...
        }
        ...
    }

So it simply grabs whatever it finds in the dict file, parses it and
then (later) we use it as index to access the AffixData array, even if
the value is way out of bounds.

For example in the example, hunspell_sample_long.affix contains about
10 affixes, but then we parse the hunspell_sample_num.dict file, and we
stumble upon

    book/302,301,202,303

and we parse the flags as integers, and interpret them as indexes in the
AffixData array. Clearly, 303 is waaaay out of bounds, triggering the
segfault crash.

So I think we need some sort of cross-check here. We certainly need to
make NISortDictionary() check the affix value is within AffixData
bounds, and error out when the index is non-sensical (maybe negative
and/or exceeding nAffixData). Maybe there's a simple way to check if the
affix/dict files match. The failing affix has

    FLAG num

while with

    FLAG long

it works just fine. But I'm not sure that's actually possible, because I
don't see anything in hunspell_sample_num.dict that would allow us to
decide that it expects "FLAG num" and not "FLAG long". Furthermore, we
certainly can't rely on this - we still need to check the range.


regards

-- 
Tomas Vondra                  http://www.2ndQuadrant.com
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

pgsql-hackers by date:

From: Justin Pryzby
Date: 13 October 2019, 21:07:33
Subject: Re: d25ea01275 and partitionwise join

From: Michael Paquier
Date: 13 October 2019, 23:57:16
Subject: Re: v12.0: segfault in reindex CONCURRENTLY

Re: CREATE TEXT SEARCH DICTIONARY segfaulting on 9.6+ - Mailing list pgsql-hackers

Previous

Next