Thread: BUG #11770: Segfault on spell.c when there are more than one characters as suffix flag

BUG #11770: Segfault on spell.c when there are more than one characters as suffix flag

From
emre@hasegeli.com
Date:
The following bug has been logged on the website:

Bug reference:      11770
Logged by:          Emre Hasegeli
Email address:      emre@hasegeli.com
PostgreSQL version: 9.0.18
Operating system:   Mac OS X Version 10.9.5
Description:

How to reproduce:

create text search dictionary hunspell_tr (template = ispell, dictfile = tr,
afffile = tr);
select ts_lexize('hunspell_tr', 'veriye');


$SHAREDIR/tsearch_data/tr.dict:

ONLYINCOMPOUND L

SFX 100 N 1
SFX 100 0 ye .


$SHAREDIR/tsearch_data/tr.dict:

veri/100


Note that the affix file is not correct without "FLAG num" which is not
supported by PostgreSQL.  Also, the code fails to report the proper error
on spell.c:673 when "FLAG num" is used.

Tested on master and REL9_0_STABLE.

Backtrace:

* thread #1: tid = 0x62bf2, 0x0000000101a7413a postgres`SplitToVariants
[inlined] CheckCompoundAffixes(ptr=<unavailable>, word=0x00007f99e200a8f1,
len=8) at spell.c:1522, queue = 'com.apple.main-thread', stop reason =
EXC_BAD_ACCESS (code=1, address=0x0)
  * frame #0: 0x0000000101a7413a postgres`SplitToVariants [inlined]
CheckCompoundAffixes(ptr=<unavailable>, word=0x00007f99e200a8f1, len=8) at
spell.c:1522
    frame #1: 0x0000000101a7413a
postgres`SplitToVariants(Conf=0x00007f99e2012448, snode=<unavailable>,
orig=<unavailable>, word=0x00007f99e200a8f0, wordlen=9,
startpos=<unavailable>, minpos=<unavailable>) + 442 at spell.c:1598
    frame #2: 0x0000000101a73402
postgres`NINormalizeWord(Conf=0x00007f99e2012448, word=0x00007f99e200a8f0) +
290 at spell.c:1764
    frame #3: 0x0000000101a700b1
postgres`dispell_lexize(fcinfo=<unavailable>) + 49 at dict_ispell.c:124
    frame #4: 0x0000000101b621ec
postgres`FunctionCall4Coll(flinfo=<unavailable>, collation=<unavailable>,
arg1=<unavailable>, arg2=<unavailable>, arg3=<unavailable>,
arg4=<unavailable>) + 124 at fmgr.c:1376
    frame #5: 0x0000000101a69922 postgres`LexizeExec(ld=<unavailable>,
correspondLexem=<unavailable>) + 642 at ts_parse.c:207
    frame #6: 0x0000000101a69542 postgres`parsetext(cfgId=<unavailable>,
prs=<unavailable>, buf=<unavailable>, buflen=<unavailable>) + 466 at
ts_parse.c:405
    frame #7: 0x0000000101a7510c
postgres`to_tsvector_byid(fcinfo=0x00007f99e20085c0) + 124 at
to_tsany.c:224
    frame #8: 0x000000010195b811
postgres`ExecMakeFunctionResultNoSets(fcache=0x00007f99e2008550,
econtext=<unavailable>, isNull=0x00007fff5e425387, isDone=<unavailable>) +
209 at execQual.c:1992
    frame #9: 0x0000000101955b9b
postgres`ExecEvalExprSwitchContext(expression=<unavailable>,
econtext=<unavailable>, isNull=<unavailable>, isDone=<unavailable>) + 27 at
execQual.c:4320
    frame #10: 0x00000001019dfac9 postgres`evaluate_expr(expr=<unavailable>,
result_type=3614, result_typmod=-1, result_collation=0) + 121 at
clauses.c:4568
    frame #11: 0x00000001019df22e postgres`simplify_function [inlined]
evaluate_function(funcid=<unavailable>, result_type=<unavailable>,
result_typmod=<unavailable>, result_collid=<unavailable>,
input_collid=<unavailable>, args=<unavailable>, funcvariadic=<unavailable>,
func_tuple=<unavailable>, context=<unavailable>) + 332 at clauses.c:4129
    frame #12: 0x00000001019df0e2
postgres`simplify_function(funcid=<unavailable>, result_type=<unavailable>,
result_typmod=<unavailable>, result_collid=<unavailable>,
input_collid=<unavailable>, args_p=<unavailable>,
funcvariadic=<unavailable>, process_args=<unavailable>,
allow_non_const='\b', context=<unavailable>) + 162 at clauses.c:3768
    frame #13: 0x00000001019dd16c
postgres`eval_const_expressions_mutator(node=0x00007f99e2006660,
context=0x00007fff5e425748) + 620 at clauses.c:2459
    frame #14: 0x000000010198baa5
postgres`expression_tree_mutator(node=<unavailable>, mutator=<unavailable>,
context=<unavailable>) + 581 at nodeFuncs.c:2374
    frame #15: 0x00000001019dd03b
postgres`eval_const_expressions_mutator(node=<unavailable>,
context=<unavailable>) + 315 at clauses.c:3418
    frame #16: 0x000000010198b98b
postgres`expression_tree_mutator(node=<unavailable>,
mutator=0x00000001019dcf00, context=0x00007fff5e425748) + 299 at
nodeFuncs.c:2570
    frame #17: 0x00000001019dd03b
postgres`eval_const_expressions_mutator(node=<unavailable>,
context=<unavailable>) + 315 at clauses.c:3418
    frame #18: 0x00000001019dcefa
postgres`eval_const_expressions(root=<unavailable>, node=<unavailable>) + 74
at clauses.c:2301
    frame #19: 0x00000001019c9eea postgres`subquery_planner [inlined]
preprocess_expression(expr=<unavailable>, kind=<unavailable>) + 34 at
planner.c:678
    frame #20: 0x00000001019c9ec8
postgres`subquery_planner(glob=0x00007f99e183c1e0, parse=0x00007f99e183bf38,
parent_root=<unavailable>, hasRecursion=<unavailable>, tuple_fraction=0,
subroot=0x00007fff5e4259b0) + 1112 at planner.c:426
    frame #21: 0x00000001019c9878
postgres`standard_planner(parse=0x00007f99e183bf38,
cursorOptions=<unavailable>, boundParams=<unavailable>) + 232 at
planner.c:211
    frame #22: 0x0000000101a5f8ba
postgres`pg_plan_query(querytree=0x00007f99e183bf38, cursorOptions=0,
boundParams=0x0000000000000000) + 106 at postgres.c:750
    frame #23: 0x0000000101a62e1f postgres`PostgresMain [inlined]
pg_plan_queries(querytrees=<unavailable>, cursorOptions=<unavailable>,
boundParams=<unavailable>) + 50 at postgres.c:809
    frame #24: 0x0000000101a62ded postgres`PostgresMain [inlined]
exec_simple_query(query_string=<unavailable>) + 13 at postgres.c:974
    frame #25: 0x0000000101a62de0 postgres`PostgresMain(argc=<unavailable>,
argv=<unavailable>, dbname=0x00007f99e1004e10, username=<unavailable>) +
8992 at postgres.c:4010
    frame #26: 0x00000001019fa4d8 postgres`PostmasterMain [inlined]
BackendRun + 7768 at postmaster.c:4118
    frame #27: 0x00000001019fa4b2 postgres`PostmasterMain [inlined]
BackendStartup at postmaster.c:3793
    frame #28: 0x00000001019fa4b2 postgres`PostmasterMain [inlined]
ServerLoop at postmaster.c:1572
    frame #29: 0x00000001019fa4b2
postgres`PostmasterMain(argc=<unavailable>, argv=<unavailable>) + 7730 at
postmaster.c:1219
    frame #30: 0x0000000101989909 postgres`main(argc=<unavailable>,
argv=<unavailable>) + 1081 at main.c:220
    frame #31: 0x00007fff90aaa5fd libdyld.dylib`start + 1
emre@hasegeli.com writes:
> How to reproduce:
> ...
> Note that the affix file is not correct without "FLAG num" which is not
> supported by PostgreSQL.  Also, the code fails to report the proper error
> on spell.c:673 when "FLAG num" is used.

Hmm.  The immediate cause of the crash is that the code is expecting some
"CompoundAffix"es to exist when usecompound = true, but they don't.
I'm not sure whether we should flag the affix file as invalid, but in
any case it'd be a good idea for CheckCompoundAffixes to defend itself
against *ptr being NULL.

As for whether the error is proper ... this code is woefully
underdocumented, but it looks to me like NIImportAffixes() is designed
to import the original ispell affix file format, and if it decides that
the file is not that but MySpell/Hunspell format then it sends control
off to NIImportOOAffixes to re-parse the whole thing.  The difficulty
is that FLAG commands exist in both formats and it's not being careful
about whether the FLAG command is new or old format.  Probably we should
insist that the FLAG argument not contain multiple letters in order to
deem it old format.

            regards, tom lane