Re: Empty string in lexeme for tsvector - Mailing list pgsql-hackers

From Jean-Christophe Arnu
Subject Re: Empty string in lexeme for tsvector
Date
Msg-id CAHZmTm2RNbzrQnPupvXfT_TgLn1Vj67g4dPE1_uD9p-cogYh=Q@mail.gmail.com
Whole thread Raw
In response to Re: Empty string in lexeme for tsvector  (Ranier Vilela <ranier.vf@gmail.com>)
Responses Re: Empty string in lexeme for tsvector  (Ranier Vilela <ranier.vf@gmail.com>)
Re: Empty string in lexeme for tsvector  (Artur Zakirov <zaartur@gmail.com>)
List pgsql-hackers


Le ven. 24 sept. 2021 à 13:03, Ranier Vilela <ranier.vf@gmail.com> a écrit :

Comments are more than welcome!
1. Would be better to add this test-and-error before tsvector_bsearch call.

+ if (lex_len == 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_ZERO_LENGTH_CHARACTER_STRING),
+ errmsg("lexeme array may not contain empty strings")));
+

If lex_len is equal to zero, better get out soon.

2. The second test-and-error can use lex_len, just like the first test,
I don't see the point in recalculating the size of lex_len if that's already done.

+ if (VARSIZE(dlexemes[i]) - VARHDRSZ == 0)
+ ereport(ERROR,
+ (errcode(ERRCODE_ZERO_LENGTH_CHARACTER_STRING),
+ errmsg("lexeme array may not contain empty strings")));
+

Hello Ranier,
Thank you for your comments.
Here's a new patch file taking your comments into account.

I was just wondering if empty string eviction is done in the right place.
As you rightfully commented, lex_len is calculated later (once again for a
right purpose) and my code checks for empty strings as soon as possible.
To me, it seems to be the right thing to do (prevent further processing on lexemes
as soon as possible) but I might omit something.

Regards

 
Jean-Christophe Arnu
Attachment

pgsql-hackers by date:

Previous
From: Aleksander Alekseev
Date:
Subject: Re: Bug fix for tab completion of ALTER TABLE ... VALIDATE CONSTRAINT ...
Next
From: Peter Eisentraut
Date:
Subject: Re: psql - add SHOW_ALL_RESULTS option