Re: BUG #17691: Unexpected behaviour using ts_headline() - Mailing list pgsql-bugs

From sebastian.patino-lang@posteo.net
Subject Re: BUG #17691: Unexpected behaviour using ts_headline()
Date
Msg-id 76717862-8936-41cc-80b0-f8bc0593a0c4@Spark
Whole thread Raw
In response to BUG #17691: Unexpected behaviour using ts_headline()  (PG Bug reporting form <noreply@postgresql.org>)
List pgsql-bugs
Hi all,

Tom Lane made me aware that the link is not working and its better anyway to include all data in the bug report directly. Please find the file attached.

@Tom: thanks for the hint!

Regards,
Sebastian
On 19. Nov 2022, 14:05 +0100, PG Bug reporting form <noreply@postgresql.org>, wrote:
The following bug has been logged on the website:

Bug reference: 17691
Logged by: Sebastian Patino-Lang
Email address: sebastian.patino-lang@posteo.net
PostgreSQL version: 13.9
Operating system: x86_64-apple-darwin19.6.0
Description:

I experience unexpected behaviour when using ts_headline() in general, but
especially when changing MaxFragments. Given the data in
ts_headline_report.sql [1]

SELECT id,
ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
world'), 'StartSel=<<, StopSel=>>') AS "preview"
FROM texts;

id=1: Highlight word is the first one in the result. Expectation: highlight
word is somewhere in the middle.
id=2: No highlight word at all.
id=3: Highlight words are the first and last one in the result. Not ideal
but ok-ish.

SELECT id,
ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
world'), 'MaxFragments=1, StartSel=<<, StopSel=>>') AS "preview"
FROM texts;

id=1: Highlight word is now in the middle of the result. This is ok.
id=2: No highlight word at all.
id=3: Highlight words are in the middle part of the result. This is ok.

SELECT id,
ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
world'), 'MaxFragments=2, StartSel=<<, StopSel=>>') AS "preview"
FROM texts;

id=1: Correct number of fragments (2) with highlight words are returned.
id=2: No highlight word at all.
id=3: Correct number of fragments (2) with highlight words are returned.

SELECT id,
ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
world'), 'MaxFragments=3, StartSel=<<, StopSel=>>') AS "preview"
FROM texts;

id=1: Wrong number of fragments (2) with highlight words are returned.
id=2: No highlight word at all.
id=3: Correct number of fragments (3) with highlight words are returned.

SELECT id,
ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
world'), 'MaxFragments=4, StartSel=<<, StopSel=>>') AS "preview"
FROM texts;

id=1: Wrong number of fragments (2) with highlight words are returned.
id=2: No highlight word at all.
id=3: Correct number of fragments (4) with highlight words are returned.

... and so on. Until MaxFragments=6 where for id=1 suddenly more fragments
(4) get returned.

SELECT id,
ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
world'), 'MaxFragments=4, StartSel=<<, StopSel=>>') AS "preview"
FROM texts;

id=1: Wrong number of fragments (4) with highlight words are returned.
id=2: No highlight word at all.
id=3: Correct number of fragments (6) with highlight words are returned.

... and so on. Until MaxFragments=11 where for id=1 the number of returned
fragments changes again (5)

SELECT id,
ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
world'), 'MaxFragments=11, StartSel=<<, StopSel=>>') AS "preview"
FROM texts;

id=1: Wrong number of fragments (5) with highlight words are returned.
id=2: No highlight word at all.
id=3: Correct number of fragments (11) with highlight words are returned.

... and so on. Until MaxFragments=14 where for id=2 suddenly more fragments
(2) get returned.

SELECT id,
ts_headline('english', "texts"."fulltext", to_tsquery('english', 'amazon &
world'), 'MaxFragments=14, StartSel=<<, StopSel=>>') AS "preview"
FROM texts;

id=1: Wrong number of fragments (5) with highlight words are returned.
id=2: Wrong number of fragments (2) with highlight words are returned.
id=3: Correct number of fragments (11) with highlight words are returned.

I stopped testing here, but im sure the strange behaviour and jumps in
fragment count will continue.

Any ideas?

[1] https://1drv.ms/u/s!AqRGi9iGBWddgZVU_M2iuoRdTzM6tg?e=0OHHnB

Attachment

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #17691: Unexpected behaviour using ts_headline()
Next
From: Tom Lane
Date:
Subject: Re: BUG #17691: Unexpected behaviour using ts_headline()