Thread: [pgsql-www] Searching for pgweb
Right now our main website search uses plainto_tsquery() to generate the searches.
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
Should we consider switching that to phraseto_tsquery() now that we have phrase searching?
//Magnus
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Wed, Mar 22, 2017 at 4:51 PM, Magnus Hagander <magnus@hagander.net> wrote: > Right now our main website search uses plainto_tsquery() to generate the > searches. > > Should we consider switching that to phraseto_tsquery() now that we have > phrase searching? +1 -- Dave Page Blog: http://pgsnake.blogspot.com Twitter: @pgsnake EnterpriseDB UK: http://www.enterprisedb.com The Enterprise PostgreSQL Company
On 03/22/2017 09:55 AM, Dave Page wrote: > On Wed, Mar 22, 2017 at 4:51 PM, Magnus Hagander <magnus@hagander.net> wrote: >> Right now our main website search uses plainto_tsquery() to generate the >> searches. >> >> Should we consider switching that to phraseto_tsquery() now that we have >> phrase searching? > > +1 > +1 -- Command Prompt, Inc. http://the.postgres.company/ +1-503-667-4564 PostgreSQL Centered full stack support, consulting and development. Everyone appreciates your honesty, until you are honest with them. Unless otherwise stated, opinions are my own.
> On Mar 22, 2017, at 09:51, Magnus Hagander <magnus@hagander.net> wrote: > Should we consider switching that to phraseto_tsquery() now that we have phrase searching? +1. -- -- Christophe Pettus xof@thebuild.com
Ok, I'll ask the silly questions: Will it actually give better results? Does it need user documentation added beyond just changing the query?
On Wed, Mar 22, 2017 at 11:20 AM, Christophe Pettus <xof@thebuild.com> wrote:
> On Mar 22, 2017, at 09:51, Magnus Hagander <magnus@hagander.net> wrote:
> Should we consider switching that to phraseto_tsquery() now that we have phrase searching?
+1.
--
-- Christophe Pettus
xof@thebuild.com
--
Sent via pgsql-www mailing list (pgsql-www@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-www
Peter van Hardenberg
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut
San Francisco, California
"Everything was beautiful, and nothing hurt."—Kurt Vonnegut
I think so. For example:
search=# SELECT count(*) FROM webpages WHERE suburl LIKE '/docs/current/%' AND fti @@ plainto_tsquery('create index');
count
-------
342
(1 row)
search=# SELECT count(*) FROM webpages WHERE suburl LIKE '/docs/current/%' AND fti @@ phraseto_tsquery('create index');
count
-------
106
(1 row)
But yes, that is kind of the core of the question.
I don't think it needs any special user docs -- it's not like we have documented how it works now, it's supposed to "just work".
//Magnus
On Thu, Mar 23, 2017 at 9:11 AM, Peter van Hardenberg <pvh@pvh.ca> wrote:
Ok, I'll ask the silly questions: Will it actually give better results? Does it need user documentation added beyond just changing the query?On Wed, Mar 22, 2017 at 11:20 AM, Christophe Pettus <xof@thebuild.com> wrote:
> On Mar 22, 2017, at 09:51, Magnus Hagander <magnus@hagander.net> wrote:
> Should we consider switching that to phraseto_tsquery() now that we have phrase searching?
+1.
--
-- Christophe Pettus
xof@thebuild.com
On Wed, Mar 22, 2017 at 7:51 PM, Magnus Hagander <magnus@hagander.net> wrote:
+1
2.
select ts_parse('tsparser', 'utc-5');
ts_parse
------------
(15,utc-5)
(11,utc)
(12,-)
(9,5)
(4 rows)
select ts_parse('default', 'utc-5');
ts_parse
----------
(1,utc)
(21,-5)
(2 rows)
https://postgrespro.ru/search/?area=version&q=utc-5&product=postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=utc-5
Right now our main website search uses plainto_tsquery() to generate the searches.Should we consider switching that to phraseto_tsquery() now that we have phrase searching?
+1
Also, I suggest to use new parser, which better works _ and -, for example:
1.
select ts_parse('tsparser', 'btree_gin');
ts_parse
----------------
(16,btree_gin)
(11,btree)
(12,_)
(11,gin)
(4 rows)
select ts_parse('default', 'btree_gin');
ts_parse
-----------
(1,btree)
(12,_)
(1,gin)
(3 rows)
1.
select ts_parse('tsparser', 'btree_gin');
ts_parse
----------------
(16,btree_gin)
(11,btree)
(12,_)
(11,gin)
(4 rows)
select ts_parse('default', 'btree_gin');
ts_parse
-----------
(1,btree)
(12,_)
(1,gin)
(3 rows)
Default parser produces too much noise, just check the difference:
https://postgrespro.ru/search/?area=version&q=btree_gin&product=postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=btree_gin
https://postgrespro.ru/search/?area=version&q=btree_gin&product=postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=btree_gin
2.
select ts_parse('tsparser', 'utc-5');
ts_parse
------------
(15,utc-5)
(11,utc)
(12,-)
(9,5)
(4 rows)
select ts_parse('default', 'utc-5');
ts_parse
----------
(1,utc)
(21,-5)
(2 rows)
again, compare
https://postgrespro.ru/search/?area=version&q=utc-5&product=postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=utc-5
We have also better parsing of email, but I'm not sure we need it on postgres site.
We'll publish soon on github, let me know if you know it.
Oleg
//Magnus
--
Magnus Hagander
Me: http://www.hagander.net/
Work: http://www.redpill-linpro.com/
On Fri, Mar 24, 2017 at 10:56 AM, Oleg Bartunov <obartunov@gmail.com> wrote:
On Wed, Mar 22, 2017 at 7:51 PM, Magnus Hagander <magnus@hagander.net> wrote:Right now our main website search uses plainto_tsquery() to generate the searches.Should we consider switching that to phraseto_tsquery() now that we have phrase searching?
+1Also, I suggest to use new parser, which better works _ and -, for example:
1.
select ts_parse('tsparser', 'btree_gin');
ts_parse
----------------
(16,btree_gin)
(11,btree)
(12,_)
(11,gin)
(4 rows)
select ts_parse('default', 'btree_gin');
ts_parse
-----------
(1,btree)
(12,_)
(1,gin)
(3 rows)Default parser produces too much noise, just check the difference:
https://postgrespro.ru/search/?area=version&q=btree_gin& product=postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q= btree_gin
2.
select ts_parse('tsparser', 'utc-5');
ts_parse
------------
(15,utc-5)
(11,utc)
(12,-)
(9,5)
(4 rows)
select ts_parse('default', 'utc-5');
ts_parse
----------
(1,utc)
(21,-5)
(2 rows)again, compare
https://postgrespro.ru/search/?area=version&q=utc-5&product= postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q= utc-5 We have also better parsing of email, but I'm not sure we need it on postgres site.We'll publish soon on github, let me know if you know it.
On Fri, Mar 24, 2017 at 8:56 AM, Oleg Bartunov <obartunov@gmail.com> wrote:
On Wed, Mar 22, 2017 at 7:51 PM, Magnus Hagander <magnus@hagander.net> wrote:Right now our main website search uses plainto_tsquery() to generate the searches.Should we consider switching that to phraseto_tsquery() now that we have phrase searching?
+1Also, I suggest to use new parser, which better works _ and -, for example:
1.
select ts_parse('tsparser', 'btree_gin');
ts_parse
----------------
(16,btree_gin)
(11,btree)
(12,_)
(11,gin)
(4 rows)
select ts_parse('default', 'btree_gin');
ts_parse
-----------
(1,btree)
(12,_)
(1,gin)
(3 rows)Default parser produces too much noise, just check the difference:
https://postgrespro.ru/search/?area=version&q=btree_gin& product=postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q= btree_gin
2.
select ts_parse('tsparser', 'utc-5');
ts_parse
------------
(15,utc-5)
(11,utc)
(12,-)
(9,5)
(4 rows)
select ts_parse('default', 'utc-5');
ts_parse
----------
(1,utc)
(21,-5)
(2 rows)again, compare
https://postgrespro.ru/search/?area=version&q=utc-5&product= postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q= utc-5 We have also better parsing of email, but I'm not sure we need it on postgres site.We'll publish soon on github, let me know if you know it.
That sounds interesting. Two questions:
1. Do you have plans for contributing this one for upstream postgres, or is it intended to be run separately?
2. Would it still make sense to do phrase searching? Sounds like it would?
On 29 Mar 2017 09:49, "Magnus Hagander" <magnus@hagander.net> wrote:
On Fri, Mar 24, 2017 at 8:56 AM, Oleg Bartunov <obartunov@gmail.com> wrote:On Wed, Mar 22, 2017 at 7:51 PM, Magnus Hagander <magnus@hagander.net> wrote:Right now our main website search uses plainto_tsquery() to generate the searches.Should we consider switching that to phraseto_tsquery() now that we have phrase searching?
+1Also, I suggest to use new parser, which better works _ and -, for example:
1.
select ts_parse('tsparser', 'btree_gin');
ts_parse
----------------
(16,btree_gin)
(11,btree)
(12,_)
(11,gin)
(4 rows)
select ts_parse('default', 'btree_gin');
ts_parse
-----------
(1,btree)
(12,_)
(1,gin)
(3 rows)Default parser produces too much noise, just check the difference:
https://postgrespro.ru/search/?area=version&q=btree_gin&prod uct=postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=btre e_gin
2.
select ts_parse('tsparser', 'utc-5');
ts_parse
------------
(15,utc-5)
(11,utc)
(12,-)
(9,5)
(4 rows)
select ts_parse('default', 'utc-5');
ts_parse
----------
(1,utc)
(21,-5)
(2 rows)again, compare
https://postgrespro.ru/search/?area=version&q=utc-5&product= postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=utc- 5 We have also better parsing of email, but I'm not sure we need it on postgres site.We'll publish soon on github, let me know if you know it.That sounds interesting. Two questions:1. Do you have plans for contributing this one for upstream postgres, or is it intended to be run separately?
We would love to do this, but currently it's there
2. Would it still make sense to do phrase searching? Sounds like it would?
Yes, it would.
On Wed, Mar 29, 2017 at 3:55 PM, Oleg Bartunov <obartunov@gmail.com> wrote:
On 29 Mar 2017 09:49, "Magnus Hagander" <magnus@hagander.net> wrote:On Fri, Mar 24, 2017 at 8:56 AM, Oleg Bartunov <obartunov@gmail.com> wrote:On Wed, Mar 22, 2017 at 7:51 PM, Magnus Hagander <magnus@hagander.net> wrote:Right now our main website search uses plainto_tsquery() to generate the searches.Should we consider switching that to phraseto_tsquery() now that we have phrase searching?
+1Also, I suggest to use new parser, which better works _ and -, for example:
1.
select ts_parse('tsparser', 'btree_gin');
ts_parse
----------------
(16,btree_gin)
(11,btree)
(12,_)
(11,gin)
(4 rows)
select ts_parse('default', 'btree_gin');
ts_parse
-----------
(1,btree)
(12,_)
(1,gin)
(3 rows)Default parser produces too much noise, just check the difference:
https://postgrespro.ru/search/?area=version&q=btree_gin&prod uct=postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=btre e_gin
2.
select ts_parse('tsparser', 'utc-5');
ts_parse
------------
(15,utc-5)
(11,utc)
(12,-)
(9,5)
(4 rows)
select ts_parse('default', 'utc-5');
ts_parse
----------
(1,utc)
(21,-5)
(2 rows)again, compare
https://postgrespro.ru/search/?area=version&q=utc-5&product= postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=utc- 5 We have also better parsing of email, but I'm not sure we need it on postgres site.We'll publish soon on github, let me know if you know it.That sounds interesting. Two questions:1. Do you have plans for contributing this one for upstream postgres, or is it intended to be run separately?We would love to do this, but currently it's there
Right, found that one. But if your long term plan is to contribute it upstream, that makes it easier to rely on :)
2. Would it still make sense to do phrase searching? Sounds like it would?Yes, it would.
On Fri, Mar 31, 2017 at 8:04 AM, Magnus Hagander <magnus@hagander.net> wrote:
On Wed, Mar 29, 2017 at 3:55 PM, Oleg Bartunov <obartunov@gmail.com> wrote:On 29 Mar 2017 09:49, "Magnus Hagander" <magnus@hagander.net> wrote:On Fri, Mar 24, 2017 at 8:56 AM, Oleg Bartunov <obartunov@gmail.com> wrote:On Wed, Mar 22, 2017 at 7:51 PM, Magnus Hagander <magnus@hagander.net> wrote:Right now our main website search uses plainto_tsquery() to generate the searches.Should we consider switching that to phraseto_tsquery() now that we have phrase searching?
+1Also, I suggest to use new parser, which better works _ and -, for example:
1.
select ts_parse('tsparser', 'btree_gin');
ts_parse
----------------
(16,btree_gin)
(11,btree)
(12,_)
(11,gin)
(4 rows)
select ts_parse('default', 'btree_gin');
ts_parse
-----------
(1,btree)
(12,_)
(1,gin)
(3 rows)Default parser produces too much noise, just check the difference:
https://postgrespro.ru/search/?area=version&q=btree_gin&prod uct=postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=btre e_gin
2.
select ts_parse('tsparser', 'utc-5');
ts_parse
------------
(15,utc-5)
(11,utc)
(12,-)
(9,5)
(4 rows)
select ts_parse('default', 'utc-5');
ts_parse
----------
(1,utc)
(21,-5)
(2 rows)again, compare
https://postgrespro.ru/search/?area=version&q=utc-5&product= postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=utc- 5 We have also better parsing of email, but I'm not sure we need it on postgres site.We'll publish soon on github, let me know if you know it.That sounds interesting. Two questions:1. Do you have plans for contributing this one for upstream postgres, or is it intended to be run separately?We would love to do this, but currently it's thereRight, found that one. But if your long term plan is to contribute it upstream, that makes it easier to rely on :)
I'd love if you test it, give us feedback what to improve, what to fix. Then we could try to convince community to accept it.
2. Would it still make sense to do phrase searching? Sounds like it would?Yes, it would.Cool, thanks.--
On Fri, Mar 31, 2017 at 2:46 PM, Oleg Bartunov <obartunov@gmail.com> wrote:
-- On Fri, Mar 31, 2017 at 8:04 AM, Magnus Hagander <magnus@hagander.net> wrote:On Wed, Mar 29, 2017 at 3:55 PM, Oleg Bartunov <obartunov@gmail.com> wrote:On 29 Mar 2017 09:49, "Magnus Hagander" <magnus@hagander.net> wrote:On Fri, Mar 24, 2017 at 8:56 AM, Oleg Bartunov <obartunov@gmail.com> wrote:On Wed, Mar 22, 2017 at 7:51 PM, Magnus Hagander <magnus@hagander.net> wrote:Right now our main website search uses plainto_tsquery() to generate the searches.Should we consider switching that to phraseto_tsquery() now that we have phrase searching?
+1Also, I suggest to use new parser, which better works _ and -, for example:
1.
select ts_parse('tsparser', 'btree_gin');
ts_parse
----------------
(16,btree_gin)
(11,btree)
(12,_)
(11,gin)
(4 rows)
select ts_parse('default', 'btree_gin');
ts_parse
-----------
(1,btree)
(12,_)
(1,gin)
(3 rows)Default parser produces too much noise, just check the difference:
https://postgrespro.ru/search/?area=version&q=btree_gin&prod uct=postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=btre e_gin
2.
select ts_parse('tsparser', 'utc-5');
ts_parse
------------
(15,utc-5)
(11,utc)
(12,-)
(9,5)
(4 rows)
select ts_parse('default', 'utc-5');
ts_parse
----------
(1,utc)
(21,-5)
(2 rows)again, compare
https://postgrespro.ru/search/?area=version&q=utc-5&product= postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=utc- 5 We have also better parsing of email, but I'm not sure we need it on postgres site.We'll publish soon on github, let me know if you know it.That sounds interesting. Two questions:1. Do you have plans for contributing this one for upstream postgres, or is it intended to be run separately?We would love to do this, but currently it's thereRight, found that one. But if your long term plan is to contribute it upstream, that makes it easier to rely on :)I'd love if you test it, give us feedback what to improve, what to fix. Then we could try to convince community to accept it.
I've applied this one for testing on the main website search.
At the same time I realized we didn't setweight() on the title on regular webpages, so I fixed that too (setting title to weight A).
Basically the conf is:
CREATE TEXT SEARCH DICTIONARY english_ispell (
TEMPLATE = pg_catalog.ispell,
dictfile = 'en_us', afffile = 'en_us', stopwords = 'english' );
CREATE TEXT SEARCH DICTIONARY pg_dict (
TEMPLATE = pg_catalog.synonym,
synonyms = 'pg_dict' );
CREATE TEXT SEARCH CONFIGURATION pg (
PARSER = tsparser );
ALTER TEXT SEARCH CONFIGURATION pg
ALTER MAPPING FOR asciiword, asciihword, hword_asciipart,
word, hword, hword_part
WITH pg_dict, english_ispell, english_stem;
ALTER TEXT SEARCH CONFIGURATION pg
ALTER MAPPING FOR email, file, float, host, hword_numpart, int, numhword, numword, sfloat, uint, url, url_path, version WITH simple;
If you have any other suggestions of things we should change there, please let me know!
So far, this is on the main website search and *not* on the archives search. Let's try it there first, but in the long run we should use similar configurations.
On Sun, Apr 2, 2017 at 9:37 AM, Magnus Hagander <magnus@hagander.net> wrote:
On Fri, Mar 31, 2017 at 2:46 PM, Oleg Bartunov <obartunov@gmail.com> wrote:On Fri, Mar 31, 2017 at 8:04 AM, Magnus Hagander <magnus@hagander.net> wrote:On Wed, Mar 29, 2017 at 3:55 PM, Oleg Bartunov <obartunov@gmail.com> wrote:On 29 Mar 2017 09:49, "Magnus Hagander" <magnus@hagander.net> wrote:On Fri, Mar 24, 2017 at 8:56 AM, Oleg Bartunov <obartunov@gmail.com> wrote:On Wed, Mar 22, 2017 at 7:51 PM, Magnus Hagander <magnus@hagander.net> wrote:Right now our main website search uses plainto_tsquery() to generate the searches.Should we consider switching that to phraseto_tsquery() now that we have phrase searching?
+1Also, I suggest to use new parser, which better works _ and -, for example:
1.
select ts_parse('tsparser', 'btree_gin');
ts_parse
----------------
(16,btree_gin)
(11,btree)
(12,_)
(11,gin)
(4 rows)
select ts_parse('default', 'btree_gin');
ts_parse
-----------
(1,btree)
(12,_)
(1,gin)
(3 rows)Default parser produces too much noise, just check the difference:
https://postgrespro.ru/search/?area=version&q=btree_gin&prod uct=postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=btre e_gin
2.
select ts_parse('tsparser', 'utc-5');
ts_parse
------------
(15,utc-5)
(11,utc)
(12,-)
(9,5)
(4 rows)
select ts_parse('default', 'utc-5');
ts_parse
----------
(1,utc)
(21,-5)
(2 rows)again, compare
https://postgrespro.ru/search/?area=version&q=utc-5&product= postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=utc- 5 We have also better parsing of email, but I'm not sure we need it on postgres site.We'll publish soon on github, let me know if you know it.That sounds interesting. Two questions:1. Do you have plans for contributing this one for upstream postgres, or is it intended to be run separately?We would love to do this, but currently it's thereRight, found that one. But if your long term plan is to contribute it upstream, that makes it easier to rely on :)I'd love if you test it, give us feedback what to improve, what to fix. Then we could try to convince community to accept it.I've applied this one for testing on the main website search.At the same time I realized we didn't setweight() on the title on regular webpages, so I fixed that too (setting title to weight A).Basically the conf is:CREATE TEXT SEARCH DICTIONARY english_ispell (TEMPLATE = pg_catalog.ispell,dictfile = 'en_us', afffile = 'en_us', stopwords = 'english' );CREATE TEXT SEARCH DICTIONARY pg_dict (TEMPLATE = pg_catalog.synonym,synonyms = 'pg_dict' );CREATE TEXT SEARCH CONFIGURATION pg (PARSER = tsparser );ALTER TEXT SEARCH CONFIGURATION pgALTER MAPPING FOR asciiword, asciihword, hword_asciipart,word, hword, hword_partWITH pg_dict, english_ispell, english_stem;ALTER TEXT SEARCH CONFIGURATION pgALTER MAPPING FOR email, file, float, host, hword_numpart, int, numhword, numword, sfloat, uint, url, url_path, version WITH simple;If you have any other suggestions of things we should change there, please let me know!
Depending on the load I'd use also shared ispell https://github.com/postgrespro/shared_ispell, which will save memory a lot.
--So far, this is on the main website search and *not* on the archives search. Let's try it there first, but in the long run we should use similar configurations.
On Mon, Apr 3, 2017 at 3:37 PM, Oleg Bartunov <obartunov@gmail.com> wrote:
-- On Sun, Apr 2, 2017 at 9:37 AM, Magnus Hagander <magnus@hagander.net> wrote:On Fri, Mar 31, 2017 at 2:46 PM, Oleg Bartunov <obartunov@gmail.com> wrote:On Fri, Mar 31, 2017 at 8:04 AM, Magnus Hagander <magnus@hagander.net> wrote:On Wed, Mar 29, 2017 at 3:55 PM, Oleg Bartunov <obartunov@gmail.com> wrote:On 29 Mar 2017 09:49, "Magnus Hagander" <magnus@hagander.net> wrote:On Fri, Mar 24, 2017 at 8:56 AM, Oleg Bartunov <obartunov@gmail.com> wrote:On Wed, Mar 22, 2017 at 7:51 PM, Magnus Hagander <magnus@hagander.net> wrote:Right now our main website search uses plainto_tsquery() to generate the searches.Should we consider switching that to phraseto_tsquery() now that we have phrase searching?
+1Also, I suggest to use new parser, which better works _ and -, for example:
1.
select ts_parse('tsparser', 'btree_gin');
ts_parse
----------------
(16,btree_gin)
(11,btree)
(12,_)
(11,gin)
(4 rows)
select ts_parse('default', 'btree_gin');
ts_parse
-----------
(1,btree)
(12,_)
(1,gin)
(3 rows)Default parser produces too much noise, just check the difference:
https://postgrespro.ru/search/?area=version&q=btree_gin&prod uct=postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=btre e_gin
2.
select ts_parse('tsparser', 'utc-5');
ts_parse
------------
(15,utc-5)
(11,utc)
(12,-)
(9,5)
(4 rows)
select ts_parse('default', 'utc-5');
ts_parse
----------
(1,utc)
(21,-5)
(2 rows)again, compare
https://postgrespro.ru/search/?area=version&q=utc-5&product= postgresql&version=9.6
https://www.postgresql.org/search/?u=%2Fdocs%2F9.6%2F&q=utc- 5 We have also better parsing of email, but I'm not sure we need it on postgres site.We'll publish soon on github, let me know if you know it.That sounds interesting. Two questions:1. Do you have plans for contributing this one for upstream postgres, or is it intended to be run separately?We would love to do this, but currently it's thereRight, found that one. But if your long term plan is to contribute it upstream, that makes it easier to rely on :)I'd love if you test it, give us feedback what to improve, what to fix. Then we could try to convince community to accept it.I've applied this one for testing on the main website search.At the same time I realized we didn't setweight() on the title on regular webpages, so I fixed that too (setting title to weight A).Basically the conf is:CREATE TEXT SEARCH DICTIONARY english_ispell (TEMPLATE = pg_catalog.ispell,dictfile = 'en_us', afffile = 'en_us', stopwords = 'english' );CREATE TEXT SEARCH DICTIONARY pg_dict (TEMPLATE = pg_catalog.synonym,synonyms = 'pg_dict' );CREATE TEXT SEARCH CONFIGURATION pg (PARSER = tsparser );ALTER TEXT SEARCH CONFIGURATION pgALTER MAPPING FOR asciiword, asciihword, hword_asciipart,word, hword, hword_partWITH pg_dict, english_ispell, english_stem;ALTER TEXT SEARCH CONFIGURATION pgALTER MAPPING FOR email, file, float, host, hword_numpart, int, numhword, numword, sfloat, uint, url, url_path, version WITH simple;If you have any other suggestions of things we should change there, please let me know!Depending on the load I'd use also shared ispell https://github.com/postgrespro/shared_ispell, which will save memory a lot.
Our load is pretty low, so we don't really have need for that one at this point. But I'll try to remember it :)