Thread: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Andreas Joseph Krogh
Date:
Hi.
 
Any news about when slides for $subject will be available?
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment

Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Oleg Bartunov
Date:


On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
Hi.
 
Any news about when slides for $subject will be available?

I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf

There are some missing features in rum index, but I hope we'll update github repository really soon.
 
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963

Attachment

Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Andreas Joseph Krogh
Date:
På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <obartunov@gmail.com>:
 
 
On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
Hi.
 
Any news about when slides for $subject will be available?
 
I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
 
There are some missing features in rum index, but I hope we'll update github repository really soon.
 
This is simply amazing!
 
I want to run 9.6 beta in production right now because of this:-)
 
Hats off guys, congrats to PostgresPro, and huge thanks!!
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 
Attachment

Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Stefan Keller
Date:
Hi,

Nice work from you postgrespro.ru guys! Especially the RUM index which demonstrates the power of 9.6 to let third party SW create access methods as extension: https://github.com/postgrespro/rum

1. I don't understand the benchmarks on slide 25 "20 mln descriptions" (and the one before "6.7 mln classifieds"): What does "Queries in 8 h 9.2 +patch (9.6 rum)" mean?

2. What does R-U-M mean? (can't mean "Range Usage Metadata" which was finally coined range index BRIN)?

:Stefan, co-organizer of Swiss PGDay


2016-05-29 11:29 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:
På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <obartunov@gmail.com>:
 
 
On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
Hi.
 
Any news about when slides for $subject will be available?
 
I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
 
There are some missing features in rum index, but I hope we'll update github repository really soon.
 
This is simply amazing!
 
I want to run 9.6 beta in production right now because of this:-)
 
Hats off guys, congrats to PostgresPro, and huge thanks!!
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 

Attachment

Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Oleg Bartunov
Date:


On Sun, May 29, 2016 at 12:29 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <obartunov@gmail.com>:
 
 
On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
Hi.
 
Any news about when slides for $subject will be available?
 
I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
 
There are some missing features in rum index, but I hope we'll update github repository really soon.
 
This is simply amazing!
 
I want to run 9.6 beta in production right now because of this:-)

wait-wait :)  We'd be happy to have feedback from production, of course, but please, wait a bit. We are adding support of sorting posting list/tree not by item pointer as in gin, but make use of additional information, for example, timestamp, which will provide additional speedup to the existing one. Also, we are sure there are some bugs :)
 
 
Hats off guys, congrats to PostgresPro, and huge thanks!!
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 

Attachment

Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Andreas Joseph Krogh
Date:
På søndag 29. mai 2016 kl. 19:49:06, skrev Oleg Bartunov <obartunov@gmail.com>:
[snip]
 
I want to run 9.6 beta in production right now because of this:-)
 
wait-wait :)  We'd be happy to have feedback from production, of course, but please, wait a bit. We are adding support of sorting posting list/tree not by item pointer as in gin, but make use of additional information, for example, timestamp, which will provide additional speedup to the existing one.
 
Awesome!
 
 
Also, we are sure there are some bugs :)
 
He he, I reported 1st issue: https://github.com/postgrespro/rum/issues/1
 
Would be cool to see this fixed so I actually could have a sip of the rum:-)
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 
Attachment

Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Oleg Bartunov
Date:


On Sun, May 29, 2016 at 2:43 PM, Stefan Keller <sfkeller@gmail.com> wrote:
Hi,

Nice work from you postgrespro.ru guys! Especially the RUM index which demonstrates the power of 9.6 to let third party SW create access methods as extension: https://github.com/postgrespro/rum

1. I don't understand the benchmarks on slide 25 "20 mln descriptions" (and the one before "6.7 mln classifieds"): What does "Queries in 8 h 9.2 +patch (9.6 rum)" mean?

We run queries for 8 hours and recorded the number of executed queries.  Four years ago, when I and Alexander developed an initial version of patch we got results marked by "9.2+patch", and now we run the same queries on the same database and put rum results into (). I'd not consider to this numbers, since we used queries from 6 mln database. We'd be happy if somebody run independent benchmarks.
 

2. What does R-U-M mean? (can't mean "Range Usage Metadata" which was finally coined range index BRIN)?


We chose RUM just because there are GIN and VODKA :) But some people already suggested several meanings like Really Useful iMdex :)  We are open for suggestion.
 

:Stefan, co-organizer of Swiss PGDay


2016-05-29 11:29 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:
På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <obartunov@gmail.com>:
 
 
On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
Hi.
 
Any news about when slides for $subject will be available?
 
I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
 
There are some missing features in rum index, but I hope we'll update github repository really soon.
 
This is simply amazing!
 
I want to run 9.6 beta in production right now because of this:-)
 
Hats off guys, congrats to PostgresPro, and huge thanks!!
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 


Attachment

Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Karsten Hilbert
Date:
>> I submitted slides to pgcon site, but it usually takes awhile, so you can
>> download our presentation directly
>> http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf

Looking at slide 39 (attached) I get the impression that I
should be able to do the following:


- turn a coding system (say, ICD-10) into a dictionary
  by splitting the terms into single words

    say, "diabetes mellitus -> "diabetes", "mellitus"

- define stop words like "left", "right", ...

    say, "fracture left ulna" -> the "left" doesn't
    matter as far as coding is concerned

- also turn that coding system into queries by splitting
  the terms into single words, concatenating them
  with "&", and setting the ICD 10 code as tag on them

    say, "diabetes mellitus" -> "diabetes & mellitus [E11]"

- run an inverse FTS (FQS) against a user supplied string
  thereby finding queries (= tags = ICD10 codes) likely
  relevant to the input

    say, to_tsvector("patient was suspected to suffer from diabetes mellitus")
    -> tag = E11


Possible, not possible, insane, unintended use ?

Thanks,
Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346

Attachment

Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Oleg Bartunov
Date:
On Sun, May 29, 2016 at 10:04 PM, Karsten Hilbert
<Karsten.Hilbert@gmx.net> wrote:
>>> I submitted slides to pgcon site, but it usually takes awhile, so you can
>>> download our presentation directly
>>> http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
>
> Looking at slide 39 (attached) I get the impression that I
> should be able to do the following:
>
>
> - turn a coding system (say, ICD-10) into a dictionary
>   by splitting the terms into single words
>
>         say, "diabetes mellitus -> "diabetes", "mellitus"
>
> - define stop words like "left", "right", ...
>
>         say, "fracture left ulna" -> the "left" doesn't
>         matter as far as coding is concerned
>
> - also turn that coding system into queries by splitting
>   the terms into single words, concatenating them
>   with "&", and setting the ICD 10 code as tag on them
>
>         say, "diabetes mellitus" -> "diabetes & mellitus [E11]"
>
> - run an inverse FTS (FQS) against a user supplied string
>   thereby finding queries (= tags = ICD10 codes) likely
>   relevant to the input
>
>         say, to_tsvector("patient was suspected to suffer from diabetes mellitus")
>         -> tag = E11
>
>
> Possible, not possible, insane, unintended use ?

why not, it's the same kind of usage I used at slide #39.

create table icd10 (q tsquery, code text);
insert into icd10 values(to_tsquery('diabetes & mellitus'), '[E11]');
select * from icd10 where to_tsvector('patient was suspected to suffer
from diabetes mellitus') @@ q;
           q           | code
-----------------------+-------
 'diabet' & 'mellitus' | [E11]
(1 row)



>
> Thanks,
> Karsten
> --
> GPG key ID E4071346 @ eu.pool.sks-keyservers.net
> E167 67FD A291 2BEA 73BD  4537 78B9 A9F9 E407 1346
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general
>


Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Oleg Bartunov
Date:


On Sun, May 29, 2016 at 12:59 AM, Oleg Bartunov <obartunov@gmail.com> wrote:


On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
Hi.
 
Any news about when slides for $subject will be available?

I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf


Please, download new version of slides. I added CREATE INDEX commands in examples.

 
There are some missing features in rum index, but I hope we'll update github repository really soon.
 
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963


Attachment

Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Andreas Joseph Krogh
Date:
På mandag 30. mai 2016 kl. 22:27:11, skrev Oleg Bartunov <obartunov@gmail.com>:
 
 
On Sun, May 29, 2016 at 12:59 AM, Oleg Bartunov <obartunov@gmail.com> wrote:
 
 
On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
Hi.
 
Any news about when slides for $subject will be available?
 
I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
 
 
Please, download new version of slides. I added CREATE INDEX commands in examples.
 
Great!
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 
Attachment

Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Stefan Keller
Date:
Hi Oleg

2016-05-29 19:54 GMT+02:00 Oleg Bartunov <obartunov@gmail.com>:

> We chose RUM just because there are GIN and VODKA :) 
> But some people already suggested several meanings like Really Useful iMdex :)  
> We are open for suggestion.

iMdex LOL :-)

Ok. What's new about the index? 
* AFAIK it's using methods as extension 
* it's inspired by inverted index
* and uses position information to calculate rank and order results

So I propose: "Ranking UMdex" ;-)

:Stefan


2016-05-30 22:33 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:
På mandag 30. mai 2016 kl. 22:27:11, skrev Oleg Bartunov <obartunov@gmail.com>:
 
 
On Sun, May 29, 2016 at 12:59 AM, Oleg Bartunov <obartunov@gmail.com> wrote:
 
 
On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
Hi.
 
Any news about when slides for $subject will be available?
 
I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
 
 
Please, download new version of slides. I added CREATE INDEX commands in examples.
 
Great!
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 

Attachment

Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Oleg Bartunov
Date:


On Sun, May 29, 2016 at 8:53 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
På søndag 29. mai 2016 kl. 19:49:06, skrev Oleg Bartunov <obartunov@gmail.com>:
[snip]
 
I want to run 9.6 beta in production right now because of this:-)
 
wait-wait :)  We'd be happy to have feedback from production, of course, but please, wait a bit. We are adding support of sorting posting list/tree not by item pointer as in gin, but make use of additional information, for example, timestamp, which will provide additional speedup to the existing one.
 
Awesome!
 
 
Also, we are sure there are some bugs :)
 
He he, I reported 1st issue: https://github.com/postgrespro/rum/issues/1
 
Would be cool to see this fixed so I actually could have a sip of the rum:-)


It's not easy to fix this. We don't want rum depends on  btree_gin, so probably the easiest way is to have separate operator <=> in rum.
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 

Attachment

Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Andreas Joseph Krogh
Date:
På tirsdag 31. mai 2016 kl. 16:12:52, skrev Oleg Bartunov <obartunov@gmail.com>:
[snip]
He he, I reported 1st issue: https://github.com/postgrespro/rum/issues/1
 
Would be cool to see this fixed so I actually could have a sip of the rum:-)

 
It's not easy to fix this. We don't want rum depends on  btree_gin, so probably the easiest way is to have separate operator <=> in rum.
 
+1 for separate operator!
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
 
Attachment

Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

From
Kaare Rasmussen
Date:
On 2016-05-31 13:24, Stefan Keller wrote:
>
> > We chose RUM just because there are GIN and VODKA :)
> > But some people already suggested several meanings like Really
> Useful iMdex :)
> > We are open for suggestion.
>
> So I propose: "Ranking UMdex" ;-)
>

How about "Russian Unbelievable Magic"? Or just "RUssian Magic" if you
do believe...

/kaare