Thread: Slides for PGCon2016; "FTS is dead ? Long live FTS !"
Hi.
Any news about when slides for $subject will be available?
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
Hi.Any news about when slides for $subject will be available?
I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
There are some missing features in rum index, but I hope we'll update github repository really soon.
--Andreas Joseph KroghCTO / Partner - Visena ASMobile: +47 909 56 963
Attachment
På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <obartunov@gmail.com>:
On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:Hi.Any news about when slides for $subject will be available?I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
There are some missing features in rum index, but I hope we'll update github repository really soon.
This is simply amazing!
I want to run 9.6 beta in production right now because of this:-)
Hats off guys, congrats to PostgresPro, and huge thanks!!
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
Hi,
Nice work from you postgrespro.ru guys! Especially the RUM index which demonstrates the power of 9.6 to let third party SW create access methods as extension: https://github.com/postgrespro/rum
1. I don't understand the benchmarks on slide 25 "20 mln descriptions" (and the one before "6.7 mln classifieds"): What does "Queries in 8 h 9.2 +patch (9.6 rum)" mean?
2. What does R-U-M mean? (can't mean "Range Usage Metadata" which was finally coined range index BRIN)?
:Stefan, co-organizer of Swiss PGDay
2016-05-29 11:29 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:
På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <obartunov@gmail.com>:On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:Hi.Any news about when slides for $subject will be available?I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
There are some missing features in rum index, but I hope we'll update github repository really soon.This is simply amazing!I want to run 9.6 beta in production right now because of this:-)Hats off guys, congrats to PostgresPro, and huge thanks!!--Andreas Joseph KroghCTO / Partner - Visena ASMobile: +47 909 56 963
Attachment
On Sun, May 29, 2016 at 12:29 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <obartunov@gmail.com>:On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:Hi.Any news about when slides for $subject will be available?I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
There are some missing features in rum index, but I hope we'll update github repository really soon.This is simply amazing!I want to run 9.6 beta in production right now because of this:-)
wait-wait :) We'd be happy to have feedback from production, of course, but please, wait a bit. We are adding support of sorting posting list/tree not by item pointer as in gin, but make use of additional information, for example, timestamp, which will provide additional speedup to the existing one. Also, we are sure there are some bugs :)
Hats off guys, congrats to PostgresPro, and huge thanks!!--Andreas Joseph KroghCTO / Partner - Visena ASMobile: +47 909 56 963
Attachment
På søndag 29. mai 2016 kl. 19:49:06, skrev Oleg Bartunov <obartunov@gmail.com>:
[snip]I want to run 9.6 beta in production right now because of this:-)wait-wait :) We'd be happy to have feedback from production, of course, but please, wait a bit. We are adding support of sorting posting list/tree not by item pointer as in gin, but make use of additional information, for example, timestamp, which will provide additional speedup to the existing one.
Awesome!
Also, we are sure there are some bugs :)
He he, I reported 1st issue: https://github.com/postgrespro/rum/issues/1
Would be cool to see this fixed so I actually could have a sip of the rum:-)
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
On Sun, May 29, 2016 at 2:43 PM, Stefan Keller <sfkeller@gmail.com> wrote:
Hi,Nice work from you postgrespro.ru guys! Especially the RUM index which demonstrates the power of 9.6 to let third party SW create access methods as extension: https://github.com/postgrespro/rum1. I don't understand the benchmarks on slide 25 "20 mln descriptions" (and the one before "6.7 mln classifieds"): What does "Queries in 8 h 9.2 +patch (9.6 rum)" mean?
We run queries for 8 hours and recorded the number of executed queries. Four years ago, when I and Alexander developed an initial version of patch we got results marked by "9.2+patch", and now we run the same queries on the same database and put rum results into (). I'd not consider to this numbers, since we used queries from 6 mln database. We'd be happy if somebody run independent benchmarks.
2. What does R-U-M mean? (can't mean "Range Usage Metadata" which was finally coined range index BRIN)?
We chose RUM just because there are GIN and VODKA :) But some people already suggested several meanings like Really Useful iMdex :) We are open for suggestion.
:Stefan, co-organizer of Swiss PGDay2016-05-29 11:29 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <obartunov@gmail.com>:On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:Hi.Any news about when slides for $subject will be available?I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
There are some missing features in rum index, but I hope we'll update github repository really soon.This is simply amazing!I want to run 9.6 beta in production right now because of this:-)Hats off guys, congrats to PostgresPro, and huge thanks!!--Andreas Joseph KroghCTO / Partner - Visena ASMobile: +47 909 56 963
Attachment
>> I submitted slides to pgcon site, but it usually takes awhile, so you can >> download our presentation directly >> http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf Looking at slide 39 (attached) I get the impression that I should be able to do the following: - turn a coding system (say, ICD-10) into a dictionary by splitting the terms into single words say, "diabetes mellitus -> "diabetes", "mellitus" - define stop words like "left", "right", ... say, "fracture left ulna" -> the "left" doesn't matter as far as coding is concerned - also turn that coding system into queries by splitting the terms into single words, concatenating them with "&", and setting the ICD 10 code as tag on them say, "diabetes mellitus" -> "diabetes & mellitus [E11]" - run an inverse FTS (FQS) against a user supplied string thereby finding queries (= tags = ICD10 codes) likely relevant to the input say, to_tsvector("patient was suspected to suffer from diabetes mellitus") -> tag = E11 Possible, not possible, insane, unintended use ? Thanks, Karsten -- GPG key ID E4071346 @ eu.pool.sks-keyservers.net E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
Attachment
On Sun, May 29, 2016 at 10:04 PM, Karsten Hilbert <Karsten.Hilbert@gmx.net> wrote: >>> I submitted slides to pgcon site, but it usually takes awhile, so you can >>> download our presentation directly >>> http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf > > Looking at slide 39 (attached) I get the impression that I > should be able to do the following: > > > - turn a coding system (say, ICD-10) into a dictionary > by splitting the terms into single words > > say, "diabetes mellitus -> "diabetes", "mellitus" > > - define stop words like "left", "right", ... > > say, "fracture left ulna" -> the "left" doesn't > matter as far as coding is concerned > > - also turn that coding system into queries by splitting > the terms into single words, concatenating them > with "&", and setting the ICD 10 code as tag on them > > say, "diabetes mellitus" -> "diabetes & mellitus [E11]" > > - run an inverse FTS (FQS) against a user supplied string > thereby finding queries (= tags = ICD10 codes) likely > relevant to the input > > say, to_tsvector("patient was suspected to suffer from diabetes mellitus") > -> tag = E11 > > > Possible, not possible, insane, unintended use ? why not, it's the same kind of usage I used at slide #39. create table icd10 (q tsquery, code text); insert into icd10 values(to_tsquery('diabetes & mellitus'), '[E11]'); select * from icd10 where to_tsvector('patient was suspected to suffer from diabetes mellitus') @@ q; q | code -----------------------+------- 'diabet' & 'mellitus' | [E11] (1 row) > > Thanks, > Karsten > -- > GPG key ID E4071346 @ eu.pool.sks-keyservers.net > E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346 > > > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general >
On Sun, May 29, 2016 at 12:59 AM, Oleg Bartunov <obartunov@gmail.com> wrote:
On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:Hi.Any news about when slides for $subject will be available?I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
Please, download new version of slides. I added CREATE INDEX commands in examples.
There are some missing features in rum index, but I hope we'll update github repository really soon.
--Andreas Joseph KroghCTO / Partner - Visena ASMobile: +47 909 56 963
Attachment
På mandag 30. mai 2016 kl. 22:27:11, skrev Oleg Bartunov <obartunov@gmail.com>:
On Sun, May 29, 2016 at 12:59 AM, Oleg Bartunov <obartunov@gmail.com> wrote:On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:Hi.Any news about when slides for $subject will be available?I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
Please, download new version of slides. I added CREATE INDEX commands in examples.
Great!
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
Hi Oleg
2016-05-29 19:54 GMT+02:00 Oleg Bartunov <obartunov@gmail.com>:
> We chose RUM just because there are GIN and VODKA :)
> But some people already suggested several meanings like Really Useful iMdex :)
> We are open for suggestion.
iMdex LOL :-)
Ok. What's new about the index?
* AFAIK it's using methods as extension
* it's inspired by inverted index
* and uses position information to calculate rank and order results
So I propose: "Ranking UMdex" ;-)
:Stefan
2016-05-30 22:33 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:
På mandag 30. mai 2016 kl. 22:27:11, skrev Oleg Bartunov <obartunov@gmail.com>:On Sun, May 29, 2016 at 12:59 AM, Oleg Bartunov <obartunov@gmail.com> wrote:On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:Hi.Any news about when slides for $subject will be available?I submitted slides to pgcon site, but it usually takes awhile, so you can download our presentation directly http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
Please, download new version of slides. I added CREATE INDEX commands in examples.Great!--Andreas Joseph KroghCTO / Partner - Visena ASMobile: +47 909 56 963
Attachment
On Sun, May 29, 2016 at 8:53 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
På søndag 29. mai 2016 kl. 19:49:06, skrev Oleg Bartunov <obartunov@gmail.com>:[snip]I want to run 9.6 beta in production right now because of this:-)wait-wait :) We'd be happy to have feedback from production, of course, but please, wait a bit. We are adding support of sorting posting list/tree not by item pointer as in gin, but make use of additional information, for example, timestamp, which will provide additional speedup to the existing one.Awesome!Also, we are sure there are some bugs :)He he, I reported 1st issue: https://github.com/postgrespro/rum/issues/1Would be cool to see this fixed so I actually could have a sip of the rum:-)
It's not easy to fix this. We don't want rum depends on btree_gin, so probably the easiest way is to have separate operator <=> in rum.
--Andreas Joseph KroghCTO / Partner - Visena ASMobile: +47 909 56 963
Attachment
På tirsdag 31. mai 2016 kl. 16:12:52, skrev Oleg Bartunov <obartunov@gmail.com>:
[snip]He he, I reported 1st issue: https://github.com/postgrespro/rum/issues/1Would be cool to see this fixed so I actually could have a sip of the rum:-)
It's not easy to fix this. We don't want rum depends on btree_gin, so probably the easiest way is to have separate operator <=> in rum.
+1 for separate operator!
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
Attachment
On 2016-05-31 13:24, Stefan Keller wrote: > > > We chose RUM just because there are GIN and VODKA :) > > But some people already suggested several meanings like Really > Useful iMdex :) > > We are open for suggestion. > > So I propose: "Ranking UMdex" ;-) > How about "Russian Unbelievable Magic"? Or just "RUssian Magic" if you do believe... /kaare