Slides for PGCon2016; "FTS is dead ? Long live FTS !"

Started by Andreas Joseph Kroghabout 10 years ago15 messagesgeneral
Jump to latest
#1Andreas Joseph Krogh
andreas@visena.com

Hi.
 
Any news about when slides for $subject will be available?
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com&gt;
<https://www.visena.com&gt;

#2Oleg Bartunov
oleg@sai.msu.su
In reply to: Andreas Joseph Krogh (#1)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com>
wrote:

Hi.

Any news about when slides for $subject will be available?

I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf

There are some missing features in rum index, but I hope we'll update
github repository really soon.

Show quoted text

--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com&gt;

#3Andreas Joseph Krogh
andreas@visena.com
In reply to: Oleg Bartunov (#2)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <obartunov@gmail.com
<mailto:obartunov@gmail.com>>:
    On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com
<mailto:andreas@visena.com>> wrote: Hi.
 
Any news about when slides for $subject will be available?
 
I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
<http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf&gt;
 
There are some missing features in rum index, but I hope we'll update github
repository really soon.

 
This is simply amazing!
 
I want to run 9.6 beta in production right now because of this:-)
 
Hats off guys, congrats to PostgresPro, and huge thanks!!
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com&gt;
<https://www.visena.com&gt;

 

#4Stefan Keller
sfkeller@gmail.com
In reply to: Andreas Joseph Krogh (#3)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

Hi,

Nice work from you postgrespro.ru guys! Especially the RUM index which
demonstrates the power of 9.6 to let third party SW create access methods
as extension: https://github.com/postgrespro/rum

1. I don't understand the benchmarks on slide 25 "20 mln descriptions" (and
the one before "6.7 mln classifieds"): What does "Queries in 8 h 9.2 +patch
(9.6 rum)" mean?

2. What does R-U-M mean? (can't mean "Range Usage Metadata" which was
finally coined range index BRIN)?

:Stefan, co-organizer of Swiss PGDay

2016-05-29 11:29 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:

Show quoted text

På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <
obartunov@gmail.com>:

On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com

wrote:

Hi.

Any news about when slides for $subject will be available?

I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf

There are some missing features in rum index, but I hope we'll update
github repository really soon.

This is simply amazing!

I want to run 9.6 beta in production right now because of this:-)

Hats off guys, congrats to PostgresPro, and huge thanks!!

--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com&gt;

#5Oleg Bartunov
oleg@sai.msu.su
In reply to: Andreas Joseph Krogh (#3)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

On Sun, May 29, 2016 at 12:29 PM, Andreas Joseph Krogh <andreas@visena.com>
wrote:

På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <
obartunov@gmail.com>:

On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com

wrote:

Hi.

Any news about when slides for $subject will be available?

I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf

There are some missing features in rum index, but I hope we'll update
github repository really soon.

This is simply amazing!

I want to run 9.6 beta in production right now because of this:-)

wait-wait :) We'd be happy to have feedback from production, of course,
but please, wait a bit. We are adding support of sorting posting list/tree
not by item pointer as in gin, but make use of additional information, for
example, timestamp, which will provide additional speedup to the existing
one. Also, we are sure there are some bugs :)

Show quoted text

Hats off guys, congrats to PostgresPro, and huge thanks!!

--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com&gt;

#6Andreas Joseph Krogh
andreas@visena.com
In reply to: Oleg Bartunov (#5)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

På søndag 29. mai 2016 kl. 19:49:06, skrev Oleg Bartunov <obartunov@gmail.com
<mailto:obartunov@gmail.com>>:
[snip]  
I want to run 9.6 beta in production right now because of this:-)
 
wait-wait :)  We'd be happy to have feedback from production, of course, but
please, wait a bit. We are adding support of sorting posting list/tree not by
item pointer as in gin, but make use of additional information, for example,
timestamp, which will provide additional speedup to the existing one.

 
Awesome!
 
 
Also, we are sure there are some bugs :)

 
He he, I reported 1st issue: https://github.com/postgrespro/rum/issues/1
 
Would be cool to see this fixed so I actually could have a sip of the rum:-)

 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com&gt;
<https://www.visena.com&gt;

 

#7Oleg Bartunov
oleg@sai.msu.su
In reply to: Stefan Keller (#4)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

On Sun, May 29, 2016 at 2:43 PM, Stefan Keller <sfkeller@gmail.com> wrote:

Hi,

Nice work from you postgrespro.ru guys! Especially the RUM index which
demonstrates the power of 9.6 to let third party SW create access methods
as extension: https://github.com/postgrespro/rum

1. I don't understand the benchmarks on slide 25 "20 mln descriptions"
(and the one before "6.7 mln classifieds"): What does "Queries in 8 h 9.2
+patch (9.6 rum)" mean?

We run queries for 8 hours and recorded the number of executed queries.
Four years ago, when I and Alexander developed an initial version of patch
we got results marked by "9.2+patch", and now we run the same queries on
the same database and put rum results into (). I'd not consider to this
numbers, since we used queries from 6 mln database. We'd be happy if
somebody run independent benchmarks.

2. What does R-U-M mean? (can't mean "Range Usage Metadata" which was
finally coined range index BRIN)?

We chose RUM just because there are GIN and VODKA :) But some people
already suggested several meanings like Really Useful iMdex :) We are open
for suggestion.

Show quoted text

:Stefan, co-organizer of Swiss PGDay

2016-05-29 11:29 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:

På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <
obartunov@gmail.com>:

On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <
andreas@visena.com> wrote:

Hi.

Any news about when slides for $subject will be available?

I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf

There are some missing features in rum index, but I hope we'll update
github repository really soon.

This is simply amazing!

I want to run 9.6 beta in production right now because of this:-)

Hats off guys, congrats to PostgresPro, and huge thanks!!

--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com&gt;

#8Karsten Hilbert
Karsten.Hilbert@gmx.net
In reply to: Oleg Bartunov (#5)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf

Looking at slide 39 (attached) I get the impression that I
should be able to do the following:

- turn a coding system (say, ICD-10) into a dictionary
by splitting the terms into single words

say, "diabetes mellitus -> "diabetes", "mellitus"

- define stop words like "left", "right", ...

say, "fracture left ulna" -> the "left" doesn't
matter as far as coding is concerned

- also turn that coding system into queries by splitting
the terms into single words, concatenating them
with "&", and setting the ICD 10 code as tag on them

say, "diabetes mellitus" -> "diabetes & mellitus [E11]"

- run an inverse FTS (FQS) against a user supplied string
thereby finding queries (= tags = ICD10 codes) likely
relevant to the input

say, to_tsvector("patient was suspected to suffer from diabetes mellitus")
-> tag = E11

Possible, not possible, insane, unintended use ?

Thanks,
Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

Attachments:

pgcon-2016-fts-Seite-39.pdfapplication/pdfDownload
#9Oleg Bartunov
oleg@sai.msu.su
In reply to: Karsten Hilbert (#8)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

On Sun, May 29, 2016 at 10:04 PM, Karsten Hilbert
<Karsten.Hilbert@gmx.net> wrote:

I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf

Looking at slide 39 (attached) I get the impression that I
should be able to do the following:

- turn a coding system (say, ICD-10) into a dictionary
by splitting the terms into single words

say, "diabetes mellitus -> "diabetes", "mellitus"

- define stop words like "left", "right", ...

say, "fracture left ulna" -> the "left" doesn't
matter as far as coding is concerned

- also turn that coding system into queries by splitting
the terms into single words, concatenating them
with "&", and setting the ICD 10 code as tag on them

say, "diabetes mellitus" -> "diabetes & mellitus [E11]"

- run an inverse FTS (FQS) against a user supplied string
thereby finding queries (= tags = ICD10 codes) likely
relevant to the input

say, to_tsvector("patient was suspected to suffer from diabetes mellitus")
-> tag = E11

Possible, not possible, insane, unintended use ?

why not, it's the same kind of usage I used at slide #39.

create table icd10 (q tsquery, code text);
insert into icd10 values(to_tsquery('diabetes & mellitus'), '[E11]');
select * from icd10 where to_tsvector('patient was suspected to suffer
from diabetes mellitus') @@ q;
q | code
-----------------------+-------
'diabet' & 'mellitus' | [E11]
(1 row)

Thanks,
Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#10Oleg Bartunov
oleg@sai.msu.su
In reply to: Oleg Bartunov (#2)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

On Sun, May 29, 2016 at 12:59 AM, Oleg Bartunov <obartunov@gmail.com> wrote:

On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com

wrote:

Hi.

Any news about when slides for $subject will be available?

I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf

Please, download new version of slides. I added CREATE INDEX commands in
examples.

Show quoted text

There are some missing features in rum index, but I hope we'll update
github repository really soon.

--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com&gt;

#11Andreas Joseph Krogh
andreas@visena.com
In reply to: Oleg Bartunov (#10)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

På mandag 30. mai 2016 kl. 22:27:11, skrev Oleg Bartunov <obartunov@gmail.com
<mailto:obartunov@gmail.com>>:
    On Sun, May 29, 2016 at 12:59 AM, Oleg Bartunov <obartunov@gmail.com
<mailto:obartunov@gmail.com>> wrote:     On Thu, May 26, 2016 at 11:26 PM,
Andreas Joseph Krogh<andreas@visena.com <mailto:andreas@visena.com>> wrote: Hi.
 
Any news about when slides for $subject will be available?
 
I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
<http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf&gt;
 

 
Please, download new version of slides. I added CREATE INDEX commands in
examples.

 
Great!
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com&gt;
<https://www.visena.com&gt;

 

#12Stefan Keller
sfkeller@gmail.com
In reply to: Andreas Joseph Krogh (#11)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

Hi Oleg

2016-05-29 19:54 GMT+02:00 Oleg Bartunov <obartunov@gmail.com>:

We chose RUM just because there are GIN and VODKA :)
But some people already suggested several meanings like Really Useful

iMdex :)

We are open for suggestion.

iMdex LOL :-)

Ok. What's new about the index?
* AFAIK it's using methods as extension
* it's inspired by inverted index
* and uses position information to calculate rank and order results

So I propose: "Ranking UMdex" ;-)

:Stefan

2016-05-30 22:33 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:

Show quoted text

På mandag 30. mai 2016 kl. 22:27:11, skrev Oleg Bartunov <
obartunov@gmail.com>:

On Sun, May 29, 2016 at 12:59 AM, Oleg Bartunov <obartunov@gmail.com>
wrote:

On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <
andreas@visena.com> wrote:

Hi.

Any news about when slides for $subject will be available?

I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf

Please, download new version of slides. I added CREATE INDEX commands in
examples.

Great!

--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com&gt;

#13Oleg Bartunov
oleg@sai.msu.su
In reply to: Andreas Joseph Krogh (#6)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

On Sun, May 29, 2016 at 8:53 PM, Andreas Joseph Krogh <andreas@visena.com>
wrote:

På søndag 29. mai 2016 kl. 19:49:06, skrev Oleg Bartunov <
obartunov@gmail.com>:

[snip]

I want to run 9.6 beta in production right now because of this:-)

wait-wait :) We'd be happy to have feedback from production, of course,
but please, wait a bit. We are adding support of sorting posting list/tree
not by item pointer as in gin, but make use of additional information, for
example, timestamp, which will provide additional speedup to the existing
one.

Awesome!

Also, we are sure there are some bugs :)

He he, I reported 1st issue: https://github.com/postgrespro/rum/issues/1

Would be cool to see this fixed so I actually could have a sip of the
rum:-)

It's not easy to fix this. We don't want rum depends on btree_gin, so
probably the easiest way is to have separate operator <=> in rum.

Show quoted text

--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com&gt;

#14Andreas Joseph Krogh
andreas@visena.com
In reply to: Oleg Bartunov (#13)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

På tirsdag 31. mai 2016 kl. 16:12:52, skrev Oleg Bartunov <obartunov@gmail.com
<mailto:obartunov@gmail.com>>:
[snip] He he, I reported 1st issue: https://github.com/postgrespro/rum/issues/1
<https://github.com/postgrespro/rum/issues/1&gt;
 
Would be cool to see this fixed so I actually could have a sip of the rum:-)

 
It's not easy to fix this. We don't want rum depends on  btree_gin, so
probably the easiest way is to have separate operator <=> in rum.

 
+1 for separate operator!
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com&gt;
<https://www.visena.com&gt;

 

#15Kaare Rasmussen
kaare@jasonic.dk
In reply to: Stefan Keller (#12)
Re: Slides for PGCon2016; "FTS is dead ? Long live FTS !"

On 2016-05-31 13:24, Stefan Keller wrote:

We chose RUM just because there are GIN and VODKA :)
But some people already suggested several meanings like Really

Useful iMdex :)

We are open for suggestion.

So I propose: "Ranking UMdex" ;-)

How about "Russian Unbelievable Magic"? Or just "RUssian Magic" if you
do believe...

/kaare

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general