Slides for PGCon2016; "FTS is dead ? Long live FTS !"
Hi.
Any news about when slides for $subject will be available?
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com>
<https://www.visena.com>
On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com>
wrote:
Hi.
Any news about when slides for $subject will be available?
I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
There are some missing features in rum index, but I hope we'll update
github repository really soon.
Show quoted text
--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com>
På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <obartunov@gmail.com
<mailto:obartunov@gmail.com>>:
On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com
<mailto:andreas@visena.com>> wrote: Hi.
Any news about when slides for $subject will be available?
I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
<http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf>
There are some missing features in rum index, but I hope we'll update github
repository really soon.
This is simply amazing!
I want to run 9.6 beta in production right now because of this:-)
Hats off guys, congrats to PostgresPro, and huge thanks!!
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com>
<https://www.visena.com>
Hi,
Nice work from you postgrespro.ru guys! Especially the RUM index which
demonstrates the power of 9.6 to let third party SW create access methods
as extension: https://github.com/postgrespro/rum
1. I don't understand the benchmarks on slide 25 "20 mln descriptions" (and
the one before "6.7 mln classifieds"): What does "Queries in 8 h 9.2 +patch
(9.6 rum)" mean?
2. What does R-U-M mean? (can't mean "Range Usage Metadata" which was
finally coined range index BRIN)?
:Stefan, co-organizer of Swiss PGDay
2016-05-29 11:29 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:
Show quoted text
På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <
obartunov@gmail.com>:On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com
wrote:
Hi.
Any news about when slides for $subject will be available?
I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdfThere are some missing features in rum index, but I hope we'll update
github repository really soon.This is simply amazing!
I want to run 9.6 beta in production right now because of this:-)
Hats off guys, congrats to PostgresPro, and huge thanks!!
--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com>
On Sun, May 29, 2016 at 12:29 PM, Andreas Joseph Krogh <andreas@visena.com>
wrote:
På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <
obartunov@gmail.com>:On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com
wrote:
Hi.
Any news about when slides for $subject will be available?
I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdfThere are some missing features in rum index, but I hope we'll update
github repository really soon.This is simply amazing!
I want to run 9.6 beta in production right now because of this:-)
wait-wait :) We'd be happy to have feedback from production, of course,
but please, wait a bit. We are adding support of sorting posting list/tree
not by item pointer as in gin, but make use of additional information, for
example, timestamp, which will provide additional speedup to the existing
one. Also, we are sure there are some bugs :)
Show quoted text
Hats off guys, congrats to PostgresPro, and huge thanks!!
--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com>
På søndag 29. mai 2016 kl. 19:49:06, skrev Oleg Bartunov <obartunov@gmail.com
<mailto:obartunov@gmail.com>>:
[snip]
I want to run 9.6 beta in production right now because of this:-)
wait-wait :) We'd be happy to have feedback from production, of course, but
please, wait a bit. We are adding support of sorting posting list/tree not by
item pointer as in gin, but make use of additional information, for example,
timestamp, which will provide additional speedup to the existing one.
Awesome!
Also, we are sure there are some bugs :)
He he, I reported 1st issue: https://github.com/postgrespro/rum/issues/1
Would be cool to see this fixed so I actually could have a sip of the rum:-)
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com>
<https://www.visena.com>
On Sun, May 29, 2016 at 2:43 PM, Stefan Keller <sfkeller@gmail.com> wrote:
Hi,
Nice work from you postgrespro.ru guys! Especially the RUM index which
demonstrates the power of 9.6 to let third party SW create access methods
as extension: https://github.com/postgrespro/rum1. I don't understand the benchmarks on slide 25 "20 mln descriptions"
(and the one before "6.7 mln classifieds"): What does "Queries in 8 h 9.2
+patch (9.6 rum)" mean?
We run queries for 8 hours and recorded the number of executed queries.
Four years ago, when I and Alexander developed an initial version of patch
we got results marked by "9.2+patch", and now we run the same queries on
the same database and put rum results into (). I'd not consider to this
numbers, since we used queries from 6 mln database. We'd be happy if
somebody run independent benchmarks.
2. What does R-U-M mean? (can't mean "Range Usage Metadata" which was
finally coined range index BRIN)?
We chose RUM just because there are GIN and VODKA :) But some people
already suggested several meanings like Really Useful iMdex :) We are open
for suggestion.
Show quoted text
:Stefan, co-organizer of Swiss PGDay
2016-05-29 11:29 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:
På lørdag 28. mai 2016 kl. 23:59:55, skrev Oleg Bartunov <
obartunov@gmail.com>:On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <
andreas@visena.com> wrote:Hi.
Any news about when slides for $subject will be available?
I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdfThere are some missing features in rum index, but I hope we'll update
github repository really soon.This is simply amazing!
I want to run 9.6 beta in production right now because of this:-)
Hats off guys, congrats to PostgresPro, and huge thanks!!
--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com>
I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
Looking at slide 39 (attached) I get the impression that I
should be able to do the following:
- turn a coding system (say, ICD-10) into a dictionary
by splitting the terms into single words
say, "diabetes mellitus -> "diabetes", "mellitus"
- define stop words like "left", "right", ...
say, "fracture left ulna" -> the "left" doesn't
matter as far as coding is concerned
- also turn that coding system into queries by splitting
the terms into single words, concatenating them
with "&", and setting the ICD 10 code as tag on them
say, "diabetes mellitus" -> "diabetes & mellitus [E11]"
- run an inverse FTS (FQS) against a user supplied string
thereby finding queries (= tags = ICD10 codes) likely
relevant to the input
say, to_tsvector("patient was suspected to suffer from diabetes mellitus")
-> tag = E11
Possible, not possible, insane, unintended use ?
Thanks,
Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346
Attachments:
On Sun, May 29, 2016 at 10:04 PM, Karsten Hilbert
<Karsten.Hilbert@gmx.net> wrote:
I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdfLooking at slide 39 (attached) I get the impression that I
should be able to do the following:- turn a coding system (say, ICD-10) into a dictionary
by splitting the terms into single wordssay, "diabetes mellitus -> "diabetes", "mellitus"
- define stop words like "left", "right", ...
say, "fracture left ulna" -> the "left" doesn't
matter as far as coding is concerned- also turn that coding system into queries by splitting
the terms into single words, concatenating them
with "&", and setting the ICD 10 code as tag on themsay, "diabetes mellitus" -> "diabetes & mellitus [E11]"
- run an inverse FTS (FQS) against a user supplied string
thereby finding queries (= tags = ICD10 codes) likely
relevant to the inputsay, to_tsvector("patient was suspected to suffer from diabetes mellitus")
-> tag = E11Possible, not possible, insane, unintended use ?
why not, it's the same kind of usage I used at slide #39.
create table icd10 (q tsquery, code text);
insert into icd10 values(to_tsquery('diabetes & mellitus'), '[E11]');
select * from icd10 where to_tsvector('patient was suspected to suffer
from diabetes mellitus') @@ q;
q | code
-----------------------+-------
'diabet' & 'mellitus' | [E11]
(1 row)
Thanks,
Karsten
--
GPG key ID E4071346 @ eu.pool.sks-keyservers.net
E167 67FD A291 2BEA 73BD 4537 78B9 A9F9 E407 1346--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
On Sun, May 29, 2016 at 12:59 AM, Oleg Bartunov <obartunov@gmail.com> wrote:
On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <andreas@visena.com
wrote:
Hi.
Any news about when slides for $subject will be available?
I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
Please, download new version of slides. I added CREATE INDEX commands in
examples.
Show quoted text
There are some missing features in rum index, but I hope we'll update
github repository really soon.--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com>
På mandag 30. mai 2016 kl. 22:27:11, skrev Oleg Bartunov <obartunov@gmail.com
<mailto:obartunov@gmail.com>>:
On Sun, May 29, 2016 at 12:59 AM, Oleg Bartunov <obartunov@gmail.com
<mailto:obartunov@gmail.com>> wrote: On Thu, May 26, 2016 at 11:26 PM,
Andreas Joseph Krogh<andreas@visena.com <mailto:andreas@visena.com>> wrote: Hi.
Any news about when slides for $subject will be available?
I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf
<http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdf>
Please, download new version of slides. I added CREATE INDEX commands in
examples.
Great!
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com>
<https://www.visena.com>
Hi Oleg
2016-05-29 19:54 GMT+02:00 Oleg Bartunov <obartunov@gmail.com>:
We chose RUM just because there are GIN and VODKA :)
But some people already suggested several meanings like Really Useful
iMdex :)
We are open for suggestion.
iMdex LOL :-)
Ok. What's new about the index?
* AFAIK it's using methods as extension
* it's inspired by inverted index
* and uses position information to calculate rank and order results
So I propose: "Ranking UMdex" ;-)
:Stefan
2016-05-30 22:33 GMT+02:00 Andreas Joseph Krogh <andreas@visena.com>:
Show quoted text
På mandag 30. mai 2016 kl. 22:27:11, skrev Oleg Bartunov <
obartunov@gmail.com>:On Sun, May 29, 2016 at 12:59 AM, Oleg Bartunov <obartunov@gmail.com>
wrote:On Thu, May 26, 2016 at 11:26 PM, Andreas Joseph Krogh <
andreas@visena.com> wrote:Hi.
Any news about when slides for $subject will be available?
I submitted slides to pgcon site, but it usually takes awhile, so you can
download our presentation directly
http://www.sai.msu.su/~megera/postgres/talks/pgcon-2016-fts.pdfPlease, download new version of slides. I added CREATE INDEX commands in
examples.Great!
--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com>
On Sun, May 29, 2016 at 8:53 PM, Andreas Joseph Krogh <andreas@visena.com>
wrote:
På søndag 29. mai 2016 kl. 19:49:06, skrev Oleg Bartunov <
obartunov@gmail.com>:[snip]
I want to run 9.6 beta in production right now because of this:-)
wait-wait :) We'd be happy to have feedback from production, of course,
but please, wait a bit. We are adding support of sorting posting list/tree
not by item pointer as in gin, but make use of additional information, for
example, timestamp, which will provide additional speedup to the existing
one.Awesome!
Also, we are sure there are some bugs :)
He he, I reported 1st issue: https://github.com/postgrespro/rum/issues/1
Would be cool to see this fixed so I actually could have a sip of the
rum:-)
It's not easy to fix this. We don't want rum depends on btree_gin, so
probably the easiest way is to have separate operator <=> in rum.
Show quoted text
--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com>
På tirsdag 31. mai 2016 kl. 16:12:52, skrev Oleg Bartunov <obartunov@gmail.com
<mailto:obartunov@gmail.com>>:
[snip] He he, I reported 1st issue: https://github.com/postgrespro/rum/issues/1
<https://github.com/postgrespro/rum/issues/1>
Would be cool to see this fixed so I actually could have a sip of the rum:-)
It's not easy to fix this. We don't want rum depends on btree_gin, so
probably the easiest way is to have separate operator <=> in rum.
+1 for separate operator!
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com>
<https://www.visena.com>
On 2016-05-31 13:24, Stefan Keller wrote:
We chose RUM just because there are GIN and VODKA :)
But some people already suggested several meanings like ReallyUseful iMdex :)
We are open for suggestion.
So I propose: "Ranking UMdex" ;-)
How about "Russian Unbelievable Magic"? Or just "RUssian Magic" if you
do believe...
/kaare
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general