Updated RUM-index and support for bigint as part of index

Started by Andreas Joseph Kroghalmost 10 years ago9 messagesgeneral
Jump to latest
#1Andreas Joseph Krogh
andreas@visena.com

Hi.
 
I see the RUM-index is updated, which is great!
 
I wonder, to be able to sort by timestamp one has to create the index like
this:
 
CREATE INDEX rumidx ON origo_email_delivery USING rum (fts_all
rum_tsvector_timestamp_ops, received_timestamp)WITH (attach =
'received_timestamp', TO = 'fts_all', order_by_attach = TRUE );
Then, to be able to use the index for sorting by the
"received_timestamp"-column one has to issue a query like this:
EXPLAIN ANALYZE SELECT del.entity_id, del.subject, del.received_timestamp,
fts_all <=>to_tsquery('simple', 'andreas&kr') AS rank FROM origo_email_delivery
delWHERE del.fts_all @@ to_tsquery('simple', 'andreas&kr') ORDER BY '2000-01-01'
::TIMESTAMP <=> del.received_timestamp LIMIT 10;
QUERY PLAN
--------------------------------------------------------------------------------------------------------------------------------------------------
Limit(cost=14.40..26.47 rows=10 width=89) (actual time=10.908..10.952 rows=10
loops=1) -> Index Scan using rumidx on origo_email_delivery del (cost
=14.40..3221.22rows=2657 width=89) (actual time=10.906..10.947 rows=10 loops=1)
IndexCond: (fts_all @@ '''andreas'' & ''kr'''::tsquery) Order By:
(received_timestamp <=>'2000-01-01 00:00:00'::timestamp without time zone)
Planningtime: 0.491 ms Execution time: 11.010 ms (6 rows)
 
The ORDER BY part seems strange; It seems one has to find a value "lower than
any other value" to use as a kind of base, why is this necessary? It also seems
that in order to be able to sort DESC one has to provide a timestamp value
"higher than any other value", is this correct?
 
It would be great if the docs explained this.
 
I really miss the opportunity to include a BIGINT as part of the index, so
that the WHERE-clause could be like this:
 
WHERE del.fts_all @@ to_tsquery('simple', 'andreas&kr') AND del.folder_id IN (1
,2,3)
 
Having this would be perfect for my use-case searching in email in folders,
sorted by received_date, and having it use ONE index.
 
Will this be supported?
 
Thanks.
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com&gt;
<https://www.visena.com&gt;

#2Arthur Zakirov
a.zakirov@postgrespro.ru
In reply to: Andreas Joseph Krogh (#1)
Re: Updated RUM-index and support for bigint as part of index

Hello,

2016-08-02 21:08 GMT+03:00 Andreas Joseph Krogh <andreas@visena.com>:

The ORDER BY part seems strange; It seems one has to find a value "lower than any other value" to use as a kind of base, why is this necessary? It also seems that in order to be able to sort DESC one has to provide a timestamp value "higher than any other value", is this correct?

It would be great if the docs explained this.

We will write more detailed documentation for RUM.

I really miss the opportunity to include a BIGINT as part of the index, so
that the WHERE-clause could be like this:

WHERE del.fts_all @@ to_tsquery('simple', 'andreas&kr') AND del.folder_id IN
(1,2,3)

Having this would be perfect for my use-case searching in email in
folders, sorted by received_date, and having it use ONE index.

Will this be supported?

We have a plan to use generic types to able to include bigint, timestamp
and other types as part of index. But I cant tell date of it.

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

#3Oleg Bartunov
oleg@sai.msu.su
In reply to: Andreas Joseph Krogh (#1)
Re: Updated RUM-index and support for bigint as part of index

On Tue, Aug 2, 2016 at 9:08 PM, Andreas Joseph Krogh <andreas@visena.com>
wrote:

Hi.

I see the RUM-index is updated, which is great!

I wonder, to be able to sort by timestamp one has to create the index like
this:

CREATE INDEX rumidx ON origo_email_delivery USING rum (fts_all rum_tsvector_timestamp_ops, received_timestamp)
WITH (attach = 'received_timestamp', TO = 'fts_all', order_by_attach = TRUE );

Then, to be able to use the index for sorting by the
"received_timestamp"-column one has to issue a query like this:

EXPLAIN ANALYZE SELECT del.entity_id,
del.subject,
del.received_timestamp,
fts_all <=> to_tsquery('simple', 'andreas&kr') AS rank
FROM origo_email_delivery del
WHERE del.fts_all @@ to_tsquery('simple', 'andreas&kr')
ORDER BY '2000-01-01' :: TIMESTAMP <=> del.received_timestamp
LIMIT 10;

QUERY PLAN -------------------------------------------------------------------------------------------------------------------------------------------------- Limit (cost=14.40..26.47 rows=10 width=89) (actual time=10.908..10.952 rows=10 loops=1)
-> Index Scan using rumidx on origo_email_delivery del (cost=14.40..3221.22 rows=2657 width=89) (actual time=10.906..10.947 rows=10 loops=1)
Index Cond: (fts_all @@ '''andreas'' & ''kr'''::tsquery)
Order By: (received_timestamp <=> '2000-01-01 00:00:00'::timestamp without time zone)
Planning time: 0.491 ms
Execution time: 11.010 ms
(6 rows)

The ORDER BY part seems strange; It seems one has to find a value
"lower than any other value" to use as a kind of base, why is this
necessary? It also seems that in order to be able to sort DESC one has to
provide a timestamp value "higher than any other value", is this correct?

have you considered <=| and |=> operators ? <=> in ORDER BY works like KNN.

Show quoted text

It would be great if the docs explained this.

I really miss the opportunity to include a BIGINT as part of the index, so
that the WHERE-clause could be like this:

WHERE del.fts_all @@ to_tsquery('simple', 'andreas&kr') AND del.folder_id IN
(1,2,3)

Having this would be perfect for my use-case searching in email in
folders, sorted by received_date, and having it use ONE index.

Will this be supported?

Thanks.

--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com&gt;

#4Andreas Joseph Krogh
andreas@visena.com
In reply to: Arthur Zakirov (#2)
Re: Updated RUM-index and support for bigint as part of index

På lørdag 06. august 2016 kl. 20:54:32, skrev Artur Zakirov <
a.zakirov@postgrespro.ru <mailto:a.zakirov@postgrespro.ru>>:
Hello,   2016-08-02 21:08 GMT+03:00 Andreas Joseph Krogh <andreas@visena.com
<mailto:andreas@visena.com>>:  The ORDER BY part seems strange; It seems one
has to find a value "lower than any other value" to use as a kind of base, why
is this necessary? It also seems that in order to be able to sort DESC one has
to provide a timestamp value "higher than any other value", is this correct?
 
It would be great if the docs explained this.
 
We will write more detailed documentation for RUM.

 
Great!
 
 
I really miss the opportunity to include a BIGINT as part of the index, so
that the WHERE-clause could be like this:
 
WHERE del.fts_all @@ to_tsquery('simple', 'andreas&kr') AND del.folder_id IN (1
,2,3)
 
Having this would be perfect for my use-case searching in email in folders,
sorted by received_date, and having it use ONE index.
 
Will this be supported?

We have a plan to use generic types to able to include bigint, timestamp and
other types as part of index.

 
Does this eliminate the need for a btree_rum equivalent of btree_gin, being
that the RUM-index will handle all "btree-able" datatypes?
 
 
But I cant tell date of it.

 
I understand.
Do you think it will be done by the time 9.6 is released?
 
Thanks.
 

-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com&gt;
<https://www.visena.com&gt;

 

#5Andreas Joseph Krogh
andreas@visena.com
In reply to: Oleg Bartunov (#3)
Re: Updated RUM-index and support for bigint as part of index

På søndag 07. august 2016 kl. 08:27:06, skrev Oleg Bartunov <obartunov@gmail.com
<mailto:obartunov@gmail.com>>:
[snip]
have you considered <=| and |=> operators ? <=> in ORDER BY works like KNN.

 
I don't get how these operators should work. Neither give me the expected
results.
 
Using <=>
 
SELECT del.entity_id, del.folder_id, del.received_timestamp FROM
origo_email_delivery delWHERE del.fts_all @@ to_tsquery('simple',
'andreas:*&jose:*') ORDER BY '2000-01-01' :: TIMESTAMP <=>
del.received_timestampLIMIT 10;
 entity_id | folder_id |   received_timestamp     
-----------+-----------+-------------------------
  1224278 |   1068087 | 2015-08-17 23:53:26
  1224382 |   1068087 | 2015-08-18 03:07:55
  1224404 |   1068087 | 2015-08-18 03:49:02
  1505713 |     48496 | 2015-10-27 14:51:45
   142132 |     66658 | 2012-12-03 14:14:05.488
   122565 |     90115 | 2012-11-20 15:41:04.936
   200744 |     66655 | 2013-01-28 21:47:44.561
  1445927 |    888665 | 2015-09-29 00:26:56
   123671 |     83509 | 2012-11-21 14:16:26.448
  1129928 |     66658 | 2015-05-09 08:39:14.128
(10 rows)

 
Using <=|
SELECT del.entity_id, del.folder_id, del.received_timestamp FROM
origo_email_delivery delWHERE del.fts_all @@ to_tsquery('simple',
'andreas:*&jose:*') ORDER BY '2000-01-01' :: TIMESTAMP <=|
del.received_timestampLIMIT 10;
 
 entity_id | folder_id |   received_timestamp     
-----------+-----------+-------------------------
  1224278 |   1068087 | 2015-08-17 23:53:26
  1224382 |   1068087 | 2015-08-18 03:07:55
  1224404 |   1068087 | 2015-08-18 03:49:02
  1505713 |     48496 | 2015-10-27 14:51:45
   142132 |     66658 | 2012-12-03 14:14:05.488
   122565 |     90115 | 2012-11-20 15:41:04.936
   200744 |     66655 | 2013-01-28 21:47:44.561
  1445927 |    888665 | 2015-09-29 00:26:56
   123671 |     83509 | 2012-11-21 14:16:26.448
  1129928 |     66658 | 2015-05-09 08:39:14.128
(10 rows)

 
Neither are ordered by received_timestamp
 
Can you explain how to get ORDER BY received_timestamp DESC?
 
Thanks.
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com&gt;
<https://www.visena.com&gt;

 

#6Arthur Zakirov
a.zakirov@postgrespro.ru
In reply to: Andreas Joseph Krogh (#5)
Re: Updated RUM-index and support for bigint as part of index

On 07.08.2016 11:05, Andreas Joseph Krogh wrote:

På søndag 07. august 2016 kl. 08:27:06, skrev Oleg Bartunov
<obartunov@gmail.com <mailto:obartunov@gmail.com>>:

[snip]
have you considered <=| and |=> operators ? <=> in ORDER BY works
like KNN.

I don't get how these operators should work. Neither give me the
expected results.

Using <=>

SELECT del.entity_id, del.folder_id, del.received_timestamp FROM
origo_email_delivery del WHERE del.fts_all @@ to_tsquery('simple',
'andreas:*&jose:*') ORDER BY '2000-01-01' :: TIMESTAMP <=>
del.received_timestamp LIMIT 10;

entity_id | folder_id | received_timestamp
-----------+-----------+-------------------------
1224278 | 1068087 | 2015-08-17 23:53:26
1224382 | 1068087 | 2015-08-18 03:07:55
1224404 | 1068087 | 2015-08-18 03:49:02
1505713 | 48496 | 2015-10-27 14:51:45
142132 | 66658 | 2012-12-03 14:14:05.488
122565 | 90115 | 2012-11-20 15:41:04.936
200744 | 66655 | 2013-01-28 21:47:44.561
1445927 | 888665 | 2015-09-29 00:26:56
123671 | 83509 | 2012-11-21 14:16:26.448
1129928 | 66658 | 2015-05-09 08:39:14.128
(10 rows)

Using <=|

SELECT del.entity_id, del.folder_id, del.received_timestamp FROM
origo_email_delivery del WHERE del.fts_all @@ to_tsquery('simple',
'andreas:*&jose:*') ORDER BY '2000-01-01' :: TIMESTAMP <=|
del.received_timestamp LIMIT 10;

entity_id | folder_id | received_timestamp
-----------+-----------+-------------------------
1224278 | 1068087 | 2015-08-17 23:53:26
1224382 | 1068087 | 2015-08-18 03:07:55
1224404 | 1068087 | 2015-08-18 03:49:02
1505713 | 48496 | 2015-10-27 14:51:45
142132 | 66658 | 2012-12-03 14:14:05.488
122565 | 90115 | 2012-11-20 15:41:04.936
200744 | 66655 | 2013-01-28 21:47:44.561
1445927 | 888665 | 2015-09-29 00:26:56
123671 | 83509 | 2012-11-21 14:16:26.448
1129928 | 66658 | 2015-05-09 08:39:14.128
(10 rows)

Neither are ordered by received_timestamp

Can you explain how to get ORDER BY received_timestamp DESC?

Thanks.

--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com&gt;
<https://www.visena.com&gt;

Do you need simple ordering by received_timestamp column? Not ordering
by distance between received_timestamp and some date?

Then you can use simple "ORDER BY received_timestamp". For example, we
have data:

=# SELECT * FROM test;
id | fts | received
----+-------------+-------------------------
1 | 'andreas':1 | 2015-08-17 23:53:26
2 | 'andreas':1 | 2015-08-18 03:07:55
3 | 'andreas':1 | 2015-08-18 03:49:02
4 | 'andreas':1 | 2012-12-03 14:14:05.488
5 | 'andreas':1 | 2012-11-20 15:41:04.936
6 | 'andreas':1 | 2013-01-28 21:47:44.561
6 | 'andreas':1 | 2015-09-29 00:26:56
7 | 'andreas':1 | 2012-11-21 14:16:26.448
8 | 'andreas':1 | 2015-05-09 08:39:14.128
(9 rows)

I created index:

CREATE INDEX rumidx ON test USING rum (fts rum_tsvector_timestamp_ops,
received) WITH (attach = 'received', to = 'fts');

Then we can execute queries:

=# SELECT id, received FROM test WHERE fts @@ to_tsquery('simple',
'andreas') ORDER BY received LIMIT 8;
id | received
----+-------------------------
5 | 2012-11-20 15:41:04.936
7 | 2012-11-21 14:16:26.448
4 | 2012-12-03 14:14:05.488
6 | 2013-01-28 21:47:44.561
8 | 2015-05-09 08:39:14.128
1 | 2015-08-17 23:53:26
2 | 2015-08-18 03:07:55
3 | 2015-08-18 03:49:02
(8 rows)

=# SELECT id, received FROM test WHERE fts @@ to_tsquery('simple',
'andreas') ORDER BY received DESC LIMIT 8;
id | received
----+-------------------------
6 | 2015-09-29 00:26:56
3 | 2015-08-18 03:49:02
2 | 2015-08-18 03:07:55
1 | 2015-08-17 23:53:26
8 | 2015-05-09 08:39:14.128
6 | 2013-01-28 21:47:44.561
4 | 2012-12-03 14:14:05.488
7 | 2012-11-21 14:16:26.448
(8 rows)

Operators <=>, |=>, <=| you can use to order by nearest date to specific
date:

=# SELECT id, received, received <=> '2013-01-01' AS rank FROM test
WHERE fts @@ to_tsquery('simple', 'andreas') ORDER BY received <=>
'2013-01-01' LIMIT 8;
id | received | rank
----+-------------------------+--------------
6 | 2013-01-28 21:47:44.561 | 2411264.561
4 | 2012-12-03 14:14:05.488 | 2454354.512
7 | 2012-11-21 14:16:26.448 | 3491013.552
5 | 2012-11-20 15:41:04.936 | 3572335.064
8 | 2015-05-09 08:39:14.128 | 74162354.128
1 | 2015-08-17 23:53:26 | 82857206
2 | 2015-08-18 03:07:55 | 82868875
3 | 2015-08-18 03:49:02 | 82871342
(8 rows)

=# SELECT id, received, received <=> '2013-01-01' AS rank FROM test
WHERE fts @@ to_tsquery('simple', 'andreas') ORDER BY received <=>
'2013-01-01' DESC LIMIT 8;
id | received | rank
----+-------------------------+--------------
6 | 2015-09-29 00:26:56 | 86488016
3 | 2015-08-18 03:49:02 | 82871342
2 | 2015-08-18 03:07:55 | 82868875
1 | 2015-08-17 23:53:26 | 82857206
8 | 2015-05-09 08:39:14.128 | 74162354.128
5 | 2012-11-20 15:41:04.936 | 3572335.064
7 | 2012-11-21 14:16:26.448 | 3491013.552
4 | 2012-12-03 14:14:05.488 | 2454354.512
(8 rows)

I hope this is what you want.

--
Artur Zakirov
Postgres Professional: http://www.postgrespro.com
Russian Postgres Company

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#7Andreas Joseph Krogh
andreas@visena.com
In reply to: Arthur Zakirov (#6)
Re: Updated RUM-index and support for bigint as part of index

På torsdag 11. august 2016 kl. 19:13:10, skrev Artur Zakirov <
a.zakirov@postgrespro.ru <mailto:a.zakirov@postgrespro.ru>>:
On 07.08.2016 11:05, Andreas Joseph Krogh wrote:

På søndag 07. august 2016 kl. 08:27:06, skrev Oleg Bartunov
<obartunov@gmail.com <mailto:obartunov@gmail.com>>:

     [snip]
     have you considered <=| and |=> operators ? <=> in ORDER BY works
     like KNN.

I don't get how these operators should work. Neither give me the
expected results.

Using <=>

SELECT del.entity_id, del.folder_id, del.received_timestamp FROM
origo_email_delivery del WHERE del.fts_all @@ to_tsquery('simple',
'andreas:*&jose:*') ORDER BY '2000-01-01' :: TIMESTAMP <=>
del.received_timestamp LIMIT 10;

  entity_id | folder_id |   received_timestamp
-----------+-----------+-------------------------
   1224278 |   1068087 | 2015-08-17 23:53:26
   1224382 |   1068087 | 2015-08-18 03:07:55
   1224404 |   1068087 | 2015-08-18 03:49:02
   1505713 |     48496 | 2015-10-27 14:51:45
    142132 |     66658 | 2012-12-03 14:14:05.488
    122565 |     90115 | 2012-11-20 15:41:04.936
    200744 |     66655 | 2013-01-28 21:47:44.561
   1445927 |    888665 | 2015-09-29 00:26:56
    123671 |     83509 | 2012-11-21 14:16:26.448
   1129928 |     66658 | 2015-05-09 08:39:14.128
(10 rows)

Using <=|

SELECT del.entity_id, del.folder_id, del.received_timestamp FROM
origo_email_delivery del WHERE del.fts_all @@ to_tsquery('simple',
'andreas:*&jose:*') ORDER BY '2000-01-01' :: TIMESTAMP <=|
del.received_timestamp LIMIT 10;

  entity_id | folder_id |   received_timestamp
-----------+-----------+-------------------------
   1224278 |   1068087 | 2015-08-17 23:53:26
   1224382 |   1068087 | 2015-08-18 03:07:55
   1224404 |   1068087 | 2015-08-18 03:49:02
   1505713 |     48496 | 2015-10-27 14:51:45
    142132 |     66658 | 2012-12-03 14:14:05.488
    122565 |     90115 | 2012-11-20 15:41:04.936
    200744 |     66655 | 2013-01-28 21:47:44.561
   1445927 |    888665 | 2015-09-29 00:26:56
    123671 |     83509 | 2012-11-21 14:16:26.448
   1129928 |     66658 | 2015-05-09 08:39:14.128
(10 rows)

Neither are ordered by received_timestamp

Can you explain how to get ORDER BY received_timestamp DESC?

Thanks.

--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com&gt;
<https://www.visena.com&gt;

Do you need simple ordering by received_timestamp column? Not ordering
by distance between received_timestamp and some date?

Then you can use simple "ORDER BY received_timestamp". For example, we
have data:

=# SELECT * FROM test;
  id |     fts     |        received
----+-------------+-------------------------
   1 | 'andreas':1 | 2015-08-17 23:53:26
   2 | 'andreas':1 | 2015-08-18 03:07:55
   3 | 'andreas':1 | 2015-08-18 03:49:02
   4 | 'andreas':1 | 2012-12-03 14:14:05.488
   5 | 'andreas':1 | 2012-11-20 15:41:04.936
   6 | 'andreas':1 | 2013-01-28 21:47:44.561
   6 | 'andreas':1 | 2015-09-29 00:26:56
   7 | 'andreas':1 | 2012-11-21 14:16:26.448
   8 | 'andreas':1 | 2015-05-09 08:39:14.128
(9 rows)

I created index:

CREATE INDEX rumidx ON test USING rum (fts rum_tsvector_timestamp_ops,
received) WITH (attach = 'received', to = 'fts');

Then we can execute queries:

=# SELECT id, received FROM test WHERE fts @@ to_tsquery('simple',
'andreas') ORDER BY received LIMIT 8;
  id |        received
----+-------------------------
   5 | 2012-11-20 15:41:04.936
   7 | 2012-11-21 14:16:26.448
   4 | 2012-12-03 14:14:05.488
   6 | 2013-01-28 21:47:44.561
   8 | 2015-05-09 08:39:14.128
   1 | 2015-08-17 23:53:26
   2 | 2015-08-18 03:07:55
   3 | 2015-08-18 03:49:02
(8 rows)

=# SELECT id, received FROM test WHERE fts @@ to_tsquery('simple',
'andreas') ORDER BY received DESC LIMIT 8;
  id |        received
----+-------------------------
   6 | 2015-09-29 00:26:56
   3 | 2015-08-18 03:49:02
   2 | 2015-08-18 03:07:55
   1 | 2015-08-17 23:53:26
   8 | 2015-05-09 08:39:14.128
   6 | 2013-01-28 21:47:44.561
   4 | 2012-12-03 14:14:05.488
   7 | 2012-11-21 14:16:26.448
(8 rows)
 
Yes, this gives the correct result, but the whole motivation for using
RUM-index is for the query to use the same index for ORDER BY, as it seems to
do using the <=> operator.
 
The query you gave above does not the index for sorting AFAIU.
 
Operators <=>, |=>, <=| you can use to order by nearest date to specific
date:
[snip]
 
I hope this is what you want.
 
I still don't understand how my query which had ORDER BY '2000-01-01' ::
TIMESTAMP <=> del.received_timestamp
can produce the following ordering:
 
 entity_id | folder_id |   received_timestamp     
-----------+-----------+-------------------------
  1224278 |   1068087 | 2015-08-17 23:53:26
  1224382 |   1068087 | 2015-08-18 03:07:55
  1224404 |   1068087 | 2015-08-18 03:49:02
  1505713 |     48496 | 2015-10-27 14:51:45
   142132 |     66658 | 2012-12-03 14:14:05.488
   122565 |     90115 | 2012-11-20 15:41:04.936
   200744 |     66655 | 2013-01-28 21:47:44.561
  1445927 |    888665 | 2015-09-29 00:26:56
   123671 |     83509 | 2012-11-21 14:16:26.448
  1129928 |     66658 | 2015-05-09 08:39:14.128
 
How can "nearest date to specific date" produce this ordering when the
specific date si 2000-01-01?
 
Thanks for explaining.
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com&gt;
<https://www.visena.com&gt;

 

#8Oleg Bartunov
oleg@sai.msu.su
In reply to: Andreas Joseph Krogh (#5)
Re: Updated RUM-index and support for bigint as part of index

Andreas,

sorry for delay,
it looks like a bug to me, could you please, share your dataset with me, so
I could reproduce the behaviour.

Regards,
Oleg

On Sun, Aug 7, 2016 at 11:05 AM, Andreas Joseph Krogh <andreas@visena.com>
wrote:

Show quoted text

På søndag 07. august 2016 kl. 08:27:06, skrev Oleg Bartunov <
obartunov@gmail.com>:

[snip]
have you considered <=| and |=> operators ? <=> in ORDER BY works like KNN.

I don't get how these operators should work. Neither give me the expected
results.

Using <=>

SELECT del.entity_id,
del.folder_id,
del.received_timestampFROM origo_email_delivery delWHERE del.fts_all @@ to_tsquery('simple', 'andreas:*&jose:*')ORDER BY '2000-01-01' :: TIMESTAMP <=> del.received_timestampLIMIT 10;

entity_id | folder_id | received_timestamp
-----------+-----------+-------------------------
1224278 | 1068087 | 2015-08-17 23:53:26
1224382 | 1068087 | 2015-08-18 03:07:55
1224404 | 1068087 | 2015-08-18 03:49:02
1505713 | 48496 | 2015-10-27 14:51:45
142132 | 66658 | 2012-12-03 14:14:05.488
122565 | 90115 | 2012-11-20 15:41:04.936
200744 | 66655 | 2013-01-28 21:47:44.561
1445927 | 888665 | 2015-09-29 00:26:56
123671 | 83509 | 2012-11-21 14:16:26.448
1129928 | 66658 | 2015-05-09 08:39:14.128
(10 rows)

Using <=|

SELECT del.entity_id,
del.folder_id,
del.received_timestampFROM origo_email_delivery delWHERE del.fts_all @@ to_tsquery('simple', 'andreas:*&jose:*')ORDER BY '2000-01-01' :: TIMESTAMP <=| del.received_timestampLIMIT 10;

entity_id | folder_id | received_timestamp
-----------+-----------+-------------------------
1224278 | 1068087 | 2015-08-17 23:53:26
1224382 | 1068087 | 2015-08-18 03:07:55
1224404 | 1068087 | 2015-08-18 03:49:02
1505713 | 48496 | 2015-10-27 14:51:45
142132 | 66658 | 2012-12-03 14:14:05.488
122565 | 90115 | 2012-11-20 15:41:04.936
200744 | 66655 | 2013-01-28 21:47:44.561
1445927 | 888665 | 2015-09-29 00:26:56
123671 | 83509 | 2012-11-21 14:16:26.448
1129928 | 66658 | 2015-05-09 08:39:14.128
(10 rows)

Neither are ordered by received_timestamp

Can you explain how to get ORDER BY received_timestamp DESC?

Thanks.

--
*Andreas Joseph Krogh*
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com
www.visena.com
<https://www.visena.com&gt;

#9Andreas Joseph Krogh
andreas@visena.com
In reply to: Oleg Bartunov (#8)
Re: Updated RUM-index and support for bigint as part of index

På torsdag 25. august 2016 kl. 18:12:34, skrev Oleg Bartunov <
obartunov@gmail.com <mailto:obartunov@gmail.com>>:
Andreas,
 
sorry for delay,
it looks like a bug to me, could you please, share your dataset with me, so I
could reproduce the behaviour.

 
 
I'll send you a Google Drive link on your email
 
-- Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963
andreas@visena.com <mailto:andreas@visena.com>
www.visena.com <https://www.visena.com&gt;
<https://www.visena.com&gt;