Re: Updated RUM-index and support for bigint as part of index - Mailing list pgsql-general

From Oleg Bartunov
Subject Re: Updated RUM-index and support for bigint as part of index
Date
Msg-id CAF4Au4zBDdTFV5ThffnLbV36zof9Xmy6R4mKoW=_q6RjJPXWHg@mail.gmail.com
Whole thread Raw
In response to Updated RUM-index and support for bigint as part of index  (Andreas Joseph Krogh <andreas@visena.com>)
Responses Re: Updated RUM-index and support for bigint as part of index  (Andreas Joseph Krogh <andreas@visena.com>)
List pgsql-general


On Tue, Aug 2, 2016 at 9:08 PM, Andreas Joseph Krogh <andreas@visena.com> wrote:
Hi.
 
I see the RUM-index is updated, which is great!
 
I wonder, to be able to sort by timestamp one has to create the index like this:
 
CREATE INDEX rumidx ON origo_email_delivery USING rum (fts_all rum_tsvector_timestamp_ops, received_timestamp)     WITH (attach = 'received_timestamp', TO = 'fts_all', order_by_attach = TRUE );

Then, to be able to use the index for sorting by the "received_timestamp"-column one has to issue a query like this:
EXPLAIN ANALYZE SELECT
                    del.entity_id,                    del.subject,                    del.received_timestamp,                    fts_all <=> to_tsquery('simple', 'andreas&kr') AS rank                FROM origo_email_delivery del                WHERE del.fts_all @@ to_tsquery('simple', 'andreas&kr')                ORDER BY '2000-01-01' :: TIMESTAMP <=> del.received_timestamp                LIMIT 10;
                                                                    QUERY PLAN                                                                    
--------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=14.40..26.47 rows=10 width=89) (actual time=10.908..10.952 rows=10 loops=1)   ->  Index Scan using rumidx on origo_email_delivery del  (cost=14.40..3221.22 rows=2657 width=89) (actual time=10.906..10.947 rows=10 loops=1)         Index Cond: (fts_all @@ '''andreas'' & ''kr'''::tsquery)         Order By: (received_timestamp <=> '2000-01-01 00:00:00'::timestamp without time zone) Planning time: 0.491 ms Execution time: 11.010 ms
(6 rows)
 
The ORDER BY part seems strange; It seems one has to find a value "lower than any other value" to use as a kind of base, why is this necessary? It also seems that in order to be able to sort DESC one has to provide a timestamp value "higher than any other value", is this correct?

have you considered <=| and |=> operators ? <=> in ORDER BY works like KNN.
 
 
It would be great if the docs explained this.
 
I really miss the opportunity to include a BIGINT as part of the index, so that the WHERE-clause could be like this:
 
WHERE del.fts_all @@ to_tsquery('simple', 'andreas&kr') AND del.folder_id IN (1,2,3)
 
Having this would be perfect for my use-case searching in email in folders, sorted by received_date, and having it use ONE index.
 
Will this be supported?
 
Thanks.
 
--
Andreas Joseph Krogh
CTO / Partner - Visena AS
Mobile: +47 909 56 963

Attachment

pgsql-general by date:

Previous
From: Venkata Balaji N
Date:
Subject: Re: Logical Decoding Failover
Next
From: Andreas Joseph Krogh
Date:
Subject: Re: Updated RUM-index and support for bigint as part of index