Thread: Prefix search on all hstore values
Hi,
I have an hstore column that stores a string in several arbitrary languages, so something like this:
"en" => "string in english", "zh" => "string in chinese", "fr" => "string in french"
Is it possible to construct an index that can be used to determine if a query string is a prefix of ANY of the values in the hstore? From reading the documentation the closest I've gotten is a gin index after converting the values to an array, but that doesn't seem to work with prefix searching. Any pointers would be much appreciated!
Thanks,
Albert
On Wed, Nov 27, 2013 at 6:49 PM, Albert Chern <albert.chern@gmail.com> wrote: > I have an hstore column that stores a string in several arbitrary languages, > so something like this: > > "en" => "string in english", "zh" => "string in chinese", "fr" => "string in > french" > > Is it possible to construct an index that can be used to determine if a > query string is a prefix of ANY of the values in the hstore? From reading > the documentation the closest I've gotten is a gin index after converting > the values to an array, but that doesn't seem to work with prefix searching. > Any pointers would be much appreciated! The idea is to de-normalize the hstore_column to an assisting table with 2 columns: original_record_id, hstore_column_value. And to create a btree index on hstore_column_value that will effectively be used in prefix search. -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray.ru@gmail.com
Hi! Full-text search has this feature. # select to_tsvector('en_name=>yes, fr_name=>oui'::hstore::text) @@ 'en:*'; ?column? ---------- t or (index only keys) select to_tsvector(akeys('en_name=>yes, fr_name=>oui'::hstore)::text) @@ 'en:*'; ?column? ---------- t To speed up this queries you use functional indexes. Albert Chern wrote: > Hi, > > I have an hstore column that stores a string in several arbitrary languages, so > something like this: > > "en" => "string in english", "zh" => "string in chinese", "fr" => "string in french" > > Is it possible to construct an index that can be used to determine if a query > string is a prefix of ANY of the values in the hstore? From reading the > documentation the closest I've gotten is a gin index after converting the values > to an array, but that doesn't seem to work with prefix searching. Any pointers > would be much appreciated! > > Thanks, > Albert -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/
On Thu, Nov 28, 2013 at 12:44 AM, Teodor Sigaev <teodor@sigaev.ru> wrote: > Full-text search has this feature. > > # select to_tsvector('en_name=>yes, fr_name=>oui'::hstore::text) @@ 'en:*'; > ?column? > ---------- > t > > or (index only keys) > > select to_tsvector(akeys('en_name=>yes, fr_name=>oui'::hstore)::text) @@ > 'en:*'; > ?column? > ---------- > t > > To speed up this queries you use functional indexes. It wont work. The OP needs to search by values prefixes, not by any separate word in the hstore. # select to_tsvector('en_name=>"oh yes", fr_name=>oui'::hstore::text) @@ 'ye:*'; ?column? ---------- t -- Kind regards, Sergey Konoplev PostgreSQL Consultant and DBA http://www.linkedin.com/in/grayhemp +1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979 gray.ru@gmail.com
Thanks for the suggestions!
My requirements can be relaxed to full text search, but the problem I had with that approach is I have strings in Chinese, and postgres doesn't seem to support it. Calling to_tsvector() on Chinese characters always returns an empty vector.
A separate table will definitely work, but I was hoping for something more straightforward. I'll use that approach if necessary though.
On Thu, Nov 28, 2013 at 4:51 PM, Sergey Konoplev <gray.ru@gmail.com> wrote:
On Thu, Nov 28, 2013 at 12:44 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:It wont work. The OP needs to search by values prefixes, not by any
> Full-text search has this feature.
>
> # select to_tsvector('en_name=>yes, fr_name=>oui'::hstore::text) @@ 'en:*';
> ?column?
> ----------
> t
>
> or (index only keys)
>
> select to_tsvector(akeys('en_name=>yes, fr_name=>oui'::hstore)::text) @@
> 'en:*';
> ?column?
> ----------
> t
>
> To speed up this queries you use functional indexes.
separate word in the hstore.
# select to_tsvector('en_name=>"oh yes", fr_name=>oui'::hstore::text) @@ 'ye:*';
?column?
----------t
--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA
http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com
> My requirements can be relaxed to full text search, but the problem I had with > that approach is I have strings in Chinese, and postgres doesn't seem to support > it. Calling to_tsvector() on Chinese characters always returns an empty vector. > Hm, check your locale settings. AFAIK, somebody uses FTS with Chinese language. -- Teodor Sigaev E-mail: teodor@sigaev.ru WWW: http://www.sigaev.ru/