Thread: Prefix search on all hstore values

Prefix search on all hstore values

From
Albert Chern
Date:
Hi,

I have an hstore column that stores a string in several arbitrary languages, so something like this:

"en" => "string in english", "zh" => "string in chinese", "fr" => "string in french"

Is it possible to construct an index that can be used to determine if a query string is a prefix of ANY of the values in the hstore?  From reading the documentation the closest I've gotten is a gin index after converting the values to an array, but that doesn't seem to work with prefix searching.  Any pointers would be much appreciated!

Thanks,
Albert

Re: Prefix search on all hstore values

From
Sergey Konoplev
Date:
On Wed, Nov 27, 2013 at 6:49 PM, Albert Chern <albert.chern@gmail.com> wrote:
> I have an hstore column that stores a string in several arbitrary languages,
> so something like this:
>
> "en" => "string in english", "zh" => "string in chinese", "fr" => "string in
> french"
>
> Is it possible to construct an index that can be used to determine if a
> query string is a prefix of ANY of the values in the hstore?  From reading
> the documentation the closest I've gotten is a gin index after converting
> the values to an array, but that doesn't seem to work with prefix searching.
> Any pointers would be much appreciated!

The idea is to de-normalize the hstore_column to an assisting table
with 2 columns: original_record_id, hstore_column_value. And to create
a btree index on hstore_column_value that will effectively be used in
prefix search.

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com


Re: Prefix search on all hstore values

From
Teodor Sigaev
Date:
Hi!

Full-text search has this feature.

# select to_tsvector('en_name=>yes, fr_name=>oui'::hstore::text) @@ 'en:*';
  ?column?
----------
  t


or (index only keys)

select to_tsvector(akeys('en_name=>yes, fr_name=>oui'::hstore)::text) @@ 'en:*';
  ?column?
----------
  t

To speed up this queries you use functional indexes.


Albert Chern wrote:
> Hi,
>
> I have an hstore column that stores a string in several arbitrary languages, so
> something like this:
>
> "en" => "string in english", "zh" => "string in chinese", "fr" => "string in french"
>
> Is it possible to construct an index that can be used to determine if a query
> string is a prefix of ANY of the values in the hstore?  From reading the
> documentation the closest I've gotten is a gin index after converting the values
> to an array, but that doesn't seem to work with prefix searching.  Any pointers
> would be much appreciated!
>
> Thanks,
> Albert

--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/


Re: Prefix search on all hstore values

From
Sergey Konoplev
Date:
On Thu, Nov 28, 2013 at 12:44 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
> Full-text search has this feature.
>
> # select to_tsvector('en_name=>yes, fr_name=>oui'::hstore::text) @@ 'en:*';
>  ?column?
> ----------
>  t
>
> or (index only keys)
>
> select to_tsvector(akeys('en_name=>yes, fr_name=>oui'::hstore)::text) @@
> 'en:*';
>  ?column?
> ----------
>  t
>
> To speed up this queries you use functional indexes.

It wont work. The OP needs to search by values prefixes, not by any
separate word in the hstore.

# select to_tsvector('en_name=>"oh yes", fr_name=>oui'::hstore::text) @@ 'ye:*';
 ?column?
----------
 t

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com


Re: Prefix search on all hstore values

From
Albert Chern
Date:
Thanks for the suggestions!

My requirements can be relaxed to full text search, but the problem I had with that approach is I have strings in Chinese, and postgres doesn't seem to support it.  Calling to_tsvector() on Chinese characters always returns an empty vector.

A separate table will definitely work, but I was hoping for something more straightforward.  I'll use that approach if necessary though.


On Thu, Nov 28, 2013 at 4:51 PM, Sergey Konoplev <gray.ru@gmail.com> wrote:
On Thu, Nov 28, 2013 at 12:44 AM, Teodor Sigaev <teodor@sigaev.ru> wrote:
> Full-text search has this feature.
>
> # select to_tsvector('en_name=>yes, fr_name=>oui'::hstore::text) @@ 'en:*';
>  ?column?
> ----------
>  t
>
> or (index only keys)
>
> select to_tsvector(akeys('en_name=>yes, fr_name=>oui'::hstore)::text) @@
> 'en:*';
>  ?column?
> ----------
>  t
>
> To speed up this queries you use functional indexes.

It wont work. The OP needs to search by values prefixes, not by any
separate word in the hstore.

# select to_tsvector('en_name=>"oh yes", fr_name=>oui'::hstore::text) @@ 'ye:*';
 ?column?
----------
 t

--
Kind regards,
Sergey Konoplev
PostgreSQL Consultant and DBA

http://www.linkedin.com/in/grayhemp
+1 (415) 867-9984, +7 (901) 903-0499, +7 (988) 888-1979
gray.ru@gmail.com

Re: Prefix search on all hstore values

From
Teodor Sigaev
Date:
> My requirements can be relaxed to full text search, but the problem I had with
> that approach is I have strings in Chinese, and postgres doesn't seem to support
> it.  Calling to_tsvector() on Chinese characters always returns an empty vector.
>

Hm, check your locale settings. AFAIK, somebody uses FTS with Chinese language.


--
Teodor Sigaev                                   E-mail: teodor@sigaev.ru
                                                    WWW: http://www.sigaev.ru/