Thread: hstore - Implementation and performance issues around its operators

hstore - Implementation and performance issues around its operators

From
Stefan Keller
Date:
Hi,

We did a benchmark comparing a Key-Value-Pairs stored as EAV db schema
versus hstore.
The results are promising in favor of hstore but there are some
question which remain.

1. Obviously the '@>' has to be used in order to let use the GiST index.
Why is the '->' operator not supported by GiST ('->' is actually
mentioned in all examples of the doc.)?

2. Currently the hstore elements are stored in order as they are
coming from the insert statement / constructor.
Why are the elements not ordered i.e. why is the hstore not cached in
all hstore functions (like hstore_fetchval etc.)?

3. In the source code 'hstore_io.c' one finds the following enigmatic
note: "... very large hstore values can't be output. this could be
fixed, but many other data types probably have the same issue."
What is the max. length of a hstore (i.e. the max. length of the sum
of all elements in text representation)?

4. Last, I don't fully understand the following note in the hstore
doc. (http://www.postgresql.org/docs/current/interactive/hstore.html
):
> Notice that the old names are reversed from the convention
> formerly followed by the core geometric data types!

Why names? Why not rather 'operators' or 'functions'?
What does this "reversed from the convention" mean concretely?

Yours, Stefan

Re: hstore - Implementation and performance issues around its operators

From
Robert Haas
Date:
On Sun, Jun 19, 2011 at 2:59 PM, Stefan Keller <sfkeller@gmail.com> wrote:
> 1. Obviously the '@>' has to be used in order to let use the GiST index.
> Why is the '->' operator not supported by GiST ('->' is actually
> mentioned in all examples of the doc.)?

Because it's not a comparison operator.

> 2. Currently the hstore elements are stored in order as they are
> coming from the insert statement / constructor.
> Why are the elements not ordered i.e. why is the hstore not cached in
> all hstore functions (like hstore_fetchval etc.)?

Putting the elements in order wouldn't really help, would it?  I mean,
you'd need some kind of an index inside the hstore... which there
isn't.

> 3. In the source code 'hstore_io.c' one finds the following enigmatic
> note: "... very large hstore values can't be output. this could be
> fixed, but many other data types probably have the same issue."
> What is the max. length of a hstore (i.e. the max. length of the sum
> of all elements in text representation)?

I think that anything of half a gigabyte or more is at risk of falling
down there.  But probably it's not smart to use such big hstores
anyway.

> 4. Last, I don't fully understand the following note in the hstore
> doc. (http://www.postgresql.org/docs/current/interactive/hstore.html
> ):
>> Notice that the old names are reversed from the convention
>> formerly followed by the core geometric data types!
>
> Why names? Why not rather 'operators' or 'functions'?

It's referring to the operator names.

> What does this "reversed from the convention" mean concretely?

That comment could be a little more clear, but I think what it's
saying is that hstore's old @ is like the core geometic types old ~,
and visca versa.

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

Re: hstore - Implementation and performance issues around its operators

From
Stefan Keller
Date:
Hi Robert

Many thanks for your answers.

2011/7/19 Robert Haas <robertmhaas@gmail.com>:
> Putting the elements in order wouldn't really help, would it?  I mean,
> you'd need some kind of an index inside the hstore... which there
> isn't.

Sorry for my inprecise question. In fact elements of a hstore are
stored in order of (keylength,key) with the key comparison done
bytewise (not locale-dependent). See e.g. function hstoreUniquePairs
in http://doxygen.postgresql.org/ . This ordered property is being
used by some hstore functions but not all - and I'm still wondering
why.

Yours, Stefan


2011/7/19 Robert Haas <robertmhaas@gmail.com>:
> On Sun, Jun 19, 2011 at 2:59 PM, Stefan Keller <sfkeller@gmail.com> wrote:
>> 1. Obviously the '@>' has to be used in order to let use the GiST index.
>> Why is the '->' operator not supported by GiST ('->' is actually
>> mentioned in all examples of the doc.)?
>
> Because it's not a comparison operator.
>
>> 2. Currently the hstore elements are stored in order as they are
>> coming from the insert statement / constructor.
>> Why are the elements not ordered i.e. why is the hstore not cached in
>> all hstore functions (like hstore_fetchval etc.)?
>
> Putting the elements in order wouldn't really help, would it?  I mean,
> you'd need some kind of an index inside the hstore... which there
> isn't.
>
>> 3. In the source code 'hstore_io.c' one finds the following enigmatic
>> note: "... very large hstore values can't be output. this could be
>> fixed, but many other data types probably have the same issue."
>> What is the max. length of a hstore (i.e. the max. length of the sum
>> of all elements in text representation)?
>
> I think that anything of half a gigabyte or more is at risk of falling
> down there.  But probably it's not smart to use such big hstores
> anyway.
>
>> 4. Last, I don't fully understand the following note in the hstore
>> doc. (http://www.postgresql.org/docs/current/interactive/hstore.html
>> ):
>>> Notice that the old names are reversed from the convention
>>> formerly followed by the core geometric data types!
>>
>> Why names? Why not rather 'operators' or 'functions'?
>
> It's referring to the operator names.
>
>> What does this "reversed from the convention" mean concretely?
>
> That comment could be a little more clear, but I think what it's
> saying is that hstore's old @ is like the core geometic types old ~,
> and visca versa.
>
> --
> Robert Haas
> EnterpriseDB: http://www.enterprisedb.com
> The Enterprise PostgreSQL Company
>

Re: hstore - Implementation and performance issues around its operators

From
Robert Haas
Date:
On Tue, Jul 19, 2011 at 5:06 PM, Stefan Keller <sfkeller@gmail.com> wrote:
> 2011/7/19 Robert Haas <robertmhaas@gmail.com>:
>> Putting the elements in order wouldn't really help, would it?  I mean,
>> you'd need some kind of an index inside the hstore... which there
>> isn't.
>
> Sorry for my inprecise question. In fact elements of a hstore are
> stored in order of (keylength,key) with the key comparison done
> bytewise (not locale-dependent). See e.g. function hstoreUniquePairs
> in http://doxygen.postgresql.org/ . This ordered property is being
> used by some hstore functions but not all - and I'm still wondering
> why.

Not sure, honestly.  Is there some place where it would permit an
optimization we're not currently doing?

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company