Thread: hstore improvements?

hstore improvements?

From
Andrew Gierth
Date:
I have a patch almost done that adds some obvious but currently
missing functionality to hstore, specifically the ability to construct
an hstore from a record, and the ability to construct one from a pair
of arrays.

If there's any other features that people find notably missing from
hstore, I could stick them in too; any requests?

Also, hstore has an (undocumented) limit of 65535 bytes for keys and
values, and it does not behave very cleanly when given longer values
(it truncates them mod 2^16, rather than erroring). That gives rise to
two obvious questions: (1) are those lengths reasonable? they strike
me as being rather long for keys and rather short for values; and (2)
should exceeding the lengths throw an error?

-- 
Andrew (irc:RhodiumToad)


Re: hstore improvements?

From
Alvaro Herrera
Date:
Andrew Gierth wrote:
> I have a patch almost done that adds some obvious but currently
> missing functionality to hstore, specifically the ability to construct
> an hstore from a record, and the ability to construct one from a pair
> of arrays.
> 
> If there's any other features that people find notably missing from
> hstore, I could stick them in too; any requests?

Optionally compressing the values would be nice ...


-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: hstore improvements?

From
Heikki Linnakangas
Date:
Alvaro Herrera wrote:
> Andrew Gierth wrote:
>> I have a patch almost done that adds some obvious but currently
>> missing functionality to hstore, specifically the ability to construct
>> an hstore from a record, and the ability to construct one from a pair
>> of arrays.
>>
>> If there's any other features that people find notably missing from
>> hstore, I could stick them in too; any requests?
> 
> Optionally compressing the values would be nice ...

The whole Datum will be toasted/compressed if it gets large enough. Is 
that not enough?

--   Heikki Linnakangas  EnterpriseDB   http://www.enterprisedb.com


Re: hstore improvements?

From
Alvaro Herrera
Date:
Heikki Linnakangas wrote:
> Alvaro Herrera wrote:
>> Andrew Gierth wrote:
>>> I have a patch almost done that adds some obvious but currently
>>> missing functionality to hstore, specifically the ability to construct
>>> an hstore from a record, and the ability to construct one from a pair
>>> of arrays.
>>>
>>> If there's any other features that people find notably missing from
>>> hstore, I could stick them in too; any requests?
>>
>> Optionally compressing the values would be nice ...
>
> The whole Datum will be toasted/compressed if it gets large enough. Is  
> that not enough?

It doesn't always gets large enough, and there's no way to control that.

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
PostgreSQL Replication, Consulting, Custom Development, 24x7 support


Re: hstore improvements?

From
"David E. Wheeler"
Date:
On Mar 13, 2009, at 11:06 AM, Andrew Gierth wrote:

> If there's any other features that people find notably missing from
> hstore, I could stick them in too; any requests?

Can you create slices? That is, create a new hstore as a subset of an  
existing hstore?

> Also, hstore has an (undocumented) limit of 65535 bytes for keys and
> values, and it does not behave very cleanly when given longer values
> (it truncates them mod 2^16, rather than erroring). That gives rise to
> two obvious questions: (1) are those lengths reasonable? they strike
> me as being rather long for keys and rather short for values; and (2)
> should exceeding the lengths throw an error?

I agree. The keys can be much shorter without any threat of loss. Can  
the value not essentially be TEXT, and thus theoretically unlimited in  
size?

Best,

David


Re: hstore improvements?

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Heikki Linnakangas wrote:
>> Alvaro Herrera wrote:
>>> Optionally compressing the values would be nice ...
>> 
>> The whole Datum will be toasted/compressed if it gets large enough. Is  
>> that not enough?

> It doesn't always gets large enough, and there's no way to control that.

Maybe not, but putting compression into a datatype is NOT NOT NOT the
answer.  You only want one layer of compression in a system, and that
means if you want more control you need to speak to the TOAST code.
        regards, tom lane


Re: hstore improvements?

From
Tom Lane
Date:
"David E. Wheeler" <david@kineticode.com> writes:
> On Mar 13, 2009, at 11:06 AM, Andrew Gierth wrote:
>> Also, hstore has an (undocumented) limit of 65535 bytes for keys and
>> values, and it does not behave very cleanly when given longer values
>> (it truncates them mod 2^16, rather than erroring). That gives rise to
>> two obvious questions: (1) are those lengths reasonable? they strike
>> me as being rather long for keys and rather short for values; and (2)
>> should exceeding the lengths throw an error?

> I agree. The keys can be much shorter without any threat of loss. Can  
> the value not essentially be TEXT, and thus theoretically unlimited in  
> size?

Well, TEXT is limited to 1GB by the toastable-datum rules, as is the
whole hstore datum, so there's no point in worrying about "huge"
values.  I agree though that 64K is on the small side for a data limit.
If we wanted to keep the lengths in the same 32 bits they presumably
occupy now, what about splitting 8/24 (=> 255 bytes for key, 24MB for
value)?

As for truncation rather than throwing an error, I'd argue that that's
a flat-out bug and the fix deserves back-patching.
        regards, tom lane


Re: hstore improvements?

From
Andrew Gierth
Date:
>>>>> "David" == "David E Wheeler" <david@kineticode.com> writes:
> On Mar 13, 2009, at 11:06 AM, Andrew Gierth wrote:>> If there's any other features that people find notably missing>>
fromhstore, I could stick them in too; any requests?
 
David> Can you create slices? That is, create a new hstore as aDavid> subset of an existing hstore?

ooh. good point. What would be a good operator for that?

I'm thinking that (hstore -> text[]) should probably return text[],
and maybe (hstore => text[]) returning hstore?

i.e.

select ('a=>1,b=>2,c=>3'::hstore) -> ARRAY['a','b']; -- returns '{1,2}'

select ('a=>1,b=>2,c=>3'::hstore) => ARRAY['a','b']; -- returns 'a=>1,b=>2'

(by analogy to the existing operators -> for lookup and => for
construction)

-- 
Andrew (irc:RhodiumToad)


Re: hstore improvements?

From
Andrew Gierth
Date:
>>>>> "Tom" == Tom Lane <tgl@sss.pgh.pa.us> writes:
Tom> As for truncation rather than throwing an error, I'd argue thatTom> that's a flat-out bug and the fix deserves
back-patching.

Separate patch for that part then?

-- 
Andrew (irc:RhodiumToad)


Re: hstore improvements?

From
"David E. Wheeler"
Date:
On Mar 13, 2009, at 1:21 PM, Andrew Gierth wrote:

>>>>>> I'm thinking that (hstore -> text[]) should probably return  
>>>>>> text[],
> and maybe (hstore => text[]) returning hstore?
>
> i.e.
>
> select ('a=>1,b=>2,c=>3'::hstore) -> ARRAY['a','b'];
>  -- returns '{1,2}'
>
> select ('a=>1,b=>2,c=>3'::hstore) => ARRAY['a','b'];
>  -- returns 'a=>1,b=>2'
>
> (by analogy to the existing operators -> for lookup and => for
> construction)

Is a more Perlish syntax out of the question?

SELECT ('a=>1,b=>2,c=>3'::hstore)['a', 'b']; -- returns '{1,2}'
select ('a=>1,b=>2,c=>3'::hstore){'a','b'}; -- returns 'a=>1,b=>2'

Best,

David


Re: hstore improvements?

From
Andrew Gierth
Date:
>>>>> "David" == "David E Wheeler" <david@kineticode.com> writes:
>> select ('a=>1,b=>2,c=>3'::hstore) -> ARRAY['a','b'];>> -- returns '{1,2}'>> >> select ('a=>1,b=>2,c=>3'::hstore) =>
ARRAY['a','b'];>>-- returns 'a=>1,b=>2'>> >> (by analogy to the existing operators -> for lookup and => for>>
construction)
David> Is a more Perlish syntax out of the question?

Yes. Sorry.
David> SELECT ('a=>1,b=>2,c=>3'::hstore)['a', 'b'];David>   -- returns '{1,2}'

That would require integrating hstore into core - array subscripting
isn't a user-definable operation.
David> select ('a=>1,b=>2,c=>3'::hstore){'a','b'};David>   -- returns 'a=>1,b=>2'

And that would require changing the parser...

-- 
Andrew (irc:RhodiumToad)


Re: hstore improvements?

From
"David E. Wheeler"
Date:
On Mar 13, 2009, at 2:26 PM, Andrew Gierth wrote:

> David> Is a more Perlish syntax out of the question?
>
> Yes. Sorry.
>
> David> SELECT ('a=>1,b=>2,c=>3'::hstore)['a', 'b'];
> David>   -- returns '{1,2}'
>
> That would require integrating hstore into core - array subscripting
> isn't a user-definable operation.
>
> David> select ('a=>1,b=>2,c=>3'::hstore){'a','b'};
> David>   -- returns 'a=>1,b=>2'
>
> And that would require changing the parser...

How functionS, then?
 SELECT slice(('a=>1,b=>2,c=>3'::hstore), ARRAY['a', 'b']); -- returns '{1,2}'
 SELECT hslice(('a=>1,b=>2,c=>3'::hstore), ARRAY['a','b']); -- returns 'a=>1,b=>2'

Better names welcome, of course. But there isn't even a slice  
interface for array, is there?

SELECT slice(ARRAY[ 'a', 'b', 'c' ], ARRAY[1, 3]); -- returns '{a,c}'

Best,

David


Re: hstore improvements?

From
Tom Lane
Date:
"David E. Wheeler" <david@kineticode.com> writes:
> Is a more Perlish syntax out of the question?

Yes.  SQL is not Perl.
        regards, tom lane


Re: hstore improvements?

From
"David E. Wheeler"
Date:
On Mar 13, 2009, at 2:35 PM, Tom Lane wrote:

>> Is a more Perlish syntax out of the question?
>
> Yes.  SQL is not Perl.

You mean all this time I thought I was writing Perl when I was using  
PostgreSQL, and it turns out that it's *not* Perl? That explains the  
strange lack of sigils.

Thanks for setting me straight, Tom.

:-P

D


Re: hstore improvements?

From
Tom Lane
Date:
I wrote:
> If we wanted to keep the lengths in the same 32 bits they presumably
> occupy now, what about splitting 8/24 (=> 255 bytes for key, 24MB for
> value)?

Sigh, fingers faster than brain today.  A 24-bit length field could
represent lengths up to 16MB, not 24MB.  Still, it seems like a
reasonable maximum.

Or we could increase the size of hstore values so as to provide more
than 32 bits total for this, but that would presumably be pessimal for
all existing applications; there is evidently no one using more than
64K, or we'd have heard complaints before.
        regards, tom lane


Re: hstore improvements?

From
Alvaro Herrera
Date:
Tom Lane wrote:

> Or we could increase the size of hstore values so as to provide more
> than 32 bits total for this, but that would presumably be pessimal for
> all existing applications; there is evidently no one using more than
> 64K, or we'd have heard complaints before.

Would it work to allow storing toast pointers for values?

-- 
Alvaro Herrera                                http://www.CommandPrompt.com/
The PostgreSQL Company - Command Prompt, Inc.


Re: hstore improvements?

From
Tom Lane
Date:
Alvaro Herrera <alvherre@commandprompt.com> writes:
> Tom Lane wrote:
>> Or we could increase the size of hstore values so as to provide more
>> than 32 bits total for this, but that would presumably be pessimal for
>> all existing applications; there is evidently no one using more than
>> 64K, or we'd have heard complaints before.

> Would it work to allow storing toast pointers for values?

Given that there is nobody using this for values > 64K, that seems like
far too much complication.  (Hint: how you gonna vacuum toast pointers
embedded within datums?  Especially within a datatype that isn't even
known to the core code?)
        regards, tom lane


Re: hstore improvements?

From
decibel
Date:
On Mar 13, 2009, at 4:47 PM, Tom Lane wrote:

> Or we could increase the size of hstore values so as to provide more
> than 32 bits total for this, but that would presumably be pessimal for
> all existing applications; there is evidently no one using more than
> 64K, or we'd have heard complaints before.


Unless they haven't realized that we've been pulling a MySQL and  
silently truncating their data. :(

On another point, I agree that compression would be nice, and the way  
to fix that is to expose knobs for controlling TOAST thresholds  
(something I've wanted forever).
-- 
Decibel!, aka Jim C. Nasby, Database Architect  decibel@decibel.org
Give your computer some brain candy! www.distributed.net Team #1828




Re: hstore improvements?

From
Josh Berkus
Date:
Tom Lane wrote:
> I wrote:
>> If we wanted to keep the lengths in the same 32 bits they presumably
>> occupy now, what about splitting 8/24 (=> 255 bytes for key, 24MB for
>> value)?
> 
> Sigh, fingers faster than brain today.  A 24-bit length field could
> represent lengths up to 16MB, not 24MB.  Still, it seems like a
> reasonable maximum.
> 
> Or we could increase the size of hstore values so as to provide more
> than 32 bits total for this, but that would presumably be pessimal for
> all existing applications; there is evidently no one using more than
> 64K, or we'd have heard complaints before.

Yeah, I have to say that it would never have occurred to me to use 
hstore for large values like that; 64K is pretty much a whole page of 
text.  If you need to store that much data, use a real table.  Or maybe 
CouchDB.

As an hstore user, I'd be fine with simply limiting it to 64K (or, heck, 
8K) and throwing an error.  I'd also be fine with limiting keys to 255 
bytes, although we'd have to warn people.

--Josh


Re: hstore improvements?

From
Tom Lane
Date:
Josh Berkus <josh@agliodbs.com> writes:
> As an hstore user, I'd be fine with simply limiting it to 64K (or, heck, 
> 8K) and throwing an error.  I'd also be fine with limiting keys to 255 
> bytes, although we'd have to warn people.

Yeah, 255 might well be more of a problem than the other limit.  We
could move to something like 10/22 or 12/20 split, which would give
us 1KB/4MB or 4KB/1MB limits.
        regards, tom lane


Re: hstore improvements?

From
Josh Berkus
Date:
Tom Lane wrote:
> Josh Berkus <josh@agliodbs.com> writes:
>> As an hstore user, I'd be fine with simply limiting it to 64K (or, heck, 
>> 8K) and throwing an error.  I'd also be fine with limiting keys to 255 
>> bytes, although we'd have to warn people.
> 
> Yeah, 255 might well be more of a problem than the other limit.  We
> could move to something like 10/22 or 12/20 split, which would give
> us 1KB/4MB or 4KB/1MB limits.

Anything you like.  What I'm saying is that I think I use hstore more 
heavily than most people, and that if the limits were as low as 255b/8K 
it wouldn't hurt me any.

I suppose 1K/4MB would allow OO-types to use hstore as an object store, 
so you'll make them happy with a new foot gun.  Why not?

--Josh


Re: hstore improvements?

From
Andrew Gierth
Date:
>>>>> "Ron" == Ron Mayer <rm_pg@cheapcomplexdevices.com> writes:
Ron> Currently hstore gives me an indexed operator to query if aRon> hstore contains a single key.  It'd be nice if
therewere as wayRon> to extend this so that I could ask for only records that haveRon> all or any the keys in a query.
 
Ron>   'a=>1, b=>1'::hstore ? 'a,b'
Ron> In one database I put ids of each of the keys in a hstore into aRon> largely redundant intarray to be able to do
fastqueries forRon> rows containing all the hstore-keys in a set.
 

I think trying to overload this onto ? would be possibly a bit too
confusing; would  hstore ? text[]  mean "any match", or "all match"?

How about  hstore ?& text[]  for "all match", and hstore ?| text[]
for "any match"?
Ron> Even cooler might be extending the hstore '?' operator toRon> allow expressions similar to intarray's queries:Ron>
 'a=>1, b=>1'::hstore ? 'a|b'Ron>   'a=>1, b=>1'::hstore ? 'a&b'Ron>   'a=>1, b=>1'::hstore ? 'a&(b|c)'Ron> I don't
havea need for the more general expressions, but ifRon> the code can be borrowed from intarray to handle both,
that'dRon>be sweet.
 

While that is possible, it involves adding a lot more than I planned
to at this stage.
Ron> I once wanted a variation of hstore where a key could haveRon> multiple values(and the ability to query them).Ron>
 'a=>x, a=>y'::hstore @> 'a=>x'Ron> I imagine most EAV systems allow multiple values for eachRon> attribute - and a
hstorethat supported this could probablyRon> be a pretty nice general solution for many EAV systems.  IIRCRon> other
peopleasked about similar on the lists before.
 

This is definitely going too far - such a thing would have to be a
new type entirely.

-- 
Andrew.


Re: hstore improvements?

From
Ron Mayer
Date:
Andrew Gierth wrote:
> I have a patch almost done that adds some obvious but currently
> missing functionality to hstore... 
> If there's any other features that people find notably missing from
> hstore, I could stick them in too; any requests?

Currently hstore gives me an indexed operator to query if
a hstore contains a single key.    It'd be nice if there were
as way to extend this so that I could ask for only records
that have all or any the keys in a query. 'a=>1, b=>1'::hstore ? 'a,b'
In one database I put ids of each of the keys in a
hstore into a largely redundant intarray to be able to
do fast queries for rows containing all the hstore-keys
in a set.

Even cooler might be extending the hstore '?' operator to
allow expressions similar to intarray's queries: 'a=>1, b=>1'::hstore ? 'a|b' 'a=>1, b=>1'::hstore ? 'a&b' 'a=>1,
b=>1'::hstore? 'a&(b|c)'
 
I don't have a need for the more general expressions, but if
the code can be borrowed from intarray to handle both, that'd
be sweet.


I once wanted a variation of hstore where a key could have
multiple values(and the ability to query them). 'a=>x, a=>y'::hstore @> 'a=>x'
I imagine most EAV systems allow multiple values for each
attribute - and a hstore that supported this could probably
be a pretty nice general solution for many EAV systems.  IIRC
other people asked about similar on the lists before.


> Also, hstore has an (undocumented) limit of 65535 bytes for keys and
> values, and it does not behave very cleanly when given longer values
> (it truncates them mod 2^16, rather than erroring). That gives rise to
> two obvious questions: (1) are those lengths reasonable? they strike
> me as being rather long for keys and rather short for values; and (2)
> should exceeding the lengths throw an error?



Re: hstore improvements?

From
Josh Berkus
Date:
Ron,

> I imagine most EAV systems allow multiple values for each
> attribute - and a hstore that supported this could probably
> be a pretty nice general solution for many EAV systems.  IIRC
> other people asked about similar on the lists before.

Well, not usually.  Generally, since EAV systems are 
entity-attribute-value, the key for the table is generally 
entity-attribute.  I've seen a few which were 
entity-attribute-attributenumber-value, but those are rare.

Overally, I think the ability to stick an array into an hstore value 
would take care of most real uses.  Beyond that, we're encouraging 
people to use hstore in ways it won't actually scale.

--Josh



Re: hstore improvements?

From
Andrew Gierth
Date:
>>>>> "Andrew" == Andrew Gierth <andrew@tao11.riddles.org.uk> writes:
Andrew> I have a patch almost done that adds some obvious butAndrew> currently missing functionality to hstore,
specificallytheAndrew> ability to construct an hstore from a record, and the abilityAndrew> to construct one from a
pairof arrays.
 
Andrew> If there's any other features that people find notablyAndrew> missing from hstore, I could stick them in too;
anyrequests?
 

One request I've had is to construct a record (of some supplied
composite type) from an hstore.

I'm not sure if this is even possible; I'm certainly not seeing a way
to implement it. Am I missing something?

-- 
Andrew (irc:RhodiumToad)


Re: hstore improvements?

From
Josh Berkus
Date:
Andrew,

> One request I've had is to construct a record (of some supplied
> composite type) from an hstore.
>
> I'm not sure if this is even possible; I'm certainly not seeing a way
> to implement it. Am I missing something?

Well, presumably you'd try to match hstore tags against the "columns" of 
the composite type, and where a tag didn't exist, return NULL,and where 
one isn't in the composite type, ignore it.  All data would be TEXT.

--Josh




Re: hstore improvements?

From
Josh Berkus
Date:
On 3/19/09 1:23 PM, Josh Berkus wrote:
> Andrew,
>
>> One request I've had is to construct a record (of some supplied
>> composite type) from an hstore.
>>
>> I'm not sure if this is even possible; I'm certainly not seeing a way
>> to implement it. Am I missing something?
>
> Well, presumably you'd try to match hstore tags against the "columns" of
> the composite type, and where a tag didn't exist, return NULL,and where
> one isn't in the composite type, ignore it. All data would be TEXT.

Oh, and in features I'd want, mostly an UNROLL for hstore ... that is, 
rather than a RECORD, a set of exploded rows.

--Josh



Re: hstore improvements?

From
Dimitri Fontaine
Date:
Hi,

Le 19 mars 09 à 21:23, Josh Berkus a écrit :
>> One request I've had is to construct a record (of some supplied
>> composite type) from an hstore.
>>
>> I'm not sure if this is even possible; I'm certainly not seeing a way
>> to implement it. Am I missing something?
>
> Well, presumably you'd try to match hstore tags against the
> "columns" of the composite type, and where a tag didn't exist,
> return NULL,and where one isn't in the composite type, ignore it.
> All data would be TEXT.


The problem is more how to have the parser know which data type to
target, because you want to avoid having to create a new cast per each
composite type you want to target.

A solution could maybe look like this:  SELECT hstore_to_composite(hstore_value, null::my_composite_type);

Regards,
--
dim

from the IRC-to-List bridge dept ;)

Re: hstore improvements?

From
Andrew Gierth
Date:
>>>>> "Josh" == Josh Berkus <josh@agliodbs.com> writes:
> Tom Lane wrote:>> Josh Berkus <josh@agliodbs.com> writes:>>> As an hstore user, I'd be fine with simply limiting it
to64K (or,>>> heck, 8K) and throwing an error.  I'd also be fine with limiting>>> keys to 255 bytes, although we'd have
towarn people.>> Yeah, 255 might well be more of a problem than the other limit.  We>> could move to something like
10/22or 12/20 split, which would give>> us 1KB/4MB or 4KB/1MB limits.
 
Josh> Anything you like.  What I'm saying is that I think I useJosh> hstore more heavily than most people, and that if
thelimitsJosh> were as low as 255b/8K it wouldn't hurt me any.
 
Josh> I suppose 1K/4MB would allow OO-types to use hstore as anJosh> object store, so you'll make them happy with a new
footgun.Josh> Why not?
 

I decided to obviate the entire question and remove the limits
altogether (while still keeping the overhead the same, i.e. 8 bytes
per entry).

-- 
Andrew.


Re: hstore improvements?

From
Josh Berkus
Date:
On 3/19/09 2:40 PM, Dimitri Fontaine wrote:
> Hi,
>
> Le 19 mars 09 à 21:23, Josh Berkus a écrit :
>>> One request I've had is to construct a record (of some supplied
>>> composite type) from an hstore.
>>>
>>> I'm not sure if this is even possible; I'm certainly not seeing a way
>>> to implement it. Am I missing something?
>>
>> Well, presumably you'd try to match hstore tags against the "columns"
>> of the composite type, and where a tag didn't exist, return NULL,and
>> where one isn't in the composite type, ignore it. All data would be TEXT.
>
>
> The problem is more how to have the parser know which data type to
> target, because you want to avoid having to create a new cast per each
> composite type you want to target.

Or you could just have it fail if there's no cast between TEXT and the 
target type.

--Josh