Thread: Managing Key Value tags on rows

Managing Key Value tags on rows

From

Tim Uckun

Date:

13 November 2014, 01:18:04

What is the most efficient way to model key value tags on records. The keys and values will be variable and changing over time. The values may be of various types (numbers, dates, strings etc). There will be filters and group by selects based on tag names and maybe even tag values.

Postgres gives me the option of using hstore, JSON or just a linked table with all the keys and values and I am wondering which would be most efficient in terms of SELECT speeds. The writes are done in batches so this is mostly for analytical purposes.

Thanks.

Re: Managing Key Value tags on rows

From

Merlin Moncure

Date:

17 November 2014, 16:40:56

On Wed, Nov 12, 2014 at 7:17 PM, Tim Uckun <timuckun@gmail.com> wrote:
> What is the most efficient way to model key value tags on records. The keys
> and values will be variable and changing over time.  The values may be of
> various types (numbers, dates, strings etc).   There will be filters and
> group by selects based on tag names and maybe even tag values.
>
> Postgres gives me the option of using hstore, JSON or just a linked table
> with all the keys and values and I am wondering which would be most
> efficient in terms of SELECT speeds. The writes are done in batches so this
> is mostly for analytical purposes.

For pre-9.4 you should be contrasting hstore to EAV model, where each
key value pair is stored in its own record.  Text mode json (that is,
not jsonb) is generally not a good choice for arbitrary searching.

If you'll permit a gross generalization, if the data is mostly static
hstore should work pretty well otherwise you may want to use the 'old
school' wasteful but flexible EAV mechanic, double so if you have to
store other describing criteria than the type itself.

9.4 and above, I'd be looking pretty intently at jsonb with its
fancier searching options, better type handling, and ability to store
complicated structures for this type of work.  Frequent updates will
still be a pain point however especially if there are a large number
of keys per object.

merlin

Re: Managing Key Value tags on rows

From

Tim Uckun

Date:

17 November 2014, 21:43:28

My Main worry is that the tag values will be of different types and ideally I would be able to search for using type specific ranges. For example if the tag value is a date then be able to do a date interval search but if the tag values are strings then do an ilike search.

I was thinking of creating different columns for different types so that I can do a search like 'WHERE tag_name = 'blah' and date_value between ....' . In other words I would have a string_value, integer_value, numeric_value, date_value ... columns.

On Tue, Nov 18, 2014 at 5:40 AM, Merlin Moncure <mmoncure@gmail.com> wrote:

On Wed, Nov 12, 2014 at 7:17 PM, Tim Uckun <timuckun@gmail.com> wrote:
> What is the most efficient way to model key value tags on records. The keys
> and values will be variable and changing over time. The values may be of
> various types (numbers, dates, strings etc). There will be filters and
> group by selects based on tag names and maybe even tag values.
>
> Postgres gives me the option of using hstore, JSON or just a linked table
> with all the keys and values and I am wondering which would be most
> efficient in terms of SELECT speeds. The writes are done in batches so this
> is mostly for analytical purposes.

For pre-9.4 you should be contrasting hstore to EAV model, where each
key value pair is stored in its own record. Text mode json (that is,
not jsonb) is generally not a good choice for arbitrary searching.

If you'll permit a gross generalization, if the data is mostly static
hstore should work pretty well otherwise you may want to use the 'old
school' wasteful but flexible EAV mechanic, double so if you have to
store other describing criteria than the type itself.

9.4 and above, I'd be looking pretty intently at jsonb with its
fancier searching options, better type handling, and ability to store
complicated structures for this type of work. Frequent updates will
still be a pain point however especially if there are a large number
of keys per object.

merlin

Re: Managing Key Value tags on rows

From

Merlin Moncure

Date:

17 November 2014, 21:58:07

On Mon, Nov 17, 2014 at 3:43 PM, Tim Uckun <timuckun@gmail.com> wrote:
> My Main worry is that the tag values will be of different types and ideally
> I would be able to search for using type specific ranges. For example if the
> tag value is a date then be able to do a date interval search but if the tag
> values are strings then do an ilike search.
>
> I was thinking of creating different columns for different types so that I
> can do a search like 'WHERE tag_name = 'blah' and date_value between ....' .
> In other words I would have a string_value, integer_value, numeric_value,
> date_value ... columns.

Yeah -- noted, you're solidly in the EAV camp here, at least for 9.3.
Personally, I'd just use a single value column unless indexing 'value'
for range searches was very important:

create table foo_attribute
(
  foo_id references foo on delete cascade,
  type text, -- text,bool, int, etc
  key text,
  value text
);

without that you can just involve a cast to get what you want: select
* from foo_attribute where value::int between ...

Even with jsonb, EAV remains the most flexible (although not the best)
approach.  jsonb with jsquery gives incredibly fast searching but it
doesn't handle partial string matching at all -- at least the last
time I looked.

merlin