Re: Disabling nested loops - worst case performance - Mailing list pgsql-performance

From Anssi Kääriäinen
Subject Re: Disabling nested loops - worst case performance
Date
Msg-id 4D834DBC.3090009@thl.fi
Whole thread Raw
In response to Re: Disabling nested loops - worst case performance  (Thomas Kellerer <spam_eater@gmx.net>)
List pgsql-performance
On 03/18/2011 01:14 PM, Thomas Kellerer wrote:
> Did you consider using hstore instead?
>
> I think in the PostgreSQL world, this is a better alternative than EAV and most probably faster as well.
No, we did not. The reason is that we want to track each attribute with
bi-temporal timestamps. The actual database schema for the attribute
value table is:

CREATE TABLE attr_value (
     id SERIAL PRIMARY KEY,
     olio_id INTEGER NOT NULL REFERENCES base_olio, -- entity identifier
     attr_tunniste VARCHAR(20) NOT NULL REFERENCES base_attr, -- attr
identifier
     kieli_tunniste VARCHAR(20) REFERENCES kieli, -- lang identifier
     arvo_number DECIMAL(18, 9), -- value number
     arvo_ts timestamptz, -- value timestamp
     arvo_text TEXT, -- value text
     arvo_valinta_tunniste VARCHAR(20), -- for choice lists:
"value_choice_identifier"
     real_valid_from TIMESTAMPTZ NOT NULL, -- real_valid_from -
real_valid_until define when things have been in "real" world
     real_valid_until TIMESTAMPTZ NOT NULL,
     db_valid_from TIMESTAMPTZ NOT NULL, -- db_valid_* defines when
things have been in the database
     db_valid_until TIMESTAMPTZ NOT NULL,
     tx_id_insert INTEGER default txid_current(),
     tx_id_delete INTEGER,
     -- foreign keys & checks skipped
);

Naturally, we have other tables defining the objects, joins between
objects and metadata for the EAV. All data modifications are done
through procedures, which ensure uniqueness etc. for the attributes and
joins.

The data set is small, and performance in general is not that important,
as long as the UI is responsive and data can be transferred to other
systems in reasonable time. Insert performance is at least 10x worse
than when using traditional schema, but it doesn't matter (we have
somewhere around 1000 inserts / updates a day max). The only real
problem so far is the chained nested loop problem, which really kills
performance for some queries.

Surprisingly (at least to me) this schema has worked really well,
although sometimes there is a feeling that we are implementing a
database using a database...

  - Anssi

pgsql-performance by date:

Previous
From: Anssi Kääriäinen
Date:
Subject: Re: Disabling nested loops - worst case performance
Next
From: Jeff
Date:
Subject: Re: Xeon twice the performance of opteron