Thread: Is it nonsense (read: stupid) to keep count of child entries via triggers and a custom table?

Greetings,
I need to keep various information related to a parent entity (the Electronic Health Record for a single patient) and its children (partitions of the EHR, since it can get very big).
It is important that I know how many partitions exist for an EHR, and I'd like to avoid a query that'll count the partitions (which'll probably go to hundreds of millions) with a parent id as criteria.
Other than count, there is also significant info such as some metadata for the last added partition which must be accessed along with the count.

Can I simply adopt the naive approach of updating an EHR metadata table within a transaction in every partition addition/deletion operation? (update/decrease count, write last added partition's metadata to this table or remove it etc) Any hidden nasty surprises you can see? A better way of doing this?

Best regards
Seref

On 08/28/2012 08:56 PM, Seref Arikan wrote:
> Can I simply adopt the naive approach of updating an EHR metadata table
> within a transaction in every partition addition/deletion operation?

Absolutely. That's a classic trade-off; pay the cost of maintaining a
materialized view at INSERT/UPDATE/DELETE time, in exchange for faster
access in frequent queries that're otherwise unacceptably expensive.

It *is* a trade-off, like any performance choice. Careful work is also
required to handle concurrency issues correctly.

I do the same thing in much smaller (tiny, even) databases where I have
expensive queries I want to respond before the user noticed they were
waiting. For example, in a parent->child relationship I sometimes
maintain a summary table with a 1:1 relationship with the parent that
summarizes the children.

It's usually a good idea to keep your summary tables clearly separate as
trigger-maintained materialized views, rather than updating "real"
entities with summary info too. You avoid churn on your "real" tables,
avoid some interesting lock ordering issues, etc.

Some explicit locking with `SELECT ... FOR UPDATE` can be important to
avoid unexpected concurrency issues.

--
Craig Ringer


Ah, thanks a lot for the Select .. For Update clue/reminder! The partitions are actually going to arrive from different systems (lab reports coming from lab equipment and nurses entering bedside data at the same time etc). So I'll have to be defensive about concurrency.

Best regards
Seref


On Tue, Aug 28, 2012 at 2:14 PM, Craig Ringer <ringerc@ringerc.id.au> wrote:
On 08/28/2012 08:56 PM, Seref Arikan wrote:
Can I simply adopt the naive approach of updating an EHR metadata table
within a transaction in every partition addition/deletion operation?

Absolutely. That's a classic trade-off; pay the cost of maintaining a materialized view at INSERT/UPDATE/DELETE time, in exchange for faster access in frequent queries that're otherwise unacceptably expensive.

It *is* a trade-off, like any performance choice. Careful work is also required to handle concurrency issues correctly.

I do the same thing in much smaller (tiny, even) databases where I have expensive queries I want to respond before the user noticed they were waiting. For example, in a parent->child relationship I sometimes maintain a summary table with a 1:1 relationship with the parent that summarizes the children.

It's usually a good idea to keep your summary tables clearly separate as trigger-maintained materialized views, rather than updating "real" entities with summary info too. You avoid churn on your "real" tables, avoid some interesting lock ordering issues, etc.

Some explicit locking with `SELECT ... FOR UPDATE` can be important to avoid unexpected concurrency issues.

--
Craig Ringer