Re: Trigger or Function - Mailing list pgsql-performance

From Robert Klemme
Subject Re: Trigger or Function
Date
Msg-id CAM9pMnP5aCEcjNPOGOJKuuDhL9OO9ZKojQmeWv4n3xm7LQ1Ltw@mail.gmail.com
Whole thread Raw
In response to Trigger or Function  (alan <alan.miller3@gmail.com>)
List pgsql-performance
On Tue, Jul 12, 2011 at 9:41 AM, alan <alan.miller3@gmail.com> wrote:
> Hello,
> I'm a postgres newbie and am wondering what's the best way to do this.
>
> I am gathering some data and will be inserting to a table once daily.
> The table is quite simple but I want the updates to be as efficient as
> possible since
> this db is part of a big data project.
>
> Say I have a table with these columns:
> | Date | Hostname | DayVal | WeekAvg | MonthAvg |
>
> When I insert a new row I have the values for Date, Hostname, DayVal.
> Is it possible to define the table is such a way that the WeekAvg and
> MonthAvg
> are automatically updated as follows?
>  WeekAvg = current rows DayVal plus the sum of DayVal for the
> previous 6 rows.
>  MonthAvg = current row's DayVal plus the sum of DayVal for the
> previous 29 rows.
>
> Should I place the logic in a Trigger or in a Function?
> Does someone have an example or a link showing how I could set this
> up?

IMHO that design does not fit the relational model well because you
are trying to store multirow aggregate values in individual rows.  For
example, your values will be wrong if you insert rows in the wrong
order (i.e. today's data before yesterday's data).

My first approach would be to remove WeekAvg and MonthAvg from the
table and create a view which calculates appropriate values.

If that proves too inefficient (e.g. because the data set is too huge
and too much data is queried for individual queries) we can start
optimizing.  One approach to optimizing would be to have secondary
tables

| Week | Hostname | WeekAvg |
| Month | Hostname | MonthAvg |

and update them with an insert trigger and probably also with an
update and delete trigger.

If you actually need increasing values (i.e. running totals) you can
use windowing functions (analytic SQL in Oracle).  View definitions
then of course need to change.
http://www.postgresql.org/docs/9.0/interactive/queries-table-expressions.html#QUERIES-WINDOW

Kind regards

robert

--
remember.guy do |as, often| as.you_can - without end
http://blog.rubybestpractices.com/

pgsql-performance by date:

Previous
From: Clem Dickey
Date:
Subject: Re: Planner choosing NestedLoop, although it is slower...
Next
From: "Kevin Grittner"
Date:
Subject: Re: UPDATEDs slowing SELECTs in a fully cached database