Re: Another Plpgsql trigger example - summary table - Mailing list pgsql-patches
From | Mark Kirkwood |
---|---|
Subject | Re: Another Plpgsql trigger example - summary table |
Date | |
Msg-id | 41E3339D.2020200@coretech.co.nz Whole thread Raw |
In response to | Re: Another Plpgsql trigger example - summary table (Mark Kirkwood <markir@coretech.co.nz>) |
Responses |
Re: Another Plpgsql trigger example - summary table
Re: Another Plpgsql trigger example - summary table |
List | pgsql-patches |
Mark Kirkwood wrote: > > 2) Perhaps leave the trigger + plpgsql function as a plpgsql example, > and refer to it in the (new) data warehouse section/chapter. > Looking at option 2, it seems reasonable to add a trimmed trigger example into the plpgsql examples section now, and leave the data warehouse introductory stuff for its own chapter at some later stage. --- plpgsql.sgml.orig Tue Jan 11 12:39:17 2005 +++ plpgsql.sgml Tue Jan 11 13:43:35 2005 @@ -2646,6 +2646,162 @@ FOR EACH ROW EXECUTE PROCEDURE process_emp_audit(); </programlisting> </example> + + <para> + An area where triggers can be useful is maintaining a summary table + of another table. The resulting summary can be used in place of the + original table for certain queries - with often vastly reduced run + times. + </para> + + <para> + This technique is commonly used in Data Warehousing, where the tables + of measured or observed data (called fact tables) can be extremely large. + <xref linkend="plpgsql-trigger-summary-example"> shows an example of a + trigger procedure in <application>PL/pgSQL</application> that maintains + a summary table for a fact table in a data warehouse. + </para> + + + <example id="plpgsql-trigger-summary-example"> + <title>A <application>PL/pgSQL</application> Trigger Procedure For Maintaining A Summary Table</title> + + <para> + The schema detailed here is partly based on the <emphasis>Grocery Store + </emphasis> example from <emphasis>The Data Warehouse Toolkit</emphasis> + by Ralph Kimball. + </para> + +<programlisting> +-- +-- Main tables - time dimension and sales fact. +-- +CREATE TABLE time_dimension ( + time_key integer NOT NULL, + day_of_week integer NOT NULL, + day_of_month integer NOT NULL, + month integer NOT NULL, + quarter integer NOT NULL, + year integer NOT NULL +); +CREATE UNIQUE INDEX time_dimension_key ON time_dimension(time_key); + +CREATE TABLE sales_fact ( + time_key integer NOT NULL, + product_key integer NOT NULL, + store_key integer NOT NULL, + amount_sold numeric(12,2) NOT NULL, + units_sold integer NOT NULL, + amount_cost numeric(12,2) NOT NULL +); +CREATE INDEX sales_fact_time ON sales_fact(time_key); + +-- +-- Summary table - sales by time. +-- +CREATE TABLE sales_summary_bytime ( + time_key integer NOT NULL, + amount_sold numeric(15,2) NOT NULL, + units_sold numeric(12) NOT NULL, + amount_cost numeric(15,2) NOT NULL +); +CREATE UNIQUE INDEX sales_summary_bytime_key ON sales_summary_bytime(time_key); + +-- +-- Function and trigger to amend summarized column(s) on UPDATE, INSERT, DELETE. +-- +CREATE OR REPLACE FUNCTION maint_sales_summary_bytime() RETURNS TRIGGER AS $maint_sales_summary_bytime$ + DECLARE + delta_time_key integer; + delta_amount_sold numeric(15,2); + delta_units_sold numeric(12); + delta_amount_cost numeric(15,2); + BEGIN + + -- Work out the increment/decrement amount(s). + IF (TG_OP = 'DELETE') THEN + + delta_time_key = OLD.time_key; + delta_amount_sold = -1 * OLD.amount_sold; + delta_units_sold = -1 * OLD.units_sold; + delta_amount_cost = -1 * OLD.amount_cost; + + ELSIF (TG_OP = 'UPDATE') THEN + + -- forbid updates that change the time_key - + -- (probably not too onerous, as DELETE + INSERT is how most + -- changes will be made). + IF ( OLD.time_key != NEW.time_key) THEN + RAISE EXCEPTION 'Update of time_key : % -> % not allowed', OLD.time_key, NEW.time_key; + END IF; + + delta_time_key = OLD.time_key; + delta_amount_sold = NEW.amount_sold - OLD.amount_sold; + delta_units_sold = NEW.units_sold - OLD.units_sold; + delta_amount_cost = NEW.amount_cost - OLD.amount_cost; + + ELSIF (TG_OP = 'INSERT') THEN + + delta_time_key = NEW.time_key; + delta_amount_sold = NEW.amount_sold; + delta_units_sold = NEW.units_sold; + delta_amount_cost = NEW.amount_cost; + + END IF; + + + -- Update the summary row with the new values. + UPDATE sales_summary_bytime + SET amount_sold = amount_sold + delta_amount_sold, + units_sold = units_sold + delta_units_sold, + amount_cost = amount_cost + delta_amount_cost + WHERE time_key = delta_time_key; + + + -- There might have been no row with this time_key (e.g new data!). + IF (NOT FOUND) THEN + BEGIN + INSERT INTO sales_summary_bytime ( + time_key, + amount_sold, + units_sold, + amount_cost) + SELECT f.time_key, + sum(f.amount_sold), + sum(f.units_sold), + sum(f.amount_cost) + FROM sales_fact f + WHERE f.time_key = delta_time_key + GROUP BY f.time_key; + -- This query can potentially be very expensive if the trigger + -- is created on sales_fact without the time_key indexes. + -- Some care is needed to ensure that this situation does + -- *not* occur. + EXCEPTION + -- + -- Catch race condition when two transactions are adding data + -- for a new time_key. + -- + WHEN UNIQUE_VIOLATION THEN + UPDATE sales_summary_bytime + SET amount_sold = amount_sold + delta_amount_sold, + units_sold = units_sold + delta_units_sold, + amount_cost = amount_cost + delta_amount_cost + WHERE time_key = delta_time_key; + + END; + END IF; + RETURN NULL; + + END; +$maint_sales_summary_bytime$ LANGUAGE plpgsql; + +CREATE TRIGGER maint_sales_summary_bytime +AFTER INSERT OR UPDATE OR DELETE ON sales_fact + FOR EACH ROW EXECUTE PROCEDURE maint_sales_summary_bytime(); +</programlisting> + </example> + </sect1> <!-- **** Porting from Oracle PL/SQL **** -->
pgsql-patches by date: