Re: Replacing old tuples on insert - Mailing list pgsql-novice
From | Michael Fuhr |
---|---|
Subject | Re: Replacing old tuples on insert |
Date | |
Msg-id | 20041127075717.GA76802@winnie.fuhr.org Whole thread Raw |
In response to | Replacing old tuples on insert ("Pradeepkumar, Pyatalo (IE10)" <Pradeepkumar.Pyatalo@honeywell.com>) |
List | pgsql-novice |
On Wed, Nov 24, 2004 at 02:20:42AM -0700, Pradeepkumar, Pyatalo (IE10) wrote: > > I have a table as follows - > table TrendData( TrendID bigserial, > TrendName varchar(30), > ..... > .....); > > Now for this table i want a maximum of only 5000 tuples. After the maximum > is reached the new tuples that are inserted should replace the older ones. > Could anyone help me out in writing the query or a trigger would also be > fine. How accurate does the tuple count need to be? A simple but flawed approach would be to use a rule like the following: CREATE RULE TrendTrunc AS ON INSERT TO TrendData DO ALSO DELETE FROM TrendData WHERE TrendID <= currval('trenddata_trendid_seq') - 5000; This rule is flawed because it doesn't account for gaps in the sequence that could exist due to deleted rows or rolled back transactions. Such gaps could cause the rule to truncate the table to a lower row count than the desired 5000. Here's a rule that handles gaps but is also flawed: CREATE RULE TrendTrunc AS ON INSERT TO TrendData DO ALSO DELETE FROM TrendData WHERE TrendID < (SELECT MIN(TrendID) FROM (SELECT TrendID FROM TrendData ORDER BY TrendID DESC LIMIT 5000) AS s); This rule could leave the table with more than 5000 rows if multiple transactions are inserting at the same time. The following trigger solution should handle gaps and strictly enforce the 5000-row limit by locking the table before deleting excess records, thus waiting for other transactions that have inserted or deleted records to complete (unfortunately this will hurt performance if concurrent transactions are doing inserts or deletes). It uses a trigger because rules don't allow LOCK. The trigger fires after each statement instead of after each row so INSERTs that insert multiple rows will delete excess rows only once, after all new rows have been inserted. I haven't considered all possible situations and I've done only simple testing, so the solution could have flaws. An improvement would be to make the function reusable for multiple tables with differing field names and maximum row counts; this is left as an exercise for the reader. CREATE FUNCTION TrendTrunc() RETURNS TRIGGER AS ' BEGIN LOCK TABLE TrendData IN SHARE ROW EXCLUSIVE MODE; DELETE FROM TrendData WHERE TrendID < (SELECT MIN(TrendID) FROM (SELECT TrendID FROM TrendData ORDER BY TrendID DESC LIMIT 5000) AS s); RETURN NULL; END; ' LANGUAGE plpgsql; CREATE TRIGGER TrendAfterInsert AFTER INSERT ON TrendData FOR EACH STATEMENT EXECUTE PROCEDURE TrendTrunc(); If anybody sees problems or has suggestions for improvements then please follow up. In particular, is there a more efficient way to get the cutoff TrendID? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
pgsql-novice by date: