Thread: keeping last 30 entries of a log table
I need to write a function which inserts a log entry in a log table and only keeps the last 30 records. I was thinking of using a subquery similar to the following: insert into log (account_id, message) values (1, 'this is a test); delete from log where account_id = 1 and id not in ( select id from log where account_id = 1 order by timestamp desc limit30); I'm wondering if there is a more performance oriented method of doing the delete that I'm not thinking of. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
> I need to write a function which inserts a log entry in a log table and > only > keeps the last 30 records. I was thinking of using a subquery similar to > the > following: > > insert into log (account_id, message) values (1, 'this is a test); > delete from log where account_id = 1 and id not in ( select id from log > where account_id = 1 order by timestamp desc limit 30); > > I'm wondering if there is a more performance oriented method of doing the > delete that I'm not thinking of. > Depending on whether id is a kind of auto-incremented column that never cycles, I would suggest something like: DELETE FROM log WHERE account_id = 1 AND id < ( SELECT MIN(id) FROM log WHERE account_id = 1 ORDERBY timestamp DESC LIMIT 30); I think there will be a performance difference with your method when the number of records to be deleted is huge. -- Daniel
On Sat, 17 Jun 2006, Daniel CAUNE wrote: >> insert into log (account_id, message) values (1, 'this is a test); >> delete from log where account_id = 1 and id not in ( select id from log >> where account_id = 1 order by timestamp desc limit 30); >> >> I'm wondering if there is a more performance oriented method of doing the >> delete that I'm not thinking of. >> > > Depending on whether id is a kind of auto-incremented column that never cycles, I would suggest something like: > > DELETE FROM log > WHERE account_id = 1 > AND id < ( > SELECT MIN(id) > FROM log > WHERE account_id = 1 > ORDER BY timestamp DESC > LIMIT 30); > > I think there will be a performance difference with your method when the number of records to be deleted is huge. Thanks Daniel, I'll try and benchmark them both and see if < turns out to be faster than NOT IN. I guess there's no way to get around the subselect though. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
> >> insert into log (account_id, message) values (1, 'this is a test); > >> delete from log where account_id = 1 and id not in ( select id from log > >> where account_id = 1 order by timestamp desc limit 30); > >> > >> I'm wondering if there is a more performance oriented method of doing > the > >> delete that I'm not thinking of. > >> > > > > Depending on whether id is a kind of auto-incremented column that never > cycles, I would suggest something like: > > > > DELETE FROM log > > WHERE account_id = 1 > > AND id < ( > > SELECT MIN(id) > > FROM log > > WHERE account_id = 1 > > ORDER BY timestamp DESC > > LIMIT 30); > > > > I think there will be a performance difference with your method when the > number of records to be deleted is huge. > > Thanks Daniel, I'll try and benchmark them both and see if < turns out to > be > faster than NOT IN. I guess there's no way to get around the subselect > though. > Column id should be indexed indeed. Anyway, I'm not sure about any performance improvement using that last method, as themost consuming operation might be the DELETE operation, not really the SELECT operation, when dealing with a huge volumeof data. -- Daniel
> Column id should be indexed indeed. Anyway, I'm not sure about any performance improvement using that last method, asthe most consuming operation might be the DELETE operation, not really the SELECT operation, when dealing with a huge volumeof data. why worry, there are a maximum of 30 Entries in this table anyway. So even the most unoptimized select and delete combinations should be very fast... -- ---> Dirk Jagdmann ----> http://cubic.org/~doj -----> http://llg.cubic.org
On Tue, 20 Jun 2006, Dirk Jagdmann wrote: >> Column id should be indexed indeed. Anyway, I'm not sure about any >> performance improvement using that last method, as the most consuming >> operation might be the DELETE operation, not really the SELECT operation, >> when dealing with a huge volume of data. > > why worry, there are a maximum of 30 Entries in this table anyway. So > even the most unoptimized select and delete combinations should be > very fast... Well, it's 30 entries per account and likely 1000s of accounts, but still that would only be 30,000-90,000 entries, so shouldn't be too bad. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
On Wed, 21 Jun 2006, Ben K. wrote: > Just for the sake of alternatives - > > create sequence cy30 maxvalue 30 cycle; > > insert into log values(select generate_series(1,30), 'dummy'); > INSERT 0 30 > > update log set des='....' where account_id=(select nextval('cy30')); > UPDATE 1 > > > There are details to consider I guess. For example what if an update fails > and the sequence already advanced... Also, since we cycle the id, for > sorting, we'll need to add timestamp or something similar. > Only problem with this approach is that I need to be able to keep track of the last 30 items per each account_id...so each account_id will have his last 30 messages in the table. -- Jeff Frost, Owner <jeff@frostconsultingllc.com> Frost Consulting, LLC http://www.frostconsultingllc.com/ Phone: 650-780-7908 FAX: 650-649-1954
> I need to write a function which inserts a log entry in a log table and only > keeps the last 30 records. I was thinking of using a subquery similar to the > following: > > insert into log (account_id, message) values (1, 'this is a test); > delete from log where account_id = 1 and id not in ( select id from log > where account_id = 1 order by timestamp desc limit 30); > > I'm wondering if there is a more performance oriented method of doing the > delete that I'm not thinking of. Just for the sake of alternatives - create sequence cy30 maxvalue 30 cycle; insert into log values(select generate_series(1,30), 'dummy'); INSERT 0 30 update log set des='....' where account_id=(select nextval('cy30')); UPDATE 1 There are details to consider I guess. For example what if an update fails and the sequence already advanced... Also, since we cycle the id, for sorting, we'll need to add timestamp or something similar. My 2 pence... P.S. This A) failed me and I wonder if this is supposed to be so or if it's just a place where no one treaded on ?? B) works fine except it doesn't advance the sequence. A) update tc set des='b' where id=nextval('cy30')::int; UPDATE 30 B) update tc set des='c' where id=currval('cy30'); UPDATE 1 Regards, Ben K. Developer http://benix.tamu.edu