Thread: keeping last 30 entries of a log table

keeping last 30 entries of a log table

From
Jeff Frost
Date:
I need to write a function which inserts a log entry in a log table and only 
keeps the last 30 records.  I was thinking of using a subquery similar to the 
following:

insert into log (account_id, message) values (1, 'this is a test);
delete from log where account_id = 1 and id not in ( select id from log  where account_id = 1 order by timestamp desc
limit30);
 

I'm wondering if there is a more performance oriented method of doing the 
delete that I'm not thinking of.

-- 
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954


Re: keeping last 30 entries of a log table

From
Daniel CAUNE
Date:
> I need to write a function which inserts a log entry in a log table and
> only
> keeps the last 30 records.  I was thinking of using a subquery similar to
> the
> following:
> 
> insert into log (account_id, message) values (1, 'this is a test);
> delete from log where account_id = 1 and id not in ( select id from log
>    where account_id = 1 order by timestamp desc limit 30);
> 
> I'm wondering if there is a more performance oriented method of doing the
> delete that I'm not thinking of.
>

Depending on whether id is a kind of auto-incremented column that never cycles, I would suggest something like:

DELETE FROM log WHERE account_id = 1   AND id < (     SELECT MIN(id)       FROM log       WHERE account_id = 1
ORDERBY timestamp DESC       LIMIT 30);
 

I think there will be a performance difference with your method when the number of records to be deleted is huge.

--
Daniel



Re: keeping last 30 entries of a log table

From
Jeff Frost
Date:
On Sat, 17 Jun 2006, Daniel CAUNE wrote:

>> insert into log (account_id, message) values (1, 'this is a test);
>> delete from log where account_id = 1 and id not in ( select id from log
>>    where account_id = 1 order by timestamp desc limit 30);
>>
>> I'm wondering if there is a more performance oriented method of doing the
>> delete that I'm not thinking of.
>>
>
> Depending on whether id is a kind of auto-incremented column that never cycles, I would suggest something like:
>
> DELETE FROM log
>  WHERE account_id = 1
>    AND id < (
>      SELECT MIN(id)
>        FROM log
>        WHERE account_id = 1
>        ORDER BY timestamp DESC
>        LIMIT 30);
>
> I think there will be a performance difference with your method when the number of records to be deleted is huge.

Thanks Daniel, I'll try and benchmark them both and see if < turns out to be 
faster than NOT IN.  I guess there's no way to get around the subselect 
though.

-- 
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954


Re: keeping last 30 entries of a log table

From
Daniel CAUNE
Date:
> >> insert into log (account_id, message) values (1, 'this is a test);
> >> delete from log where account_id = 1 and id not in ( select id from log
> >>    where account_id = 1 order by timestamp desc limit 30);
> >>
> >> I'm wondering if there is a more performance oriented method of doing
> the
> >> delete that I'm not thinking of.
> >>
> >
> > Depending on whether id is a kind of auto-incremented column that never
> cycles, I would suggest something like:
> >
> > DELETE FROM log
> >  WHERE account_id = 1
> >    AND id < (
> >      SELECT MIN(id)
> >        FROM log
> >        WHERE account_id = 1
> >        ORDER BY timestamp DESC
> >        LIMIT 30);
> >
> > I think there will be a performance difference with your method when the
> number of records to be deleted is huge.
> 
> Thanks Daniel, I'll try and benchmark them both and see if < turns out to
> be
> faster than NOT IN.  I guess there's no way to get around the subselect
> though.
> 

Column id should be indexed indeed.  Anyway, I'm not sure about any performance improvement using that last method, as
themost consuming operation might be the DELETE operation, not really the SELECT operation, when dealing with a huge
volumeof data.
 


--
Daniel



Re: keeping last 30 entries of a log table

From
"Dirk Jagdmann"
Date:
> Column id should be indexed indeed.  Anyway, I'm not sure about any performance improvement using that last method,
asthe most consuming operation might be the DELETE operation, not really the SELECT operation, when dealing with a huge
volumeof data.
 

why worry, there are a maximum of 30 Entries in this table anyway. So
even the most unoptimized select and delete combinations should be
very fast...

-- 
---> Dirk Jagdmann
----> http://cubic.org/~doj
-----> http://llg.cubic.org


Re: keeping last 30 entries of a log table

From
Jeff Frost
Date:
On Tue, 20 Jun 2006, Dirk Jagdmann wrote:

>> Column id should be indexed indeed.  Anyway, I'm not sure about any 
>> performance improvement using that last method, as the most consuming 
>> operation might be the DELETE operation, not really the SELECT operation, 
>> when dealing with a huge volume of data.
>
> why worry, there are a maximum of 30 Entries in this table anyway. So
> even the most unoptimized select and delete combinations should be
> very fast...

Well, it's 30 entries per account and likely 1000s of accounts, but still that 
would only be 30,000-90,000 entries, so shouldn't be too bad.

-- 
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954


Re: keeping last 30 entries of a log table

From
Jeff Frost
Date:
On Wed, 21 Jun 2006, Ben K. wrote:

> Just for the sake of alternatives -
>
> create sequence cy30 maxvalue 30 cycle;
>
> insert into log values(select generate_series(1,30), 'dummy');
> INSERT 0 30
>
> update log set des='....' where account_id=(select nextval('cy30'));
> UPDATE 1
>
>
> There are details to consider I guess. For example what if an update fails 
> and the sequence already advanced... Also, since we cycle the id, for 
> sorting, we'll need to add timestamp or something similar.
>

Only problem with this approach is that I need to be able to keep track of the 
last 30 items per each account_id...so each account_id will have his last 30 
messages in the table.


-- 
Jeff Frost, Owner     <jeff@frostconsultingllc.com>
Frost Consulting, LLC     http://www.frostconsultingllc.com/
Phone: 650-780-7908    FAX: 650-649-1954


Re: keeping last 30 entries of a log table

From
"Ben K."
Date:
> I need to write a function which inserts a log entry in a log table and only 
> keeps the last 30 records.  I was thinking of using a subquery similar to the 
> following:
>
> insert into log (account_id, message) values (1, 'this is a test);
> delete from log where account_id = 1 and id not in ( select id from log
>  where account_id = 1 order by timestamp desc limit 30);
>
> I'm wondering if there is a more performance oriented method of doing the 
> delete that I'm not thinking of.


Just for the sake of alternatives -

create sequence cy30 maxvalue 30 cycle;

insert into log values(select generate_series(1,30), 'dummy');
INSERT 0 30

update log set des='....' where account_id=(select nextval('cy30'));
UPDATE 1


There are details to consider I guess. For example what if an update fails 
and the sequence already advanced... Also, since we cycle the id, for 
sorting, we'll need to add timestamp or something similar.

My 2 pence...

P.S.

This A) failed me and I wonder if this is supposed to be so or if it's 
just a place where no one treaded on ??

B) works fine except it doesn't advance the sequence.

A) update  tc set des='b' where id=nextval('cy30')::int;
UPDATE 30

B) update tc set des='c' where id=currval('cy30');
UPDATE 1


Regards,

Ben K.
Developer
http://benix.tamu.edu