On Fri, 10 Jan 2014 13:06:21 +0100 Francisco Olarte <folarte@peoplecall.com> wrote:
> Hi:
>
> On Thu, Jan 9, 2014 at 7:50 PM, Tirthankar Barari <tbarari@verizon.com> wrote:
> > We have a table where we insert about 10 million rows everyday. We keep 14
> > day's worth of entries (i.e. 140 mil). A scheduled task wakes up every day
> > and deletes all entries past the 14 day window (i.e. deletes entries from
> > the 15th day in the past).
I missed the early part of this thread, so I apologize if I'm repeating anything
that was previously stated.
In my experience, I have seen that rolling datasets like this are far better
maintained with more frequent purging. Most database systems seem to struggle
when loading up large amounts of data, then removing large amounts of data.
What has helped immensely for me is to increase the frequency of the purging
process. i.e. instead of puring once a day, purge once an hour, or possibly
even more frequently. Even though the purge happens more often, it's less
overhead each time it happens since it's affecting less rows. I have one
system that collects about 60 GPS coordinates per second and keeps 2 weeks
worth of data -- trial and error has found that the most efficient ongoing
maintenance is to purge about every 10 seconds (in the case of this particular
system, I actually wrote a program that estimates the current load on the
database and purges more or less frequently based on what else is going on,
but over the course of a day it averages out to about once very 10 seconds)
YMMV, and the suggestion about partitioning is something that might work in
your case as well.
--
Bill Moran <wmoran@potentialtech.com>