Greg Stark wrote:
>>Actually, each record will be incremented probably only thousands of times a
>>day. But there are many banners. Each record has a (bannerid, campaignid,
>>websiteid, date, countrycode) "dimensions" and (impression, click) "measures".
>
> In the past when I had a very similar situation we kept the raw impression and
> click event data. Ie, one record per impression in the impression table and
> one record per click in the click data.
> That makes the tables insert-only which is efficient and not prone to locking
> contention. They would never have to be vacuumed except after purging old data.
Assuming there are 10 millions of impressions per day, the impression
table will grow at least 200-400MB per day, is that correct? What do you
do and how often do you purge old data? Do you do a mass DELETE on the
impression table itself or do you switch to another table? I've found
that deleting tens/hundreds of thousands of row, at least in InnoDB,
takes long, long time (plus it sucks CPU and slows other queries).
> Then to accelerate queries we had denormalized aggregate tables with a cron
> job that did the equivalent of
>
> insert into agg_clicks (
> select count(*),bannerid
> from clicks
> where date between ? and ?
> group by bannerid
> )
--
dave