Re: how many record versions - Mailing list pgsql-general

From David Garamond
Subject Re: how many record versions
Date
Msg-id 40B1904A.3090307@zara.6.isreserved.com
Whole thread Raw
In response to Re: how many record versions  (Greg Stark <gsstark@mit.edu>)
Responses Re: how many record versions  (Greg Stark <gsstark@mit.edu>)
List pgsql-general
Greg Stark wrote:
>>Actually, each record will be incremented probably only thousands of times a
>>day. But there are many banners. Each record has a (bannerid, campaignid,
>>websiteid, date, countrycode) "dimensions" and (impression, click) "measures".
>
> In the past when I had a very similar situation we kept the raw impression and
> click event data. Ie, one record per impression in the impression table and
> one record per click in the click data.

> That makes the tables insert-only which is efficient and not prone to locking
> contention. They would never have to be vacuumed except after purging old data.

Assuming there are 10 millions of impressions per day, the impression
table will grow at least 200-400MB per day, is that correct? What do you
do and how often do you purge old data? Do you do a mass DELETE on the
impression table itself or do you switch to another table? I've found
that deleting tens/hundreds of thousands of row, at least in InnoDB,
takes long, long time (plus it sucks CPU and slows other queries).

> Then to accelerate queries we had denormalized aggregate tables with a cron
> job that did the equivalent of
>
> insert into agg_clicks (
>   select count(*),bannerid
>     from clicks
>    where date between ? and ?
>    group by bannerid
>   )

--
dave


pgsql-general by date:

Previous
From: David Garamond
Date:
Subject: Re: how many record versions
Next
From: Marc Slemko
Date:
Subject: Re: how many record versions