> [tgl@sss.pgh.pa.us]
>
> > indexes:
> > stats_min_pkey primary key btree (ip, "start")
> > stats_min_start btree ("start")
> > stats_hr_pkey primary key btree (ip, "start")
> > stats_hr_start btree ("start")
>
> > ip is of type "inet" in all tables.
> > start is of type "timestamp without time zone" in all tables.
>
> Okay, so a pkey index entry will take 32 bytes counting overhead ...
> you've got about 10:1 bloat on the stats_min indexes and 2:1 in stats_hr.
> Definitely bad :-(
The only difference between the way stats_min and stats_hr are updated
stems from the fact that stats_min only holds records for the last 1440
minutes (because of its killer time granularity), whereas stats_hr
holds its data until we decide some of it is obsolete enough and
issue a "delete from" by hand.
Thus, the updates look like (pardon the pseudosql please):
stats_min:
begin;
delete from stats_min where \
[data is current minute but yesterday's, or even older,
or from the future];
[repeat for all ips]
insert into stats_min [data for the current minute];
[end repeat]
commit;
stats_hr:
begin;
[repeat for all ips]
update stats_hr set [data += increment] where \
[data is for current hour];
[if no_of_rows_updated < 1]
insert into stats_hr [current hour's first increment];
[endif]
[end repeat]
commit;
> I expect the bloat is coming from the fact that the interesting range of
> "start" changes over time. 7.4 should be able to recycle index space
> in that situation, but 7.3 and before can't.
OK, I'll definitely try 7.4 once I'm confident with it.
Thanks for your time,
--
Tomas Szepe <szepe@pinerecords.com>