Re: Postgresql 'eats' all mi data partition - Mailing list pgsql-bugs
| From | Tomas Szepe |
|---|---|
| Subject | Re: Postgresql 'eats' all mi data partition |
| Date | |
| Msg-id | 20030927113127.GC32507@louise.pinerecords.com Whole thread Raw |
| In response to | Re: Postgresql 'eats' all mi data partition (Gaetano Mendola <mendola@bigfoot.com>) |
| List | pgsql-bugs |
> [mendola@bigfoot.com]
>
> >>>>>indexes:
> >>>>>stats_min_pkey primary key btree (ip, "start")
> >>>>>stats_min_start btree ("start")
> >>>>>stats_hr_pkey primary key btree (ip, "start")
> >>>>>stats_hr_start btree ("start")
> >>>>
> >>>>>ip is of type "inet" in all tables.
> >>>>>start is of type "timestamp without time zone" in all tables.
> >>>>
> >>>>Okay, so a pkey index entry will take 32 bytes counting overhead ...
> >>>>you've got about 10:1 bloat on the stats_min indexes and 2:1 in
> >>>>stats_hr.
> >>>>Definitely bad :-(
> >>>
> >>>
> >>>The only difference between the way stats_min and stats_hr are updated
> >>>stems from the fact that stats_min only holds records for the last 1440
> >>>minutes (because of its killer time granularity), whereas stats_hr
> >>>holds its data until we decide some of it is obsolete enough and
> >>>issue a "delete from" by hand.
> >>
> >>Are you sure that all indexes are needed and that a partial index could
> >>not help ? What about the statistics on these indexes ? Are they really
> >>used ?
> >
> >
> >Yup, they're all essential. :(
>
> May I see yours tipical queries where these indexes are involved ?
A very typical query (apart from those I've already posted in my "how the
updates work" mail) would be:
select ip, start::time,
(in_tcp_web + in_tcp_mail + in_udp_and_icmp
+ in_tcp_rest + in_rest) as d_in,
(out_tcp_web + out_tcp_mail + out_udp_and_icmp
+ out_tcp_rest + out_rest) as d_out,
(in_tcp_web + in_tcp_mail + in_udp_and_icmp
+ in_tcp_rest + in_rest
+ out_tcp_web + out_tcp_mail + out_udp_and_icmp
+ out_tcp_rest + out_rest) as d_sum,
((in_tcp_web + in_tcp_mail + in_udp_and_icmp
+ in_tcp_rest + in_rest
+ out_tcp_web + out_tcp_mail + out_udp_and_icmp
+ out_tcp_rest + out_rest) / intlen / 128) as rate_sum
from stats_hr
where start=(select start from stats_hr order by start desc limit 1)
order by (in_tcp_web + in_tcp_mail + in_udp_and_icmp
+ in_tcp_rest + in_rest + out_tcp_web + out_tcp_mail
+ out_udp_and_icmp + out_tcp_rest + out_rest)
desc
limit 20;
->
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Limit (cost=5152.26..5152.31 rows=20 width=104)
InitPlan
-> Limit (cost=0.00..1.11 rows=1 width=8)
-> Index Scan Backward using stats_hr_start on stats_hr (cost=0.00..12059890.93 rows=10847279 width=8)
-> Sort (cost=5152.26..5162.55 rows=4115 width=104)
Sort Key: (((((((((in_tcp_web + in_tcp_mail) + in_udp_and_icmp) + in_tcp_rest) + in_rest) + out_tcp_web) +
out_tcp_mail)+ out_udp_and_icmp) + out_tcp_rest) + out_rest)
-> Index Scan using stats_hr_start on stats_hr (cost=0.00..4905.22 rows=4115 width=104)
Index Cond: ("start" = $0)
(done in 0.079s.)
--
Tomas Szepe <szepe@pinerecords.com>
pgsql-bugs by date: