Re: Postgresql 'eats' all mi data partition - Mailing list pgsql-bugs

From Tomas Szepe
Subject Re: Postgresql 'eats' all mi data partition
Date
Msg-id 20030927113127.GC32507@louise.pinerecords.com
Whole thread Raw
In response to Re: Postgresql 'eats' all mi data partition  (Gaetano Mendola <mendola@bigfoot.com>)
List pgsql-bugs
> [mendola@bigfoot.com]
>
> >>>>>indexes:
> >>>>>stats_min_pkey primary key btree (ip, "start")
> >>>>>stats_min_start btree ("start")
> >>>>>stats_hr_pkey primary key btree (ip, "start")
> >>>>>stats_hr_start btree ("start")
> >>>>
> >>>>>ip is of type "inet" in all tables.
> >>>>>start is of type "timestamp without time zone" in all tables.
> >>>>
> >>>>Okay, so a pkey index entry will take 32 bytes counting overhead ...
> >>>>you've got about 10:1 bloat on the stats_min indexes and 2:1 in
> >>>>stats_hr.
> >>>>Definitely bad :-(
> >>>
> >>>
> >>>The only difference between the way stats_min and stats_hr are updated
> >>>stems from the fact that stats_min only holds records for the last 1440
> >>>minutes (because of its killer time granularity), whereas stats_hr
> >>>holds its data until we decide some of it is obsolete enough and
> >>>issue a "delete from" by hand.
> >>
> >>Are you sure that all indexes are needed and that a partial index could
> >>not help ? What about the statistics on these indexes ? Are they really
> >>used ?
> >
> >
> >Yup, they're all essential. :(
>
> May I see yours tipical queries where these indexes are involved ?

A very typical query (apart from those I've already posted in my "how the
updates work" mail) would be:

select ip, start::time,
    (in_tcp_web + in_tcp_mail + in_udp_and_icmp
        + in_tcp_rest + in_rest) as d_in,
    (out_tcp_web + out_tcp_mail + out_udp_and_icmp
        + out_tcp_rest + out_rest) as d_out,
    (in_tcp_web + in_tcp_mail + in_udp_and_icmp
        + in_tcp_rest + in_rest
        + out_tcp_web + out_tcp_mail + out_udp_and_icmp
        + out_tcp_rest + out_rest) as d_sum,
    ((in_tcp_web + in_tcp_mail + in_udp_and_icmp
        + in_tcp_rest + in_rest
        + out_tcp_web + out_tcp_mail + out_udp_and_icmp
        + out_tcp_rest + out_rest) / intlen / 128) as rate_sum
    from stats_hr
    where start=(select start from stats_hr order by start desc limit 1)
    order by (in_tcp_web + in_tcp_mail + in_udp_and_icmp
        + in_tcp_rest + in_rest + out_tcp_web + out_tcp_mail
        + out_udp_and_icmp + out_tcp_rest + out_rest)
        desc
    limit 20;

->

    QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Limit  (cost=5152.26..5152.31 rows=20 width=104)
   InitPlan
     ->  Limit  (cost=0.00..1.11 rows=1 width=8)
           ->  Index Scan Backward using stats_hr_start on stats_hr  (cost=0.00..12059890.93 rows=10847279 width=8)
   ->  Sort  (cost=5152.26..5162.55 rows=4115 width=104)
         Sort Key: (((((((((in_tcp_web + in_tcp_mail) + in_udp_and_icmp) + in_tcp_rest) + in_rest) + out_tcp_web) +
out_tcp_mail)+ out_udp_and_icmp) + out_tcp_rest) + out_rest) 
         ->  Index Scan using stats_hr_start on stats_hr  (cost=0.00..4905.22 rows=4115 width=104)
               Index Cond: ("start" = $0)

(done in 0.079s.)

--
Tomas Szepe <szepe@pinerecords.com>

pgsql-bugs by date:

Previous
From: Gaetano Mendola
Date:
Subject: Re: Postgresql 'eats' all mi data partition
Next
From: Tomas Szepe
Date:
Subject: [7.4beta3] pg_dump -t xxx won't output sequences