Estimated rows way off - Mailing list pgsql-performance

From Michael Guerin
Subject Estimated rows way off
Date
Msg-id 406AFD42.7050208@rentec.com
Whole thread Raw
List pgsql-performance
*statistics  target = 100
*INFO:  index "timeseries_tsid" now contains *16,677,521* row versions
in 145605 pages
DETAIL:  76109 index pages have been deleted, 20000 are currently reusable.
CPU 12.00s/2.83u sec elapsed 171.26 sec.
INFO:  "timeseries": found 0 removable, 16677521 nonremovable row
versions in 1876702 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were *18,894,051* unused item pointers.
0 pages are entirely empty.
CPU 138.74s/28.96u sec elapsed 1079.43 sec.
INFO:  vacuuming "pg_toast.pg_toast_1286079786"
INFO:  index "pg_toast_1286079786_index" now contains 4846282 row
versions in 29319 pages
DETAIL:  10590 index pages have been deleted, 10590 are currently reusable.
CPU 2.23s/0.55u sec elapsed 28.34 sec.
INFO:  "pg_toast_1286079786": found 0 removable, 4846282 nonremovable
row versions in 1379686 pages
DETAIL:  0 dead row versions cannot be removed yet.
There were 2824978 unused item pointers.
0 pages are entirely empty.
CPU 112.92s/19.53u sec elapsed 731.99 sec.
INFO:  analyzing "public.timeseries"
INFO:  "timeseries": 1876702 pages, *30,000* rows sampled, *41,762,188*
estimated total rows


setting the default statistics target higher makes the estimate worse:
*statistics target = 500*
INFO:  index "timeseries_tsid" now contains *16,953,429 *row versions in
145605 pages
INFO:  "timeseries": 1891940 pages, *150,000* rows sampled, *64,803,483*
estimated total rows

*statistics target = 1000 *
INFO:  index "timeseries_tsid" now contains *17,216,139* row versions in
145605 pages
INFO:  "timeseries": 1937484 pages, *300,000* rows sampled, *68,544,295*
estimated total rows


I'm trying to understand why the estimated row count is so off.  I'm
assuming this is b/c we do very large deletes and we're leaving around a
large number of almost empty pages.  Is this the reason?

Let me know if you need more info.

Thanks
Michael





>
>
>> INFO:  index "timeseries_tsid" now contains *16677521* row versions
>> in 145605 pages
>> DETAIL:  76109 index pages have been deleted, 20000 are currently
>> reusable.
>> CPU 12.00s/2.83u sec elapsed 171.26 sec.
>> INFO:  "timeseries": found 0 removable, 16677521 nonremovable row
>> versions in 1876702 pages
>> DETAIL:  0 dead row versions cannot be removed yet.
>> There were 18894051 unused item pointers.
>> 0 pages are entirely empty.
>> CPU 138.74s/28.96u sec elapsed 1079.43 sec.
>> INFO:  vacuuming "pg_toast.pg_toast_1286079786"
>> INFO:  index "pg_toast_1286079786_index" now contains 4846282 row
>> versions in 29319 pages
>> DETAIL:  10590 index pages have been deleted, 10590 are currently
>> reusable.
>> CPU 2.23s/0.55u sec elapsed 28.34 sec.
>> INFO:  "pg_toast_1286079786": found 0 removable, 4846282 nonremovable
>> row versions in 1379686 pages
>> DETAIL:  0 dead row versions cannot be removed yet.
>> There were 2824978 unused item pointers.
>> 0 pages are entirely empty.
>> CPU 112.92s/19.53u sec elapsed 731.99 sec.
>> INFO:  analyzing "public.timeseries"
>> INFO:  "timeseries": 1876702 pages, *30,000* rows sampled,
>> *41,762,188* estimated total rows
>>
>>
>>
>
> setting the default statistics target higher made the estimate worse:
> (changed from 100 to 500)
> *
> statistics target = 500*
> INFO:  index "timeseries_tsid" now contains *16,953,429 *row versions
> in 145605 pages
> INFO:  "timeseries": 1891940 pages, *150,000* rows sampled,
> *64,803,483* estimated total rows
>
> *statistics target = 1000
> *INFO:  index "timeseries_tsid" now contains *17,216,139* row versions
> in 145605 pages
> INFO:  "timeseries": 1937484 pages,* 300,000* rows sampled,
> *68,544,295* estimated total rows
>
>
>
>
>
> This probably has something to do with the large deletes we do.  I'm
> looking around to get some more info on statistics collection.
>
> -mike
>
>
>

pgsql-performance by date:

Previous
From: Chris Kratz
Date:
Subject: Re: Delete performance on delete from table with inherited tables
Next
From: "Magnus Naeslund(t)"
Date:
Subject: Re: PostgreSQL and Linux 2.6 kernel.