Sorry for the repost. Just wondering if anyone has a workaround for 7.4?
At 11:25 AM 1/27/2004 -0500, Tom Lane wrote:
>Martijn van Oosterhout <kleptog@svana.org> writes:
> > I'm afraid I'll have to defer to someone else (Tom?) as why the
> estimate was
> > out by three orders of magnitude.
>
>I'd like to know that, too.
>
> > I'd suggest playing around with statistics and seeing if you can work out
> > why they were so bad.
>
>Could we see the pg_stats row for the ip_saddr column? Also, does the
>estimate get better if you increase the stats target for ip_saddr and
>re-analyze?
>
> regards, tom lane
OK - the statistics and various queries follow. If n_distinct is the
estimate of unique values then it is still off, but the plan does change.
How does postgresql do the analysis?
select version();
version
-------------------------------------------------------------
PostgreSQL 7.4.1 on i686-pc-linux-gnu, compiled by GCC 2.96
(1 row)
VACUUM full cust_ulog ;
VACUUM
Time: 96916.323 ms
analyze cust_ulog;
Time: 25969.562 ms
explain select ip_saddr,count(*),sum(ip_totlen) from cust_ulog
where true
group by ip_saddr order by sum(ip_totlen) desc limit 10
QUERY PLAN
---------------------------------------------------------------------------------------
Limit (cost=331739.22..331739.24 rows=10 width=13)
-> Sort (cost=331739.22..331743.03 rows=1523 width=13)
Sort Key: sum(ip_totlen)
-> HashAggregate (cost=331651.09..331658.71 rows=1523 width=13)
-> Seq Scan on cust_ulog (cost=0.00..262156.91
rows=9265891 width=13)
(5 rows)
Time: 163.512 ms
select * from pg_stats where tablename ='cust_ulog' and attname='ip_saddr';
schemaname | tablename | attname | null_frac | avg_width | n_distinct |
most_common_vals
| most_common_freqs |
histogram_bounds | correlation
------------+-----------+----------+-----------+-----------+------------+---------------------------------------------------------------------------------------------------------------------------------------------------+------------------------------------------------------------------------------------------+----------------------------------------------------------------------------------------------------------------------------------------------------------+-------------
public | cust_ulog | ip_saddr | 0 | 11 | 1523 |
{202.185.236.30,202.185.236.6,202.185.236.11,202.185.236.25,202.185.236.7,203.121.96.19,202.185.236.23,202.75.129.102,192.228.128.20,202.188.1.5}
|
{0.243667,0.0926667,0.0553333,0.0433333,0.031,0.0183333,0.0143333,0.0116667,0.011,0.011}
|
{4.2.49.2,61.149.58.98,66.128.175.107,69.56.63.214,195.92.67.209,202.185.231.40,202.185.236.47,203.70.228.182,208.41.95.194,212.253.48.34,221.232.160.6}
| 0.00267716
(1 row)
alter table cust_ulog alter column ip_saddr set statistics 50;
ALTER TABLE
Time: 15.605 ms
analyze cust_ulog;
ANALYZE
Time: 51251.862 ms
select correlation,attname from pg_stats where tablename ='cust_ulog' and
attname='ip_saddr';
correlation | attname
-------------+----------
0.0286273 | ip_saddr
select n_distinct from pg_stats where tablename ='cust_ulog' and
attname='ip_saddr';
n_distinct
------------
6714
(1 row)
Time: 16.134 ms
alter table cust_ulog alter column ip_saddr set statistics 100;
ALTER TABLE
Time: 13.728 ms
analyze cust_ulog;
ANALYZE
Time: 55412.288 ms
select n_distinct from pg_stats where tablename ='cust_ulog' and
attname='ip_saddr';
n_distinct
------------
12939
(1 row)
Time: 26.338 ms
explain select ip_saddr,count(*),sum(ip_totlen) from cust_ulog
where true
group by ip_saddr order by sum(ip_totlen) desc limit 10
;
QUERY PLAN
---------------------------------------------------------------------------------------------
Limit (cost=2096080.34..2096080.37 rows=10 width=13)
-> Sort (cost=2096080.34..2096112.69 rows=12939 width=13)
Sort Key: sum(ip_totlen)
-> GroupAggregate (cost=2002684.66..2095196.65 rows=12939 width=13)
-> Sort (cost=2002684.66..2025796.48 rows=9244729 width=13)
Sort Key: ip_saddr
-> Seq Scan on cust_ulog (cost=0.00..261945.29
rows=9244729 width=13)
(7 rows)
Time: 29.819 ms
select * from pg_stats where tablename ='cust_ulog' and attname='ip_saddr';
schemaname | tablename | attname | null_frac | avg_width | n_distinct |
most_common_vals | most_common_freqs | histogram_bounds | correlation
------------+-----------+----------+-----------+-----------+------------+-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
-----
-
------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+-------------------
-----
-
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------+------
-----
-
--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
public | cust_ulog | ip_saddr | 0 | 11 | 12939 |
{202.185.236.30,202.185.236.6,202.185.236.11,202.185.236.25,202.185.236.7,203.121.96.19,202.185.236.23,202.185.236.47,192.228.128.20,202.75.129.102,202.188.1.5,202.185.236.3,202.75.129.101,203.121.96.24,203.121.96.16,202.185.236.34,202.185.236.31,202.188.0.132,202.188.0.133,202.185.236.12,202.12.28.131,202.188.1.23,202.188.1.25,202.188.1.4,202.12.29.25,202.185.231.40,202.185.236.45,202.185.236.26,192.228.128.11,193.0.0.193,202.129.169.40,202.146.72.66,210.186.75.154,219.93.92.82,202.185.236.35,192.228.128.1
2,202
.
185.90.118,202.185.236.5,202.185.236.8,202.187.125.12,211.97.56.133,103.221.179.184,202.185.66.41,202.186.47.155,202.186.140.50,202.188.172.163,216.39.48.40,216.218.158.87,216.218.201.91,216.218.224.26,216.219.254.60,202.185.254.231,202.186.49.133,202.187.31.12,202.187.58.165,202.188.0.164,203.121.15.140,203.193.96.68,207.218.165.207,61.6.38.139,61.175.132.228,62.62.139.253,63.163.210.227,64.61.25.155,66.48.78.31}
|
{0.248767,0.0997333,0.0566667,0.0382667,0.0345667,0.0147667,0.0138,0.0118,0.0107667,0.00983333,0.00946667,0.0093,0.00926667,0.00903333,0.00793333,0.00776667,0.0077,0.0074,0.00733333,0.0049,0.00443333,0.0044,0.0044,0.00426667,0.00406667,0.0038,0.0031,0.0028,0.00193333,0.0018,0.0016,0.00146667,0.0014,0.00123333,0.00113333,0.0011,0.00103333,0.0008,0.0008,0.000733333,0.0006,0.0006,0.000566667,0.000566667,0.000566667,0.000533333,0.000533333,0.000466667,0.000433333,0.000433333,0.0004,0.000366667,0.000366667,0.000
333,0
. 000133333} |
{4.3.7.155,4.40.42.46,12.8.192.29,12.148.209.196,18.206.0.76,24.81.163.171,24.145.188.188,24.194.25.199,24.228.82.49,61.6.98.138,61.6.154.62,61.48.54.46,61.88.0.6,61.149.232.192,61.222.96.205,62.12.112.134,63.13.136.190,63.185.112.170,64.40.227.22,64.146.63.170,64.218.231.6,64.231.230.239,65.43.172.73,65.92.251.30,65.124.135.2,65.220.24.19,66.50.11.33,66.92.93.137,66.142.230.36,66.190.201.9,66.250.5.182,67.39.64.151,67.72.200.129,67.119.178.208,67.227.12.5,68.47.247.99,68.76.186.67,68.97.15.56,68.146.69.171
.22,2
0
7.206.201.250,208.188.24.184,209.122.233.245,210.3.135.230,210.54.85.168,210.169.237.132,210.186.123.51,210.187.228.241,210.201.230.58,211.24.46.39,211.104.217.229,211.173.53.3,212.162.76.197,213.152.72.122,216.68.94.36,216.176.89.162,217.42.54.38,217.233.45.74,218.47.249.76,218.117.84.68,218.244.59.57,219.92.84.8,219.93.108.1,219.95.1.248,219.95.164.90,219.109.238.196,219.252.67.13,220.201.96.35,221.232.160.6}
| 0.0258759
(1 row)
Time: 4.600 ms
select count(*) from cust_ulog ;
count
---------
9461332
(1 row)
Time: 46468.211 ms
fwlogs=# explain select count(distinct(ip_saddr)) from cust_ulog;
QUERY PLAN
---------------------------------------------------------------------------
Aggregate (cost=285057.11..285057.11 rows=1 width=11)
-> Seq Scan on cust_ulog (cost=0.00..261945.29 rows=9244729 width=11)
(2 rows)
Time: 1.087 ms
fwlogs=# select count(distinct(ip_saddr)) from cust_ulog;
count
---------
1391110
(1 row)
Time: 226998.021 ms
***Version 7.3.4 statistics:
select * from pg_stats where tablename ='cust_ulog' and attname='ip_saddr'
schemaname | tablename | attname | null_frac | avg_width | n_distinct |
most_common_vals
| most_common_freqs |
histogram_bounds | correlation
------------+-----------+----------+-----------+-----------+------------+-----------------------------------------------------------------------------------------------------------------------------------------------------+-----------------------------------------------------------------------------------+------------------------------------------------------------------------------------------------------------------------------------------------------+-------------
public | hlbb_ulog | ip_saddr | 0 | 12 | 1643 |
{202.185.236.30,202.185.236.6,202.185.236.11,202.185.236.25,202.185.236.7,203.121.96.19,202.185.236.23,192.228.128.20,202.185.236.47,202.188.0.133}
|
{0.238333,0.091,0.0536667,0.037,0.035,0.0163333,0.015,0.012,0.0103333,0.00966667}
|
{4.2.49.3,61.6.158.105,66.44.2.9,74.16.241.113,192.228.128.11,202.108.249.21,202.185.236.34,203.69.19.3,209.112.0.54,216.87.206.244,221.199.145.131}
| 0.0245979
---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?
http://www.postgresql.org/docs/faqs/FAQ.html