Thread: stats are way off on 8.4 b1

stats are way off on 8.4 b1

From
Grzegorz Jaśkiewicz
Date:
crawler=# select *  from assigments;
 jobid | timeout | workerid
-------+---------+----------
(0 rows)

Time: 0.705 ms
crawler=# \d+ assigments
                                           Table "public.assigments"
  Column  |           Type           |                    Modifiers
                | Storage | Description
----------+--------------------------+-------------------------------------------------+---------+-------------
 jobid    | bigint                   | not null
                | plain   |
 timeout  | timestamp with time zone | not null default (now() +
'00:02:00'::interval) | plain   |
 workerid | bigint                   | not null
                | plain   |
Indexes:
    "assigments_pkey" PRIMARY KEY, btree (jobid)
Foreign-key constraints:
    "assigments_jobid_fkey" FOREIGN KEY (jobid) REFERENCES jobs(id)
MATCH FULL ON UPDATE CASCADE ON DELETE CASCADE
Has OIDs: no

crawler=# \d+ jobs
                                                Table "public.jobs"
   Column   |           Type           |                     Modifiers
                    | Storage | Description
------------+--------------------------+---------------------------------------------------+---------+-------------
 id         | bigint                   | not null default
nextval('jobs_id_seq'::regclass) | plain   |
 domainid   | bigint                   | not null
                    | plain   |
 priority   | smallint                 | not null default 1
                    | plain   |
 added      | timestamp with time zone | not null default now()
                    | plain   |
 notify_end | boolean                  | not null default false
                    | plain   |
Indexes:
    "jobs_pkey" PRIMARY KEY, btree (domainid)
    "job_id_uidx" UNIQUE, btree (id)
    "foo" btree (notify_end DESC, priority DESC, added)
    "foo_bar" btree (notify_end, priority, added)
    "jobs_worker_priority_on_jobs" btree (calc_prio(notify_end,
priority, added))
Foreign-key constraints:
    "jobs_domain_id_fkey" FOREIGN KEY (domainid) REFERENCES
domains(id) MATCH FULL ON UPDATE CASCADE ON DELETE CASCADE DEFERRABLE
Referenced by:
  "assigments_jobid_fkey" IN assigments FOREIGN KEY (jobid) REFERENCES
jobs(id) MATCH FULL ON UPDATE CASCADE ON DELETE CASCADE
Has OIDs: no

crawler=# explain analyze select * from full_assigments_view;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=0.00..11040.77 rows=1510 width=31) (actual
time=0.003..0.003 rows=0 loops=1)
   ->  Nested Loop  (cost=0.00..10410.97 rows=1510 width=24) (actual
time=0.002..0.002 rows=0 loops=1)
         ->  Seq Scan on assigments a  (cost=0.00..25.10 rows=1510
width=16) (actual time=0.002..0.002 rows=0 loops=1)
         ->  Index Scan using job_id_uidx on jobs j  (cost=0.00..6.87
rows=1 width=16) (never executed)
               Index Cond: (j.id = a.jobid)
   ->  Index Scan using domains_id_idx on domains d  (cost=0.00..0.40
rows=1 width=19) (never executed)
         Index Cond: (d.id = j.domainid)
 Total runtime: 0.123 ms
(8 rows)

Time: 1.390 ms

             View "public.full_assigments_view"
   Column    |  Type   | Modifiers | Storage  | Description
-------------+---------+-----------+----------+-------------
 domain_name | text    |           | extended |
 job_id      | bigint  |           | plain    |
 timed_out   | boolean |           | plain    |
View definition:
 SELECT d.name AS domain_name, j.id AS job_id, (now() - a.timeout) >
'00:00:00'::interval AS timed_out
   FROM assigments a
   JOIN jobs j ON a.jobid = j.id
   JOIN domains d ON d.id = j.domainid;


default_statistics_target=100
all the other settings are pretty much default,

That expected 1510 rows in 'assigments' seems to be pretty off,
especially since I just vacuumed/analyze the db.
Any ideas ?


--
GJ

Re: stats are way off on 8.4 b1

From
Tom Lane
Date:
=?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= <gryzman@gmail.com> writes:
> That expected 1510 rows in 'assigments' seems to be pretty off,

The planner does not trust an empty table to stay empty.  Every
Postgres version in living memory has acted like that; it's not
new to 8.4.

            regards, tom lane

Re: stats are way off on 8.4 b1

From
Grzegorz Jaśkiewicz
Date:
2009/4/18 Tom Lane <tgl@sss.pgh.pa.us>:
> =?UTF-8?Q?Grzegorz_Ja=C5=9Bkiewicz?= <gryzman@gmail.com> writes:
>> That expected 1510 rows in 'assigments' seems to be pretty off,
>
> The planner does not trust an empty table to stay empty.  Every
> Postgres version in living memory has acted like that; it's not
> new to 8.4.

ok, thanks
Quick question Tom. Can correlation be negative ?


--
GJ

Re: stats are way off on 8.4 b1

From
Heikki Linnakangas
Date:
Grzegorz Jaśkiewicz wrote:
> Can correlation be negative ?

Yes, if the data in the column are in descending order. For example:

postgres=# CREATE TABLE foo(id int4);
CREATE TABLE
postgres=# INSERT INTO foo SELECT 1000 - generate_series(1, 1000);
INSERT 0 1000
postgres=# ANALYZE foo;
ANALYZE
postgres=# SELECT attname, correlation FROM pg_stats  WHERE tablename='foo';
  attname | correlation
---------+-------------
  id      |          -1
(1 row)

--
   Heikki Linnakangas
   EnterpriseDB   http://www.enterprisedb.com

Re: stats are way off on 8.4 b1

From
Grzegorz Jaśkiewicz
Date:
2009/4/18 Heikki Linnakangas <heikki.linnakangas@enterprisedb.com>:
> Grzegorz Jaśkiewicz wrote:
>>
>> Can correlation be negative ?
>
> Yes, if the data in the column are in descending order. For example:
>
> postgres=# CREATE TABLE foo(id int4);
> CREATE TABLE
> postgres=# INSERT INTO foo SELECT 1000 - generate_series(1, 1000);
> INSERT 0 1000
> postgres=# ANALYZE foo;
> ANALYZE
> postgres=# SELECT attname, correlation FROM pg_stats  WHERE tablename='foo';
>  attname | correlation
> ---------+-------------
>  id      |          -1
> (1 row)

aye, thanks.


--
GJ