Thread: Stats on new tables

Stats on new tables

From
Michael Richards
Date:
I think I've found something of interest.
When I create a new table, the plan for it believes that the table
contains 2 rows, not 0. Also, since the engine knows when we run an
insert, and also knows how many rows a delete nukes, wouldn't it be better
to update the stats after every select and delete?

-Michael


equipment=> CREATE TABLE xuserid(
equipment->   oldid varchar(12),
equipment->   newid int4,
equipment->   PRIMARY KEY (oldid,newid)
equipment-> );
NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index
'xuserid_pkey' for table 'xuserid'
CREATE
equipment=> \d xuserid_pkey
Table    = xuserid_pkey
+----------------------------------+----------------------------------+-------+
|              Field               |              Type                |
Length|
+----------------------------------+----------------------------------+-------+
| oldid                            | varchar()                        |
12 |
| newid                            | int4                             |
4 |
+----------------------------------+----------------------------------+-------+
equipment=> explain select * from xuserid where newid=859;
NOTICE:  QUERY PLAN:

Seq Scan on xuserid  (cost=43.00 rows=2 width=16)

EXPLAIN
equipment=> vacuum analyze;
VACUUM
equipment=> explain select * from xuserid where newid=859;
NOTICE:  QUERY PLAN:

Seq Scan on xuserid  (cost=0.00 rows=0 width=16)

EXPLAIN



Re: [SQL] Stats on new tables

From
Tom Lane
Date:
Michael Richards <miker@scifair.acadiau.ca> writes:
> I think I've found something of interest.
> When I create a new table, the plan for it believes that the table
> contains 2 rows, not 0.

No, actually the initial default assumption under 6.5 & up is that
the table contains 1000 rows.  (It *used* to be that relntuples started
off zero, but that led the optimizer to pick plans that were only
suitable for very small tables, which led to horrible performance if
you loaded up a table without doing a vacuum.)

The particular example you're looking at involves a default assumption
about the selectivity of an '=' WHERE condition as well as a default
assumption about the total table size.  The 'rows' field of a plan
node shows the estimated number of *output* tuples, not the number of
tuples that need to be scanned.


> Also, since the engine knows when we run an
> insert, and also knows how many rows a delete nukes, wouldn't it be better
> to update the stats after every select and delete?

A lot easier said than done (consider concurrent transactions some
of which may abort).  We've discussed making it happen, but personally
I don't believe that the bookkeeping costs that would be incurred could
possibly be justified.  Certainly the optimizer's results wouldn't be
materially better with exact tuple counts than with approximate ones.
        regards, tom lane


Re: [SQL] Stats on new tables

From
Bruce Momjian
Date:
> I think I've found something of interest.
> When I create a new table, the plan for it believes that the table
> contains 2 rows, not 0. Also, since the engine knows when we run an
> insert, and also knows how many rows a delete nukes, wouldn't it be better
> to update the stats after every select and delete?

We believe the overhead of that would be high vs. the benefits.

> 
> -Michael
> 
> 
> equipment=> CREATE TABLE xuserid(
> equipment->   oldid varchar(12),
> equipment->   newid int4,
> equipment->   PRIMARY KEY (oldid,newid)
> equipment-> );
> NOTICE:  CREATE TABLE/PRIMARY KEY will create implicit index
> 'xuserid_pkey' for table 'xuserid'
> CREATE
> equipment=> \d xuserid_pkey
> Table    = xuserid_pkey
> +----------------------------------+----------------------------------+-------+
> |              Field               |              Type                |
> Length|
> +----------------------------------+----------------------------------+-------+
> | oldid                            | varchar()                        |
> 12 |
> | newid                            | int4                             |
> 4 |
> +----------------------------------+----------------------------------+-------+
> equipment=> explain select * from xuserid where newid=859;
> NOTICE:  QUERY PLAN:
> 
> Seq Scan on xuserid  (cost=43.00 rows=2 width=16)
> 
> EXPLAIN
> equipment=> vacuum analyze;
> VACUUM
> equipment=> explain select * from xuserid where newid=859;
> NOTICE:  QUERY PLAN:
> 
> Seq Scan on xuserid  (cost=0.00 rows=0 width=16)
> 
> EXPLAIN
> 
> 
> ************
> 
> 


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026