Re: [HACKERS] Bug#48582: psql spends hours computing results it already knows (fwd) - Mailing list pgsql-hackers

From Brian Ristuccia
Subject Re: [HACKERS] Bug#48582: psql spends hours computing results it already knows (fwd)
Date
Msg-id 19991028163553.B3255@osiris.978.org
Whole thread Raw
In response to Re: [HACKERS] Bug#48582: psql spends hours computing results it already knows (fwd)  ("Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu>)
List pgsql-hackers
On Thu, Oct 28, 1999 at 03:32:17PM -0500, Ross J. Reedstrom wrote:
> Oliver (& Brian) - 
> Hmm, that happens to not be the case. The rows=XXXX number is drawn
> from the statistics for the table, which are only updated on VACUUM
> ANALYZE of that table. Easily tested: just INSERT a couple rows and do
> the EXPLAIN again. The rows=XXX won't change. Ah, here's an example:
> a table I've never vacuumed, until now (it's in a test copy of a db)

Aah.. Is there any other more efficient way of determining the number of
rows in a table? It seems a sequential scan takes forever, but the database
must already have some idea (somewhere) of how many records are in the table
otherwise how would it know where to start/stop the sequential scan? 

> 
> test=> select count(*) from "Personnel";
> count
> -----
>   177
> (1 row)
> 
> test=> explain select count(*) from "Personnel";
> NOTICE:  QUERY PLAN:
> 
> Aggregate  (cost=43.00 rows=1000 width=4)
>   ->  Seq Scan on Personnel  (cost=43.00 rows=1000 width=4)
> 
> EXPLAIN
> test=> vacuum analyze "Personnel";
> VACUUM
> test=> explain select count(*) from "Personnel";
> NOTICE:  QUERY PLAN:
> 
> Aggregate  (cost=7.84 rows=177 width=4)
>   ->  Seq Scan on Personnel  (cost=7.84 rows=177 width=4)
> 
> EXPLAIN
> test=> 
> 
> Ross
> 
> On Thu, Oct 28, 1999 at 08:53:31PM +0100, Oliver Elphick wrote:
> > This is a Debian bug report, which needs upstream attention.
> > 
> > ------- Forwarded Message
> > 
> > Date:    Thu, 28 Oct 1999 13:45:18 -0400
> > From:    Brian Ristuccia <brianr@osiris.978.org>
> > To:      submit@bugs.debian.org
> > Subject: Bug#48582: psql spends hours computing results it already knows
> > 
> > Package: postgresql
> > Version: 6.5.2-3
> > 
> > massive_db=> explain select count(*) from huge_table;
> > NOTICE:  QUERY PLAN:
> > 
> > Aggregate  (cost=511.46 rows=9923 width=12)
> >   ->  Seq Scan on huge_table  (cost=511.46 rows=9923 width=12)
> > 
> > EXPLAIN
> > 
> > If huge_table really is huge -- like 9,000,000 rows instead of 9923, after
> > postgresql already knows the number of rows (that's how it determines the
> > cost), it proceeds to do a very long and CPU/IO intensive seq scan to
> > determine the count().
> > 
> > - -- 
> > Brian Ristuccia
> > brianr@osiris.978.org
> > bristucc@nortelnetworks.com
> > bristucc@cs.uml.edu
> > 

-- 
Brian Ristuccia
brianr@osiris.978.org
bristucc@nortelnetworks.com
bristucc@cs.uml.edu


pgsql-hackers by date:

Previous
From: "Ross J. Reedstrom"
Date:
Subject: Re: [HACKERS] Bug#48582: psql spends hours computing results it already knows (fwd)
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] psql Week 4.142857