Thread: Bug#48582: psql spends hours computing results it already knows (fwd)

Bug#48582: psql spends hours computing results it already knows (fwd)

From
"Oliver Elphick"
Date:
This is a Debian bug report, which needs upstream attention.

------- Forwarded Message

Date:    Thu, 28 Oct 1999 13:45:18 -0400
From:    Brian Ristuccia <brianr@osiris.978.org>
To:      submit@bugs.debian.org
Subject: Bug#48582: psql spends hours computing results it already knows

Package: postgresql
Version: 6.5.2-3

massive_db=> explain select count(*) from huge_table;
NOTICE:  QUERY PLAN:

Aggregate  (cost=511.46 rows=9923 width=12) ->  Seq Scan on huge_table  (cost=511.46 rows=9923 width=12)

EXPLAIN

If huge_table really is huge -- like 9,000,000 rows instead of 9923, after
postgresql already knows the number of rows (that's how it determines the
cost), it proceeds to do a very long and CPU/IO intensive seq scan to
determine the count().

- -- 
Brian Ristuccia
brianr@osiris.978.org
bristucc@nortelnetworks.com
bristucc@cs.uml.edu


------- End of Forwarded Message


--      Vote against SPAM: http://www.politik-digital.de/spam/                ========================================
Oliver Elphick                                Oliver.Elphick@lfix.co.uk
Isle of Wight                              http://www.lfix.co.uk/oliver              PGP key from public servers; key
ID32B8FAA1                ========================================    "Cast thy burden upon the LORD, and he shall
sustain     thee; he shall never allow the righteous to fall."                                                Psalms
55:22
 




Re: [HACKERS] Bug#48582: psql spends hours computing results it already knows (fwd)

From
"Ross J. Reedstrom"
Date:
Oliver (& Brian) - 
Hmm, that happens to not be the case. The rows=XXXX number is drawn
from the statistics for the table, which are only updated on VACUUM
ANALYZE of that table. Easily tested: just INSERT a couple rows and do
the EXPLAIN again. The rows=XXX won't change. Ah, here's an example:
a table I've never vacuumed, until now (it's in a test copy of a db)

test=> select count(*) from "Personnel";
count
----- 177
(1 row)

test=> explain select count(*) from "Personnel";
NOTICE:  QUERY PLAN:

Aggregate  (cost=43.00 rows=1000 width=4) ->  Seq Scan on Personnel  (cost=43.00 rows=1000 width=4)

EXPLAIN
test=> vacuum analyze "Personnel";
VACUUM
test=> explain select count(*) from "Personnel";
NOTICE:  QUERY PLAN:

Aggregate  (cost=7.84 rows=177 width=4) ->  Seq Scan on Personnel  (cost=7.84 rows=177 width=4)

EXPLAIN
test=> 

Ross

On Thu, Oct 28, 1999 at 08:53:31PM +0100, Oliver Elphick wrote:
> This is a Debian bug report, which needs upstream attention.
> 
> ------- Forwarded Message
> 
> Date:    Thu, 28 Oct 1999 13:45:18 -0400
> From:    Brian Ristuccia <brianr@osiris.978.org>
> To:      submit@bugs.debian.org
> Subject: Bug#48582: psql spends hours computing results it already knows
> 
> Package: postgresql
> Version: 6.5.2-3
> 
> massive_db=> explain select count(*) from huge_table;
> NOTICE:  QUERY PLAN:
> 
> Aggregate  (cost=511.46 rows=9923 width=12)
>   ->  Seq Scan on huge_table  (cost=511.46 rows=9923 width=12)
> 
> EXPLAIN
> 
> If huge_table really is huge -- like 9,000,000 rows instead of 9923, after
> postgresql already knows the number of rows (that's how it determines the
> cost), it proceeds to do a very long and CPU/IO intensive seq scan to
> determine the count().
> 
> - -- 
> Brian Ristuccia
> brianr@osiris.978.org
> bristucc@nortelnetworks.com
> bristucc@cs.uml.edu
> 


Re: [HACKERS] Bug#48582: psql spends hours computing results it already knows (fwd)

From
Brian Ristuccia
Date:
On Thu, Oct 28, 1999 at 03:32:17PM -0500, Ross J. Reedstrom wrote:
> Oliver (& Brian) - 
> Hmm, that happens to not be the case. The rows=XXXX number is drawn
> from the statistics for the table, which are only updated on VACUUM
> ANALYZE of that table. Easily tested: just INSERT a couple rows and do
> the EXPLAIN again. The rows=XXX won't change. Ah, here's an example:
> a table I've never vacuumed, until now (it's in a test copy of a db)

Aah.. Is there any other more efficient way of determining the number of
rows in a table? It seems a sequential scan takes forever, but the database
must already have some idea (somewhere) of how many records are in the table
otherwise how would it know where to start/stop the sequential scan? 

> 
> test=> select count(*) from "Personnel";
> count
> -----
>   177
> (1 row)
> 
> test=> explain select count(*) from "Personnel";
> NOTICE:  QUERY PLAN:
> 
> Aggregate  (cost=43.00 rows=1000 width=4)
>   ->  Seq Scan on Personnel  (cost=43.00 rows=1000 width=4)
> 
> EXPLAIN
> test=> vacuum analyze "Personnel";
> VACUUM
> test=> explain select count(*) from "Personnel";
> NOTICE:  QUERY PLAN:
> 
> Aggregate  (cost=7.84 rows=177 width=4)
>   ->  Seq Scan on Personnel  (cost=7.84 rows=177 width=4)
> 
> EXPLAIN
> test=> 
> 
> Ross
> 
> On Thu, Oct 28, 1999 at 08:53:31PM +0100, Oliver Elphick wrote:
> > This is a Debian bug report, which needs upstream attention.
> > 
> > ------- Forwarded Message
> > 
> > Date:    Thu, 28 Oct 1999 13:45:18 -0400
> > From:    Brian Ristuccia <brianr@osiris.978.org>
> > To:      submit@bugs.debian.org
> > Subject: Bug#48582: psql spends hours computing results it already knows
> > 
> > Package: postgresql
> > Version: 6.5.2-3
> > 
> > massive_db=> explain select count(*) from huge_table;
> > NOTICE:  QUERY PLAN:
> > 
> > Aggregate  (cost=511.46 rows=9923 width=12)
> >   ->  Seq Scan on huge_table  (cost=511.46 rows=9923 width=12)
> > 
> > EXPLAIN
> > 
> > If huge_table really is huge -- like 9,000,000 rows instead of 9923, after
> > postgresql already knows the number of rows (that's how it determines the
> > cost), it proceeds to do a very long and CPU/IO intensive seq scan to
> > determine the count().
> > 
> > - -- 
> > Brian Ristuccia
> > brianr@osiris.978.org
> > bristucc@nortelnetworks.com
> > bristucc@cs.uml.edu
> > 

-- 
Brian Ristuccia
brianr@osiris.978.org
bristucc@nortelnetworks.com
bristucc@cs.uml.edu


"Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu> writes:
> Hmm, that happens to not be the case. The rows=XXXX number is drawn
> from the statistics for the table, which are only updated on VACUUM
> ANALYZE of that table. Easily tested: just INSERT a couple rows and do
> the EXPLAIN again. The rows=XXX won't change.

The short answer to this is that maintaining a perfectly accurate tuple
count on-the-fly would almost certainly cost more, totalled over all
operations that modify a table, than we could ever hope to make back
by short-circuiting "select count(*)" operations.  (Consider
concurrent transactions running in multiple backends, some of which
may abort instead of committing, and others of which may already have
committed but your transaction is not supposed to be able to see their
effects...)

The optimizer is perfectly happy with approximate tuple counts, so it
makes do with stats recorded at the last VACUUM.

This has been discussed quite recently on pg-hackers; see the archives
for more info.
        regards, tom lane


Then <tgl@sss.pgh.pa.us> spoke up and said:
> The short answer to this is that maintaining a perfectly accurate tuple
> count on-the-fly would almost certainly cost more, totalled over all
> operations that modify a table, than we could ever hope to make back
> by short-circuiting "select count(*)" operations.  (Consider
> concurrent transactions running in multiple backends, some of which
> may abort instead of committing, and others of which may already have
> committed but your transaction is not supposed to be able to see their
> effects...)

So, does the planner allow counting from a unique index (if one
exists)?  In general, an index scan on a unique index should be faster
than a table scan.  Of course, I'm sure someone already thought of this...

-- 
=====================================================================
| JAVA must have been developed in the wilds of West Virginia.      |
| After all, why else would it support only single inheritance??    |
=====================================================================
| Finger geek@cmu.edu for my public key.                            |
=====================================================================

Brian E Gallew <geek+@cmu.edu> writes:
> So, does the planner allow counting from a unique index (if one
> exists)?  In general, an index scan on a unique index should be faster
> than a table scan.  Of course, I'm sure someone already thought of this...

Vadim will have to check me on this, but I believe that index entries
don't contain transaction information --- that is, you can determine
whether a tuple matches a specified search key by examining the index,
but in order to discover whether the tuple is actually *valid*
(according to your transaction's worldview) you must fetch the tuple
itself from the main table.  So scanning an index cannot be cheaper than
a sequential scan of the main table, except when the index allows you to
avoid visiting most of the tuples in the main table.

There has been some discussion of allowing scans of indexes without
fetching the underlying tuples, but AFAICS that would mean replicating
the tuple transaction status information into (each!) index, which'd
be a big hit in both disk space and number of disk writes implied by
committing a tuple.  I've got my doubts about it being a win...
        regards, tom lane


Then <tgl@sss.pgh.pa.us> spoke up and said:
> Vadim will have to check me on this, but I believe that index entries
> don't contain transaction information --- that is, you can determine
> whether a tuple matches a specified search key by examining the index,
> but in order to discover whether the tuple is actually *valid*
> (according to your transaction's worldview) you must fetch the tuple
> itself from the main table.  So scanning an index cannot be cheaper than
> a sequential scan of the main table, except when the index allows you to
> avoid visiting most of the tuples in the main table.

Right.  As usual, I've overlooked something obvious.  So, this really
wouldn't work unless we had an exclusive table lock ('cause then there
wouldn't be any other transactions to worry about, except for our
own).  Feh.

-- 
=====================================================================
| JAVA must have been developed in the wilds of West Virginia.      |
| After all, why else would it support only single inheritance??    |
=====================================================================
| Finger geek@cmu.edu for my public key.                            |
=====================================================================