Re: Various performance questions - Mailing list pgsql-performance

From Dror Matalon
Subject Re: Various performance questions
Date
Msg-id 20031027174019.GJ2979@rlx11.zapatec.com
Whole thread Raw
In response to Re: Various performance questions  (Christopher Browne <cbbrowne@acm.org>)
List pgsql-performance
On Mon, Oct 27, 2003 at 07:52:06AM -0500, Christopher Browne wrote:
> In the last exciting episode, dror@zapatec.com (Dror Matalon) wrote:
> > I was answering an earlier response that suggested that maybe the actual
> > counting took time so it would take quite a bit longer when there are
> > more rows to count.
>
> Well, if a "where clause" allows the system to use an index to search
> for the subset of elements, that would reduce the number of pages that
> have to be examined, thereby diminishing the amount of work.
>
> Why don't you report what EXPLAIN ANALYZE returns as output for the
> query with WHERE clause?  That would allow us to get more of an idea
> of what is going on...


Here it is once again, and I've added another data poing "channel <
1000" which takes even less time than channel < 5000. It almost seems
like the optimizer knows that it can skip certain rows "rows=4910762" vs
"rows=1505605" . But how can it do that without using an index or
actually looking at each row?

zp1936=> EXPLAIN ANALYZE select count(*) from items;
                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=245044.53..245044.53 rows=1 width=0) (actual time=55806.893..55806.897 rows=1 loops=1)
   ->  Seq Scan on items  (cost=0.00..232767.62 rows=4910762 width=0)
(actual time=0.058..30481.482 rows=4910762 loops=1)
 Total runtime: 55806.992 ms
(3 rows)

zp1936=> EXPLAIN ANALYZE select count(*) from items where channel < 5000;
                                                        QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=248808.54..248808.54 rows=1 width=0) (actual time=26071.264..26071.269 rows=1 loops=1)
   ->  Seq Scan on items  (cost=0.00..245044.52 rows=1505605 width=0)
(actual time=0.161..17623.033 rows=1632057 loops=1)
         Filter: (channel < 5000)
 Total runtime: 26071.361 ms
(4 rows)

zp1936=> EXPLAIN ANALYZE select count(*) from items where channel < 1000;
                                                      QUERY PLAN
-----------------------------------------------------------------------------------------------------------------------
 Aggregate  (cost=245429.74..245429.74 rows=1 width=0) (actual time=10225.272..10225.276 rows=1 loops=1)
   ->  Seq Scan on items  (cost=0.00..245044.52 rows=154085 width=0) (actual time=7.633..10083.246 rows=25687 loops=1)
         Filter: (channel < 1000)
 Total runtime: 10225.373 ms
(4 rows)


> --
> (format nil "~S@~S" "cbbrowne" "acm.org")
> http://www3.sympatico.ca/cbbrowne/spiritual.html
> When  replying, it  is  often possible  to cleverly edit  the original
> message in such a way  as to subtly alter  its meaning or tone to your
> advantage while  appearing that you are  taking pains  to preserve the
> author's intent.   As a   bonus,   it will   seem that your   superior
> intellect is cutting through all the excess verbiage to the very heart
> of the matter.  -- from the Symbolics Guidelines for Sending Mail
>
> ---------------------------(end of broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org

--
Dror Matalon
Zapatec Inc
1700 MLK Way
Berkeley, CA 94709
http://www.zapatec.com

pgsql-performance by date:

Previous
From: Dror Matalon
Date:
Subject: Re: Various performance questions
Next
From: Greg Stark
Date:
Subject: Re: vacuum locking