Thread: PERFORMANCE and SIZE

PERFORMANCE and SIZE

From
"Alfranio Junior"
Date:
 Hello,

I'm a new PostgresSql user and I do not know so much about the
 performance mechanisms currently implemented and available.

 So, as a dummy user I think that something strange is happening with me.
 When I run the following command:

 explain analyze select * from customer
       where c_last = 'ROUGHTATION' and
       c_w_id = 1 and
       c_d_id = 1
       order by c_w_id, c_d_id, c_last, c_first limit 1;

 I receive the following results:

 (Customer table with 60.000 rows) -
                                                                QUERY PLAN
 ---------------------------------------------------------------------------
-----------------------------------------------------------
   Limit  (cost=4.84..4.84 rows=1 width=283) (actual time=213.13..213.13
 rows=0 loops=1)
     ->  Sort  (cost=4.84..4.84 rows=1 width=283) (actual
 time=213.13..213.13 rows=0 loops=1)
           Sort Key: c_w_id, c_d_id, c_last, c_first
           ->  Index Scan using pk_customer on customer  (cost=0.00..4.83
 rows=1 width=283) (actual time=211.93..211.93 rows=0 loops=1)
                 Index Cond: ((c_w_id = 1) AND (c_d_id = 1))
                 Filter: (c_last = 'ROUGHTATION'::bpchar)
   Total runtime: 213.29 msec
 (7 rows)


 (Customer table with 360.000 rows) -
                                                                 QUERY PLAN
 ---------------------------------------------------------------------------
-------------------------------------------------------------
   Limit  (cost=11100.99..11101.00 rows=1 width=638) (actual
 time=20.82..20.82 rows=0 loops=1)
     ->  Sort  (cost=11100.99..11101.00 rows=4 width=638) (actual
 time=20.81..20.81 rows=0 loops=1)
           Sort Key: c_w_id, c_d_id, c_last, c_first
           ->  Index Scan using pk_customer on customer
 (cost=0.00..11100.95 rows=4 width=638) (actual time=20.40..20.40 rows=0
 loops=1)
                 Index Cond: ((c_w_id = 1) AND (c_d_id = 1))
                 Filter: (c_last = 'ROUGHTATION'::bpchar)
   Total runtime: 21.11 msec
 (7 rows)

 Increasing the number of rows the total runtime decreases.
 The customer table has the following structure:
 CREATE TABLE customer
 (
      c_id int NOT NULL ,
      c_d_id int4 NOT NULL ,
      c_w_id int4 NOT NULL ,
      c_first char (16) NULL ,
      c_middle char (2) NULL ,
      c_last char (16) NULL ,
      c_street_1 char (20) NULL ,
      c_street_2 char (20) NULL ,
      c_city char (20) NULL ,
      c_state char (2) NULL ,
      c_zip char (9) NULL ,
      c_phone char (16) NULL ,
      c_since timestamp NULL ,
      c_credit char (2) NULL ,
      c_credit_lim numeric(12, 2) NULL ,
      c_discount numeric(4, 4) NULL ,
      c_balance numeric(12, 2) NULL ,
      c_ytd_payment numeric(12, 2) NULL ,
      c_payment_cnt int4 NULL ,
      c_delivery_cnt int4 NULL ,
      c_data text NULL
 );

 ALTER TABLE customer  ADD
      CONSTRAINT PK_customer PRIMARY KEY
      (
          c_w_id,
          c_d_id,
          c_id
      );

 Does anybody know what is happening ?


 Thanks !!!!

 Alfranio Junior


Re: PERFORMANCE and SIZE

From
Josh Berkus
Date:
Alfranio,

> I'm a new PostgresSql user and I do not know so much about the
>  performance mechanisms currently implemented and available.
<snip>
>  Does anybody know what is happening ?

90% likely:  You haven't run VACUUM FULL ANALYZE in a while.

--
Josh Berkus
Aglio Database Solutions
San Francisco


Re: PERFORMANCE and SIZE

From
"Nikolaus Dilger"
Date:
Alfranio Junior,

99% likely:  You ran the second query after the first
and the 4 result rows where already stored in memory.
The first execution took longer because the database
had to go to the disk after looking up in the index
what rows to get.  I further assume that the index was
already in memory for both queries since you most
likely just build it.

Of course you also need to vaccuum on a regular basis
in order to have up to date statstics.

Regards,
Nikolaus Dilger


On Mon, 12 May 2003 12:35:24 -0700, "Alfranio Junior"
wrote:

>
>  Hello,
>
> I'm a new PostgresSql user and I do not know so much
> about the
>  performance mechanisms currently implemented and
> available.
>
>  So, as a dummy user I think that something strange is
> happening with me.
>  When I run the following command:
>
>  explain analyze select * from customer
>        where c_last = 'ROUGHTATION' and
>        c_w_id = 1 and
>        c_d_id = 1
>        order by c_w_id, c_d_id, c_last, c_first limit
1;
>
>  I receive the following results:
>
>  (Customer table with 60.000 rows) -
>

>         QUERY PLAN
>
---------------------------------------------------------------------------
>
-----------------------------------------------------------
>    Limit  (cost=4.84..4.84 rows=1 width=283) (actual
> time=213.13..213.13
>  rows=0 loops=1)
>      ->  Sort  (cost=4.84..4.84 rows=1 width=283)
> (actual
>  time=213.13..213.13 rows=0 loops=1)
>            Sort Key: c_w_id, c_d_id, c_last, c_first
>            ->  Index Scan using pk_customer on
customer
>  (cost=0.00..4.83
>  rows=1 width=283) (actual time=211.93..211.93 rows=0
> loops=1)
>                  Index Cond: ((c_w_id = 1) AND (c_d_id
> = 1))
>                  Filter: (c_last =
> 'ROUGHTATION'::bpchar)
>    Total runtime: 213.29 msec
>  (7 rows)
>
>
>  (Customer table with 360.000 rows) -
>

>          QUERY PLAN
>
---------------------------------------------------------------------------
>
-------------------------------------------------------------
>    Limit  (cost=11100.99..11101.00 rows=1 width=638)
> (actual
>  time=20.82..20.82 rows=0 loops=1)
>      ->  Sort  (cost=11100.99..11101.00 rows=4
> width=638) (actual
>  time=20.81..20.81 rows=0 loops=1)
>            Sort Key: c_w_id, c_d_id, c_last, c_first
>            ->  Index Scan using pk_customer on
customer
>  (cost=0.00..11100.95 rows=4 width=638) (actual
> time=20.40..20.40 rows=0
>  loops=1)
>                  Index Cond: ((c_w_id = 1) AND (c_d_id
> = 1))
>                  Filter: (c_last =
> 'ROUGHTATION'::bpchar)
>    Total runtime: 21.11 msec
>  (7 rows)
>
>  Increasing the number of rows the total runtime
> decreases.
>  The customer table has the following structure:
>  CREATE TABLE customer
>  (
>       c_id int NOT NULL ,
>       c_d_id int4 NOT NULL ,
>       c_w_id int4 NOT NULL ,
>       c_first char (16) NULL ,
>       c_middle char (2) NULL ,
>       c_last char (16) NULL ,
>       c_street_1 char (20) NULL ,
>       c_street_2 char (20) NULL ,
>       c_city char (20) NULL ,
>       c_state char (2) NULL ,
>       c_zip char (9) NULL ,
>       c_phone char (16) NULL ,
>       c_since timestamp NULL ,
>       c_credit char (2) NULL ,
>       c_credit_lim numeric(12, 2) NULL ,
>       c_discount numeric(4, 4) NULL ,
>       c_balance numeric(12, 2) NULL ,
>       c_ytd_payment numeric(12, 2) NULL ,
>       c_payment_cnt int4 NULL ,
>       c_delivery_cnt int4 NULL ,
>       c_data text NULL
>  );
>
>  ALTER TABLE customer  ADD
>       CONSTRAINT PK_customer PRIMARY KEY
>       (
>           c_w_id,
>           c_d_id,
>           c_id
>       );
>
>  Does anybody know what is happening ?
>
>
>  Thanks !!!!
>
>  Alfranio Junior
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html


Re: PERFORMANCE and SIZE

From
"Alfranio Junior"
Date:
Josh,

I ran the vacuumdb as follows:
vacuumdb -f -v -e -a

and after that,

vacuumdb -z -v -e -a.

And now, the optimizer started to use a table scan and in consequence gives
me:

explain analyze select * from customer

where c_last = 'ROUGHTATION' and

c_w_id = 1 and

c_d_id = 1

order by c_w_id, c_d_id, c_last, c_first limit 1;

QUERY PLAN

----------------------------------------------------------------------------
-----------------------------------------

Limit (cost=6302.03..6302.03 rows=1 width=639) (actual time=208.33..208.33
rows=0 loops=1)

-> Sort (cost=6302.03..6302.04 rows=3 width=639) (actual time=208.32..208.32
rows=0 loops=1)

Sort Key: c_w_id, c_d_id, c_last, c_first

-> Seq Scan on customer (cost=0.00..6302.00 rows=3 width=639) (actual
time=207.99..207.99 rows=0 loops=1)

Filter: ((c_last = 'ROUGHTATION'::bpchar) AND (c_w_id = 1) AND (c_d_id = 1))

Total runtime: 208.54 msec

(6 rows)




When I force the index use a receive a better result:

set enable_seqscan to off;

explain analyze select * from customer

where c_last = 'ROUGHTATION' and

c_w_id = 1 and

c_d_id = 1

order by c_w_id, c_d_id, c_last, c_first limit 1;


QUERY PLAN

----------------------------------------------------------------------------
-----------------------------------------------------------

Limit (cost=9860.03..9860.03 rows=1 width=639) (actual time=13.98..13.98
rows=0 loops=1)

-> Sort (cost=9860.03..9860.04 rows=3 width=639) (actual time=13.98..13.98
rows=0 loops=1)

Sort Key: c_w_id, c_d_id, c_last, c_first

-> Index Scan using pk_customer on customer (cost=0.00..9860.00 rows=3
width=639) (actual time=13.86..13.86 rows=0 loops=1)

Index Cond: ((c_w_id = 1) AND (c_d_id = 1))

Filter: (c_last = 'ROUGHTATION'::bpchar)

Total runtime: 14.11 msec

(7 rows)

Is this the only way to force the index ?
What are the reasons to the optimizer to decide for a worse plan ?

> Alfranio,
>
> > I'm a new PostgresSql user and I do not know so much about the
> >  performance mechanisms currently implemented and available.
> <snip>
> >  Does anybody know what is happening ?
>
> 90% likely:  You haven't run VACUUM FULL ANALYZE in a while.
>
> --
> Josh Berkus
> Aglio Database Solutions
> San Francisco
>


Re: PERFORMANCE and SIZE

From
Josh Berkus
Date:
Alfranio,

> And now, the optimizer started to use a table scan and in consequence gives
> me:

What appears to me to be happening is that the planner has incorrect estimates
of the cost of an index lookup.   The base estimate is contained in the
postgresql.conf parameter:
cpu_index_tuple_cost = 0.001

From the look of things, your disk/array has much better random seek times
than the standard, or you have enough RAM to cache most of your tables.
Either way, I would experiment with lowering the index_tuple_cost to, say,
0.0003 and see if you get better use of indexes.

If that does work for you, make sure to check some other queries unrelated to
the "customers" table to make sure that the new setting doesn't mess them up
in some way.

--
Josh Berkus
Aglio Database Solutions
San Francisco


Re: PERFORMANCE and SIZE

From
"scott.marlowe"
Date:
On Tue, 13 May 2003, Josh Berkus wrote:

> Alfranio,
>
> > And now, the optimizer started to use a table scan and in consequence gives
> > me:
>
> What appears to me to be happening is that the planner has incorrect estimates
> of the cost of an index lookup.   The base estimate is contained in the
> postgresql.conf parameter:
> cpu_index_tuple_cost = 0.001
>
> >From the look of things, your disk/array has much better random seek times
> than the standard, or you have enough RAM to cache most of your tables.
> Either way, I would experiment with lowering the index_tuple_cost to, say,
> 0.0003 and see if you get better use of indexes.
>
> If that does work for you, make sure to check some other queries unrelated to
> the "customers" table to make sure that the new setting doesn't mess them up
> in some way.

Also, you can lower random page cost.  And make sure the query planner has
some idea how much effective cache you have, as it can kind of take that
into account too.  i.e. a machine wiht 800 Meg cache is far more likely to
have data in memory than one 100 MEg cache.  This is kernel cache I'm
talking about, by the way.   effective cache size is set in 8k blocks.


Re: PERFORMANCE and SIZE

From
Bruce Momjian
Date:
I have gotten so much spam, this subject line struck me as spam until I
looked closer.  Did it catch anyone else?

---------------------------------------------------------------------------

Alfranio Junior wrote:
>  Hello,
>
> I'm a new PostgresSql user and I do not know so much about the
>  performance mechanisms currently implemented and available.
>
>  So, as a dummy user I think that something strange is happening with me.
>  When I run the following command:
>
>  explain analyze select * from customer
>        where c_last = 'ROUGHTATION' and
>        c_w_id = 1 and
>        c_d_id = 1
>        order by c_w_id, c_d_id, c_last, c_first limit 1;
>
>  I receive the following results:
>
>  (Customer table with 60.000 rows) -
>                                                                 QUERY PLAN
>  ---------------------------------------------------------------------------
> -----------------------------------------------------------
>    Limit  (cost=4.84..4.84 rows=1 width=283) (actual time=213.13..213.13
>  rows=0 loops=1)
>      ->  Sort  (cost=4.84..4.84 rows=1 width=283) (actual
>  time=213.13..213.13 rows=0 loops=1)
>            Sort Key: c_w_id, c_d_id, c_last, c_first
>            ->  Index Scan using pk_customer on customer  (cost=0.00..4.83
>  rows=1 width=283) (actual time=211.93..211.93 rows=0 loops=1)
>                  Index Cond: ((c_w_id = 1) AND (c_d_id = 1))
>                  Filter: (c_last = 'ROUGHTATION'::bpchar)
>    Total runtime: 213.29 msec
>  (7 rows)
>
>
>  (Customer table with 360.000 rows) -
>                                                                  QUERY PLAN
>  ---------------------------------------------------------------------------
> -------------------------------------------------------------
>    Limit  (cost=11100.99..11101.00 rows=1 width=638) (actual
>  time=20.82..20.82 rows=0 loops=1)
>      ->  Sort  (cost=11100.99..11101.00 rows=4 width=638) (actual
>  time=20.81..20.81 rows=0 loops=1)
>            Sort Key: c_w_id, c_d_id, c_last, c_first
>            ->  Index Scan using pk_customer on customer
>  (cost=0.00..11100.95 rows=4 width=638) (actual time=20.40..20.40 rows=0
>  loops=1)
>                  Index Cond: ((c_w_id = 1) AND (c_d_id = 1))
>                  Filter: (c_last = 'ROUGHTATION'::bpchar)
>    Total runtime: 21.11 msec
>  (7 rows)
>
>  Increasing the number of rows the total runtime decreases.
>  The customer table has the following structure:
>  CREATE TABLE customer
>  (
>       c_id int NOT NULL ,
>       c_d_id int4 NOT NULL ,
>       c_w_id int4 NOT NULL ,
>       c_first char (16) NULL ,
>       c_middle char (2) NULL ,
>       c_last char (16) NULL ,
>       c_street_1 char (20) NULL ,
>       c_street_2 char (20) NULL ,
>       c_city char (20) NULL ,
>       c_state char (2) NULL ,
>       c_zip char (9) NULL ,
>       c_phone char (16) NULL ,
>       c_since timestamp NULL ,
>       c_credit char (2) NULL ,
>       c_credit_lim numeric(12, 2) NULL ,
>       c_discount numeric(4, 4) NULL ,
>       c_balance numeric(12, 2) NULL ,
>       c_ytd_payment numeric(12, 2) NULL ,
>       c_payment_cnt int4 NULL ,
>       c_delivery_cnt int4 NULL ,
>       c_data text NULL
>  );
>
>  ALTER TABLE customer  ADD
>       CONSTRAINT PK_customer PRIMARY KEY
>       (
>           c_w_id,
>           c_d_id,
>           c_id
>       );
>
>  Does anybody know what is happening ?
>
>
>  Thanks !!!!
>
>  Alfranio Junior
>
>
> ---------------------------(end of broadcast)---------------------------
> TIP 5: Have you checked our extensive FAQ?
>
> http://www.postgresql.org/docs/faqs/FAQ.html
>

--
  Bruce Momjian                        |  http://candle.pha.pa.us
  pgman@candle.pha.pa.us               |  (610) 359-1001
  +  If your life is a hard drive,     |  13 Roberts Road
  +  Christ can be your backup.        |  Newtown Square, Pennsylvania 19073

Re: PERFORMANCE and SIZE

From
Rudi Starcevic
Date:
Ha !

No - it didn't catch me -- but Yes my spam has been going through the roof lately.
Over here in Australia it's in the Media alot of late - Spam increases.
Seems like everyone is suffering.

Cheers
RS.




Bruce Momjian wrote:
I have gotten so much spam, this subject line struck me as spam until I
looked closer.  Did it catch anyone else?

---------------------------------------------------------------------------

Alfranio Junior wrote: 
 Hello,

I'm a new PostgresSql user and I do not know so much about theperformance mechanisms currently implemented and available.
So, as a dummy user I think that something strange is happening with me.When I run the following command:
explain analyze select * from customer      where c_last = 'ROUGHTATION' and      c_w_id = 1 and      c_d_id = 1      order by c_w_id, c_d_id, c_last, c_first limit 1;
I receive the following results:
(Customer table with 60.000 rows) -                                                               QUERY PLAN---------------------------------------------------------------------------
-----------------------------------------------------------  Limit  (cost=4.84..4.84 rows=1 width=283) (actual time=213.13..213.13rows=0 loops=1)    ->  Sort  (cost=4.84..4.84 rows=1 width=283) (actualtime=213.13..213.13 rows=0 loops=1)          Sort Key: c_w_id, c_d_id, c_last, c_first          ->  Index Scan using pk_customer on customer  (cost=0.00..4.83rows=1 width=283) (actual time=211.93..211.93 rows=0 loops=1)                Index Cond: ((c_w_id = 1) AND (c_d_id = 1))                Filter: (c_last = 'ROUGHTATION'::bpchar)  Total runtime: 213.29 msec(7 rows)

(Customer table with 360.000 rows) -                                                                QUERY PLAN---------------------------------------------------------------------------
-------------------------------------------------------------  Limit  (cost=11100.99..11101.00 rows=1 width=638) (actualtime=20.82..20.82 rows=0 loops=1)    ->  Sort  (cost=11100.99..11101.00 rows=4 width=638) (actualtime=20.81..20.81 rows=0 loops=1)          Sort Key: c_w_id, c_d_id, c_last, c_first          ->  Index Scan using pk_customer on customer(cost=0.00..11100.95 rows=4 width=638) (actual time=20.40..20.40 rows=0loops=1)                Index Cond: ((c_w_id = 1) AND (c_d_id = 1))                Filter: (c_last = 'ROUGHTATION'::bpchar)  Total runtime: 21.11 msec(7 rows)
Increasing the number of rows the total runtime decreases.The customer table has the following structure:CREATE TABLE customer(     c_id int NOT NULL ,     c_d_id int4 NOT NULL ,     c_w_id int4 NOT NULL ,     c_first char (16) NULL ,     c_middle char (2) NULL ,     c_last char (16) NULL ,     c_street_1 char (20) NULL ,     c_street_2 char (20) NULL ,     c_city char (20) NULL ,     c_state char (2) NULL ,     c_zip char (9) NULL ,     c_phone char (16) NULL ,     c_since timestamp NULL ,     c_credit char (2) NULL ,     c_credit_lim numeric(12, 2) NULL ,     c_discount numeric(4, 4) NULL ,     c_balance numeric(12, 2) NULL ,     c_ytd_payment numeric(12, 2) NULL ,     c_payment_cnt int4 NULL ,     c_delivery_cnt int4 NULL ,     c_data text NULL);
ALTER TABLE customer  ADD     CONSTRAINT PK_customer PRIMARY KEY     (         c_w_id,         c_d_id,         c_id     );
Does anybody know what is happening ?

Thanks !!!!
Alfranio Junior


---------------------------(end of broadcast)---------------------------
TIP 5: Have you checked our extensive FAQ?

http://www.postgresql.org/docs/faqs/FAQ.html