Thread: Performance

Performance

From

"Diego Schvartzman"

Date:

15 May 2000, 14:38:16

Hi all!
I've upgraded to PGSQL 7.0.0 beta 2 from 6.5.3. Unloaded and loaded my db,
vacuumed it, everything ok. I'm not using foreign keys. But now I see that
performance is so much slower !. Is this supposed to be? Do I have to do
something else?

English is my second language. Thanks in advance !

Diego Schvartzman
Email: diego.schvartzman@usa.net
ICQ# 1779434

Re: Performance

From

Alfred Perlstein

Date:

15 May 2000, 15:47:16

* Diego Schvartzman <dschvar@yahoo.com> [000515 12:25] wrote:
> Hi all!
> I've upgraded to PGSQL 7.0.0 beta 2 from 6.5.3. Unloaded and loaded my db,
> vacuumed it, everything ok. I'm not using foreign keys. But now I see that
> performance is so much slower !. Is this supposed to be? Do I have to do
> something else?
>
> English is my second language. Thanks in advance !

A lot of people have said that performance has increased, if you want
any help you'll need to be more specific, give examples of what's
worse now than before.

--
-Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
"I have the heart of a child; I keep it in a jar on my desk."

Re: Performance

From

"Diego Schvartzman"

Date:

15 May 2000, 15:59:15

I have an application via PHP. For example, a SELECT query that must return
one and only one row, with a where clause with and index (I droped it and
created again) that took about 3 seconds (v6.5.3), now (v7.0.0) takes about
15 seconds. Really I don't lnow what is happening. Same hardware, same php
version, etc etc.

Thanks again!

Diego Schvartzman
Email: diego.schvartzman@usa.net
ICQ# 1779434
----- Original Message -----
From: Alfred Perlstein <bright@wintelcom.net>
To: Diego Schvartzman <dschvar@yahoo.com>
Cc: Lista PGSQL <pgsql-general@postgresql.org>
Sent: Monday, May 15, 2000 5:18 PM
Subject: Re: [GENERAL] Performance


> * Diego Schvartzman <dschvar@yahoo.com> [000515 12:25] wrote:
> > Hi all!
> > I've upgraded to PGSQL 7.0.0 beta 2 from 6.5.3. Unloaded and loaded my
db,
> > vacuumed it, everything ok. I'm not using foreign keys. But now I see
that
> > performance is so much slower !. Is this supposed to be? Do I have to do
> > something else?
> >
> > English is my second language. Thanks in advance !
>
> A lot of people have said that performance has increased, if you want
> any help you'll need to be more specific, give examples of what's
> worse now than before.
>
> --
> -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
> "I have the heart of a child; I keep it in a jar on my desk."
>

Re: Performance

From

Tom Lane

Date:

15 May 2000, 17:20:16

"Diego Schvartzman" <dschvar@yahoo.com> writes:
> I have an application via PHP. For example, a SELECT query that must return
> one and only one row, with a where clause with and index (I droped it and
> created again) that took about 3 seconds (v6.5.3), now (v7.0.0) takes about
> 15 seconds.

Could be that 7.0 is less willing to use the index than 6.5 was.  See
thread "indexes ingnored on simple query in 7.0" over in pgsql-sql for
ways to investigate the problem and one possible solution.

            regards, tom lane

Re: Performance

From

Charles Tassell

Date:

15 May 2000, 17:41:16

Have you done a VACUUM ANALYZE on your database after recreating the index?

At 04:56 PM 5/15/00, Diego Schvartzman wrote:
>I have an application via PHP. For example, a SELECT query that must return
>one and only one row, with a where clause with and index (I droped it and
>created again) that took about 3 seconds (v6.5.3), now (v7.0.0) takes about
>15 seconds. Really I don't lnow what is happening. Same hardware, same php
>version, etc etc.
>
>Thanks again!
>
>Diego Schvartzman
>Email: diego.schvartzman@usa.net
>ICQ# 1779434
>----- Original Message -----
>From: Alfred Perlstein <bright@wintelcom.net>
>To: Diego Schvartzman <dschvar@yahoo.com>
>Cc: Lista PGSQL <pgsql-general@postgresql.org>
>Sent: Monday, May 15, 2000 5:18 PM
>Subject: Re: [GENERAL] Performance
>
>
> > * Diego Schvartzman <dschvar@yahoo.com> [000515 12:25] wrote:
> > > Hi all!
> > > I've upgraded to PGSQL 7.0.0 beta 2 from 6.5.3. Unloaded and loaded my
>db,
> > > vacuumed it, everything ok. I'm not using foreign keys. But now I see
>that
> > > performance is so much slower !. Is this supposed to be? Do I have to do
> > > something else?
> > >
> > > English is my second language. Thanks in advance !
> >
> > A lot of people have said that performance has increased, if you want
> > any help you'll need to be more specific, give examples of what's
> > worse now than before.
> >
> > --
> > -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
> > "I have the heart of a child; I keep it in a jar on my desk."
> >

Re: Performance

From

Dustin Sallings

Date:

16 May 2000, 04:44:35

On Mon, 15 May 2000, Charles Tassell wrote:

    I ran into this exact problem, and it was *very* significant on a
15M row table I have.  :)  It didn't seem to want to use the index, even
freshly created, without a vacuum analyze.

# Have you done a VACUUM ANALYZE on your database after recreating the index?
#
# At 04:56 PM 5/15/00, Diego Schvartzman wrote:
# >I have an application via PHP. For example, a SELECT query that must return
# >one and only one row, with a where clause with and index (I droped it and
# >created again) that took about 3 seconds (v6.5.3), now (v7.0.0) takes about
# >15 seconds. Really I don't lnow what is happening. Same hardware, same php
# >version, etc etc.
# >
# >Thanks again!
# >
# >Diego Schvartzman
# >Email: diego.schvartzman@usa.net
# >ICQ# 1779434
# >----- Original Message -----
# >From: Alfred Perlstein <bright@wintelcom.net>
# >To: Diego Schvartzman <dschvar@yahoo.com>
# >Cc: Lista PGSQL <pgsql-general@postgresql.org>
# >Sent: Monday, May 15, 2000 5:18 PM
# >Subject: Re: [GENERAL] Performance
# >
# >
# > > * Diego Schvartzman <dschvar@yahoo.com> [000515 12:25] wrote:
# > > > Hi all!
# > > > I've upgraded to PGSQL 7.0.0 beta 2 from 6.5.3. Unloaded and loaded my
# >db,
# > > > vacuumed it, everything ok. I'm not using foreign keys. But now I see
# >that
# > > > performance is so much slower !. Is this supposed to be? Do I have to do
# > > > something else?
# > > >
# > > > English is my second language. Thanks in advance !
# > >
# > > A lot of people have said that performance has increased, if you want
# > > any help you'll need to be more specific, give examples of what's
# > > worse now than before.
# > >
# > > --
# > > -Alfred Perlstein - [bright@wintelcom.net|alfred@freebsd.org]
# > > "I have the heart of a child; I keep it in a jar on my desk."
# > >
#
#

--
dustin sallings                            The world is watching America,
http://2852210114/~dustin/                 and America is watching TV.

Re: Performance

From

"Ross J. Reedstrom"

Date:

16 May 2000, 10:55:34

On Tue, May 16, 2000 at 01:41:48AM -0700, Dustin Sallings wrote:
> On Mon, 15 May 2000, Charles Tassell wrote:
>
>     I ran into this exact problem, and it was *very* significant on a
> 15M row table I have.  :)  It didn't seem to want to use the index, even
> freshly created, without a vacuum analyze.
>

Hmm, if you drop the index, do a VACUUM ANALYZE, then create the index,
it doesn't want to use it? That's be odd, since the statistics are
only kept about the table relations, not the indices themselves. If
you mean it won't use an fresh index on a fresh table, that's the
expected behavior.

VACUUM ANALYZE [tablename] fills in the statistics in pg_statistic
that the optimizer uses when deciding between sequential and index
scans. VACUUM is currently functionally overloaded: a simple VACUUM
recovers storage space in the table files, VACUUM ANALYZE does that as
well as collect statistics. It sometimes feels quicker to do a simple
VACUUM, then a VACUUM ANALYZE.

However, vacuuming a large table with indices on it can take a _long_
time: I've seen the recommendation given to drop indices, vacuum,
then recreate the indices. This is mostly a problem for the space
recovery aspect of vacuum, since each updated or deleted tuple causes
a update/delete to the index, as space is compacted.

Ross
--
Ross J. Reedstrom, Ph.D., <reedstrm@rice.edu>
NSBRI Research Scientist/Programmer
Computer and Information Technology Institute
Rice University, 6100 S. Main St.,  Houston, TX 77005

Re: Performance

From

Tom Lane

Date:

16 May 2000, 11:41:36

"Ross J. Reedstrom" <reedstrm@wallace.ece.rice.edu> writes:
> On Tue, May 16, 2000 at 01:41:48AM -0700, Dustin Sallings wrote:
>> I ran into this exact problem, and it was *very* significant on a
>> 15M row table I have.  :)  It didn't seem to want to use the index, even
>> freshly created, without a vacuum analyze.

> Hmm, if you drop the index, do a VACUUM ANALYZE, then create the index,
> it doesn't want to use it? That's be odd, since the statistics are
> only kept about the table relations, not the indices themselves.

Right, it doesn't matter whether the index existed at the time of the
VACUUM.  But it does matter whether any VACUUM ANALYZE stats are
available or not...

> If you mean it won't use an fresh index on a fresh table, that's the
> expected behavior.

Just to clarify: it depends on the query, and 7.0's behavior is
different from prior versions.  For an equality-type probe, like
"WHERE x = 33", I'd expect 7.0 to select an indexscan even without
stats.  For an inequality like "WHERE x < 33", it will not select
an indexscan unless it has stats indicating that the inequality is
reasonably selective (less than about 10% of the table, I think).
For a range bound like "WHERE x > 22 AND x < 33", you will get an
indexscan without stats.  Beyond that I'm not going to guess...

Prior versions had a bogus cost formula for indexscans that would
*drastically* underestimate the cost of an indexscan, so they tended
to pick an indexscan even where it wasn't justified.  As it happened
they would pick an indexscan for the one-sided-inequality case even
with no stats available.  In some cases that was good, in others
it'd lose big.

            regards, tom lane

Re: Performance

From

Alfred Perlstein

Date:

16 May 2000, 11:42:37

> # At 04:56 PM 5/15/00, Diego Schvartzman wrote:
> # >I have an application via PHP. For example, a SELECT query that must return
> # >one and only one row, with a where clause with and index (I droped it and
> # >created again) that took about 3 seconds (v6.5.3), now (v7.0.0) takes about
> # >15 seconds. Really I don't lnow what is happening. Same hardware, same php
> # >version, etc etc.
> # >

* Dustin Sallings <dustin@spy.net> [000516 02:26] wrote:
> On Mon, 15 May 2000, Charles Tassell wrote:
>
>     I ran into this exact problem, and it was *very* significant on a
> 15M row table I have.  :)  It didn't seem to want to use the index, even
> freshly created, without a vacuum analyze.

grrrr....

FOR THE LAST TIME, THESE BUG REPORTS ARE PRETTY MUCH **USELESS**
TO THE DEVELOPERS UNLESS YOU GIVE:

THE TABLE STRUCTURE,
THE QUERY, AND
THE OUTPUT OF 'EXPLAIN'

Just because someone is a database guru doesn't mean they are also
clairvoyant. :)

thanks,
-Alfred

Re: Performance

From

"Diego Schvartzman"

Date:

16 May 2000, 12:18:37

Here are more info. Sorry, I thought taht because this case is very simple,
it was not necesary, but ....


50000 ROWS  aprox

QUERY:
SELECT * FROM d_cue WHERE d_cue.clave = '$cue'

INDEX:
create index d_cue_clave on d_cue (clave);

EXPLAIN:
ra1999=> explain select * from d_cue where clave='9400001';
NOTICE:  QUERY PLAN:

Seq Scan on d_cue  (cost=0.00..3738.62 rows=1 width=544)


TABLE STRUCTURE:
CREATE TABLE "d_cue" (
 "clave" character(7),
 "nombre" character varying(60),
 "calle" character varying(45),
 "referencia" character varying(60),
 "telefono" character varying(15),
 "cp" int4,
 "zona" int4,
 "fraccion" character(2),
 "radio" character(2),
 "cooperadora" int4,
 "d_cooperadora" character varying(50),
 "confesional" int4,
 "d_confesional" character varying(50),
 "arancel" int4,
 "d_arancel" character varying(50),
 "categoria" int4,
 "d_categoria" character varying(50),
 "permanencia" int4,
 "d_permanencia" character varying(50),
 "alternancia" int4,
 "d_alternancia" character varying(50),
 "cod_jur" character varying(15),
 "matric" character,
 "periodo_func" int4,
 "d_periodo_func" character varying(50),
 "var1" character varying(15),
 "var2" character varying(15),
 "var3" character varying(15),
 "var4" character varying(15),
 "var5" character varying(15),
 "var6" character varying(15),
 "vat1" character(6),
 "d_vat1" character varying(50),
 "vat2" character(6),
 "d_vat2" character varying(50),
 "vat3" character(6),
 "d_vat3" character varying(50),
 "vat4" character(6),
 "d_vat4" character varying(50),
 "vat5" character(6),
 "d_vat5" character varying(50),
 "vat6" character(6),
 "d_vat6" character varying(50),
 "direle" character varying(30),
 "barrio" character varying(40),
 "nuevo" character,
 "d_nuevo" character varying(50),
 "turnos" character varying(255),
 "tipos" character varying(255),
 "ciclos" character varying(255),
 "f_nf" character(10),
 "agregado" int4);

Diego Schvartzman
Email: diego.schvartzman@usa.net
ICQ# 1779434
----- Original Message -----
From: Alfred Perlstein <bright@wintelcom.net>
To: Dustin Sallings <dustin@spy.net>
Cc: Charles Tassell <ctassell@isn.net>; Diego Schvartzman
<dschvar@yahoo.com>; Lista PGSQL <pgsql-general@postgresql.org>
Sent: Tuesday, May 16, 2000 1:14 PM
Subject: Re: [GENERAL] Performance


> > # At 04:56 PM 5/15/00, Diego Schvartzman wrote:
> > # >I have an application via PHP. For example, a SELECT query that must
return
> > # >one and only one row, with a where clause with and index (I droped it
and
> > # >created again) that took about 3 seconds (v6.5.3), now (v7.0.0) takes
about
> > # >15 seconds. Really I don't lnow what is happening. Same hardware,
same php
> > # >version, etc etc.
> > # >
>
> * Dustin Sallings <dustin@spy.net> [000516 02:26] wrote:
> > On Mon, 15 May 2000, Charles Tassell wrote:
> >
> > I ran into this exact problem, and it was *very* significant on a
> > 15M row table I have.  :)  It didn't seem to want to use the index, even
> > freshly created, without a vacuum analyze.
>
> grrrr....
>
> FOR THE LAST TIME, THESE BUG REPORTS ARE PRETTY MUCH **USELESS**
> TO THE DEVELOPERS UNLESS YOU GIVE:
>
> THE TABLE STRUCTURE,
> THE QUERY, AND
> THE OUTPUT OF 'EXPLAIN'
>
> Just because someone is a database guru doesn't mean they are also
> clairvoyant. :)
>
> thanks,
> -Alfred
>

Re: Performance

From

Tom Lane

Date:

16 May 2000, 18:25:37

"Diego Schvartzman" <dschvar@yahoo.com> writes:
> 50000 ROWS  aprox
> INDEX:
> create index d_cue_clave on d_cue (clave);
> EXPLAIN:
> ra1999=> explain select * from d_cue where clave='9400001';
> NOTICE:  QUERY PLAN:
> Seq Scan on d_cue  (cost=0.00..3738.62 rows=1 width=544)

Wow, that's looking pretty peculiar.  The thing's estimating only one
row out, so it's not being fooled by bad statistics or anything like
that.  It surely ought to pick an indexscan here.  The only thing I can
think of is that somehow it's not realizing that the index can be
applied for this query --- but I don't see why not.  I think you've
stumbled across a very strange bug.  What EXPLAIN output do you get if
you first do
    SET enable_seqscan = OFF;
That should force it to pick an indexscan if it can figure out how...

If you still get a seqscan even in that case, I'd like to trouble you
for the result of EXPLAIN VERBOSE on the query.  You can reduce the
verbosity without (probably) changing the results if you just select
one column instead of all of 'em, ie
    explain verbose select clave from d_cue where clave='9400001';

            regards, tom lane

Re: Performance

From

Simeó

Date:

17 May 2000, 11:31:09

A mar, 16 may 2000, Diego Schvartzman va escriure:
> ....
> TABLE STRUCTURE:
> CREATE TABLE "d_cue" (
>  "clave" character(7),
>  "nombre" character varying(60),
>  "calle" character varying(45),
>  "referencia" character varying(60),
>  "telefono" character varying(15),
>  "cp" int4,
>  "zona" int4,
>  "fraccion" character(2),
>  "radio" character(2),
>  "cooperadora" int4,
>  "d_cooperadora" character varying(50),
>  "confesional" int4,
>  "d_confesional" character varying(50),
>  "arancel" int4,
>  "d_arancel" character varying(50),
>  "categoria" int4,
>  "d_categoria" character varying(50),
>  "permanencia" int4,
>  "d_permanencia" character varying(50),
>  "alternancia" int4,
>  "d_alternancia" character varying(50),
>  "cod_jur" character varying(15),
>  "matric" character,
>  "periodo_func" int4,
>  "d_periodo_func" character varying(50),
>  "var1" character varying(15),
>  "var2" character varying(15),
>  "var3" character varying(15),
>  "var4" character varying(15),
>  "var5" character varying(15),
>  "var6" character varying(15),
>  "vat1" character(6),
>  "d_vat1" character varying(50),
>  "vat2" character(6),
>  "d_vat2" character varying(50),
>  "vat3" character(6),
>  "d_vat3" character varying(50),
>  "vat4" character(6),
>  "d_vat4" character varying(50),
>  "vat5" character(6),
>  "d_vat5" character varying(50),
>  "vat6" character(6),
>  "d_vat6" character varying(50),
>  "direle" character varying(30),
>  "barrio" character varying(40),
>  "nuevo" character,
>  "d_nuevo" character varying(50),
>  "turnos" character varying(255),
>  "tipos" character varying(255),
>  "ciclos" character varying(255),
>  "f_nf" character(10),
>  "agregado" int4);
> ...


Argggggggggg, You should read a database's book.

          ----------------
              Sime� Reig
           simeo@tinet.org
          -----------------

Re: Performance

From

Michael Meskes

Date:

18 May 2000, 05:45:02

On Tue, May 16, 2000 at 10:52:33PM +0200, SimeX wrote:
> ...
>
> Argggggggggg, You should read a database's book.

I wonder what you saw on this table definition that warrants such a
response. Since I did not read the original mail maybe you just didn't quote
the relevant parts, so I'm curious.

Michael
--
Michael Meskes
Michael@Fam-Meskes.De
Go SF 49ers! Go Rhein Fire!
Use Debian GNU/Linux! Use PostgreSQL!

Re: Performance

From

"Diego Schvartzman"

Date:

18 May 2000, 17:43:10

Hi !
I destroyed my db and created again, 'vacuum' it and seems to be everything
ok.
Now I'm getting this (wich is much better than before). Thanks all !

explain select * from d_cue where clave = '9400001';
NOTICE:  QUERY PLAN:

Index Scan using d_cue_clave on d_cue  (cost=0.00..1.69 rows=1 width=544)

Re: Performance

From

Lincoln Yeoh

Date:

19 May 2000, 02:59:23

I think the "VACUUM ANALYZE" solution should be given great prominence in
the FAQ.

Possibly add to:
4.9) My queries are slow or don't make use of the indexes. Why?
http://www.postgresql.org/docs/faq-english.html#4.9

New first lines:
Make sure relevant indexes exist (see 4.8) then try VACUUM ANALYZE from psql.

If that doesn't work, read the rest. And if THAT doesn't clear things up,
then it's something which the mailing lists and developers probably want to
know.

Cheerio,

Link.

At 06:40 PM 18-05-2000 -0300, Diego Schvartzman wrote:
>Hi !
>I destroyed my db and created again, 'vacuum' it and seems to be everything
>ok.
>Now I'm getting this (wich is much better than before). Thanks all !
>
>explain select * from d_cue where clave = '9400001';
>NOTICE:  QUERY PLAN:
>
>Index Scan using d_cue_clave on d_cue  (cost=0.00..1.69 rows=1 width=544)
>
>
>
>

Re: Performance

From

Dustin Sallings

Date:

19 May 2000, 04:38:17

On Tue, 16 May 2000, Alfred Perlstein wrote:

    This was not a bug report.  I simply said that I had the same
problem where a large table should have been using an index and was not,
so I vacuumed the table, and it used the index.

    Are you a developer?  Did this really read as a bug report?  Is
anyone out there creating largish tables, adding an index to them, and
having the index used without a vacuum?  Is it really that much of a
problem?

# >     I ran into this exact problem, and it was *very* significant on a
# > 15M row table I have.  :)  It didn't seem to want to use the index, even
# > freshly created, without a vacuum analyze.
#
# grrrr....
#
# FOR THE LAST TIME, THESE BUG REPORTS ARE PRETTY MUCH **USELESS**
# TO THE DEVELOPERS UNLESS YOU GIVE:
#
# THE TABLE STRUCTURE,
# THE QUERY, AND
# THE OUTPUT OF 'EXPLAIN'
#
# Just because someone is a database guru doesn't mean they are also
# clairvoyant. :)
#
# thanks,
# -Alfred
#
#

--
dustin sallings                            The world is watching America,
http://2852210114/~dustin/                 and America is watching TV.

Re: Performance

From

Bruce Momjian

Date:

19 May 2000, 10:57:24

> I think the "VACUUM ANALYZE" solution should be given great prominence in
> the FAQ.
>
> Possibly add to:
> 4.9) My queries are slow or don't make use of the indexes. Why?
> http://www.postgresql.org/docs/faq-english.html#4.9
>
> New first lines:
> Make sure relevant indexes exist (see 4.8) then try VACUUM ANALYZE from psql.
>
> If that doesn't work, read the rest. And if THAT doesn't clear things up,
> then it's something which the mailing lists and developers probably want to
> know.

Good, I added one sentence to the top of the FAQ answer stating VACUUM
ANALYZE and try again.

--
  Bruce Momjian                        |  http://www.op.net/~candle
  pgman@candle.pha.pa.us               |  (610) 853-3000
  +  If your life is a hard drive,     |  830 Blythe Avenue
  +  Christ can be your backup.        |  Drexel Hill, Pennsylvania 19026

Re: Performance

From

Dustin Sallings

Date:

20 May 2000, 04:16:53

On Fri, 19 May 2000, Matthias Urlichs wrote:

# If a table has an index (let's say it is a btree on fields a,b), and
# if a SELECT/INSERT/UPDATE/DELETE is issued with field a being either
# inserted or in the where clause, then the database needs to use that
# index. Period.

    That's not the case.  What if I only have two rows in it?  It
would take more resources to use the index than it would to do a
sequential scan.

# Requiring the application to call VACUUM in order to get any kind of
# performance is not a solution. When exactly am I supposed to do that?
# Before inserting one million records into my temporary table it's of
# no use whatsoever, and afterwards it's next week already. Literally.

    I had the same conversation with some of my Sybase DBAs, they
explained to me why I was wrong, and why they needed to manually update
statistics for smarter index usage instead of having the hot point during
the inserts.  It can probably be designed in such a way that the
statistics can be updated constantly without slowing everything down too
much, but I'm not a postgres developer and don't have the time to find out
if that's true.

# Unfortunately, the observable behavior in this case is something like
# - create table
# - create index
# - call VACUUM or not, doesn't make a difference because the table is
#   empty anyway
# - do a whole lot of INSERTs during which PostgreSQL is slow as molasses.
#
# Ouch.

    You'll save a tremendous amount of time by loading the data before
you add an index.  This is probably a big part of the reason you spend a
week loading one million entries into a table.  I don't think it takes me
an hour to load my 15,627,696 row table from scratch, after which I create
the index in about half that time, and a vacuum takes me approximately
five minutes.

    Now, it's true, I don't remember having to vacuum before, but the
vacuum isn't very painful.

--
dustin sallings                            The world is watching America,
http://2852210114/~dustin/                 and America is watching TV.