Thread: important decrease of performance using the BETA version in one particular case

important decrease of performance using the BETA version in one particular case

From
Isabelle Therrien
Date:
(sorry for previous HTML)


Your name:       Isabelle Therrien
Your email address      :       therriei@cirano.qc.ca


System Configuration
---------------------
  Architecture          : Intel Pentium

  Operating System      : Linux

  PostgreSQL version    : PostgreSQL-7.1 Beta (but I use JDBC drivers
v7.0.2)

I'm using Weblogic Server 5.1

Please enter a FULL description of your problem:
------------------------------------------------
I observed an important decrease of performance using the BETA version of
PostgreSQL in one
particular case:

I have a big query, reported below, that is called several times in my
application.
At least 4 active connections call it at the same time.
Normally, this query is executed in about 30-50 milliseconds.
But after a while (depending on how many connections are used, and how
often the query is called),
the query is executed in 1000ms, then 2000ms, and it continues to grow
exponentially. I've already
seen it reaching 80 seconds.

While having these time reports, I tried to query the database directly
using "psql", and I got the
same result (around 80 sec)

I compared this to an older version (7.0.2) of PostgreSQL, in the same
circumstances, and the worst
I had was a 4000ms, without going any further. That's why I suppose the
problem comes from the BETA
version.


----------
The query: (the ? are replaced by 'GPK', GPK being the key we want to look
for)
----------
SELECT quotes.xmldocument, prodrefs.xmldocument, orders.filteredorder,
responses.xmldocument
     FROM quotes,prodrefs,responses,orders
     WHERE quotes.negotiationGPK = ?
        AND quotes.ctime = (SELECT MAX(ctime) FROM quotes WHERE
negotiationgpk = ?)
        AND prodrefs.negotiationGPK = ?
        AND orders.productreferenceGPK = prodrefs.gpk
        AND owner=u1 AND overridden=FALSE
        AND responses.orderGPK = orders.gpk
        AND responses.ctime = (SELECT MAX(ctime) FROM responses WHERE
ordergpk=orders.gpk)
UNION
SELECT quotes.xmldocument,prodrefs.xmldocument,(''),('')
    FROM quotes,prodrefs
    WHERE quotes.negotiationGPK = ?
        AND quotes.ctime = (SELECT MAX(ctime) FROM quotes WHERE
negotiationgpk = ?)
        AND prodrefs.negotiationGPK = ?

-----------
The tables:
-----------
            Table "quotes"
   Attribute    |   Type    | Modifier
----------------+-----------+----------
 gpk            | bigint    | not null
 xmldocument    | text      | not null
 negotiationgpk | bigint    | not null
 ctime          | timestamp | not null
Index: quotes_pkey

           Table "prodrefs"
   Attribute    |   Type    | Modifier
----------------+-----------+----------
 gpk            | bigint    | not null
 ctime          | timestamp | not null
 xmldocument    | text      | not null
 negotiationgpk | bigint    | not null
Index: prodrefs_pkey

             Table "orders"
      Attribute      |   Type    | Modifier
---------------------+-----------+----------
 gpk                 | bigint    | not null
 ctime               | timestamp | not null
 productreferencegpk | bigint    |
 owner               | text      | not null
 overridden          | boolean   |
 overridingordergpk  | bigint    |
 rawdata             | text      | not null
 filteredorder       | text      | not null
 previousquotegpk    | bigint    |
Index: orders_pkey

         Table "responses"
  Attribute  |   Type    | Modifier
-------------+-----------+----------
 gpk         | bigint    | not null
 ctime       | timestamp | not null
 ordergpk    | bigint    |
 xmldocument | text      | not null
 quotegpk    | bigint    |
Index: responses_pkey
Isabelle Therrien <therriei@LUB.UMontreal.CA> writes:
> I have a big query, reported below, that is called several times in my
> application.
> At least 4 active connections call it at the same time.
> Normally, this query is executed in about 30-50 milliseconds.
> But after a while (depending on how many connections are used, and how
> often the query is called),
> the query is executed in 1000ms, then 2000ms, and it continues to grow
> exponentially. I've already seen it reaching 80 seconds.

Hmm, that's odd.  What causes the time to drop back down to milliseconds
--- do you have to restart the whole database, or just run it in a new
backend?  Does the amount of memory being used by the backend increase
as the time goes up?  What does EXPLAIN show as the query plan for the
query?  How large are the tables, and how many tuples are actually
retrieved?

Also, which beta release exactly, and how did you build it (what
configure options)?

Finally, it would be nice to see the full schemas for these tables, to
be sure we're not missing something.  You can generate those via

    pg_dump -s -t tablename databasename

            regards, tom lane

Re: important decrease of performance using the BETA version in one particular case

From
Isabelle Therrien
Date:
Tom Lane wrote:
>
> Isabelle Therrien <therriei@LUB.UMontreal.CA> writes:
> > I have a big query, reported below, that is called several times in my
> > application.
> > At least 4 active connections call it at the same time.
> > Normally, this query is executed in about 30-50 milliseconds.
> > But after a while (depending on how many connections are used, and how
> > often the query is called),
> > the query is executed in 1000ms, then 2000ms, and it continues to grow
> > exponentially. I've already seen it reaching 80 seconds.
>
> Hmm, that's odd.  What causes the time to drop back down to milliseconds
> --- do you have to restart the whole database, or just run it in a new
> backend?  Does The amount of memory being used by the backend increase
> as the time goes up?  What does EXPLAIN show as the query plan for the
> query?  How large are the tables, and how many tuples are actually
> retrieved?
>
> Also, which beta release exactly, and how did you build it (what
> configure options)?
>
> Finally, it would be nice to see the full schemas for these tables, to
> be sure we're not missing something.  You can generate those via
>
>         pg_dump -s -t tablename databasename
>
>                         regards, tom lane



The only way the time drops down to milliseconds is to wait a couple of hours or the restart the
whole database.
The amount of memory being used by the backend doesn't seem to increase significantly.
The tables are emptied often.  We don't keep these datas. So there's never more than 50 tuples per
table. And with this query, about 3-4 tuples are retrieved.
I have these results with BETA1 (no idea what configure options were used) AND BETA4(rpm).



EXPLAIN:

NOTICE:  QUERY PLAN:
------------------
(PostgreSQL 7.0.2)
------------------
Unique  (cost=10766.48..10766.50 rows=0 width=88)
  InitPlan
    ->  Aggregate  (cost=18.69..18.69 rows=1 width=8)
      ->  Seq Scan on quotes  (cost=0.00..18.69 rows=1 width=8)
    ->  Aggregate  (cost=18.69..18.69 rows=1 width=8)
      ->  Seq Scan on quotes  (cost=0.00..18.69 rows=1 width=8)
  ->  Sort  (cost=10766.48..10766.48 rows=2 width=88)
        ->  Append  (cost=0.00..10766.47 rows=2 width=88)
                ->  Nested Loop  (cost=0.00..10745.06 rows=1 width=88)
                      ->  Nested Loop  (cost=0.00..10726.03 rows=1 width=76)
                            ->  Nested Loop  (cost=0.00..18.01 rows=1 width=48)
                                  ->  Seq Scan on orders  (cost=0.00..15.98 rows=1 width=28)
                                  ->  Index Scan using prodrefs_pkey on prodrefs  (cost=0.00..2.02
rows=1 width=20)
                            ->  Seq Scan on responses  (cost=0.00..25.96 rows=396 width=28)
                            SubPlan
                              ->  Aggregate  (cost=26.96..26.96 rows=1 width=8)
                                    ->  Seq Scan on responses  (cost=0.00..26.95 rows=4 width=8)
                      ->  Seq Scan on quotes  (cost=0.00..19.02 rows=1 width=12)
                ->  Nested Loop  (cost=0.00..21.41 rows=1 width=24)
                      ->  Seq Scan on prodrefs  (cost=0.00..2.38 rows=1 width=12)
                      ->  Seq Scan on quotes  (cost=0.00..19.02 rows=1 width=12)

-----------------------
(PostgreSQL 7.1 BETA 1)
-----------------------
Unique  (cost=0.06..0.08 rows=1 width=88)
  ->  Sort  (cost=0.06..0.06 rows=2 width=88)
        ->  Append  (cost=0.00..0.05 rows=2 width=88)
              ->  Subquery Scan *SELECT* 1  (cost=0.00..0.04 rows=1 width=88)
                    ->  Nested Loop  (cost=0.00..0.04 rows=1 width=88)
                          InitPlan
                            ->  Aggregate  (cost=0.00..0.00 rows=1 width=8)
                                  ->  Seq Scan on quotes  (cost=0.00..0.00 rows=1 width=8)
                          ->  Nested Loop  (cost=0.00..0.03 rows=1 width=68)
                                ->  Nested Loop  (cost=0.00..0.01 rows=1 width=40)
                                      ->  Seq Scan on quotes  (cost=0.00..0.00 rows=1 width=12)
                                      ->  Seq Scan on responses  (cost=0.00..0.00 rows=1 width=28)
                                ->  Seq Scan on orders  (cost=0.00..0.00 rows=1 width=28)
                                SubPlan
                                  ->  Aggregate  (cost=0.00..0.00 rows=1 width=8)
                                        ->  Seq Scan on responses  (cost=0.00..0.00 rows=1 width=8)
                          ->  Seq Scan on prodrefs  (cost=0.00..0.00 rows=1 width=20)
              ->  Subquery Scan *SELECT* 2  (cost=0.00..0.01 rows=1 width=24)
                    ->  Nested Loop  (cost=0.00..0.01 rows=1 width=24)
                          InitPlan
                            ->  Aggregate  (cost=0.00..0.00 rows=1 width=8)
                                  ->  Seq Scan on quotes  (cost=0.00..0.00 rows=1 width=8)
                          ->  Seq Scan on quotes  (cost=0.00..0.00 rows=1 width=12)
                          ->  Seq Scan on prodrefs  (cost=0.00..0.00 rows=1 width=12)

FULL SCHÉMAS:

CREATE TABLE "quotes" (
    "gpk" int8 NOT NULL,
    "xmldocument" text NOT NULL,
    "negotiationgpk" int8 NOT NULL,
    "ctime" timestamp NOT NULL,
    PRIMARY KEY ("gpk")
);
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER INSERT OR UPDATE ON "quotes"  NOT DEFERRABLE INITIALLY
IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins" ('<unnamed>', 'quotes', 'negotiations',
'UNSPECIFIED', 'negotiationgpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER DELETE ON "quotes"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_cascade_del" ('<unnamed>', 'orders', 'quotes',
'UNSPECIFIED', 'previousquotegpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER UPDATE ON "quotes"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd" ('<unnamed>', 'orders', 'quotes',
'UNSPECIFIED', 'previousquotegpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER DELETE ON "quotes"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_cascade_del" ('<unnamed>', 'responses', 'quotes',
'UNSPECIFIED', 'quotegpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER UPDATE ON "quotes"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd" ('<unnamed>', 'responses', 'quotes',
'UNSPECIFIED', 'quotegpk', 'gpk');

CREATE TABLE "prodrefs" (
    "gpk" int8 NOT NULL,
    "ctime" timestamp NOT NULL,
    "xmldocument" text NOT NULL,
    "negotiationgpk" int8 NOT NULL,
    PRIMARY KEY ("gpk")
);
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER INSERT OR UPDATE ON "prodrefs"  NOT DEFERRABLE INITIALLY
IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins" ('<unnamed>', 'prodrefs',
'negotiations', 'UNSPECIFIED', 'negotiationgpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER DELETE ON "prodrefs"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_cascade_del" ('<unnamed>', 'orders', 'prodrefs',
'UNSPECIFIED', 'productreferencegpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER UPDATE ON "prodrefs"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd" ('<unnamed>', 'orders', 'prodrefs',
'UNSPECIFIED', 'productreferencegpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER DELETE ON "prodrefs"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_cascade_del" ('<unnamed>', 'adjudications', 'prodrefs',
'UNSPECIFIED', 'productreferencegpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER UPDATE ON "prodrefs"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd" ('<unnamed>', 'adjudications', 'prodrefs',
'UNSPECIFIED', 'productreferencegpk', 'gpk');

CREATE TABLE "orders" (
    "gpk" int8 NOT NULL,
    "ctime" timestamp NOT NULL,
    "productreferencegpk" int8,
    "owner" text NOT NULL,
    "overridden" bool,
    "overridingordergpk" int8,
    "rawdata" text NOT NULL,
    "filteredorder" text NOT NULL,
    "previousquotegpk" int8,
    PRIMARY KEY ("gpk")
);
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER INSERT OR UPDATE ON "orders"  NOT DEFERRABLE INITIALLY
IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins" ('<unnamed>', 'orders', 'prodrefs',
'UNSPECIFIED', 'productreferencegpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER INSERT OR UPDATE ON "orders"  NOT DEFERRABLE INITIALLY
IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins" ('<unnamed>', 'orders', 'orders',
'UNSPECIFIED', 'overridingordergpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER DELETE ON "orders"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_cascade_del" ('<unnamed>', 'orders', 'orders',
'UNSPECIFIED', 'overridingordergpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER UPDATE ON "orders"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd" ('<unnamed>', 'orders', 'orders',
'UNSPECIFIED', 'overridingordergpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER INSERT OR UPDATE ON "orders"  NOT DEFERRABLE INITIALLY
IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins" ('<unnamed>', 'orders', 'quotes',
'UNSPECIFIED', 'previousquotegpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER DELETE ON "orders"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_cascade_del" ('<unnamed>', 'adjudications', 'orders',
'UNSPECIFIED', 'sellordergpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER UPDATE ON "orders"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd" ('<unnamed>', 'adjudications', 'orders',
'UNSPECIFIED', 'sellordergpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER DELETE ON "orders"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_cascade_del" ('<unnamed>', 'adjudications', 'orders',
'UNSPECIFIED', 'buyordergpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER UPDATE ON "orders"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd" ('<unnamed>', 'adjudications', 'orders',
'UNSPECIFIED', 'buyordergpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER DELETE ON "orders"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_cascade_del" ('<unnamed>', 'responses', 'orders',
'UNSPECIFIED', 'ordergpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER UPDATE ON "orders"  NOT DEFERRABLE INITIALLY IMMEDIATE
FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_noaction_upd" ('<unnamed>', 'responses', 'orders',
'UNSPECIFIED', 'ordergpk', 'gpk');


CREATE TABLE "responses" (
    "gpk" int8 NOT NULL,
    "ctime" timestamp NOT NULL,
    "ordergpk" int8,
    "xmldocument" text NOT NULL,
    "quotegpk" int8,
    PRIMARY KEY ("gpk")
);
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER INSERT OR UPDATE ON "responses"  NOT DEFERRABLE
INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins" ('<unnamed>', 'responses',
'orders', 'UNSPECIFIED', 'ordergpk', 'gpk');
CREATE CONSTRAINT TRIGGER "<unnamed>" AFTER INSERT OR UPDATE ON "responses"  NOT DEFERRABLE
INITIALLY IMMEDIATE FOR EACH ROW EXECUTE PROCEDURE "RI_FKey_check_ins" ('<unnamed>', 'responses',
'quotes', 'UNSPECIFIED', 'quotegpk', 'gpk');
Isabelle Therrien <therriei@LUB.UMontreal.CA> writes:
> The tables are emptied often.  We don't keep these datas. So there's
> never more than 50 tuples per table. And with this query, about 3-4
> tuples are retrieved.

Well, it would appear that in the 7.1 installation, you last vacuumed
the tables just after emptying them --- notice how all the cost
estimates are nearly zero.  The 7.0 optimizer on the other hand is
working with more reasonable cost values, and is presumably able to
select a smarter plan because of that.

I'd suggest making a practice of vacuum analyzing the tables just before
you empty them, not just after.  This may seem weird but it will leave
the optimizer with appropriate statistics.

If you see a decrease in performance even when 7.1 and 7.0 are being
given equivalent vacuum statistics, then I'd like to know more.

BTW, this query could be rewritten to be much more efficient by using
outer joins and SELECT DISTINCT ON ... but that's not really relevant
to the question of why 7.1 is slower than 7.0 for you ...

            regards, tom lane