Re: Major performance problem after upgrade from 8.3 to 8.4 - Mailing list pgsql-performance

From Gerhard Wiesinger
Subject Re: Major performance problem after upgrade from 8.3 to 8.4
Date
Msg-id alpine.LFD.2.01.1009142139290.22995@bbs.intern
Whole thread Raw
In response to Re: Major performance problem after upgrade from 8.3 to 8.4  (Merlin Moncure <mmoncure@gmail.com>)
Responses Re: Major performance problem after upgrade from 8.3 to 8.4  (Merlin Moncure <mmoncure@gmail.com>)
List pgsql-performance
On Tue, 14 Sep 2010, Merlin Moncure wrote:

> On Tue, Sep 14, 2010 at 2:07 AM, Gerhard Wiesinger <lists@wiesinger.com> wrote:
>> Hello Merlin,
>>
>> Seems to be a feasible approach. On problem which might be that when
>> multiple rows are returned that they are not ordered in each subselect
>> correctly. Any idea to solve that?
>>
>> e.g.
>> Raumsolltemperatur | Raumisttemperatur
>> Value from time 1  | Value from time 2
>> Value from time 2  | Value from time 1
>>
>> but should be
>> Raumsolltemperatur | Raumisttemperatur
>> Value from time 1  | Value from time 1
>> Value from time 2  | Value from time 2
>>
>> But that might be solveable by first selecting keys from the log_details
>> table and then join again.
>>
>> I will try it in the evening and I have to think about in detail.
>>
>> But thank you for the new approach and opening the mind :-)
>
> Using subquery in that style select (<subquery>), ... is limited to
> results that return 1 row, 1 column.  I assumed that was the case...if
> it isn't in your view, you can always attempt arrays:
>
> CREATE OR REPLACE VIEW log_entries AS
> SELECT
> l.id AS id,
> l.datetime AS datetime,
> l.tdate AS tdate,
> l.ttime AS ttime,
> array(select value from log_details ld join key_description kd on
> ld.fk_keyid = kd.keyid where ld.fk_id = l.id and  description =
> 'Raumsolltemperatur' order by XYZ) AS Raumsolltemperatur,
> [...]
>
> arrays might raise the bar somewhat in terms of dealing with the
> returned data, or they might work great.  some experimentation is in
> order.
>
> XYZ being the ordering condition you want.  If that isn't available
> inside the join then we need to think about this some more.  We could
> probably help more if you could describe the schema in a little more
> detail.  This is solvable.

Of course, subquery is limited to a result set returning 1 row and 1
column. Also order is of course preserved because of the join.

Further, I think I found a perfect query plan for the EAV pattern.

First I tried your suggestion but there were some limitation with O(n^2)
efforts (e.g. nested loops=12586 and also index scans with loop 12586):

CREATE OR REPLACE VIEW log_entries_test AS
SELECT
   l.id AS id,
   l.datetime AS datetime,
   l.tdate AS tdate,
   l.ttime AS ttime,
   (SELECT value FROM log_details d JOIN key_description kd ON d.fk_keyid = kd.keyid WHERE l.id = d.fk_id AND
kd.description= 'Raumsolltemperatur') AS Raumsolltemperatur, 
   (SELECT value FROM log_details d JOIN key_description kd ON d.fk_keyid = kd.keyid WHERE l.id = d.fk_id AND
kd.description= 'Raumtemperatur') AS Raumtemperatur, 
   (SELECT value FROM log_details d JOIN key_description kd ON d.fk_keyid = kd.keyid WHERE l.id = d.fk_id AND
kd.description= 'Kesselsolltemperatur') AS Kesselsolltemperatur, 
   (SELECT value FROM log_details d JOIN key_description kd ON d.fk_keyid = kd.keyid WHERE l.id = d.fk_id AND
kd.description= 'Kesseltemperatur') AS Kesseltemperatur, 
....
FROM
   log l
;


------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
EXPLAIN ANALYZE SELECT * FROM log_entries_test WHERE datetime > now() - INTERVAL '10 days' ORDER BY datetime DESC;

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan Backward using i_log_unique on log l  (cost=0.00..140820.12 rows=69 width=32) (actual time=2.848..22812.331
rows=12586loops=1) 
   Index Cond: (datetime > (now() - '10 days'::interval))
   SubPlan 1
     ->  Nested Loop  (cost=0.00..19.99 rows=1 width=8) (actual time=0.007..0.018 rows=1 loops=12586)
           ->  Seq Scan on key_description kd  (cost=0.00..2.38 rows=1 width=8) (actual time=0.003..0.013 rows=1
loops=12586)
                 Filter: ((description)::text = 'Raumsolltemperatur'::text)
           ->  Index Scan using unique_key_and_id on log_details d  (cost=0.00..17.60 rows=1 width=16) (actual
time=0.004..0.004rows=1 loops=12586) 
                 Index Cond: (($0 = d.fk_id) AND (d.fk_keyid = kd.keyid))
   SubPlan 2
     ->  Nested Loop  (cost=0.00..19.99 rows=1 width=8) (actual time=0.006..0.017 rows=1 loops=12586)
           ->  Seq Scan on key_description kd  (cost=0.00..2.38 rows=1 width=8) (actual time=0.003..0.013 rows=1
loops=12586)
                 Filter: ((description)::text = 'Raumtemperatur'::text)
           ->  Index Scan using unique_key_and_id on log_details d  (cost=0.00..17.60 rows=1 width=16) (actual
time=0.002..0.003rows=1 loops=12586) 
                 Index Cond: (($0 = d.fk_id) AND (d.fk_keyid = kd.keyid))
   SubPlan 3
     ->  Nested Loop  (cost=0.00..19.99 rows=1 width=8) (actual time=0.005..0.017 rows=1 loops=12586)
           ->  Seq Scan on key_description kd  (cost=0.00..2.38 rows=1 width=8) (actual time=0.002..0.013 rows=1
loops=12586)
                 Filter: ((description)::text = 'Kesselsolltemperatur'::text)
           ->  Index Scan using unique_key_and_id on log_details d  (cost=0.00..17.60 rows=1 width=16) (actual
time=0.003..0.003rows=1 loops=12586) 
                 Index Cond: (($0 = d.fk_id) AND (d.fk_keyid = kd.keyid))
   SubPlan 4
     ->  Nested Loop  (cost=0.00..19.99 rows=1 width=8) (actual time=0.006..0.017 rows=1 loops=12586)
           ->  Seq Scan on key_description kd  (cost=0.00..2.38 rows=1 width=8) (actual time=0.002..0.013 rows=1
loops=12586)
                 Filter: ((description)::text = 'Kesseltemperatur'::text)
           ->  Index Scan using unique_key_and_id on log_details d  (cost=0.00..17.60 rows=1 width=16) (actual
time=0.002..0.003rows=1 loops=12586) 
                 Index Cond: (($0 = d.fk_id) AND (d.fk_keyid = kd.keyid))
   SubPlan 5
     ->  Nested Loop  (cost=0.00..19.99 rows=1 width=8) (actual time=0.005..0.017 rows=1 loops=12586)
           ->  Seq Scan on key_description kd  (cost=0.00..2.38 rows=1 width=8) (actual time=0.002..0.014 rows=1
loops=12586)
                 Filter: ((description)::text = 'Speichersolltemperatur'::text)
           ->  Index Scan using unique_key_and_id on log_details d  (cost=0.00..17.60 rows=1 width=16) (actual
time=0.002..0.003rows=1 loops=12586) 
                 Index Cond: (($0 = d.fk_id) AND (d.fk_keyid = kd.keyid))
   SubPlan 6
     ->  Nested Loop  (cost=0.00..19.99 rows=1 width=8) (actual time=0.006..0.017 rows=1 loops=12586)
           ->  Seq Scan on key_description kd  (cost=0.00..2.38 rows=1 width=8) (actual time=0.003..0.013 rows=1
loops=12586)
                 Filter: ((description)::text = 'Speichertemperatur'::text)
           ->  Index Scan using unique_key_and_id on log_details d  (cost=0.00..17.60 rows=1 width=16) (actual
time=0.002..0.003rows=1 loops=12586) 
                 Index Cond: (($0 = d.fk_id) AND (d.fk_keyid = kd.keyid))

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

Therefore I optimized the query further which can be done in the
following way with another subquery and IHMO a perfect query plan. Also
the subselect avoid multiple iterations for each of the result rows:

CREATE OR REPLACE VIEW log_entries_test AS
SELECT
   l.id AS id,
   l.datetime AS datetime,
   l.tdate AS tdate,
   l.ttime AS ttime,
   (SELECT value FROM log_details d WHERE l.id = d.fk_id AND d.fk_keyid = (SELECT keyid FROM key_description WHERE
description= 'Raumsolltemperatur')) AS Raumsolltemperatur, 
   (SELECT value FROM log_details d WHERE l.id = d.fk_id AND d.fk_keyid = (SELECT keyid FROM key_description WHERE
description= 'Raumtemperatur')) AS Raumtemperatur, 
   (SELECT value FROM log_details d WHERE l.id = d.fk_id AND d.fk_keyid = (SELECT keyid FROM key_description WHERE
description= 'Kesselsolltemperatur')) AS Kesselsolltemperatur, 
   (SELECT value FROM log_details d WHERE l.id = d.fk_id AND d.fk_keyid = (SELECT keyid FROM key_description WHERE
description= 'Kesseltemperatur')) AS Kesseltemperatur, 
...
FROM
   log l
;

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
EXPLAIN ANALYZE SELECT * FROM log_entries_test WHERE datetime > now() - INTERVAL '10 days' ORDER BY datetime DESC;

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan Backward using i_log_unique on log l  (cost=0.00..140603.99 rows=69 width=32) (actual time=2.588..5602.899
rows=12586loops=1) 
   Index Cond: (datetime > (now() - '10 days'::interval))
   SubPlan 2
     ->  Index Scan using unique_key_and_id on log_details d  (cost=2.38..19.97 rows=1 width=8) (actual
time=0.010..0.011rows=1 loops=12586) 
           Index Cond: (($1 = fk_id) AND (fk_keyid = $0))
           InitPlan 1 (returns $0)
             ->  Seq Scan on key_description  (cost=0.00..2.38 rows=1 width=8) (actual time=0.015..0.066 rows=1
loops=1)
                   Filter: ((description)::text = 'Raumsolltemperatur'::text)
   SubPlan 4
     ->  Index Scan using unique_key_and_id on log_details d  (cost=2.38..19.97 rows=1 width=8) (actual
time=0.003..0.003rows=1 loops=12586) 
           Index Cond: (($1 = fk_id) AND (fk_keyid = $2))
           InitPlan 3 (returns $2)
             ->  Seq Scan on key_description  (cost=0.00..2.38 rows=1 width=8) (actual time=0.009..0.020 rows=1
loops=1)
                   Filter: ((description)::text = 'Raumtemperatur'::text)
   SubPlan 6
     ->  Index Scan using unique_key_and_id on log_details d  (cost=2.38..19.97 rows=1 width=8) (actual
time=0.002..0.003rows=1 loops=12586) 
           Index Cond: (($1 = fk_id) AND (fk_keyid = $3))
           InitPlan 5 (returns $3)
             ->  Seq Scan on key_description  (cost=0.00..2.38 rows=1 width=8) (actual time=0.005..0.017 rows=1
loops=1)
                   Filter: ((description)::text = 'Kesselsolltemperatur'::text)

------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------

BTW: Schemadata is in the links discussed in the thread

Thnx to all for helping me.

Ciao,
Gerhard

--
http://www.wiesinger.com/

pgsql-performance by date:

Previous
From: Craig James
Date:
Subject: Re: Held idle connections vs use of a Pooler
Next
From: Merlin Moncure
Date:
Subject: Re: Major performance problem after upgrade from 8.3 to 8.4