slow query - Mailing list pgsql-sql

From Markus Bertheau
Subject slow query
Date
Msg-id 1058893618.1857.94.camel@fluor
Whole thread Raw
Responses Re: slow query  (Markus Bertheau <twanger@bluetwanger.de>)
List pgsql-sql
Hi,

we have this large query about which we want to know if it can be made
faster. The query is:

select * from (
    select
    ressourcen.*, gebaeude.bezeichnung as
    "gebaeude.bezeichnung", gebaeude.gebaeude_id as "gebaeude.gebaeude_id",
    gebaeude.kurzbezeichnung as "gebaeude.kurzbezeichnung", gebaeude.nummer
    as "gebaeude.nummer", raeume.bemerkung as "raeume.bemerkung",
    raeume.flaeche as "raeume.flaeche", raeume.nummer as "raeume.nummer",
    raeume.raum_id as "raeume.raum_id", raumtypen.bezeichnung as
    "raumtypen.bezeichnung", raumtypen.raumtyp_id as "raumtypen.raumtyp_id",
    standorte.kurzbezeichnung as "standorte.kurzbezeichnung",
    standorte.standort_id as "standorte.standort_id",
    exists (
        select *
        from auftragsressourcen where auftrag_id = '46' and ressource_id
        = ressourcen.ressource_id
    ) as schon_in_auftrag_verwendet,
    case when (ressourcen.menge = 1) then (
        case when exists(
            select *
            from (
        -- does a row in belegungen exist for the given time intervals
                select bGetNumOfBookedSingleRes(ressourcen.ressource_id, turnus.von, turnus.bis) as da from (
                    select timestamp '2003-07-22 08:00'  AS von, timestamp '2003-07-22 20:00' AS bis
                    union select timestamp '2003-07-28 08:00'  AS von, timestamp '2003-07-28 20:00' AS bis
                    union select timestamp '2003-07-29 08:00'  AS von, timestamp '2003-07-29 20:00' AS bis
                ) as turnus
            ) as belegte where da
        ) then 0 else 1 end
    ) else ressourcen.menge - (
        select max(anzahl) from (
        -- sum(auftragsressourcen.menge) for all rows in belegungen that intersect the given time interval
        -- i.e. the number of used items of a ressource in the given time interval
            select iGetNumOfBookedRes(ressourcen.ressource_id, turnus.von, turnus.bis)
            as anzahl from (
                select timestamp '2003-07-22 08:00' AS von, timestamp '2003-07-22 20:00' AS bis
                union select timestamp '2003-07-28 08:00' AS von, timestamp '2003-07-28 20:00' AS bis
                union select timestamp '2003-07-29 08:00'  AS von, timestamp '2003-07-29 20:00' AS bis
            ) as turnus
        ) as belegte
    ) end as verfuegbar
    from ressourcen join raeume using
    (ressource_id) join gebaeude using (gebaeude_id) join standorte using
    (standort_id) join raumtypen using (raumtyp_id)
) as verfuegbare_ressourcen
where verfuegbare_ressourcen.verfuegbar > 0;

CREATE FUNCTION iGetNumOfBookedRes(integer, timestamp, timestamp) RETURNS numeric AS '
SELECT
    CASE WHEN (MAX(kumulierte) IS NULL) THEN 0 ELSE MAX(kumulierte) END
    FROM
    (SELECT
        (SELECT
            SUM(dynmenge)
            FROM
            (SELECT
                -- Ressource wird ausgeliehen
                von, menge AS dynmenge
                FROM auftragsressourcen JOIN belegungen USING (auftragsressource_id)
                WHERE ressource_id = $1 AND von > $2 AND von < $3
            UNION SELECT
                -- Ressource wird zurückgegeben
                bis AS von, -menge AS dynmenge
                FROM auftragsressourcen JOIN belegungen USING (auftragsressource_id)
                WHERE ressource_id = $1 AND bis > $2 AND bis < $3
            UNION SELECT
                -- Anfangsstand
                $2, SUM(auftragsressourcen.menge) AS dynmenge
                FROM auftragsressourcen JOIN belegungen USING (auftragsressource_id)
                WHERE ressource_id = $1 AND von <= $2 and bis > $2
            ) AS bel1
            WHERE bel1.von <= bel2.von
        ) AS kumulierte
        FROM
        (SELECT
            -- Ressource wird ausgeliehen
            von, menge AS dynmenge
            FROM auftragsressourcen JOIN belegungen USING (auftragsressource_id)
            WHERE ressource_id = $1 AND von > $2 AND von < $3
        UNION SELECT
            -- Ressource wird zurückgegeben
            bis AS von, -menge AS dynmenge
            FROM auftragsressourcen JOIN belegungen USING (auftragsressource_id)
            WHERE ressource_id = $1 AND bis > $2 AND bis < $3
        UNION SELECT
            -- Anfangsstand
            $2, SUM(auftragsressourcen.menge) AS dynmenge
            FROM auftragsressourcen JOIN belegungen USING (auftragsressource_id)
            WHERE ressource_id = $1 AND von <= $2 and bis > $2
        ) AS bel2
    ) AS belegte
' LANGUAGE SQL;

CREATE FUNCTION bGetNumOfBookedSingleRes(integer, timestamp, timestamp) RETURNS boolean AS '
select exists (select * from auftragsressourcen JOIN belegungen USING
(auftragsressource_id) WHERE ressource_id = $1 AND bis > $2 and von < $3)
' LANGUAGE SQL;

-- explain analyze of the above query

                                                                                                            QUERY PLAN
                                                                                                           

-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Subquery Scan verfuegbare_ressourcen  (cost=56.37..69.97 rows=88 width=142) (actual time=346.60..677.44 rows=265
loops=1)
   ->  Hash Join  (cost=56.37..69.97 rows=88 width=142) (actual time=346.52..668.46 rows=265 loops=1)
         Hash Cond: ("outer".raumtyp_id = "inner".raumtyp_id)
         ->  Hash Join  (cost=55.05..66.88 rows=88 width=125) (actual time=318.00..346.16 rows=265 loops=1)
               Hash Cond: ("outer".standort_id = "inner".standort_id)
               ->  Hash Join  (cost=53.96..64.25 rows=88 width=115) (actual time=317.80..336.77 rows=265 loops=1)
                     Hash Cond: ("outer".gebaeude_id = "inner".gebaeude_id)
                     ->  Hash Join  (cost=52.66..61.40 rows=88 width=68) (actual time=317.41..328.62 rows=265 loops=1)
                           Hash Cond: ("outer".ressource_id = "inner".ressource_id)
                           ->  Seq Scan on raeume  (cost=0.00..5.65 rows=265 width=56) (actual time=0.03..5.01 rows=265
loops=1)
                           ->  Hash  (cost=52.42..52.42 rows=95 width=12) (actual time=317.28..317.28 rows=0 loops=1)
                                 ->  Seq Scan on ressourcen  (cost=0.00..52.42 rows=95 width=12) (actual
time=2.88..315.62rows=284 loops=1) 
                                       Filter: (CASE WHEN (menge = 1) THEN (CASE WHEN (subplan) THEN 0 ELSE 1
END)::numericELSE ((menge)::numeric - (subplan)) END > 0::numeric) 
                                       SubPlan
                                         ->  Subquery Scan turnus  (cost=0.05..0.08 rows=1 width=0) (actual
time=0.96..0.96rows=0 loops=282) 
                                               ->  Unique  (cost=0.05..0.08 rows=1 width=0) (actual time=0.96..0.96
rows=0loops=282) 
                                                     ->  Sort  (cost=0.05..0.06 rows=3 width=0) (actual time=0.95..0.95
rows=0loops=282) 
                                                           Sort Key: von, bis
                                                           ->  Append  (cost=0.00..0.03 rows=3 width=0) (actual
time=0.80..0.80rows=0 loops=282) 
                                                                 ->  Subquery Scan "*SELECT* 1"  (cost=0.00..0.01
rows=1width=0) (actual time=0.28..0.28 rows=0 loops=282) 
                                                                       ->  Result  (cost=0.00..0.01 rows=1 width=0)
(actualtime=0.28..0.28 rows=0 loops=282) 
                                                                             One-Time Filter:
bgetnumofbookedsingleres($0,'2003-07-22 08:00:00'::timestamp without time zone, '2003-07-22 20:00:00'::timestamp
withouttime zone) 
                                                                 ->  Subquery Scan "*SELECT* 2"  (cost=0.00..0.01
rows=1width=0) (actual time=0.25..0.25 rows=0 loops=282) 
                                                                       ->  Result  (cost=0.00..0.01 rows=1 width=0)
(actualtime=0.24..0.24 rows=0 loops=282) 
                                                                             One-Time Filter:
bgetnumofbookedsingleres($0,'2003-07-28 08:00:00'::timestamp without time zone, '2003-07-28 20:00:00'::timestamp
withouttime zone) 
                                                                 ->  Subquery Scan "*SELECT* 3"  (cost=0.00..0.01
rows=1width=0) (actual time=0.25..0.25 rows=0 loops=282) 
                                                                       ->  Result  (cost=0.00..0.01 rows=1 width=0)
(actualtime=0.24..0.24 rows=0 loops=282) 
                                                                             One-Time Filter:
bgetnumofbookedsingleres($0,'2003-07-29 08:00:00'::timestamp without time zone, '2003-07-29 20:00:00'::timestamp
withouttime zone) 
                                         ->  Aggregate  (cost=0.08..0.08 rows=1 width=0) (actual time=15.06..15.07
rows=1loops=2) 
                                               ->  Subquery Scan turnus  (cost=0.05..0.08 rows=1 width=0) (actual
time=0.14..0.20rows=3 loops=2) 
                                                     ->  Unique  (cost=0.05..0.08 rows=1 width=0) (actual
time=0.13..0.16rows=3 loops=2) 
                                                           ->  Sort  (cost=0.05..0.06 rows=3 width=0) (actual
time=0.13..0.14rows=3 loops=2) 
                                                                 Sort Key: von, bis
                                                                 ->  Append  (cost=0.00..0.03 rows=3 width=0) (actual
time=0.03..0.08rows=3 loops=1) 
                                                                       ->  Subquery Scan "*SELECT* 1"  (cost=0.00..0.01
rows=1width=0) (actual time=0.02..0.03 rows=1 loops=1) 
                                                                             ->  Result  (cost=0.00..0.01 rows=1
width=0)(actual time=0.01..0.01 rows=1 loops=1) 
                                                                       ->  Subquery Scan "*SELECT* 2"  (cost=0.00..0.01
rows=1width=0) (actual time=0.02..0.02 rows=1 loops=1) 
                                                                             ->  Result  (cost=0.00..0.01 rows=1
width=0)(actual time=0.01..0.01 rows=1 loops=1) 
                                                                       ->  Subquery Scan "*SELECT* 3"  (cost=0.00..0.01
rows=1width=0) (actual time=0.02..0.02 rows=1 loops=1) 
                                                                             ->  Result  (cost=0.00..0.01 rows=1
width=0)(actual time=0.01..0.01 rows=1 loops=1) 
                     ->  Hash  (cost=1.24..1.24 rows=24 width=47) (actual time=0.32..0.32 rows=0 loops=1)
                           ->  Seq Scan on gebaeude  (cost=0.00..1.24 rows=24 width=47) (actual time=0.02..0.21 rows=24
loops=1)
               ->  Hash  (cost=1.07..1.07 rows=7 width=10) (actual time=0.09..0.09 rows=0 loops=1)
                     ->  Seq Scan on standorte  (cost=0.00..1.07 rows=7 width=10) (actual time=0.02..0.06 rows=7
loops=1)
         ->  Hash  (cost=1.26..1.26 rows=26 width=17) (actual time=0.28..0.28 rows=0 loops=1)
               ->  Seq Scan on raumtypen  (cost=0.00..1.26 rows=26 width=17) (actual time=0.04..0.19 rows=26 loops=1)
         SubPlan
           ->  Seq Scan on auftragsressourcen  (cost=0.00..1.17 rows=1 width=20) (actual time=0.04..0.04 rows=0
loops=265)
                 Filter: ((auftrag_id = 46) AND (ressource_id = $0))
           ->  Subquery Scan turnus  (cost=0.05..0.08 rows=1 width=0) (actual time=0.96..0.96 rows=0 loops=263)
                 ->  Unique  (cost=0.05..0.08 rows=1 width=0) (actual time=0.95..0.95 rows=0 loops=263)
                       ->  Sort  (cost=0.05..0.06 rows=3 width=0) (actual time=0.95..0.95 rows=0 loops=263)
                             Sort Key: von, bis
                             ->  Append  (cost=0.00..0.03 rows=3 width=0) (actual time=0.79..0.79 rows=0 loops=263)
                                   ->  Subquery Scan "*SELECT* 1"  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.26..0.26rows=0 loops=263) 
                                         ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.26..0.26 rows=0
loops=263)
                                               One-Time Filter: bgetnumofbookedsingleres($0, '2003-07-22
08:00:00'::timestampwithout time zone, '2003-07-22 20:00:00'::timestamp without time zone) 
                                   ->  Subquery Scan "*SELECT* 2"  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.26..0.26rows=0 loops=263) 
                                         ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.25..0.25 rows=0
loops=263)
                                               One-Time Filter: bgetnumofbookedsingleres($0, '2003-07-28
08:00:00'::timestampwithout time zone, '2003-07-28 20:00:00'::timestamp without time zone) 
                                   ->  Subquery Scan "*SELECT* 3"  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.25..0.25rows=0 loops=263) 
                                         ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.25..0.25 rows=0
loops=263)
                                               One-Time Filter: bgetnumofbookedsingleres($0, '2003-07-29
08:00:00'::timestampwithout time zone, '2003-07-29 20:00:00'::timestamp without time zone) 
           ->  Aggregate  (cost=0.08..0.08 rows=1 width=0) (actual time=18.43..18.43 rows=1 loops=2)
                 ->  Subquery Scan turnus  (cost=0.05..0.08 rows=1 width=0) (actual time=0.14..0.20 rows=3 loops=2)
                       ->  Unique  (cost=0.05..0.08 rows=1 width=0) (actual time=0.13..0.16 rows=3 loops=2)
                             ->  Sort  (cost=0.05..0.06 rows=3 width=0) (actual time=0.13..0.14 rows=3 loops=2)
                                   Sort Key: von, bis
                                   ->  Append  (cost=0.00..0.03 rows=3 width=0) (actual time=0.03..0.08 rows=3 loops=1)
                                         ->  Subquery Scan "*SELECT* 1"  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.02..0.03rows=1 loops=1) 
                                               ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.01..0.01
rows=1loops=1) 
                                         ->  Subquery Scan "*SELECT* 2"  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.02..0.02rows=1 loops=1) 
                                               ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.01..0.01
rows=1loops=1) 
                                         ->  Subquery Scan "*SELECT* 3"  (cost=0.00..0.01 rows=1 width=0) (actual
time=0.01..0.02rows=1 loops=1) 
                                               ->  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=0.01..0.01
rows=1loops=1) 
 Total runtime: 680.23 msec
(rows: 76)

At the moment we have ~300 rows in ressourcen, there are going to be
1000-10000 rows in it in production. The turnus union also normally has
around 100 rows.

With 300 rows in ressourcen and ~50 turnus rows the query takes 15
seconds here. With 3000 rows in ressourcen I expect around 150 seconds
which is too long. iGetNumOfBookedRes is pretty expensive, and because
almost all rows in ressource have menge = 1 we wrote an less expensive
function for that case that only checks for the existance of a row in
belegungen for the given time.

Attached are the table schemas, function definitions and the same query
with a larger turnus union. The large query takes ~13 seconds here. Test
data can be downloaded from

http://www.bab24.de/media/testdata.sql

If it helps to explain what this query actually does, I'll gladly follow
up with that. We're using PostgreSQL 7.3.3.

I hope I didn't forget anyting important.

Thanks

--
Markus Bertheau
Cenes Data GmbH

Attachment

pgsql-sql by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: min() and NaN
Next
From: Markus Bertheau
Date:
Subject: Re: slow query