Re: PostgreSQL performance problem -> tuning - Mailing list pgsql-performance

From Yaroslav Mazurak
Subject Re: PostgreSQL performance problem -> tuning
Date
Msg-id 3F335715.109@lviv.bank.gov.ua
Whole thread Raw
In response to Re: PostgreSQL performance problem -> tuning  (Richard Huxton <dev@archonet.com>)
List pgsql-performance
Hi, All!


Richard Huxton wrote:

> On Thursday 07 August 2003 17:30, Yaroslav Mazurak wrote:

>>Richard Huxton wrote:

>>>On Thursday 07 August 2003 09:24, Yaroslav Mazurak wrote:

> PG's memory use can be split into four areas (note - I'm not a developer so
> this could be wrong).
> 1. Shared memory - vital so that different connections can communicate with
> each other. Shouldn't be too large, otherwise PG spends too long managing its
> shared memory rather than working on your queries.
> 2. Sort memory - If you have to sort results during a query it will use up to
> the amount you define in sort_mem and then use disk if it needs any more.
> This is for each sort.
> 3. Results memory - If you're returning 8000 rows then PG will assemble these
> and send them to the client which also needs space to store the 8000 rows.
> 4. Working memory - to actually run the queries - stack and heap space to keep
> track of its calculations etc.

    Hence, total free RAM - shared_buffers - k * sort_mem -
effective_cache_size == (results memory + working memory)?

> For the moment, I'd leave the settings roughly where they are while we look at
> the query, then once that's out of the way we can fine-tune the settings.

    OK.

>>    Additionally, with user-defined function beginchar (SUBSTR(..., 1, 1)),
>>used for indexing, showcalc executes about 16 seconds. With function
>>SUBSTR the same showcalc executes 6 seconds.

> Fair enough - substr should be fairly efficient.

    Cost of user-defined SQL function call in PostgreSQL is high?

> OK - bear in mind that these suggestions are made without the benefit of the
> explain analyse:

> 1. You could try splitting out the various tags of your mask into different
> fields - that will instantly eliminate all the substr() calls and might make
> a difference. If you want to keep the mask for display purposes, we could
> build a trigger to keep it in sync with the separate flags.

    This will be next step. :)

> 2. Use a "calculations" table and build your results step by step. So -
> calculate all the simple accounts, then calculate the ones that contain the
> simple accounts.

    I give to SQL to user and few helper functions. Therefore single step
is required for building results.

> 3. You could keep a separate "account_contains" table that might look like:
>   acc_id | contains
>   A001   | A001
>   A002   | A002
>   A003   | A003
>   A003   | A001
>   A004   | A004
>   A004   | A003
>   A004   | A001

> So here A001/A002 are simple accounts but A003 contains A001 too. A004
> contains A003 and A001. The table can be kept up to date automatically using
> some triggers.
> This should make it simple to pick up all the accounts contained within the
> target account and might mean you can eliminate the recursion.

    Thanks, sounds not so bad, but I suspect that this method don't improve
performance essentially.
    I think about another secondary table for showcomp (compshow :)) with
showings "compiled" into account numbers and characteritics. After
inserting or updating new or old showing this showing will be
"recompiled" by explicit function call or trigger into atomary account
numbers and characteristics.

> Post the EXPLAIN ANALYSE first - maybe someone smarter than me will have an
> idea.

    First result - simple showing 'B00202' (without recursion).
    Second result - complex showing 'B00204' with recursion (1 level depth).
    Showing 'B00202' contains 85 accounts, 'B00203' - 108 accounts, and
'B00204' = 'B00202' - 'B00203'.
    Query text:

EXPLAIN ANALYZE SELECT COALESCE(
    (SELECT sc.koef * 100
        FROM showing AS s NATURAL JOIN showcomp AS sc
        WHERE s.kod = 'B00202'
            AND NOT SUBSTR(acc_mask, 1, 1) = '['
            AND SUBSTR(acc_mask, 1, 4) = '6010'
            AND SUBSTR(acc_mask, 5, 1) = SUBSTR('20', 1, 1)),
    (SELECT SUM(sc.koef * COALESCE(showcalc(SUBSTR(acc_mask, 2,
LENGTH(acc_mask) - 2), '20', '6010', 100), 0))
        FROM showing AS s NATURAL JOIN showcomp AS sc
        WHERE s.kod = 'B00202'
            AND SUBSTR(acc_mask, 1, 1) = '['),
    0) AS showing;

EXPLAIN ANALYZE SELECT COALESCE(
    (SELECT sc.koef * 100
        FROM showing AS s NATURAL JOIN showcomp AS sc
        WHERE s.kod = 'B00204'
            AND NOT SUBSTR(acc_mask, 1, 1) = '['
            AND SUBSTR(acc_mask, 1, 4) = '6010'
            AND SUBSTR(acc_mask, 5, 1) = SUBSTR('20', 1, 1)),
    (SELECT SUM(sc.koef * COALESCE(showcalc(SUBSTR(acc_mask, 2,
LENGTH(acc_mask) - 2), '20', '6010', 100), 0))
        FROM showing AS s NATURAL JOIN showcomp AS sc
        WHERE s.kod = 'B00204'
            AND SUBSTR(acc_mask, 1, 1) = '['),
    0) AS showing;

            QUERY PLAN


------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=704.39..704.39
rows=1 loops=1)
    InitPlan
      ->  Hash Join  (cost=5.22..449.63 rows=1 width=19) (actual
time=167.28..352.90 rows=1 loops=1)
            Hash Cond: ("outer".id_show = "inner".id_show)
            ->  Seq Scan on showcomp sc  (cost=0.00..444.40 rows=1
width=15) (actual time=23.29..350.17 rows=32 loops=1)
                  Filter: ((substr((acc_mask)::text, 1, 1) <> '['::text)
AND (substr((acc_mask)::text, 1, 4) = '6010'::text) AND
(substr((acc_mask)::text, 5, 1) = '2'::text))
            ->  Hash  (cost=5.22..5.22 rows=1 width=4) (actual
time=0.67..0.67 rows=0 loops=1)
                  ->  Index Scan using index_showing_kod on showing s
(cost=0.00..5.22 rows=1 width=4) (actual time=0.61..0.64 rows=1 loops=1)
                        Index Cond: (kod = 'B00202'::character varying)
      ->  Hash Join  (cost=5.22..449.63 rows=1 width=19) (actual
time=166.20..351.28 rows=1 loops=1)
            Hash Cond: ("outer".id_show = "inner".id_show)
            ->  Seq Scan on showcomp sc  (cost=0.00..444.40 rows=1
width=15) (actual time=23.36..349.24 rows=32 loops=1)
                  Filter: ((substr((acc_mask)::text, 1, 1) <> '['::text)
AND (substr((acc_mask)::text, 1, 4) = '6010'::text) AND
(substr((acc_mask)::text, 5, 1) = '2'::text))
            ->  Hash  (cost=5.22..5.22 rows=1 width=4) (actual
time=0.17..0.17 rows=0 loops=1)
                  ->  Index Scan using index_showing_kod on showing s
(cost=0.00..5.22 rows=1 width=4) (actual time=0.12..0.14 rows=1 loops=1)
                        Index Cond: (kod = 'B00202'::character varying)
      ->  Aggregate  (cost=312.61..312.61 rows=1 width=28) (never executed)
            ->  Hash Join  (cost=5.22..312.61 rows=1 width=28) (never
executed)
                  Hash Cond: ("outer".id_show = "inner".id_show)
                  ->  Seq Scan on showcomp sc  (cost=0.00..307.04
rows=69 width=24) (never executed)
                        Filter: (substr((acc_mask)::text, 1, 1) = '['::text)
                  ->  Hash  (cost=5.22..5.22 rows=1 width=4) (never
executed)
                        ->  Index Scan using index_showing_kod on
showing s  (cost=0.00..5.22 rows=1 width=4) (never executed)
                              Index Cond: (kod = 'B00202'::character
varying)
      ->  Aggregate  (cost=312.61..312.61 rows=1 width=28) (never executed)
            ->  Hash Join  (cost=5.22..312.61 rows=1 width=28) (never
executed)
                  Hash Cond: ("outer".id_show = "inner".id_show)
                  ->  Seq Scan on showcomp sc  (cost=0.00..307.04
rows=69 width=24) (never executed)
                        Filter: (substr((acc_mask)::text, 1, 1) = '['::text)
                  ->  Hash  (cost=5.22..5.22 rows=1 width=4) (never
executed)
                        ->  Index Scan using index_showing_kod on
showing s  (cost=0.00..5.22 rows=1 width=4) (never executed)
                              Index Cond: (kod = 'B00202'::character
varying)
  Total runtime: 706.82 msec
(33 rows)


            QUERY PLAN


------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
  Result  (cost=0.00..0.01 rows=1 width=0) (actual time=6256.20..6256.21
rows=1 loops=1)
    InitPlan
      ->  Hash Join  (cost=5.22..449.63 rows=1 width=19) (actual
time=357.43..357.43 rows=0 loops=1)
            Hash Cond: ("outer".id_show = "inner".id_show)
            ->  Seq Scan on showcomp sc  (cost=0.00..444.40 rows=1
width=15) (actual time=23.29..355.41 rows=32 loops=1)
                  Filter: ((substr((acc_mask)::text, 1, 1) <> '['::text)
AND (substr((acc_mask)::text, 1, 4) = '6010'::text) AND
(substr((acc_mask)::text, 5, 1) = '2'::text))
            ->  Hash  (cost=5.22..5.22 rows=1 width=4) (actual
time=0.22..0.22 rows=0 loops=1)
                  ->  Index Scan using index_showing_kod on showing s
(cost=0.00..5.22 rows=1 width=4) (actual time=0.16..0.19 rows=1 loops=1)
                        Index Cond: (kod = 'B00204'::character varying)
      ->  Hash Join  (cost=5.22..449.63 rows=1 width=19) (never executed)
            Hash Cond: ("outer".id_show = "inner".id_show)
            ->  Seq Scan on showcomp sc  (cost=0.00..444.40 rows=1
width=15) (never executed)
                  Filter: ((substr((acc_mask)::text, 1, 1) <> '['::text)
AND (substr((acc_mask)::text, 1, 4) = '6010'::text) AND
(substr((acc_mask)::text, 5, 1) = '2'::text))
            ->  Hash  (cost=5.22..5.22 rows=1 width=4) (never executed)
                  ->  Index Scan using index_showing_kod on showing s
(cost=0.00..5.22 rows=1 width=4) (never executed)
                        Index Cond: (kod = 'B00204'::character varying)
      ->  Aggregate  (cost=312.61..312.61 rows=1 width=28) (actual
time=2952.69..2952.69 rows=1 loops=1)
            ->  Hash Join  (cost=5.22..312.61 rows=1 width=28) (actual
time=12.59..264.69 rows=2 loops=1)
                  Hash Cond: ("outer".id_show = "inner".id_show)
                  ->  Seq Scan on showcomp sc  (cost=0.00..307.04
rows=69 width=24) (actual time=0.09..251.52 rows=1035 loops=1)
                        Filter: (substr((acc_mask)::text, 1, 1) = '['::text)
                  ->  Hash  (cost=5.22..5.22 rows=1 width=4) (actual
time=0.17..0.17 rows=0 loops=1)
                        ->  Index Scan using index_showing_kod on
showing s  (cost=0.00..5.22 rows=1 width=4) (actual time=0.12..0.14
rows=1 loops=1)
                              Index Cond: (kod = 'B00204'::character
varying)
      ->  Aggregate  (cost=312.61..312.61 rows=1 width=28) (actual
time=2945.79..2945.80 rows=1 loops=1)
            ->  Hash Join  (cost=5.22..312.61 rows=1 width=28) (actual
time=12.02..263.63 rows=2 loops=1)
                  Hash Cond: ("outer".id_show = "inner".id_show)
                  ->  Seq Scan on showcomp sc  (cost=0.00..307.04
rows=69 width=24) (actual time=0.09..251.09 rows=1035 loops=1)
                        Filter: (substr((acc_mask)::text, 1, 1) = '['::text)
                  ->  Hash  (cost=5.22..5.22 rows=1 width=4) (actual
time=0.17..0.17 rows=0 loops=1)
                        ->  Index Scan using index_showing_kod on
showing s  (cost=0.00..5.22 rows=1 width=4) (actual time=0.12..0.14
rows=1 loops=1)
                              Index Cond: (kod = 'B00204'::character
varying)
  Total runtime: 6257.35 msec
(33 rows)

>>    Anyway, 600Mb is too low for PostgreSQL for executing my query - DBMS
>>raise error after 11.5 hours (of estimated 13?). :(

> I think the problem is the 13 hours, not the 600MB. Once we've got the query
> running in a reasonable length of time (seconds) then the memory requirements
> will go down, I'm sure.

    OK, that's right.


With best regards
    Yaroslav Mazurak.


pgsql-performance by date:

Previous
From: Tom Lane
Date:
Subject: Re: Moving postgresql.conf tunables into 2003...
Next
From: mixo
Date:
Subject: Perfomance Tuning