Re: Join Query Perfomance Issue - Mailing list pgsql-performance

From Thomas Zaksek
Subject Re: Join Query Perfomance Issue
Date
Msg-id 47B170BF.5000206@ptt.uni-due.de
Whole thread Raw
In response to Re: Join Query Perfomance Issue  ("Scott Marlowe" <scott.marlowe@gmail.com>)
Responses Re: Join Query Perfomance Issue
List pgsql-performance
Scott Marlowe schrieb:
> On Feb 11, 2008 12:08 PM, Thomas Zaksek <zaksek@ptt.uni-due.de> wrote:
>
>> I have serious performance problems with the following type of queries:
>> /
>> /explain analyse SELECT '12.11.2007 18:04:00 UTC' AS zeit,
>>                        'M' AS datatyp,
>>                        p.zs_nr AS zs_de,
>>                    j_ges,
>>                        de_mw_abh_j_lkw(mw_abh) AS j_lkw,
>>                        de_mw_abh_v_pkw(mw_abh) AS v_pkw,
>>                        de_mw_abh_v_lkw(mw_abh) AS v_lkw,
>>                    de_mw_abh_p_bel(mw_abh) AS p_bel
>>                    FROM  messpunkt p, messungen_v_dat_2007_11_12 m, de_mw w
>>                        WHERE  m.ganglinientyp = 'M'
>>                AND 381 = m.minute_tag
>>                        AND (p.nr, p.mw_nr) = (m.messpunkt, w.nr);
>>
>> Explain analze returns
>>
>>  Nested Loop  (cost=0.00..50389.39 rows=3009 width=10) (actual
>> time=0.503..320.872 rows=2189 loops=1)
>>    ->  Nested Loop  (cost=0.00..30668.61 rows=3009 width=8) (actual
>> time=0.254..94.116 rows=2189 loops=1)
>>
>
> This nested loop is using us most of your time.  Try increasing
> work_mem and see if it chooses a better join plan, and / or turn off
> nested loops for a moment and see if that helps.
>
> set enable_nestloop = off
>
> Note that set enable_xxx = off
>
> Is kind of a hammer to the forebrain setting.  It's not subtle, and
> the planner can't work around it.  So use them with caution.  That
> said, I had one reporting query that simply wouldn't run fast without
> turning off nested loops for that one.  But don't turn off nested
> queries universally, they are still a good choice for smaller amounts
> of data.
>
I tried turning off nestloop, but with terrible results:

Hash Join  (cost=208328.61..228555.14 rows=3050 width=10) (actual
time=33421.071..40362.136 rows=2920 loops=1)
   Hash Cond: (w.nr = p.mw_nr)
   ->  Seq Scan on de_mw w  (cost=0.00..14593.79 rows=891479 width=10)
(actual time=0.012..3379.971 rows=891479 loops=1)
   ->  Hash  (cost=208290.49..208290.49 rows=3050 width=8) (actual
time=33420.877..33420.877 rows=2920 loops=1)
         ->  Merge Join  (cost=5303.71..208290.49 rows=3050 width=8)
(actual time=31.550..33407.688 rows=2920 loops=1)
               Merge Cond: (p.nr = m.messpunkt)
               ->  Index Scan using messpunkt_nr_idx on messpunkt p
(cost=0.00..238879.39 rows=6306026 width=12) (actual
time=0.056..17209.317 rows=4339470 loops=1)
               ->  Sort  (cost=5303.71..5311.34 rows=3050 width=4)
(actual time=25.973..36.858 rows=2920 loops=1)
                     Sort Key: m.messpunkt
                     ->  Index Scan using
messungen_v_dat_2007_11_12_messpunkt_minute_tag_idx on
messungen_v_dat_2007_11_12 m  (cost=0.00..5127.20 rows=3050 width=4)
(actual time=0.124..12.822 rows=2920 loops=1)
                           Index Cond: ((ganglinientyp = 'M'::bpchar)
AND (651 = minute_tag))
 Total runtime: 40373.512 ms
(12 rows)
Looks crappy, isn't it?

I also tried to increase work_men, now the config is
work_mem = 4MB
maintenance_work_mem = 128MB,
in regard to performance, it wasnt effective at all.

The postgresql runs  on a HP Server with dual Opteron, 3GB of Ram, what
are good settings here? The database will have to work with tables of
several 10Millions of Lines, but only a few columns each. No more than
maybe ~5 clients accessing the database at the same time.


pgsql-performance by date:

Previous
From: "Linux Guru"
Date:
Subject: Re: Update with Subquery Performance
Next
From: "Chris Kratz"
Date:
Subject: Re: mis-estimate in nested query causes slow runtimes