Re: Join Query Perfomance Issue - Mailing list pgsql-performance

From Thomas Zaksek
Subject Re: Join Query Perfomance Issue
Date
Msg-id 47B41E83.6060509@ptt.uni-due.de
Whole thread Raw
In response to Re: Join Query Perfomance Issue  ("Scott Marlowe" <scott.marlowe@gmail.com>)
Responses Re: Join Query Perfomance Issue
List pgsql-performance
Scott Marlowe schrieb:
>
> Yeah, it didn't help.  I was expecting the query planner to switch to
> a more efficient join plan.
>
>
> Try setting it higher for JUST THIS query.  i.e.
>
> set work_mem=128M;
> explain analyze select ....
>
> and see how that runs.  Then play with it til you've got it down to
> what helps.  Note that work_mem in postgresql.conf being too large can
> be dangerous, so it might be something you set for just this query for
> safety reasons.
>
>
Tried some values for work_mem like 32M, 128M, 256M, not much of a
difference to 4M, so i think work_mem is high enough here in basic
configuration.

I have now kind of optimized the query to a join of to tables(using
materialized views), basically like this:

SELECT  foo
        FROM messungen_v_dat_2007_11_12 m
            INNER JOIN messwerte_mv p ON p.nr = m.messpunkt
                    WHERE  m.ganglinientyp = 'M'
            AND xxx = m.minute_tag;


Are there any major flaws in this construction? Is there a better way to
join two tables this way?
Best i get here is a runtime of about 100ms, what seems ok to me.
The plan is like

nested loop
        index scan
        index scan

Nested Loop  (cost=0.00..31157.91 rows=3054 width=14) (actual
time=0.252..149.557 rows=2769 loops=1)
   ->  Index Scan using
messungen_v_dat_2007_11_12_messpunkt_minute_tag_idx on
messungen_v_dat_2007_11_12 m  (cost=0.00..5134.28 rows=3054 width=4)
(actual time=0.085..11.562 rows=2769 loops=1)
         Index Cond: ((ganglinientyp = 'M'::bpchar) AND (799 = minute_tag))
   ->  Index Scan using messwerte_mv_nr_idx on messwerte_mv p
(cost=0.00..8.51 rows=1 width=18) (actual time=0.031..0.035 rows=1
loops=2769)
         Index Cond: (p.nr = m.messpunkt)
 Total runtime: 159.703 ms
(6 rows)

Nested Loop is not the best regarding to performance, but there isn't a
way to avoid it here?

Another strange problem occurs when i retry the query after about 12
hours break without akivity on the database (starting work in the
morning) :
The query runs incredible slow (~3sec), analyse on the tables doesn't
change much. But when i switch enable_netloop to false, retry the query
(very bad result, > 30sec), then set enable_nestloop back to true, the
query works amazingly fast again (100ms). Note that explain analyse
provides the exactly similar plan for the 3sec at the beginning and the
fast 100ms later. I have absolutly no idea what causes this behavior.

pgsql-performance by date:

Previous
From: "Greg Stark"
Date:
Subject: Re: Anyone using a SAN?
Next
From: "Linux Guru"
Date:
Subject: Re: Creating and updating table using function parameter reference