Re: Performance issues - Mailing list pgsql-performance

From Gavin Flower
Subject Re: Performance issues
Date
Msg-id 5504A2CE.30007@archidevsys.co.nz
Whole thread Raw
In response to Re: Performance issues  (Tomas Vondra <tomas.vondra@2ndquadrant.com>)
Responses Re: Performance issues
List pgsql-performance
On 14/03/15 13:12, Tomas Vondra wrote:
> On 14.3.2015 00:28, Vivekanand Joshi wrote:
>> Hi Guys,
>>
>> So here is the full information attached as well as in the link
>> provided below:
>>
>> http://pgsql.privatepaste.com/41207bea45
>>
>> I can provide new information as well.
> Thanks.
>
> We still don't have EXPLAIN ANALYZE - how long was the query running (I
> assume it got killed at some point)? It's really difficult to give you
> any advices because we don't know where the problem is.
>
> If EXPLAIN ANALYZE really takes too long (say, it does not complete
> after an hour / over night), you'll have to break the query into parts
> and first tweak those independently.
>
> For example in the first message you mentioned that select from the
> S_V_D_CAMPAIGN_HIERARCHY view takes ~9 minutes, so start with that. Give
> us EXPLAIN ANALYZE for that query.
>
> Few more comments:
>
> (1) You're using CTEs - be aware that CTEs are not just aliases, but
>      impact planning / optimization, and in some cases may prevent
>      proper optimization. Try replacing them with plain views.
>
> (2) Varadharajan Mukundan already recommended you to create index on
>      s_f_promotion_history.send_dt. Have you tried that? You may also
>      try creating an index on all the columns needed by the query, so
>      that "Index Only Scan" is possible.
>
> (3) There are probably additional indexes that might be useful here.
>      What I'd try is adding indexes on all columns that are either a
>      foreign key or used in a WHERE condition. This might be an
>      overkill in some cases, but let's see.
>
> (4) I suspect many of the relations referenced in the views are not
>      actually needed in the query, i.e. the join is performed but
>      then it's just discarded because those columns are not used.
>      Try to simplify the views as much has possible - remove all the
>      tables that are not really necessary to run the query. If two
>      queries need different tables, maybe defining two views is
>      a better approach.
>
> (5) The vmstat / iostat data are pretty useless - what you provided are
>      averages since the machine was started, but we need a few samples
>      collected when the query is running. I.e. start the query, and then
>      give us a few samples from these commands:
>
>      iostat -x -k 1
>      vmstat 1
>
>> Would like to see if queries of these type can actually run in
>> postgres server?
> Why not? We're running DWH applications on tens/hundreds of GBs.
>
>> If yes, what would be the minimum requirements for hardware? We would
>> like to migrate our whole solution on PostgreSQL as we can spend on
>> hardware as much as we can but working on a proprietary appliance is
>> becoming very difficult for us.
> That's difficult to say, because we really don't know where the problem
> is and how much the queries can be optimized.
>
>
I notice that no one appears to have suggested the default setting in
postgresql.conf - these need changing as they are initially set up for
small machines, and to let PostgreSQL take anywhere near full advantage
of a box have large amounts of RAM, you need to change some of the
configuration settings!

For example 'temp_buffers' (default 8MB) and 'maintenance_work_mem'
(default 16MB) should be drastically increased,  and there are other
settings that need changing.  The precise values depend on many factors,
but the initial values set by default are definitely far too small for
your usage.

Am assuming that you are looking at PostgreSQL 9.4.



Cheers,
Gavin




pgsql-performance by date:

Previous
From: Varadharajan Mukundan
Date:
Subject: Re: Anyone have experience using PG on a NetApp All-Flash FAS8000?
Next
From: Varadharajan Mukundan
Date:
Subject: Re: Performance issues