Re: Hash join on int takes 8..114 seconds - Mailing list pgsql-performance

From Scott Carey
Subject Re: Hash join on int takes 8..114 seconds
Date
Msg-id BDFBB77C9E07BE4A984DAAE981D19F961ACA17D9BB@EXVMBX018-1.exch018.msoutlookonline.net
Whole thread Raw
In response to Re: Hash join on int takes 8..114 seconds  ("Andrus" <kobruleht2@hot.ee>)
Responses Re: Hash join on int takes 8..114 seconds
List pgsql-performance
> Appoaches which probably does not change perfomance:

> 6. Upgrade to 8.4 or to 8.3.5

Both of these will improve performance a little, even with the same query plan and same data.  I would expect about a
10%improvement for 8.3.x on most memory bound select queries. 8.4 won't be out for a few months. 

> 7. run server on standalone mode and recover 1 GB pg_shdepend bloated index.

> 8. tune some conf file parameters:
> > work_mem = 512
> I'd consider increasing this value a little - 0.5 MB seems too low to me
> (but not necessarily).

This is very easy to try.  You can change work_mem for just a single session, and this can in some cases help
performancequite a bit, and in others not at all. 
I would not recommend having it lower than at least 4MB on a server like that unless you have a lot of concurrently
activequeries / connections. 
To try it, simply use the SET command.  To try out 32MB, just do:
SET work_mem = '32MB';
and the value will be changed locally for that session only.  See if it affects your test query or not.
http://www.postgresql.org/docs/8.3/interactive/sql-set.html

> > effective_cache_size= 70000
> Well, your server has 2GB of RAM and usually it's recommended to set
> this value to about 60-70% of your RAM, so using 540MB (25%) seems quite
> low.

> Data size is nearly the same as RAM size. It is unpleasant surprise that
> queries take so long time.

> What should I do next?

First, demonstrate that it is all or mostly in memory -- use iostat or other tools to ensure that there is not much
diskactivity during the query.  If your system doesn't have iostat installed, it should be installed.  It is a very
usefultool. 
If it is all cached in memory, you may want to ensure that your shared_buffers is a reasonalbe size so that there is
lessshuffling of data from the kernel to postgres and back.  Generally, shared_buffers works best between 5% and 25% of
systemmemory. 
If it is completely CPU bound then the work done for the query has to be reduced by altering the plan to a more optimal
oneor making the work it has to do at each step easier.  Most of the ideas in this thread revolve around those things. 

Based on the time it took to do the vacuum, I suspect your disk subsystem is a bit slow.  If it can be determined that
thereis much disk I/O in your use cases, there are generally several things that can be done to tune Linux I/O.  The
mainones in my experience are the 'readahead' value for each disk which helps sequential reads significantly, and
tryingout the linux 'deadline' scheduler and comparing it to the more commonly used 'cfq' scheduler.  If the system is
configuredwith the anticipatory scheduler, absolutely switch to cfq or deadline as the anticipatory scheduler will
performhorribly poorly for a database. 

> Andrus.

pgsql-performance by date:

Previous
From: Tomas Vondra
Date:
Subject: Re: Hash join on int takes 8..114 seconds
Next
From: "Andrus"
Date:
Subject: limit clause produces wrong query plan