Thread: Postgresql performance on Solaris

Postgresql performance on Solaris

From
"Scott Buchan"
Date:

Hi,

 

I am getting poor performance from my postgresql (version 7.3.2 compiled with gcc 2.95.2) database when running load tests on my web application.  The database works great until I get above 200 concurrent users.  The following query runtime will vary from:

 

explain analyze select TIMESTAMP, SPM_CONVSRCADDR,

SPM_CONVDSTADDR, SPM_CONVSRCPORT, SPM_CONVDSTPORT,

SPM_CONVPROTO, ACTION from FirewallLogs where TimeStamp>=1044939600000

and TimeStamp<=1047391020000 and spm_subid='462' and fqdn ='bs2@pcp' order by

timestamp desc;

                                                                    QUERY PLAN                              

---------------------------------------------------------------------------------------------------------------------------------------------------

 Sort  (cost=3525.27..3527.49 rows=888 width=47) (actual time=55.65..56.16 rows=913 loops=1)

   Sort Key: "timestamp"

   ->  Index Scan using firewalllogsindex on firewalllogs  (cost=0.00..3481.77 rows=888 width=47) (actual time=0.40..50.33 rows=913 loops=1)

         Index Cond: ((fqdn = 'bs2@pcp'::character varying) AND ("timestamp" >= 1044939600000::bigint) AND ("timestamp" <= 1047391020000::bigint))

         Filter: (spm_subid = 462)

 Total runtime: 57.36 msec

(6 rows)

 

to

 

                                                                    QUERY PLAN                              

---------------------------------------------------------------------------------------------------------------------------------------------------

 Sort  (cost=3525.27..3527.49 rows=888 width=47) (actual time=2323.79..2324.31 rows=913 loops=1)

   Sort Key: "timestamp"

   ->  Index Scan using firewalllogsindex on firewalllogs  (cost=0.00..3481.77 rows=888 width=47) (actual time=0.26..2318.11 rows=913 loops=1)

         Index Cond: ((fqdn = 'bs2@pcp'::character varying) AND ("timestamp" >= 1044939600000::bigint) AND ("timestamp" <= 1047391020000::bigint))

         Filter: (spm_subid = 462)

 Total runtime: 2325.62 msec

(6 rows)

 

NOTE:  I am only performing select queries - no inserts, deletes or updates. 

I am running postgresql on a SunFire v880 with 4 750MHz sparcv9 processors with 8 Gig of RAM running solaris 8.  I have 2 tables with 500,000 records in each and both tables are indexed.  I am connecting to the database through JDBC using a pool of connections (tried pools of 50, 100, and 200 with similar results).  When running the load tests, the cpu of the box is always above 60% idle.  I have run iostat and I am not seeing any problems with io.

 

I have tried different size shared_buffers from 4100 to 64000, and I have added the following to the /etc/system file:

 

set shmsys:shminfo_shmmax=0xffffffff

set shmsys:shminfo_shmmin=1

set shmsys:shminfo_shmmni=256

set shmsys:shminfo_shmseg=256

set semsys:seminfo_semmap=256

set semsys:seminfo_semmni=512

set semsys:seminfo_semmns=512

set semsys:seminfo_semmsl=32

 

I understand that this could be a problem with the kernel and not postgresql but I am at a loss at what to change to get better performance out of the database or the kernel.

 

Any help would be appreciated.

 

 

Re: Postgresql performance on Solaris

From
Tom Lane
Date:
"Scott Buchan" <sbuchan@technicacorp.com> writes:
> I am getting poor performance from my postgresql (version 7.3.2 compiled
> with gcc 2.95.2) database when running load tests on my web application.
> The database works great until I get above 200 concurrent users.

Hmm ... that sounds kinda familiar; you might check the archives for
similar reports from Solaris users.  AFAIR we didn't figure out the
problem yet, but there's some raw data available.

> When running the load tests, the cpu of the box is always above 60%
> idle.  I have run iostat and I am not seeing any problems with io.

[ scratches head ... ]  If the bottleneck isn't CPU, and it isn't I/O,
then what could it be?  You sure about the above observations?  (Does
iostat include swap activity on that platform?)

The only other idea I can think of is that there's some weird effect in
the locking code (which only shows up with lots of concurrent backends)
such that would-be lockers repeatedly fail and sleep when they should
have gotten the lock.  If you can figure out how to tell the difference
between a backend waiting for disk I/O and one waiting for a semaphore
or sleeping, it'd be interesting to see what the majority of the
backends are doing.

Another way to try to gather some data is to attach to one of the
backend processes with a debugger, and just stop it to get a stack trace
every so often.  If the stack traces tend to point to the same place
that would give some info about the bottleneck.

            regards, tom lane


Re: Postgresql performance on Solaris

From
"scott.marlowe"
Date:
On Tue, 1 Apr 2003, Tom Lane wrote:

> The only other idea I can think of is that there's some weird effect in
> the locking code (which only shows up with lots of concurrent backends)
> such that would-be lockers repeatedly fail and sleep when they should
> have gotten the lock.  If you can figure out how to tell the difference
> between a backend waiting for disk I/O and one waiting for a semaphore
> or sleeping, it'd be interesting to see what the majority of the
> backends are doing.

I was thinking along the lines of it being something like the old Linux
kernel had with apache and other programs with waking all the processes.
It could be that something about Solaris is meaning that every backend
process, no matter how idle they are, get "touched" every time something
is done.  Just guessing.