Re: Performance query about large tables, lots of concurrent access - Mailing list pgsql-performance

From Karl Wright
Subject Re: Performance query about large tables, lots of concurrent access
Date
Msg-id 4676E422.8090406@metacarta.com
Whole thread Raw
In response to Performance query about large tables, lots of concurrent access  (Karl Wright <kwright@metacarta.com>)
List pgsql-performance
Karl Wright wrote:
> Hi,
>
> I have an application which really exercises the performance of
> postgresql in a major way, and I am running into a performance
> bottleneck with Postgresql 8.1 that I do not yet understand.
>
> Here are the details:
>
> - There is a primary table, with some secondary tables
> - The principle transaction consists of a "SELECT...FOR UPDATE",
> followed by either an INSERT or an UPDATE on the primary table
> - INSERTs, DELETEs, and UPDATEs may occur on the secondary table
> depending on what happens with the primary table, for any given
> transaction.  The secondary table has about 10x the number of rows as
> the primary.
> - All operations are carefully chosen so that highly discriminatory
> indexes are used to locate the record(s) in question.  The execution
> plans show INDEX SCAN operations being done in all cases.
> - At any given time, there are up to 100 of these operations going on at
> once against the same database.
>
> What I am seeing:
>
> - In postgresql 7.4, the table activity seems to be gated by locks, and
> runs rather slowly except when the sizes of the tables are small.
> - In postgresql 8.1, locks do not seem to be an issue, and the activity
> runs about 10x faster than for postgresql 7.4.
> - For EITHER database version, the scaling behavior is not the log(n)
> behavior I'd expect (where n is the number of rows in the table), but
> much more like linear performance.  That is, as the tables grow,
> performance drops off precipitously.  For a primary table size up to
> 100,000 rows or so, I get somewhere around 700 transactions per minute,
> on average.  Between 100,000 and 1,000,000 rows I got some 150
> transactions per minute.  At about 1,500,000 rows I get about 40
> transactions per minute.
> - Access to a row in the secondary table (which right now has 13,000,000
> rows in it) via an index that has extremely good discriminatory ability
> on a busy machine takes about 90 seconds elapsed time at the moment -
> which I feel is pretty high.
>
> I tried increasing the shared_buffers parameter to see if it had any
> impact on overall throughput.  It was moderately helpful going from the
> small default value up to 8192, but less helpful when I increased it
> beyond that.  Currently I have it set to 131072.
>
> Question:  Does anyone have any idea what bottleneck I am hitting?  An
> index's performance should in theory scale as the log of the number of
> rows - what am I missing here?
>
> Thanks very much!
> Karl
>

I suppose I should also have noted that the postgresql processes that
are dealing with the transactions seem to be CPU bound.  Here's a "top"
from the running system:

top - 15:58:50 up 4 days,  4:45,  1 user,  load average: 17.14, 21.05, 22.46
Tasks: 194 total,  15 running, 177 sleeping,   0 stopped,   2 zombie
Cpu(s): 98.4% us,  1.5% sy,  0.0% ni,  0.0% id,  0.1% wa,  0.0% hi,  0.0% si
Mem:  16634256k total, 16280244k used,   354012k free,   144560k buffers
Swap:  8008360k total,       56k used,  8008304k free, 15071968k cached

   PID USER      PR  NI  VIRT  RES  SHR S %CPU %MEM    TIME+  COMMAND
15966 postgres  18   0 1052m 1.0g 1.0g R 66.5  6.3   0:18.64 postmaster
14683 postgres  17   0 1053m 1.0g 1.0g R 54.9  6.3   0:17.90 postmaster
17050 postgres  15   0 1052m  93m  90m S 50.3  0.6   0:06.42 postmaster
16816 postgres  18   0 1052m 166m 162m R 46.3  1.0   0:04.80 postmaster
16697 postgres  18   0 1052m 992m 988m R 42.3  6.1   0:15.49 postmaster
17272 postgres  16   0 1053m 277m 273m S 30.8  1.7   0:09.91 postmaster
16659 postgres  16   0 1052m 217m 213m R 29.8  1.3   0:06.60 postmaster
15509 postgres  18   0 1052m 1.0g 1.0g R 23.2  6.4   0:26.72 postmaster
16329 postgres  18   0 1052m 195m 191m R 16.9  1.2   0:05.54 postmaster
14019 postgres  20   0 1052m 986m 983m R 16.5  6.1   0:16.50 postmaster
17002 postgres  18   0 1052m  38m  35m R 12.6  0.2   0:02.98 postmaster
16960 postgres  15   0 1053m 453m 449m S  3.3  2.8   0:10.39 postmaster
16421 postgres  15   0 1053m 1.0g 1.0g S  2.3  6.2   0:23.59 postmaster
13588 postgres  15   0 1052m 1.0g 1.0g D  0.3  6.4   0:47.89 postmaster
24708 root      15   0  2268 1136  836 R  0.3  0.0   0:05.92 top
     1 root      15   0  1584  520  452 S  0.0  0.0   0:02.08 init

Karl




pgsql-performance by date:

Previous
From: James Neethling
Date:
Subject: Re: Volunteer to build a configuration tool
Next
From: "Guillaume Smet"
Date:
Subject: Re: Parsing VACUUM VERBOSE