Thread: FWD: fastlock+lazyvzid patch performance

FWD: fastlock+lazyvzid patch performance

From
karavelov@mail.bg
Date:
<br />Hello, <br /><br />I have seen the discussions about fastlock patch and lazy-vxid performance <br />degradation,
soI decided to test it myself. <br /><br />The setup: <br />- hardware <br />Supermicro blade <br />6xSAS @15k on LSI
RAID:<br /> 1 disk for system + pg_xlog <br /> 4 disk RAID 10 for data <br /> 1 disk for spare <br />2 x Xeon E5405
@2GHz(no HT), 8 cores total <br />8G RAM <br /><br />- software <br />Debian Sid, linux-2.6.39.1 <br />Postgresql 9.1
beta2,compiled by debian sources <br />incrementally applied fastlock v3 and lazy-vxid v1 patches. I have to resolve
<br/>manually a conflict in src/backend/storage/lmgr/proc.c <br />Configuration: increased shared_mem to 2G,
max_connectionsto 500 <br /><br />- pgbench <br />initiated datasert with scaling factor 100 <br />example command
invocation:./pgbench -h 127.0.0.1 -n -S -T 30 -c 8 -j 8 -M prepared pgtest <br /><br />Results: <br /><br />clients
beta2+fastlock +lazyvzid local socket <br />8 76064 92430 92198 106734 <br />16 64254 90788 90698 105097 <br />32 56629
8818988269 101202 <br />64 51124 84354 84639 96362 <br />128 45455 79361 79724 90625 <br />256 40370 71904 72737 82434
<br/><br />All runs are executed on warm cache, I made some runs for 300s with the same results (tps). <br />I have
donesome runs with -M simple with identical distribution across cleints. <br /><br />I post this results because they
somehowcontradict with previous results posted on the list. In <br />my case the patches does not only improve peak
performancebut also improve the performance <br />under load - without patches the performance with 256 clients is 53%
ofthe peak performance <br />that is obtained with 8 clients, with patches the performance with 256 client is 79% of
thepeak <br />with 8 clients. <br /><br />Best regards <br />Luben Karavelov <br /><br />P.S. Excuse me for starting
newthread - I am new on the list. <br /><br /> 

Re: FWD: fastlock+lazyvzid patch performance

From
Robert Haas
Date:
On Fri, Jun 24, 2011 at 3:31 PM,  <karavelov@mail.bg> wrote:
> I post this results because they somehow contradict with previous results
> posted on the list. In
> my case the patches does not only improve peak performance but also improve
> the performance
> under load - without patches the performance with 256 clients is 53% of the
> peak performance
> that is obtained with 8 clients, with patches the performance with 256
> client is 79% of the peak
> with 8 clients.

I think this is strongly related to core count.  The spinlock
contention problems don't become really bad until you get up above 32
CPUs... at least from what I can tell so far.

So I'm not surprised it was just a straight win on your machine... but
thanks for verifying.  It's helpful to have more data points.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: FWD: fastlock+lazyvzid patch performance

From
Robert Haas
Date:
On Fri, Jun 24, 2011 at 3:31 PM,  <karavelov@mail.bg> wrote:
> clients beta2 +fastlock +lazyvzid local socket
> 8 76064 92430 92198 106734
> 16 64254 90788 90698 105097
> 32 56629 88189 88269 101202
> 64 51124 84354 84639 96362
> 128 45455 79361 79724 90625
> 256 40370 71904 72737 82434

I'm having trouble interpreting this table.

Column 1: # of clients
Column 2: TPS using 9.1beta2 unpatched
Column 3: TPS using 9.1beta2 + fastlock patch
Column 4: TPS using 9.1beta2 + fastlock patch + vxid patch
Column 5: ???

At any rate, that is a big improvement on a system with only 8 cores.
I would have thought you would have needed ~16 cores to get that much
speedup.  I wonder if the -M prepared makes a difference ... I wasn't
using that option.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company


Re: FWD: fastlock+lazyvzid patch performance

From
karavelov@mail.bg
Date:
----- Цитат от Robert Haas (robertmhaas@gmail.com), на 25.06.2011 в 00:16 ----- <br /><br />> On Fri, Jun 24, 2011
at3:31 PM, wrote: <br />>> clients beta2 +fastlock +lazyvzid local socket <br />>> 8 76064 92430 92198
106734<br />>> 16 64254 90788 90698 105097 <br />>> 32 56629 88189 88269 101202 <br />>> 64 51124
8435484639 96362 <br />>> 128 45455 79361 79724 90625 <br />>> 256 40370 71904 72737 82434 <br />> <br
/>>I'm having trouble interpreting this table. <br />> <br />> Column 1: # of clients <br />> Column 2: TPS
using9.1beta2 unpatched <br />> Column 3: TPS using 9.1beta2 + fastlock patch <br />> Column 4: TPS using
9.1beta2+ fastlock patch + vxid patch <br />> Column 5: ??? <br /><br />9.1beta2 + fastlock patch + vxid patch ,
pgbenchrun on unix domain <br />socket, the other tests are using local TCP connection. <br /><br />> At any rate,
thatis a big improvement on a system with only 8 cores. <br />> I would have thought you would have needed ~16 cores
toget that much <br />> speedup. I wonder if the -M prepared makes a difference ... I wasn't <br />> using that
option.<br />> <br /><br />Yes, it does make some difference, <br />Using unpatched beta2, 8 clients with simple
protocolI get 57059 tps. <br />With all patches and simple protocol I get 60707 tps. So the difference <br />between
patched/stockis not so big. I suppose the system gets CPU bound <br />on parsing and planning every submitted request.
With-M extended I <br />get even slower results. <br /><br />Luben <br /><br />-- <br />"Perhaps, there is no greater
lovethan that of a <br /> revolutionary couple where each of the two lovers is <br /> ready to abandon the other at any
momentif revolution <br /> demands it." <br /> Zizek