Re: Update on the spinlock->pthread_mutex patch experimental: replace s_lock spinlock code with pthread_mutex on linux - Mailing list pgsql-hackers
From | Nils Goroll |
---|---|
Subject | Re: Update on the spinlock->pthread_mutex patch experimental: replace s_lock spinlock code with pthread_mutex on linux |
Date | |
Msg-id | 4FF0C0FE.3010608@schokola.de Whole thread Raw |
In response to | Re: Update on the spinlock->pthread_mutex patch experimental: replace s_lock spinlock code with pthread_mutex on linux (Jeff Janes <jeff.janes@gmail.com>) |
Responses |
Re: Update on the spinlock->pthread_mutex patch
experimental: replace s_lock spinlock code with pthread_mutex on linux
|
List | pgsql-hackers |
Hi Jeff, >>> It looks like the hacked code is slower than the original. That >>> doesn't seem so good to me. Am I misreading this? >> >> No, you are right - in a way. This is not about maximizing tps, this is about >> maximizing efficiency under load situations > > But why wouldn't this maximized efficiency present itself as increased TPS? Because the latency of lock aquision influences TPS, but this is only marginally related to the cost in terms of cpu cyclues to aquire the locks. See my posting as of Sun, 01 Jul 2012 21:02:05 +0200 for an overview of my understanding. >>> Also, 20 transactions per connection is not enough of a run to make >>> any evaluation on. >> >> As you can see I've repeated the tests 10 times. I've tested slight variations >> as mentioned above, so I was looking for quick results with acceptable variation. > > Testing it 10 times doesn't necessarily improve things. My intention was to average over the imperfections of rusage accounting because I was maily interested in lowering rusage, not maximizing tps. Yes, in order to get reliable results, I'd have to run longer tests, but interestingly the results from my quick tests already approximated those from the huge tests Robert has run with respect to the differences between unpatched and patched. > You should use at least -T30, rather than -t20. Thanks for the advice - it is really appreciated and I will take it when I run more test tests. But I don't understand yet how to best provoke high spinlock concurrency with pgbench. Or are there are any other test tools out there for this case? > Anyway, your current benchmark speed of around 600 TPS over such a > short time periods suggests you are limited by fsyncs. Definitely. I described the setup in my initial posting ("why roll-your-own s_lock? / improving scalability" - Tue, 26 Jun 2012 19:02:31 +0200) > pgbench does as long as that is the case. You could turn --fsync=off, > or just change your benchmark to a read-only one like -S, or better > the -P option I've been trying get into pgbench. I don't like to make assumptions which I haven't validated. The system showing the behavior is designed to write to persistent SSD storage in order to reduce the risk of data loss by a (BBU) cache failure. Running a test with fsync=off would divert even further from reality. > Does your production server have fast fsyncs (BBU) while your test > server does not? No, we're writing directly to SSDs (ref: initial posting). > The users probably don't care about the load average. Presumably they > are unhappy because of lowered throughput (TPS) or higher peak latency > (-l switch in pgbench). So I think the only use of load average is to > verify that your benchmark is nothing like your production workload. > (But it doesn't go the other way around, just because the load > averages are similar doesn't mean the actual workloads are.) Fully agree. >> Rank Total duration Times executed Av. duration s Query >> 1 3m39s 83,667 0.00 COMMIT; > > So fsync's probably are not totally free on production, but I still > think they must be much cheaper than on your test box. Oh, the two are the same. I ran the tests on the prod machine during quiet periods. >> 2 54.4s 2 27.18 SELECT ... > > That is interesting. Maybe those two queries are hammering everything > else to death. With 64 cores? I should have mentioned that these were simply the result of a missing index when the data was collected. > But how does the 9th rank through the final rank, cumulatively, stack up? > > In other words, how many query-seconds worth of time transpired during > the 137 wall seconds? That would give an estimate of how many > simultaneously active connections the production server has. Sorry, I should have given you the stats from pgFouine: Number of unique normalized queries: 507 Number of queries: 295,949 Total query duration: 8m38s First query: 2012-06-2314:51:01 Last query: 2012-06-23 14:53:17 Query peak: 6,532 queries/s at 2012-06-23 14:51:33 >> Sorry for having omitted that detail. I had initialized pgbench with -i -s 100 > > Are you sure? In an earlier email you reported the entire output of > pgbench, and is said it was using 10. Maybe you've changed it since > then... good catch, I was wrong in the email you quoted. Sorry. -bash-4.1$ rsync -av --delete /tmp/test_template_data/ /tmp/data/ ... -bash-4.1$ ./postgres -D /tmp/data -p 55502 & [1] 38303 -bash-4.1$ LOG: database system was shut down at 2012-06-26 23:18:42 CEST LOG: database system is ready to accept connections LOG: autovacuum launcher started -bash-4.1$ ./psql -p 55502 psql (9.1.3) Type "help" for help. postgres=# select count(*) from pgbench_branches;count ------- 10 (1 row) Thank you very much, Jeff! The one question remains: Do we really have all we need to provoke very high lock contention? Nils
pgsql-hackers by date: