Re: A couple of proposed pgbench changes - Mailing list pgsql-patches
From | Tatsuo Ishii |
---|---|
Subject | Re: A couple of proposed pgbench changes |
Date | |
Msg-id | 20051201.161305.26273167.t-ishii@sraoss.co.jp Whole thread Raw |
In response to | A couple of proposed pgbench changes (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-patches |
> Hi Tatsuo, > > Attached is a proposed patch that deals with a couple of pgbench issues. > > The change at line 490 updates doCustom's local variable "commands" > after selecting a new file (command sequence). I think that the > existing coding will cause the thing to use the first command of the > old sequence in the remainder of the routine, which would be a bug. > I have not tried to set up a test case to prove it, though. Thanks for pointing it out. It's definitely a bug. To reproduce the problem, you have two SQL files, say, a.sql: SELECT 1; SELECT 2; b.sql: SELECT 300; SELECT 400; then run, pgbench -f a.sql -f b.sql and see the SQL trace by enabling log_statement=all, I see: LOG: statement: SELECT 300; LOG: statement: SELECT 400; LOG: statement: SELECT 300; LOG: statement: SELECT 2; LOG: statement: SELECT 1; LOG: statement: SELECT 400; LOG: statement: SELECT 300; LOG: statement: SELECT 400; LOG: statement: SELECT 300; LOG: statement: SELECT 2; LOG: statement: SELECT 1; : : #1 and #2 are ok, but #3(SELECT 300) and #4(SELECT 2) combo is supposed to be impossible if pgbench works correctly. > The other two changes cause doCustom to loop after processing a > meta-command. This might be a bit controversial, but as the code > is currently written, each meta-command "costs" one cycle of the > outer select() loop. Thus, for example, with the default TPC-B script, > once a backend returns "COMMIT" it will not receive a new command > until four cycles of issuing commands to other backends have elapsed. > (You can see this very easily by strace'ing pgbench under load.) > ISTM it is bad to have backends sitting idle waiting for pgbench to > give them a new command. On the other hand you could argue that > finishing out several consecutive metacommands will delay issuance of > new commands to other backends. In the test case I was running, > making this change made for a small but noticeable improvement in > total CPU usage, but I'm not entirely convinced it'd always be a win. Agreed. > I get the impression that pgbench itself is a bit of a bottleneck when > running on multi-CPU machines. I can't get the total CPU usage to reach > 90% with ten client threads on a dual HT-enabled Xeon, and the only > explanation I can see is that there are too many backends waiting for > pgbench to give them new commands instead of doing useful work. strace > shows that each select() iteration usually finds *all ten* sockets > read-ready, which is definitely bad. (I tried nice'ing pgbench to > negative priority, but it did not improve matters. It could easily be > that this is a Heisenproblem, though, with strace itself slowing pgbench > enough to cause the behavior.) I wonder if it would help for pgbench to > fork off multiple sub-processes and have each sub-process tend just one > backend. I'm not sure multiple sub-processes version of pgbench shows superior performance than current implementation because of process context switching overhead. Maybe threading is better? Mr. Yasuo Ohgaki implemented pthead version of pgbench. http://wiki.ohgaki.net/index.php?plugin=attach&pcmd=open&file=ppgbench-20050906-3.tar.bz2&refer=PostgreSQL%2Fppgbench You might want to try. > Anyway, since I'm not sure of either of these changes, I'm passing them > to you for review. Your patches seem fine. -- Tatsuo Ishii SRA OSS, Inc. Japan
pgsql-patches by date: