Thread: multi-master pgbench?
Hi, I am thinking about to implement "multi-master" option for pgbench. Supose we have multiple PostgreSQL running on host1 and host2. Something like "pgbench -c 10 -h host1,host2..." will create 5 connections to host1 and host2 and send queries to host1 and host2. The point of this functionality is to test some cluster software which have a capability to create multi-master configuration. Comments? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
On Tue, Aug 21, 2012 at 6:04 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
Do you have some thoughts about the possible option specifications?Hi,
I am thinking about to implement "multi-master" option for pgbench.
Supose we have multiple PostgreSQL running on host1 and host2.
Something like "pgbench -c 10 -h host1,host2..." will create 5
connections to host1 and host2 and send queries to host1 and host2.
The point of this functionality is to test some cluster software which
have a capability to create multi-master configuration.
Perhaps the read option has a good interest for PostgreSQL to check a simultaneous load on a multiple cluster of Postgres with read operations. But I do not see any immediate use of write operations only. Have you thought about the possibility to define a different set of transaction depending on the node targetted? For example you could target a master with write-read and slaves with read-only.
Btw, this could have some use not only for Postgres, but also for other projects based on it with which you could really do some multi-master benchmark in writing.
Btw, this could have some use not only for Postgres, but also for other projects based on it with which you could really do some multi-master benchmark in writing.
Configuration files would be too heavy for the only purpose of pgbench. So, specifiying all the info in a single command? It is of course possible, but command will become easily unreadable, and it might be the cause of many mistakes.
However, here are some ideas you might use:
1) pgbench -h host1:port1,host2:port2 ...
2) pgbench -h host1,host2 -p port1:port2
Regards,
--
Michael Paquier
http://michael.otacoo.com
Tatsuo Ishii <ishii@postgresql.org> writes: > I am thinking about to implement "multi-master" option for pgbench. > Supose we have multiple PostgreSQL running on host1 and host2. > Something like "pgbench -c 10 -h host1,host2..." will create 5 > connections to host1 and host2 and send queries to host1 and host2. > The point of this functionality is to test some cluster software which > have a capability to create multi-master configuration. Why wouldn't you just fire up several copies of pgbench, one per host? The main reason I'm dubious about this is that it's demonstrable that pgbench itself is the bottleneck in many test scenarios. That problem gets worse the more backends you try to have it control. You can of course "solve" this with multiple threads in pgbench, but as soon as you do that there's no functional benefit over just running several copies. regards, tom lane
>> I am thinking about to implement "multi-master" option for pgbench. >> Supose we have multiple PostgreSQL running on host1 and host2. >> Something like "pgbench -c 10 -h host1,host2..." will create 5 >> connections to host1 and host2 and send queries to host1 and host2. >> The point of this functionality is to test some cluster software which >> have a capability to create multi-master configuration. >> > Perhaps the read option has a good interest for PostgreSQL to check a > simultaneous load on a multiple cluster of Postgres with read operations. > But I do not see any immediate use of write operations only. Have you > thought about the possibility to define a different set of transaction > depending on the node targetted? For example you could target a master with > write-read and slaves with read-only. I think that kind of "intelligence" is beyond scope of pgbench. I would prefer to leave such a work to another tool. > Btw, this could have some use not only for Postgres, but also for other > projects based on it with which you could really do some multi-master > benchmark in writing. Right. If pgbench could have such a functionarlity, we could compare those projects by using pgbench. Currently those projects use different benchmarking tools. That means, the comparison is something like apple-to-orange. With enhanced pgbench we could do apple-to-apple comparison. > Do you have some thoughts about the possible option specifications? > Configuration files would be too heavy for the only purpose of pgbench. So, > specifiying all the info in a single command? It is of course possible, but > command will become easily unreadable, and it might be the cause of many > mistakes. Agreed. > However, here are some ideas you might use: > 1) pgbench -h host1:port1,host2:port2 ... > 2) pgbench -h host1,host2 -p port1:port2 Looks good. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
On Tue, Aug 21, 2012 at 06:04:42PM +0900, Tatsuo Ishii wrote: > Hi, > > I am thinking about to implement "multi-master" option for pgbench. > Supose we have multiple PostgreSQL running on host1 and host2. > Something like "pgbench -c 10 -h host1,host2..." will create 5 > connections to host1 and host2 and send queries to host1 and host2. > The point of this functionality is to test some cluster software which > have a capability to create multi-master configuration. > > Comments? To distinguish it from simply running separate pgbench tests for each host, would this somehow test propagation of the writes? Such a thing would be quite useful, but it seems at first glance like a large project. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
>> I am thinking about to implement "multi-master" option for pgbench. >> Supose we have multiple PostgreSQL running on host1 and host2. >> Something like "pgbench -c 10 -h host1,host2..." will create 5 >> connections to host1 and host2 and send queries to host1 and host2. >> The point of this functionality is to test some cluster software which >> have a capability to create multi-master configuration. >> >> Comments? > > To distinguish it from simply running separate pgbench tests for each > host, would this somehow test propagation of the writes? Such a thing > would be quite useful, but it seems at first glance like a large > project. What does "propagation of the writes" mean? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
> Why wouldn't you just fire up several copies of pgbench, one per host? Well, more convenient. Aside from bottle neck discussion below, simple tool to generate load is important IMO. It will help developers to enhance multi-master configuration in finding bugs and problems if any. IMO I saw similar relationship between pgbench and PostgreSQL. > The main reason I'm dubious about this is that it's demonstrable that > pgbench itself is the bottleneck in many test scenarios. That problem > gets worse the more backends you try to have it control. You can of > course "solve" this with multiple threads in pgbench, but as soon as you > do that there's no functional benefit over just running several copies. Are you sure that running several copies of pgbench could produce more TPS than single pgbench? I thought that's just a limitation of the resource of the machine which pgbench is running on. So I thought to avoid the bottle neck of pgbench, I have to use several pgbench client machines simultaneously anyway. Another point is, what kind of transactions you want. "pgbench -S" type transaction produces trivial load, and could easily reveal bottle neck of pgbench. However this type of transaction is pretty extrem one and very different from transactions in the real world. So even if your argument is true, I guess it's only adopted to "pgbench -S" case. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
Tatsuo Ishii <ishii@postgresql.org> writes: >> Why wouldn't you just fire up several copies of pgbench, one per host? > Well, more convenient. Aside from bottle neck discussion below, simple > tool to generate load is important IMO. Well, my concern here is that it's *not* going to be simple. By the time we get done adding enough switches to control connection to N different hosts (possibly with different usernames, passwords, etc), then adding frammishes to control which scripts get sent to which hosts, and so on, I don't think it's really going to be simpler to use than launching N copies of pgbench. It might be worth doing if we had features that allowed the different test scripts to interact, so that they could do things like check replication propagation from one host to another. But pgbench hasn't got that, and in multi-job mode really can't have that (at least not in the Unix separate-processes implementation). Anyway that's a whole nother level of complexity that would have to be added on before you got to a useful feature. regards, tom lane
On Wed, Aug 22, 2012 at 06:26:00AM +0900, Tatsuo Ishii wrote: > >> I am thinking about to implement "multi-master" option for pgbench. > >> Supose we have multiple PostgreSQL running on host1 and host2. > >> Something like "pgbench -c 10 -h host1,host2..." will create 5 > >> connections to host1 and host2 and send queries to host1 and host2. > >> The point of this functionality is to test some cluster software which > >> have a capability to create multi-master configuration. > >> > >> Comments? > > > > To distinguish it from simply running separate pgbench tests for each > > host, would this somehow test propagation of the writes? Such a thing > > would be quite useful, but it seems at first glance like a large > > project. > > What does "propagation of the writes" mean? I apologize for not being clear. In a multi-master system, people frequently wish to know how quickly a write operation has been duplicated to the other nodes. In some sense, those write operations are incomplete until they have happened on all nodes, even in the asynchronous case. Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
> Well, my concern here is that it's *not* going to be simple. By the > time we get done adding enough switches to control connection to N > different hosts (possibly with different usernames, passwords, etc), > then adding frammishes to control which scripts get sent to which hosts, > and so on, I don't think it's really going to be simpler to use than > launching N copies of pgbench. > > It might be worth doing if we had features that allowed the different > test scripts to interact, so that they could do things like check > replication propagation from one host to another. But pgbench hasn't > got that, and in multi-job mode really can't have that (at least not > in the Unix separate-processes implementation). Anyway that's a whole > nother level of complexity that would have to be added on before you > got to a useful feature. I do not intended to implement such a feature. As I wrote in the subject line, I intended to enhance pgbench for "multi-master" configuration. IMO, any node on multi-master configuration should accept *any* queries, not only read queries but write queries. So bare PostgreSQL streaming replication configuration cannot be a multi-master configuration and will not be a target of the new pgbench. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
>> What does "propagation of the writes" mean? > > I apologize for not being clear. In a multi-master system, people > frequently wish to know how quickly a write operation has been > duplicated to the other nodes. In some sense, those write operations > are incomplete until they have happened on all nodes, even in the > asynchronous case. IMO, that kind of functionnality is beyond the scope of benchmark tools. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
Tatsuo Ishii <ishii@postgresql.org> writes: >> Well, my concern here is that it's *not* going to be simple. By the >> time we get done adding enough switches to control connection to N >> different hosts (possibly with different usernames, passwords, etc), >> then adding frammishes to control which scripts get sent to which hosts, >> and so on, I don't think it's really going to be simpler to use than >> launching N copies of pgbench. > I do not intended to implement such a feature. As I wrote in the > subject line, I intended to enhance pgbench for "multi-master" > configuration. IMO, any node on multi-master configuration should > accept *any* queries, not only read queries but write queries. So bare > PostgreSQL streaming replication configuration cannot be a > multi-master configuration and will not be a target of the new > pgbench. Well, you're being shortsighted then, because such a feature will barely have hit the git repository before somebody wants to use it differently. I can easily imagine wanting to stress a master plus some hot-standby slaves, for instance; and that would absolutely require being able to direct different subsets of the test scripts to different hosts. regards, tom lane
>> I do not intended to implement such a feature. As I wrote in the >> subject line, I intended to enhance pgbench for "multi-master" >> configuration. IMO, any node on multi-master configuration should >> accept *any* queries, not only read queries but write queries. So bare >> PostgreSQL streaming replication configuration cannot be a >> multi-master configuration and will not be a target of the new >> pgbench. > > Well, you're being shortsighted then, because such a feature will barely > have hit the git repository before somebody wants to use it differently. > I can easily imagine wanting to stress a master plus some hot-standby > slaves, for instance; and that would absolutely require being able to > direct different subsets of the test scripts to different hosts. I don't see any practical way to implement such a tool because there's always a chance to try to retrieve non existing data from hot-standby because of replication delay. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
-----BEGIN PGP SIGNED MESSAGE----- Hash: RIPEMD160 > The point of this functionality is to test some cluster > software which have a capability to create multi-master > configuration. As the maintainer of software that does multi-master, I'm a little confused as to why we would extend pg_bench to do this. The software in question should be doing the testing itself, ideally via it's test suite (i.e. "make test"). Having pg_bench do any of this would be at best a very poor subset of the tests the software should be performing. I suppose if the software *uses* pg_bench for its tests already, once could argue a limited test case - but it seems difficult to design some pg_bench options generic and powerful enough to handle other cases outside of the one software this change is aimed at. - -- Greg Sabino Mullane greg@turnstep.com End Point Corporation http://www.endpoint.com/ PGP Key: 0x14964AC8 201208212330 http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8 -----BEGIN PGP SIGNATURE----- iEYEAREDAAYFAlA0UvsACgkQvJuQZxSWSsjALgCgw2cGI3eWR5fBGkoX9hqV1N39 OSEAn2ZIxrNRCdkDfKVrMmx2PsQTs8ZS =Xhqb -----END PGP SIGNATURE-----
> As the maintainer of software that does multi-master, I'm a little > confused as to why we would extend pg_bench to do this. The software > in question should be doing the testing itself, ideally via > it's test suite (i.e. "make test"). Having pg_bench do any of this > would be at best a very poor subset of the tests the software > should be performing. I suppose if the software *uses* pg_bench for > its tests already, once could argue a limited test case - but it seems > difficult to design some pg_bench options generic and powerful enough > to handle other cases outside of the one software this change is aimed at. Well, my point was in upthread: > Right. If pgbench could have such a functionarlity, we could compare > those projects by using pgbench. Currently those projects use > different benchmarking tools. That means, the comparison is something > like apple-to-orange. With enhanced pgbench we could do apple-to-apple > comparison. -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
On Wed, Aug 22, 2012 at 10:13:43AM +0900, Tatsuo Ishii wrote: > >> What does "propagation of the writes" mean? > > > > I apologize for not being clear. In a multi-master system, people > > frequently wish to know how quickly a write operation has been > > duplicated to the other nodes. In some sense, those write > > operations are incomplete until they have happened on all nodes, > > even in the asynchronous case. > > IMO, that kind of functionnality is beyond the scope of benchmark > tools. I was trying to come up with something that would distinguish pgbench for multi-master from pgbench run on independent nodes. Is there some other distinction to draw? Cheers, David. -- David Fetter <david@fetter.org> http://fetter.org/ Phone: +1 415 235 3778 AIM: dfetter666 Yahoo!: dfetter Skype: davidfetter XMPP: david.fetter@gmail.com iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics Remember to vote! Consider donating to Postgres: http://www.postgresql.org/about/donate
Tatsuo Ishii <ishii@postgresql.org> writes: > I am thinking about to implement "multi-master" option for pgbench. Please consider using Tsung, which solves that problem and many others. http://tsung.erlang-projects.org/ Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support
> Tatsuo Ishii <ishii@postgresql.org> writes: >> I am thinking about to implement "multi-master" option for pgbench. > > Please consider using Tsung, which solves that problem and many others. > > http://tsung.erlang-projects.org/ Thank you for introducing Tsung. I have some questions regarding it. Does it support extended query? Does it support V3 protocol? -- Tatsuo Ishii SRA OSS, Inc. Japan English: http://www.sraoss.co.jp/index_en.php Japanese: http://www.sraoss.co.jp
Tatsuo Ishii <ishii@postgresql.org> writes: > Does it support extended query? Does it support V3 protocol? Yes. It also has a proxy mode where it captures the queries sent by the client along with think times and outputs that in the session format it reads from its setup, which is very useful. Using that capability you can easily have Tsung replay your existing pgbench script. Regards, -- Dimitri Fontaine http://2ndQuadrant.fr PostgreSQL : Expertise, Formation et Support