Thread: multi-master pgbench?

multi-master pgbench?

From
Tatsuo Ishii
Date:
Hi,

I am thinking about to implement "multi-master" option for pgbench.
Supose we have multiple PostgreSQL running on host1 and host2.
Something like "pgbench -c 10 -h host1,host2..." will create 5
connections to host1 and host2 and send queries to host1 and host2.
The point of this functionality is to test some cluster software which
have a capability to create multi-master configuration.

Comments?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: multi-master pgbench?

From
Michael Paquier
Date:


On Tue, Aug 21, 2012 at 6:04 PM, Tatsuo Ishii <ishii@postgresql.org> wrote:
Hi,

I am thinking about to implement "multi-master" option for pgbench.
Supose we have multiple PostgreSQL running on host1 and host2.
Something like "pgbench -c 10 -h host1,host2..." will create 5
connections to host1 and host2 and send queries to host1 and host2.
The point of this functionality is to test some cluster software which
have a capability to create multi-master configuration.
Perhaps the read option has a good interest for PostgreSQL to check a simultaneous load on a multiple cluster of Postgres with read operations. But I do not see any immediate use of write operations only. Have you thought about the possibility to define a different set of transaction depending on the node targetted? For example you could target a master with write-read and slaves with read-only.

Btw, this could have some use not only for Postgres, but also for other projects based on it with which you could really do some multi-master benchmark in writing.
Do you have some thoughts about the possible option specifications?
Configuration files would be too heavy for the only purpose of pgbench. So, specifiying all the info in a single command? It is of course possible, but command will become easily unreadable, and it might be the cause of many mistakes.

However, here are some ideas you might use:
1) pgbench -h host1:port1,host2:port2 ...
2) pgbench -h host1,host2 -p port1:port2

Regards,
--
Michael Paquier
http://michael.otacoo.com

Re: multi-master pgbench?

From
Tom Lane
Date:
Tatsuo Ishii <ishii@postgresql.org> writes:
> I am thinking about to implement "multi-master" option for pgbench.
> Supose we have multiple PostgreSQL running on host1 and host2.
> Something like "pgbench -c 10 -h host1,host2..." will create 5
> connections to host1 and host2 and send queries to host1 and host2.
> The point of this functionality is to test some cluster software which
> have a capability to create multi-master configuration.

Why wouldn't you just fire up several copies of pgbench, one per host?

The main reason I'm dubious about this is that it's demonstrable that
pgbench itself is the bottleneck in many test scenarios.  That problem
gets worse the more backends you try to have it control.  You can of
course "solve" this with multiple threads in pgbench, but as soon as you
do that there's no functional benefit over just running several copies.
        regards, tom lane



Re: multi-master pgbench?

From
Tatsuo Ishii
Date:
>> I am thinking about to implement "multi-master" option for pgbench.
>> Supose we have multiple PostgreSQL running on host1 and host2.
>> Something like "pgbench -c 10 -h host1,host2..." will create 5
>> connections to host1 and host2 and send queries to host1 and host2.
>> The point of this functionality is to test some cluster software which
>> have a capability to create multi-master configuration.
>>
> Perhaps the read option has a good interest for PostgreSQL to check a
> simultaneous load on a multiple cluster of Postgres with read operations.
> But I do not see any immediate use of write operations only. Have you
> thought about the possibility to define a different set of transaction
> depending on the node targetted? For example you could target a master with
> write-read and slaves with read-only.

I think that kind of "intelligence" is beyond scope of pgbench. I
would prefer to leave such a work to another tool.

> Btw, this could have some use not only for Postgres, but also for other
> projects based on it with which you could really do some multi-master
> benchmark in writing.

Right. If pgbench could have such a functionarlity, we could compare
those projects by using pgbench. Currently those projects use
different benchmarking tools. That means, the comparison is something
like apple-to-orange. With enhanced pgbench we could do apple-to-apple
comparison.

> Do you have some thoughts about the possible option specifications?
> Configuration files would be too heavy for the only purpose of pgbench. So,
> specifiying all the info in a single command? It is of course possible, but
> command will become easily unreadable, and it might be the cause of many
> mistakes.

Agreed.

> However, here are some ideas you might use:
> 1) pgbench -h host1:port1,host2:port2 ...
> 2) pgbench -h host1,host2 -p port1:port2

Looks good.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: multi-master pgbench?

From
David Fetter
Date:
On Tue, Aug 21, 2012 at 06:04:42PM +0900, Tatsuo Ishii wrote:
> Hi,
> 
> I am thinking about to implement "multi-master" option for pgbench.
> Supose we have multiple PostgreSQL running on host1 and host2.
> Something like "pgbench -c 10 -h host1,host2..." will create 5
> connections to host1 and host2 and send queries to host1 and host2.
> The point of this functionality is to test some cluster software which
> have a capability to create multi-master configuration.
> 
> Comments?

To distinguish it from simply running separate pgbench tests for each
host, would this somehow test propagation of the writes?  Such a thing
would be quite useful, but it seems at first glance like a large
project.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: multi-master pgbench?

From
Tatsuo Ishii
Date:
>> I am thinking about to implement "multi-master" option for pgbench.
>> Supose we have multiple PostgreSQL running on host1 and host2.
>> Something like "pgbench -c 10 -h host1,host2..." will create 5
>> connections to host1 and host2 and send queries to host1 and host2.
>> The point of this functionality is to test some cluster software which
>> have a capability to create multi-master configuration.
>> 
>> Comments?
> 
> To distinguish it from simply running separate pgbench tests for each
> host, would this somehow test propagation of the writes?  Such a thing
> would be quite useful, but it seems at first glance like a large
> project.

What does "propagation of the writes" mean?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: multi-master pgbench?

From
Tatsuo Ishii
Date:
> Why wouldn't you just fire up several copies of pgbench, one per host?

Well, more convenient. Aside from bottle neck discussion below, simple
tool to generate load is important IMO. It will help developers to
enhance multi-master configuration in finding bugs and problems if
any. IMO I saw similar relationship between pgbench and PostgreSQL.

> The main reason I'm dubious about this is that it's demonstrable that
> pgbench itself is the bottleneck in many test scenarios.  That problem
> gets worse the more backends you try to have it control.  You can of
> course "solve" this with multiple threads in pgbench, but as soon as you
> do that there's no functional benefit over just running several copies.

Are you sure that running several copies of pgbench could produce more
TPS than single pgbench? I thought that's just a limitation of the
resource of the machine which pgbench is running on. So I thought to
avoid the bottle neck of pgbench, I have to use several pgbench client
machines simultaneously anyway.

Another point is, what kind of transactions you want. "pgbench -S"
type transaction produces trivial load, and could easily reveal bottle
neck of pgbench. However this type of transaction is pretty extrem one
and very different from transactions in the real world. So even if
your argument is true, I guess it's only adopted to "pgbench -S" case.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: multi-master pgbench?

From
Tom Lane
Date:
Tatsuo Ishii <ishii@postgresql.org> writes:
>> Why wouldn't you just fire up several copies of pgbench, one per host?

> Well, more convenient. Aside from bottle neck discussion below, simple
> tool to generate load is important IMO.

Well, my concern here is that it's *not* going to be simple.  By the
time we get done adding enough switches to control connection to N
different hosts (possibly with different usernames, passwords, etc),
then adding frammishes to control which scripts get sent to which hosts,
and so on, I don't think it's really going to be simpler to use than
launching N copies of pgbench.

It might be worth doing if we had features that allowed the different
test scripts to interact, so that they could do things like check
replication propagation from one host to another.  But pgbench hasn't
got that, and in multi-job mode really can't have that (at least not
in the Unix separate-processes implementation).  Anyway that's a whole
nother level of complexity that would have to be added on before you
got to a useful feature.
        regards, tom lane



Re: multi-master pgbench?

From
David Fetter
Date:
On Wed, Aug 22, 2012 at 06:26:00AM +0900, Tatsuo Ishii wrote:
> >> I am thinking about to implement "multi-master" option for pgbench.
> >> Supose we have multiple PostgreSQL running on host1 and host2.
> >> Something like "pgbench -c 10 -h host1,host2..." will create 5
> >> connections to host1 and host2 and send queries to host1 and host2.
> >> The point of this functionality is to test some cluster software which
> >> have a capability to create multi-master configuration.
> >> 
> >> Comments?
> > 
> > To distinguish it from simply running separate pgbench tests for each
> > host, would this somehow test propagation of the writes?  Such a thing
> > would be quite useful, but it seems at first glance like a large
> > project.
> 
> What does "propagation of the writes" mean?

I apologize for not being clear.  In a multi-master system, people
frequently wish to know how quickly a write operation has been
duplicated to the other nodes.  In some sense, those write operations
are incomplete until they have happened on all nodes, even in the
asynchronous case.

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: multi-master pgbench?

From
Tatsuo Ishii
Date:
> Well, my concern here is that it's *not* going to be simple.  By the
> time we get done adding enough switches to control connection to N
> different hosts (possibly with different usernames, passwords, etc),
> then adding frammishes to control which scripts get sent to which hosts,
> and so on, I don't think it's really going to be simpler to use than
> launching N copies of pgbench.
>
> It might be worth doing if we had features that allowed the different
> test scripts to interact, so that they could do things like check
> replication propagation from one host to another.  But pgbench hasn't
> got that, and in multi-job mode really can't have that (at least not
> in the Unix separate-processes implementation).  Anyway that's a whole
> nother level of complexity that would have to be added on before you
> got to a useful feature.

I do not intended to implement such a feature. As I wrote in the
subject line, I intended to enhance pgbench for "multi-master"
configuration. IMO, any node on multi-master configuration should
accept *any* queries, not only read queries but write queries. So bare
PostgreSQL streaming replication configuration cannot be a
multi-master configuration and will not be a target of the new
pgbench.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: multi-master pgbench?

From
Tatsuo Ishii
Date:
>> What does "propagation of the writes" mean?
> 
> I apologize for not being clear.  In a multi-master system, people
> frequently wish to know how quickly a write operation has been
> duplicated to the other nodes.  In some sense, those write operations
> are incomplete until they have happened on all nodes, even in the
> asynchronous case.

IMO, that kind of functionnality is beyond the scope of benchmark tools.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: multi-master pgbench?

From
Tom Lane
Date:
Tatsuo Ishii <ishii@postgresql.org> writes:
>> Well, my concern here is that it's *not* going to be simple.  By the
>> time we get done adding enough switches to control connection to N
>> different hosts (possibly with different usernames, passwords, etc),
>> then adding frammishes to control which scripts get sent to which hosts,
>> and so on, I don't think it's really going to be simpler to use than
>> launching N copies of pgbench.

> I do not intended to implement such a feature. As I wrote in the
> subject line, I intended to enhance pgbench for "multi-master"
> configuration. IMO, any node on multi-master configuration should
> accept *any* queries, not only read queries but write queries. So bare
> PostgreSQL streaming replication configuration cannot be a
> multi-master configuration and will not be a target of the new
> pgbench.

Well, you're being shortsighted then, because such a feature will barely
have hit the git repository before somebody wants to use it differently.
I can easily imagine wanting to stress a master plus some hot-standby
slaves, for instance; and that would absolutely require being able to
direct different subsets of the test scripts to different hosts.
        regards, tom lane



Re: multi-master pgbench?

From
Tatsuo Ishii
Date:
>> I do not intended to implement such a feature. As I wrote in the
>> subject line, I intended to enhance pgbench for "multi-master"
>> configuration. IMO, any node on multi-master configuration should
>> accept *any* queries, not only read queries but write queries. So bare
>> PostgreSQL streaming replication configuration cannot be a
>> multi-master configuration and will not be a target of the new
>> pgbench.
> 
> Well, you're being shortsighted then, because such a feature will barely
> have hit the git repository before somebody wants to use it differently.
> I can easily imagine wanting to stress a master plus some hot-standby
> slaves, for instance; and that would absolutely require being able to
> direct different subsets of the test scripts to different hosts.

I don't see any practical way to implement such a tool because there's
always a chance to try to retrieve non existing data from hot-standby
because of replication delay.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: multi-master pgbench?

From
"Greg Sabino Mullane"
Date:
-----BEGIN PGP SIGNED MESSAGE-----
Hash: RIPEMD160


> The point of this functionality is to test some cluster 
> software which have a capability to create multi-master 
> configuration.

As the maintainer of software that does multi-master, I'm a little 
confused as to why we would extend pg_bench to do this. The software 
in question should be doing the testing itself, ideally via 
it's test suite (i.e. "make test"). Having pg_bench do any of this 
would be at best a very poor subset of the tests the software 
should be performing. I suppose if the software *uses* pg_bench for 
its tests already, once could argue a limited test case - but it seems 
difficult to design some pg_bench options generic and powerful enough 
to handle other cases outside of the one software this change is aimed at.

- -- 
Greg Sabino Mullane greg@turnstep.com
End Point Corporation http://www.endpoint.com/
PGP Key: 0x14964AC8 201208212330
http://biglumber.com/x/web?pk=2529DF6AB8F79407E94445B4BC9B906714964AC8
-----BEGIN PGP SIGNATURE-----

iEYEAREDAAYFAlA0UvsACgkQvJuQZxSWSsjALgCgw2cGI3eWR5fBGkoX9hqV1N39
OSEAn2ZIxrNRCdkDfKVrMmx2PsQTs8ZS
=Xhqb
-----END PGP SIGNATURE-----





Re: multi-master pgbench?

From
Tatsuo Ishii
Date:
> As the maintainer of software that does multi-master, I'm a little 
> confused as to why we would extend pg_bench to do this. The software 
> in question should be doing the testing itself, ideally via 
> it's test suite (i.e. "make test"). Having pg_bench do any of this 
> would be at best a very poor subset of the tests the software 
> should be performing. I suppose if the software *uses* pg_bench for 
> its tests already, once could argue a limited test case - but it seems 
> difficult to design some pg_bench options generic and powerful enough 
> to handle other cases outside of the one software this change is aimed at.

Well, my point was in upthread:
> Right. If pgbench could have such a functionarlity, we could compare
> those projects by using pgbench. Currently those projects use
> different benchmarking tools. That means, the comparison is something
> like apple-to-orange. With enhanced pgbench we could do apple-to-apple
> comparison.
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: multi-master pgbench?

From
David Fetter
Date:
On Wed, Aug 22, 2012 at 10:13:43AM +0900, Tatsuo Ishii wrote:
> >> What does "propagation of the writes" mean?
> > 
> > I apologize for not being clear.  In a multi-master system, people
> > frequently wish to know how quickly a write operation has been
> > duplicated to the other nodes.  In some sense, those write
> > operations are incomplete until they have happened on all nodes,
> > even in the asynchronous case.
> 
> IMO, that kind of functionnality is beyond the scope of benchmark
> tools.

I was trying to come up with something that would distinguish pgbench
for multi-master from pgbench run on independent nodes.  Is there some
other distinction to draw?

Cheers,
David.
-- 
David Fetter <david@fetter.org> http://fetter.org/
Phone: +1 415 235 3778  AIM: dfetter666  Yahoo!: dfetter
Skype: davidfetter      XMPP: david.fetter@gmail.com
iCal: webcal://www.tripit.com/feed/ical/people/david74/tripit.ics

Remember to vote!
Consider donating to Postgres: http://www.postgresql.org/about/donate



Re: multi-master pgbench?

From
Dimitri Fontaine
Date:
Tatsuo Ishii <ishii@postgresql.org> writes:
> I am thinking about to implement "multi-master" option for pgbench.

Please consider using Tsung, which solves that problem and many others.
 http://tsung.erlang-projects.org/

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support



Re: multi-master pgbench?

From
Tatsuo Ishii
Date:
> Tatsuo Ishii <ishii@postgresql.org> writes:
>> I am thinking about to implement "multi-master" option for pgbench.
> 
> Please consider using Tsung, which solves that problem and many others.
> 
>   http://tsung.erlang-projects.org/

Thank you for introducing Tsung. I have some questions regarding it.
Does it support extended query? Does it support V3 protocol?
--
Tatsuo Ishii
SRA OSS, Inc. Japan
English: http://www.sraoss.co.jp/index_en.php
Japanese: http://www.sraoss.co.jp



Re: multi-master pgbench?

From
Dimitri Fontaine
Date:
Tatsuo Ishii <ishii@postgresql.org> writes:
> Does it support extended query? Does it support V3 protocol?

Yes.

It also has a proxy mode where it captures the queries sent by the
client along with think times and outputs that in the session format it
reads from its setup, which is very useful. Using that capability you
can easily have Tsung replay your existing pgbench script.

Regards,
-- 
Dimitri Fontaine
http://2ndQuadrant.fr     PostgreSQL : Expertise, Formation et Support