On Thu, Jul 23, 2015 at 9:42 PM, Amit Kapila <amit.kapila16@gmail.com> wrote:
> On Wed, Jul 22, 2015 at 9:14 PM, Robert Haas <robertmhaas@gmail.com> wrote:
>>
>> One thing I noticed that is a bit dismaying is that we don't get a lot
>> of benefit from having more workers. Look at the 0.1 data. At 2
>> workers, if we scaled perfectly, we would be 3x faster (since the
>> master can do work too), but we are actually 2.4x faster. Each
>> process is on the average 80% efficient. That's respectable. At 4
>> workers, we would be 5x faster with perfect scaling; here we are 3.5x
>> faster. So the third and fourth worker were about 50% efficient.
>> Hmm, not as good. But then going up to 8 workers bought us basically
>> nothing.
>>
>
> I think the improvement also depends on how costly is the qualification,
> if it is costly, even for same selectivity the gains will be shown till
> higher
> number of clients and for simple qualifications, we will see that cost of
> having more workers will start dominating (processing data over multiple
> tuple queues) over the benefit we can achieve by them.
Yes, That's correct. when the qualification cost is increased, the performance
is also increasing with number of workers.
Instead of using all the configured workers per query, how about deciding number
of workers based on cost of the qualification? I am not sure whether we have
any information available to find out the qualification cost. This way
the workers
will be distributed to all backends properly.
Regards,
Hari Babu
Fujitsu Australia