Re: Parallel Seq Scan - Mailing list pgsql-hackers

From Robert Haas
Subject Re: Parallel Seq Scan
Date
Msg-id CA+TgmobdyE1UmS-oTXG1axc06NKELnYiVKPahvAcOhfMaRyZQA@mail.gmail.com
Whole thread Raw
In response to Re: Parallel Seq Scan  (Haribabu Kommi <kommi.haribabu@gmail.com>)
List pgsql-hackers
On Tue, Sep 22, 2015 at 3:14 AM, Haribabu Kommi
<kommi.haribabu@gmail.com> wrote:
> copy_user_generic_string system call is because of file read operations.
> In my test, I gave the shared_buffers as 12GB with the table size of 18GB.

OK, cool.  So that's actually good: all that work would have to be
done either way, and parallelism lets several CPUs work on it at once.

> The _spin_lock calls are from the signals that are generated by the workers.
> With the increase of tuple queue size, there is a change in kernel system
> calls usage.

And this part is not so good: that's additional work created by
parallelism that wouldn't have to be done if we weren't in parallel
mode.  Of course, it's impossible to eliminate that, but we should try
to reduce it.

> - From the above performance readings, increase of tuple queue size
> gets benefited with lesser
>   number of workers compared to higher number of workers.

That makes sense to me, because there's a separate queue for each
worker.  If we have more workers, then the total amount of queue space
available rises in proportion to the number of workers available.

> Workers are getting started irrespective of the system load. If user
> configures 16 workers, but
> because of a sudden increase in the system load, there are only 2 or 3
> cpu's are only IDLE.
> In this case, if any parallel seq scan eligible query is executed, the
> backend may start 16 workers
> thus it can lead to overall increase of system usage and may decrease
> the performance of the
> other backend sessions?

Yep, that could happen.  It's something we should work on, but the
first version isn't going to try to be that smart.  It's similar to
the problem we already have with work_mem, and I want to work on it,
but we need to get this working first.

> If the query have two parallel seq scan plan nodes and how the workers
> will be distributed across
> the two nodes? Currently parallel_seqscan_degree is used per plan
> node, even if we change that
> to per query, I think we need a worker distribution logic, instead of
> using all workers by a single
> plan node.

Yes, we need that, too.  Again, at some point.

> Select with a limit clause is having a performance drawback with
> parallel seq scan in some scenarios,
> because of very less selectivity compared to seq scan, it should be
> better if we document it. Users
> can take necessary actions based on that for the queries with limit clause.

This is something I want to think further about in the near future.
We don't have a great plan for shutting down workers when no further
tuples are needed because, for example, an upper node has filled a
limit.  That makes using parallel query in contexts like Limit and
InitPlan significantly more costly than you might expect.  Perhaps we
should avoid parallel plans altogether in those contexts, or maybe
there is some other approach that can work.  I haven't figured it out
yet.

-- 
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company



pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: hot_standby_feedback default and docs
Next
From: Peter Eisentraut
Date:
Subject: Re: unclear about row-level security USING vs. CHECK