Thread: Parallel (concurrent) inserts?

Parallel (concurrent) inserts?

From

Ivan Voras

Date:

25 May 2012, 22:05:43

Hello,

I'm wondering if there is ia document describing which guarantees (if
any) PostgreSQL makes about concurrency for various operations? Speaking
in general (i.e. IO can handle it, number of CPU cores and client
threads is optimal), are fully concurrent operations (independant and
non-blocking) possible for:

1) An unindexed table?

2) A table with 1+ ordinary (default btree) indexes? (are there
constraints on the structure and number of indexes?)

3) A table with 1+ unique indexes?

4) A table with other objects on it (foreign keys, check constraints, etc.)?

Re: Parallel (concurrent) inserts?

From

Josh Berkus

Date:

25 May 2012, 23:14:22

> I'm wondering if there is ia document describing which guarantees (if
> any) PostgreSQL makes about concurrency for various operations? Speaking
> in general (i.e. IO can handle it, number of CPU cores and client
> threads is optimal), are fully concurrent operations (independant and
> non-blocking) possible for:

Yes, there's quite a bit of documentation.  Start here:
http://www.postgresql.org/docs/9.1/static/mvcc.html

> 1) An unindexed table?

Yes.

> 2) A table with 1+ ordinary (default btree) indexes? (are there
> constraints on the structure and number of indexes?)

Yes.

> 3) A table with 1+ unique indexes?

Yes.  Note that locking on the unique index may degrade concurrent
throughput of the input stream though, since we have to make sure you're
not inserting two different rows with the same unique indexed value
simulatenously.  It's work as you expect, though.

> 4) A table with other objects on it (foreign keys, check constraints, etc.)?

Yes.  Also concurrent autonumber (sequence) allocations work fine.

In fact, PostgreSQL has no non-concurrent mode, unless you count temp
tables.

--
Josh Berkus
PostgreSQL Experts Inc.
http://pgexperts.com

Re: Parallel (concurrent) inserts?

From

Jeff Janes

Date:

25 May 2012, 23:37:12

On Fri, May 25, 2012 at 3:04 PM, Ivan Voras <ivoras@freebsd.org> wrote:
> Hello,
>
> I'm wondering if there is ia document describing which guarantees (if
> any) PostgreSQL makes about concurrency for various operations? Speaking
> in general (i.e. IO can handle it, number of CPU cores and client
> threads is optimal), are fully concurrent operations (independant and
> non-blocking) possible for:

By "fully concurrent" do you mean that there is no detectable
sub-linear scaling at all?  I'm pretty sure that no such guarantees
can be made.

> 1) An unindexed table?

For concurrent bulk loads, there was severe contention on generating
WAL between concurrent bulk loaders.  That is greatly improved in the
upcoming 9.2 release.

> 2) A table with 1+ ordinary (default btree) indexes? (are there
> constraints on the structure and number of indexes?)

This will be limited by contention for generating WAL.  (Unless
something else limits it first)

> 3) A table with 1+ unique indexes?

If more than one transaction attempts to insert the same value, one
has to block until the other either commits or rollbacks.

> 4) A table with other objects on it (foreign keys, check constraints, etc.)?

Same as above.  If they try to do conflicting things, they don't
continue operating concurrently.

Cheers,

Jeff

Re: Parallel (concurrent) inserts?

From

Ivan Voras

Date:

25 May 2012, 23:57:59

On 26 May 2012 01:36, Jeff Janes <jeff.janes@gmail.com> wrote:
> On Fri, May 25, 2012 at 3:04 PM, Ivan Voras <ivoras@freebsd.org> wrote:
>> Hello,
>>
>> I'm wondering if there is ia document describing which guarantees (if
>> any) PostgreSQL makes about concurrency for various operations? Speaking
>> in general (i.e. IO can handle it, number of CPU cores and client
>> threads is optimal), are fully concurrent operations (independant and
>> non-blocking) possible for:
>
> By "fully concurrent" do you mean that there is no detectable
> sub-linear scaling at all?  I'm pretty sure that no such guarantees
> can be made.

> For concurrent bulk loads, there was severe contention on generating
> WAL between concurrent bulk loaders.  That is greatly improved in the
> upcoming 9.2 release.

I was thinking about major exclusive locks in the code paths which
would block multiple clients operating on unrelated data (the same
questions go for update operations). For example: if the free space
map or whole index trees are exclusively locked, things like that. The
WAL issue you mention seems exactly like what I was asking about, but
are there any others?