Home > mailing lists

Re: Postgres for a "data warehouse", 5-10 TB - Mailing list pgsql-performance

From	Marti Raudsepp
Subject	Re: Postgres for a "data warehouse", 5-10 TB
Date	September 13, 2011 15:19:58
Msg-id	CABRT9RAKjmoa91sOBySx97=0ZKv64sD7miY0ZgUDu9dK0rQygw@mail.gmail.com Whole thread
In response to	Re: Postgres for a "data warehouse", 5-10 TB (Robert Klemme <shortcutter@googlemail.com>)
Responses	Re: Postgres for a "data warehouse", 5-10 TB Re: Postgres for a "data warehouse", 5-10 TB Re: Postgres for a "data warehouse", 5-10 TB
List	pgsql-performance

Tree view

On Tue, Sep 13, 2011 at 19:34, Robert Klemme <shortcutter@googlemail.com> wrote:
> I don't think so.  You only need to catch the error (see attachment).
> Or does this create a sub transaction?

Yes, every BEGIN/EXCEPTION block creates a subtransaction -- like a
SAVEPOINT it can roll back to in case of an error.

> Yes, I mentioned the speed issue.  But regardless of the solution for
> MySQL's "INSERT..ON DUPLICATE KEY UPDATE" which Igor mentioned you
> will have the locking problem anyhow if you plan to insert
> concurrently into the same table and be robust.

In a mass-loading application you can often divide the work between
threads in a manner that doesn't cause conflicts.

For example, if the unique key is foobar_id and you have 4 threads,
thread 0 will handle rows where (foobar_id%4)=0, thread 1 takes
(foobar_id%4)=1 etc. Or potentially hash foobar_id before dividing the
work.

I already suggested this in my original post.

Regards,
Marti

pgsql-performance by date:

From: "Marc Mamin"
Date: 13 September 2011, 15:09:32
Subject: Re: Postgres for a "data warehouse", 5-10 TB

From: Igor Chudov
Date: 13 September 2011, 15:39:20
Subject: Re: Postgres for a "data warehouse", 5-10 TB

Re: Postgres for a "data warehouse", 5-10 TB - Mailing list pgsql-performance

Previous

Next