Home > mailing lists

Re: pg_bulkload ON_DUPLICATE_MERGE - Mailing list pgsql-hackers

From	Robert Haas
Subject	Re: pg_bulkload ON_DUPLICATE_MERGE
Date	January 26, 2012 13:50:04
Msg-id	CA+TgmoZFx_Ts6VK2FT7TZbNFAU9eUhffHYhmzz1=G03LMeySnQ@mail.gmail.com Whole thread
In response to	pg_bulkload ON_DUPLICATE_MERGE (Benjamin Johnson <benjamin.johnson@getcarbonblack.com>)
List	pgsql-hackers

Tree view

On Wed, Jan 25, 2012 at 2:49 PM, Benjamin Johnson
<benjamin.johnson@getcarbonblack.com> wrote:
> PG Gurus,
>
> I have a table like this:
>
> CREATE TABLE filemods (
>  guid                  BIGINT NOT NULL UNIQUE,
>  filepath_guid         BIGINT NOT NULL,
>  createtime            TIMESTAMP WITH TIME ZONE DEFAULT NULL,
>  writetime             TIMESTAMP WITH TIME ZONE DEFAULT NULL,
>  deletetime            TIMESTAMP WITH TIME ZONE DEFAULT NULL,
> );
>
> One "event" might have (1, '2012-01-25 11:00:00', NULL, NULL) and
> another event might have (1, NULL, '2012-01-25 11:05:00', NULL) and the
> end result should be:
> (1, '2012-01-25 11:00:00', '2012-01-25 11:05:00', NULL).
>
>
> I'm trying to modify pg_bulkload to "merge" two rows together.  The
> changes I have made seem to be working, although I would like input on
> what I am doing that is unsafe or terribly wrong.  You can be brutal.
>
> I've seen incredible write speed with using pg_bulkload.  If I can just
> get it to "consolidate" our rows based on the unique key it will remove
> a lot of complexity in our software.
>
> Also, I'm not entirely sure this mailing list is the correct one, but
> with all the internals you all know, I'm hoping you can help point out
> nasty flaws in my algorithm.  This is the first legitimate attempt I
> have made at modifying PG source, so I'm not real familiar with the
> proper way of loading pages and tuples and updating heaps and all that
> pg core stuff.
>
> Here's the modifications to pg_btree.c (from pg_bulkload HEAD):
>
> http://pastebin.com/U23CapvR
>
> I also attached the patch.

I am not sure who maintains pg_bulkload, but it's not part of the core
distribution, so this is the wrong mailing list....  you may want to
look here:

http://pgfoundry.org/mail/?group_id=1000261

--
Robert Haas
EnterpriseDB: http://www.enterprisedb.com
The Enterprise PostgreSQL Company

pgsql-hackers by date:

From: Robert Haas
Date: 26 January 2012, 13:48:13
Subject: Re: Different error messages executing CREATE TABLE or ALTER TABLE to create a column "xmin"

From: Peter Eisentraut
Date: 26 January 2012, 13:50:05
Subject: Re: Client Messages

Re: pg_bulkload ON_DUPLICATE_MERGE - Mailing list pgsql-hackers

Previous

Next