Home > mailing lists

Re: Avoiding duplicates (or at least marking them as such) in a "cumulative" transaction table. - Mailing list pgsql-general

From	Scott Marlowe
Subject	Re: Avoiding duplicates (or at least marking them as such) in a "cumulative" transaction table.
Date	March 8, 2010 03:16:53
Msg-id	dcc563d11003072316l78de3132y79a8b8a0801926b@mail.gmail.com Whole thread Raw
In response to	Re: Avoiding duplicates (or at least marking them as such) in a "cumulative" transaction table. (Allan Kamau <kamauallan@gmail.com>)
Responses	Re: Avoiding duplicates (or at least marking them as such) in a "cumulative" transaction table.
List	pgsql-general

Tree view

On Sun, Mar 7, 2010 at 11:31 PM, Allan Kamau <kamauallan@gmail.com> wrote:
> On Mon, Mar 8, 2010 at 5:49 AM, Scott Marlowe <scott.marlowe@gmail.com> wrote:
>> On Sun, Mar 7, 2010 at 1:45 AM, Allan Kamau <kamauallan@gmail.com> wrote:
>>> Hi,
>>> I am looking for an efficient and effective solution to eliminate
>>> duplicates in a continuously updated "cumulative" transaction table
>>> (no deletions are envisioned as all non-redundant records are
>>> important). Below is my situation.
>>
>> Is there a reason you can't use a unique index and detect failed
>> inserts and reject them?
>>
>
> I think it would have been possible make use of a unique index as you
> have suggested, and silently trap the uniqueness violation.
>
> But in my case (as pointed out in my previous lengthy mail) I am
> inserting multiple records at once, which implicitly means a single
> transaction. I think in this scenario a violation of uniqueness by
> even a single record will lead to all the other records (in this
> batch) being rejected either.

There are a whole bunch of approaches to this.  You're basically bulk
loading data into a table that already has data in it.  I'd put
everything into a temp table then insert into main where not exists in
temp table.  And I'd still have that unique index on my pk in the main
table.  First maintain uniqueness, then figure out how to load data
into it.  insert into where not exists is quite fast, and it's all one
big transaction that either goes or doesn't, so there's no cleaning up
broken / extra records later.

Are you looking at upserting these inputs?  Or is the original row good enough?

>
> Is there perhaps a way to only single out the unique constraint
> violating record(s) without having to perform individual record
> inserts, I am following the example found here
> "http://www.postgresql.org/docs/8.4/interactive/plpgsql-control-structures.html#PLPGSQL-ERROR-TRAPPING".
>
> Allan.
>



--
When fascism comes to America, it will be intolerance sold as diversity.

pgsql-general by date:

From: Ben Chobot
Date: 08 March 2010, 02:59:10
Subject: obsessive-compulsive vacuum behavior

From: Scott Marlowe
Date: 08 March 2010, 04:20:15
Subject: Re: obsessive-compulsive vacuum behavior

Re: Avoiding duplicates (or at least marking them as such) in a "cumulative" transaction table. - Mailing list pgsql-general

Previous

Next