Markus, Kaloyan, and all
Thanks for the help on this. I got the trigger to work properly,
although I discovered that a trigger in SQL is not allowed, so I
wrote it in plsql. It was very, very slow. So I finally decided
to filter the input before attempting to Copy it into the table,
using a perl hash to ensure uniqueness. As a side benefit, I was able
to count the frequency of each input string while I was filtering and
include that in the final table.
TJ O'Donnell
Markus Schaber wrote:
> Hi, TJ,
>
> TJ O'Donnell wrote:
>
>> So, is there a way (a constraint, a check?) that will simply
>> REJECT a duplicate when I insert the raw data directly into x
>> rather than FAILING on an error as it does
>> with the unique constraint above?
>
> Failing on an error is exactly the way PostgreSQL (and the SQL standard)
> uses to REJECT duplicates. :-)
>
> You seem to think about silently dropping the duplicates. That could be
> achieved with an BEFORE INSERT trigger, or with a rule on a view, as
> both can silently drop the inserted rule.
>
> The trigger body could even be in language SQL, along the lines of:
>
> SELECT CASE WHEN EXISTS (SELECT keycol FROM table WHERE
> table.keycol=NEW.keycol) THEN NULL ELSE NEW;
>
> Nevertheless, expect the insert performance to drop a little, due to the
> trigger overhead.
>
> The alternative approaches (SELECT'ing from the application, using a
> stored procedure that checks and then inserts the data, and using
> subtransactions to roll back the failing inserts) all seem worse (uglier
> and slower) to me, but don't hesitate to ask if you're interested.
>
> HTH,
> Markus
>