Re: error-tolerant COPY FROM - Mailing list pgsql-general

From Sean Davis
Subject Re: error-tolerant COPY FROM
Date
Msg-id E74473BD-76AE-11D9-8DF0-000D933565E8@mail.nih.gov
Whole thread Raw
In response to error-tolerant COPY FROM  ("Joolz" <joolz@arbodienst-limburg.nl>)
Responses Re: error-tolerant COPY FROM
List pgsql-general
I use a trigger on tables with foreign key references to either ignore
the insert row or insert an appropriate matching row in the referenced
table, if it does not exist.  In the function, I just raise a notice
that I am doing this.  This is a simple example:
create or replace function tgf_insert_gene_id() returns trigger as $$
declare
     gene_id_chk integer;
begin
     select into gene_id_chk gene_id from g_main where
gene_id=NEW.gene_id;
     if (NOT FOUND) THEN
           BEGIN
              INSERT into g_main (gene_id,name) values
(NEW.gene_id,'INSERTED FOR FOREIGN KEY CHECK');
                RAISE NOTICE 'Inserting a gene ID to meet foreign key
requirements';
           END;
     END IF;
     return NEW;
end;$$ language plpgsql;

Just make a trigger for the tables of interest (I have several tables
that reference gene_id in g_main) so they can all use the same trigger.

This function simply looks up the proposed key to the foreign key table
(gene_id) in the foreign key table.  If it is (NOT FOUND) then it
inserts the key into the foreign table (g_main) the gene_id about to be
inserted and then raises a notice (which goes into the log if you have
logging set up to log notices).  Then, the function returns and,
because we did the insert, foreign key checks work fine.  You could
just as easily choose to NOT insert a record that doesn't meet foreign
key checks, or even do an insert into a separate table instead that
contains all of the rows that don't match a foreign key.

Hope this helps....
Sean

On Feb 4, 2005, at 7:32 AM, Joolz wrote:

> Hello everyone,
>
> I'm building a postgresql db which will have to get lots of data
> from "the outside" (customers, that is). The db has lots of
> constraints, and I'm sure that our customers will offer lots of
> invalid information. We receive the information in csv format. My
> first thought was to read them into the database with COPY, but
> "COPY stops operation at the first error."
>
> What I need is an import where all valid lines from the csv files
> are read into the db, and I also get a logfile for all invalid
> lines, stating the line number plus the pg error message so I can
> see which constraint was violated.
>
> I can't think of a direct, elegant solution for this, does anyone
> have any suggestions? Thanks a lot!
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 4: Don't 'kill -9' the postmaster


pgsql-general by date:

Previous
From: "Joolz"
Date:
Subject: Re: error-tolerant COPY FROM
Next
From: Csaba Nagy
Date:
Subject: Re: error-tolerant COPY FROM