Re: Separating data sets in a table - Mailing list pgsql-sql

From Mark Stosberg
Subject Re: Separating data sets in a table
Date
Msg-id Pine.BSF.4.44.0208251627420.78713-100000@nollie.summersault.com
Whole thread Raw
In response to Re: Separating data sets in a table  (Andreas Tille <tillea@rki.de>)
Responses Re: Separating data sets in a table
List pgsql-sql
On Sun, 25 Aug 2002, Andreas Tille wrote:

> On Sat, 24 Aug 2002, Mark Stosberg wrote:
>
> > On Thu, 22 Aug 2002, Andreas Tille wrote:
> > > Hello,
> > >
> > > I want to solve the following problem:
> > >
> > > CREATE TABLE Ref    ( Id int ) ;
> > > CREATE TABLE Import ( Id    int,
> > >                       Other varchar(42),
> > >                       Flag  int,
> > >                       Ts    timestamp ) ;
> > > CREATE TABLE Data   ( Id    int,
> > >                       Other varchar(42) ) ;
> > larger problem. I get the sense that you have data you importing on a
> > regular basis from outside Postgres, and you want to check it before
> > it get moves into production, but I'm not exactly sure what's happening.
>
> You are completely right.  I just do an import from an external database.
> The person I obtain the data from does an output of the table in a form
> to do a "COPY FROM".  The problem is that it might happen that there are
> some data rows which infringe referential integrity and I have to ask
> back the data provider for additional data which describe additional data
> which are referenced by the Id mentioned above.  So I have to sort out those
> data sets who have no known Id in my production data.

Andreas,

Thanks for the clarification. Here's an idea about how to solve your
problem. As you are importing your data, instead of doing it all at
once, try import it a row at a time into a table that has the RI turned
on. Check each insert to see if it's successful. It if it's not
successful, then insert that row into a table that /doesn't/ have RI
(maybe "import_failures"),
perhaps also including the error that Postgres returned. (This may be
stored in $DBH::errstr). Then when you are done, you can look in the
import_failures for a report of which rows need some assistance. If you
need every row to succeed that's imported into the production table, you
can do all this inside of a transaction, and roll it back if any of the
inserts fail. [ thinks for a moment. ] Of course, that would normally
rollback your inserts into import_failures too, so perhaps you can use a
second database connection to make sure those always happen.

I hope that helps. Perhaps thinking in terms of "row-at-a-time
processing" will help you solve your problem.
   -mark

http://mark.stosberg.com/



pgsql-sql by date:

Previous
From: Andreas Tille
Date:
Subject: Re: Separating data sets in a table
Next
From: Mark Stosberg
Date:
Subject: Re: how to refer to tables in another database( or schema)