Re: Importing normalised data by SQL script in remote DB - Mailing list pgsql-novice

From Andreas
Subject Re: Importing normalised data by SQL script in remote DB
Date
Msg-id 48056C16.3040205@gmx.net
Whole thread Raw
In response to Re: Importing normalised data by SQL script in remote DB  ("Sean Davis" <sdavis2@mail.nih.gov>)
Responses Re: Importing normalised data by SQL script in remote DB  ("Sean Davis" <sdavis2@mail.nih.gov>)
List pgsql-novice
Sean Davis schrieb:
> On Tue, Apr 15, 2008 at 9:54 PM, Andreas <maps.on@gmx.net> wrote:
>
>> Hi,
>>  I've got to import data into a remote database.
>>  I get some stuff usually as excel-files that doesn't fit the db-structure
>> in respect of normalisation so I import it into Access, brush up the data
>> and push the columns from there in the right tables via ODBC.
>>
>>  For bigger imports (~5000 lines in Excel that get spread over 4-6 tables in
>> my db) I fetch the remote DB, do the import locally and transfer a dump back
>> to the remote site when noone uses the server.
>>  This is getting unwieldy as the db grows and the connection is not really
>> that fast.
>>
>>  I can access the server by ssh so it might be way faster to run a prepared
>> SQL file that consists just of the new data against the DB on the console.
>>
>>  Could I build a SQL script that adds the new stuff and creates foreign keys
>> on the fly without the need to know the new IDs before?
>>
>>  Lets suppose I got a couple of foreign-linked tables:
>>  company (company_id  serial primary key,  name, ...)
>>  person (person_id  serial primary key, company_fk, name, ...)
>>  contact (contact_id  serial primary key, person_fk, contact_date,
>> contact_notes, ...)
>>
>>  The SQL script needed to:
>>  1)   INSERT INTO company ...
>>  2)   look up the last created company_id
>>  3)   INSERT INTO person    and use the company_id to set company_fk
>>  4)   look up the last created person_id
>>  5)   INSERT INTO contact    and use the person_id to set person_fk
>>  Probaply create some other persons with their contacts.
>>  Then another company and so on.
>>
>>  I do this right now with vba in Access but this is painfully slow via
>> InterNet as it constantly has to ask the server for the last ids and the
>> Net's latency spoils all the fun.
>>
>
> Why not avoid Access altogether.  Load the data into postgres "as-is"
> into a set of loader tables.  Then, you can use SQL to do the data
> munging and inserts, including the foreign key relationships.  You can
> use \copy in psql to load the data, so there is no need to have access
> to the server.
>
> Sean
>
>
Thanks for your suggestion.
As said usually I get the data as a xls with wrong column names and
often enough wrong column types, missing infos or cluttered up in
creative ways, too.
How could I easily import a certain tab within a xls into a temporary
pg-table?

Then often the data is a dump of JOINs so it is not normalised.
I can't help it. I just get the stuff dumped over me as our customers
are able or willing to provide.
Most of the time it's like this:
...
company_1, ..., person_1, ...
company_1, ..., person_2, ...
company_1, ..., person_3, ...
company_2, ..., person_4, ...
company_3, ..., person_5, ...
company_3, ..., person_6, ...
...
I've got to catch the repeating companies to avoid doubles.
Until now I try to find some columns that represent a key for the
company part of the data.
Then sort by this key columns, walk over the table and generate just a
company whenever a part of the key changes.

AFAIK I need to insert the person records right after I created their
company because if I first create all companies and then the people I
wouldn't know the company-id that belongs to a person.

I'd love to learn how to do this more cleverly w/o Access.


regards
Andreas

pgsql-novice by date:

Previous
From: "Sean Davis"
Date:
Subject: Re: Importing normalised data by SQL script in remote DB
Next
From: p9e883002@sneakemail.com
Date:
Subject: Inserting a record returned from a function