Re: Bulk inserts into two (related) tables - Mailing list pgsql-general

From Francisco Olarte
Subject Re: Bulk inserts into two (related) tables
Date
Msg-id CA+bJJbyAsgWaxHvMBz3kTxWcekwGPv3yLXbKO4Yixgj5td25KQ@mail.gmail.com
Whole thread Raw
In response to Bulk inserts into two (related) tables  (Rich Shepard <rshepard@appl-ecosys.com>)
Responses Re: Bulk inserts into two (related) tables
List pgsql-general
Rich:

On Tue, May 21, 2019 at 6:56 PM Rich Shepard <rshepard@appl-ecosys.com> wrote:
> I'm cleaning and formatting a 800-line data file to be inserted into a
> database. Some of the input file fields will be inserted into an
> 'organizations' table letting postgres assign sequential org_id numbers.
> Other fields will be inserted into a separate 'people' table associated with
> each organization. The people table insert allows postgres to assign the
> person_id number and this table's foreign key, people.org_id ==
> organizations.org_id.
> I could insert all new rows into the organizations table, then produce a
> list of the org_id and org_name to manually insert the foreign key in the
> related people table. Time and effort intense.
> Is there a way to insert the two tables sequentially without manually adding
> the organizations.org_id to the appropriate foreign key column (people.org_id)
> of the people table?

From how you say it, I assume you have some data in your original
dumps which can relate boths, lets assume it's org_name, but may be an
org-code. If you do not have it it means you cannot match people to
orgs in your data, all is lost.

You can do it in a couple steps, first do a copy (org_id, org_name,
org.others) into the organizations table,

then create a temporary table and copy (people.id, people.org_name,
people.*)  into it

and then insert into people "select people.id, org.id as
people_org_id, people.* from tmp_people, orgs where
tmp_people.org_name = org.org_name)

and drop the temp table.

If the matching data is some king of code you do not want in the final
organization table, create a temporary organizations table, copy into
it, join it with the temporary people table, insert into the final
organizations table slicing with a select, drop it.

If you are using a sequence or similar thing for generating
organization ids and use the second approach, remember to use the same
sequence to generate the default values into the temporary table, or
chaos will ensue ( not really, but you'll need to advance it manually
).

This is a classic problem, the classic way to solve is that, prepare
an script which loads some temporary tables and then insert joins into
the final ones. On small data sets like yours you can just edit
everything into a single sql script.

Francisco Olarte.



pgsql-general by date:

Previous
From: PegoraroF10
Date:
Subject: Re: Refresh Publication takes hours and doesn´t finish
Next
From: Rich Shepard
Date:
Subject: Re: Bulk inserts into two (related) tables