Re: Bulk inserts into two (related) tables - Mailing list pgsql-general

From Jeremy Finzel
Subject Re: Bulk inserts into two (related) tables
Date
Msg-id CAMa1XUg3mV8XijX_rVXGDqMyqe3gfbRrsE65xJqY9rS9VQFnYQ@mail.gmail.com
Whole thread Raw
In response to Re: Bulk inserts into two (related) tables  (Rich Shepard <rshepard@appl-ecosys.com>)
Responses Re: Bulk inserts into two (related) tables
List pgsql-general
On Wed, May 22, 2019 at 12:53 PM Rich Shepard <rshepard@appl-ecosys.com> wrote:
On Wed, 22 May 2019, Francisco Olarte wrote:

> Also, when I speak of "unique identifier" I'm not speaking of the one if
> your FINAL tables, I assume you would have at least the *_id field as
> PKEY, so nothing else needed, but the one in your SOURCE data set (it can
> be anything, like the row number in the original excel).

Francisco/Jeremy,

I'm grateful for you patient help. The 'unique identifier' in the source
file has been provided (just now) using nl <https://ss64.com/bash/nl.html>.
The syntax I used is:

nl -b a -n ln -s , -v 339 source.txt > out.txt

because the organizations table has 338 as the maximum org_id number.

I believe this fulfills the need for a known unique ID in the source file,
and when I parse each row using gawk to create the two files for table input
I can use it in both the organizations table (as the PK) and the people
table (as the FK referring to the organizations table). I can let postgres
assign the unique ID for the new rows in the people table.

Am I still missing something critical?

Are you saying your database already has an organizations table, and this data file is appending to it with all-brand-new organizations and people?

Say you have 800 people.  Are you honestly saying that there are also 800 organizations?  If so, I guess your solution would work.  However, I thought that it was the case that multiple people belong to the same organization.  Hence, the entire problem of mapping we have been discussing.

The data model doesn't make much sense if that's true.

Also, you said previously that some fields from your data file would be populated in *both tables*.  That is a fundamental violation of DB design.  Why are you duplicating that data in two places?

Thanks,
Jeremy

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Table partition with primary key in 11.3
Next
From: Rich Shepard
Date:
Subject: Re: Bulk inserts into two (related) tables