Re: Bulk inserts into two (related) tables - Mailing list pgsql-general

From Jeremy Finzel
Subject Re: Bulk inserts into two (related) tables
Date
Msg-id CAMa1XUgOeJ-TGX7DX-RrEzZUhdyLLykC6p+GFm70DeX6x9RavQ@mail.gmail.com
Whole thread Raw
In response to Re: Bulk inserts into two (related) tables  (Rich Shepard <rshepard@appl-ecosys.com>)
Responses Re: Bulk inserts into two (related) tables
List pgsql-general
On Wed, May 22, 2019 at 7:40 AM Rich Shepard <rshepard@appl-ecosys.com> wrote:
On Wed, 22 May 2019, Francisco Olarte wrote:

> I'm curious, what org_id do you put (manually) to the people? you must
> have some way to match it ( like, say, "I have an organization line,
> followed by lines for people in this organization"

Francisco,

The first data entered was in small chunks so I'd add a few rows to the
organization table, output a file of org_id and org_name, then use the
source data to associate that org_id to the people associated with it.

With this new data source I want to populate the two tables more quickly. I
think the time consuming part is associating people with their organization.
I'll do more thinking about this.

I don't do perl, though.

There's absolutely no need to use anything beyond SQL here, though you could if you want to.

I really wonder how much we are just talking past each other simply because we don't know what your data looks like, so we can't show you how our examples apply to your use case.  If you provided a sample scrubbed data file, this whole thread probably would have been much shorter :).  Can you do that?

You said here again the most time consuming part is associating people with their organization.  Well, that's the whole question I was trying to optimize on.  You told Francisco that the data file does not have a unique org name that could be used as a unique organization identifier.  However you seem to have contradicted that by responding favorably to this solution:

INSERT INTO PEOPLE (id, name, org_id) VALUES (1, ‘Bob’, (SELECT org_id FROM org WHERE org_name=‘Main Office’))

How can this solution work if you said org_name cannot be used to link a person in the data file?

So, the question again becomes: does your data file have a combination of org fields that allows you to assign a person to a unique organization?
  • If the answer is still no, how could you ever assign people to a unique organization?  In that case you need to massage your data file first before trying to build a database schema, as Francisco noted.  This is basically a non-starter for your database schema.
  • If the answer is yes, that means you can use the combination of those fields to uniquely identify an organization, and thus link people to it later.  That's the reason for the md5 - to easily use many fields in combination as a unique id
Thanks,
Jeremy

pgsql-general by date:

Previous
From: Pierre Couderc
Date:
Subject: how to write correctly this update ?
Next
From: Rob Sargent
Date:
Subject: Re: how to write correctly this update ?