Home > mailing lists

Re: automated 'discovery' of a table : potential primary key, columnsfunctional dependencies ... - Mailing list pgsql-general

From	Adrian Klaver
Subject	Re: automated 'discovery' of a table : potential primary key, columnsfunctional dependencies ...
Date	November 23, 2019 01:48:50
Msg-id	80d60035-6c4a-4eac-df16-956fc49901e8@aklaver.com Whole thread Raw
In response to	automated 'discovery' of a table : potential primary key, columnsfunctional dependencies ... (Rémi Cura <remi.cura@gmail.com>)
List	pgsql-general

Tree view

On 11/22/19 2:05 PM, Rémi Cura wrote:
> Hello dear List,
> I'm currently wondering about how to streamline the normalization of a 
> new table.
> 
> I often have to import messy CSV files into the database, and making 
> clean normalized version of these takes me a lot of time (think dozens 
> of columns and millions of rows).

To me messy means the information to do the below is not available. 
Personally I think you best bet is to get the data into tables and then 
use visualization tools to help you determine the below. My guess is 
there will be a lot of data cleaning going on before you can get to a 
well ordered table layout.

> 
> I wrote some code to automatically import a CSV file and infer the type 
> of each column.
> Now I'd like to quickly get an idea of
>   - what would be the most likely primary key
>   - what are the functional dependencies between the columns
> 
> The goal is **not** to automate the modelling process,
> but rather to automate the tedious phase of information collection
> that is necessary for the DBA to make a good model.
> 
> If this goes well, I'd like to automate further tedious stuff (like 
> splitting a table into several ones with appropriate foreign keys / 
> constraints)
> 
> I'd be glad to have some feedback / pointers to tools in plpgsql or even 
> plpython.
> 
> Thank you very much
> Remi
> 
> 


-- 
Adrian Klaver
adrian.klaver@aklaver.com

pgsql-general by date:

From: Rémi Cura
Date: 23 November 2019, 01:05:01
Subject: automated 'discovery' of a table : potential primary key, columnsfunctional dependencies ...

From: stan
Date: 23 November 2019, 02:52:16
Subject: And I thought I had this solved.

Re: automated 'discovery' of a table : potential primary key, columnsfunctional dependencies ... - Mailing list pgsql-general

Previous

Next