Re: "Ungroup" data for import into PostgreSQL - Mailing list pgsql-general

From George Weaver
Subject Re: "Ungroup" data for import into PostgreSQL
Date
Msg-id EC07DD3785D94587BF0C57C192F0DD38@D420
Whole thread Raw
In response to "Ungroup" data for import into PostgreSQL  ("George Weaver" <gweaver@shaw.ca>)
List pgsql-general
Hi Adrian,

From: "Adrian Klaver" <adrian.klaver@aklaver.com>
Subject: Re: [GENERAL] "Ungroup" data for import into PostgreSQL


> On 01/15/2015 04:56 PM, Jim Nasby wrote:
>> On 1/15/15 9:43 AM, George Weaver wrote:
>>> Hi List,
>>>
>>> I need to import data from a large Excel spreadsheet into a PostgreSQL
>>> table.  I have a program that uses ODBC to connect to Excel and
>>> extract data using SQL queries.  The program then inserts the data
>>> into a PostgreSQL table.
>>>
>>> The challenge with this particular spreadsheet is that it is arrayed
>>> thus:
>>>
>>> Model No 1     Product Code 15
>>>     Serial No 1       No on Hand
>>>     Serial No 2       No on Hand
>>>     Serial No 3       No on Hand
>>> Model No 4    Product Code 9
>>>     Serial No 12     No on Hand
>>> Model No 5    Product Code 27
>>>     Serial No 6       No on Hand
>>>     Serial No 14     No on Hand
>>>
>>> etc.
>>>
>>> I need the data in PostgreSQL arrayed thus
>>>
>>> Model No 1    Product Code 15    Serial No 1       No on Hand
>>> Model No 1    Product Code 15    Serial No 2       No on Hand
>>> Model No 1    Product Code 15    Serial No 3       No on Hand
>>> Model No 4    Product Code 9      Serial No 12     No on Hand
>>> Model No 5    Product Code 27    Serial No 6       No on Hand
>>> Model No 5    Product Code 27    Serial No 14     No on Hand
>>>
>>> I can import the data procedurely using plpgsql to match the
>>> individual rows to the master for each row (import the raw data into a
>>> buffer table in PostgreSQL and then looping through the rows in the
>>> buffer table and checking to see when the Model No changes).
>>
>> Note that if you're doing that you better be putting the rownumber from
>> excel into the table... result sets are NOT guaranteed to be in insert
>> order!
>>
>>> I'm wondering if there is a more elegant way to do this using straight
>>> sql from Excel?
>>
>> Well, that's really an excel question, not a Postgres question...
>>
>> If you load the whole spreadsheet into a single table and have a way to
>> differentiate between the different rows then you might be able to do
>> something with CTE's to relate a serial number to the product code. That
>> might be faster than plpgsql.
>>
>> You might also be able to do something creative with formulas in excel
>> to copy the product code data to the serial # rows. You could then
>> import the whole thing and re-normalize it.
>>
>> There's probably some stuff you could do with VBA too. If you care about
>> performance you don't want to execute SQL statements for each
>> spreadsheet row.
>
> Or if you really want to slice and dice and you use Python, then take a
> look at Pandas:
>
> http://pandas.pydata.org/
>
> In particular the IO functions:
>
> http://pandas.pydata.org/pandas-docs/stable/io.html

I don't use Python but I found the information in these links very
interesting.

Thanks!

> --
> Adrian Klaver
> adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: "George Weaver"
Date:
Subject: Re: "Ungroup" data for import into PostgreSQL
Next
From: Gordon Haverland
Date:
Subject: Ransomware article