Re: "Ungroup" data for import into PostgreSQL - Mailing list pgsql-general
From | George Weaver |
---|---|
Subject | Re: "Ungroup" data for import into PostgreSQL |
Date | |
Msg-id | B4AE2BBC3329435DBCB6061227EBDFDC@D420 Whole thread Raw |
In response to | "Ungroup" data for import into PostgreSQL ("George Weaver" <gweaver@shaw.ca>) |
List | pgsql-general |
Hi Adrian, From: "Adrian Klaver" <adrian.klaver@aklaver.com> Subject: Re: [GENERAL] "Ungroup" data for import into PostgreSQL > On 01/15/2015 04:56 PM, Jim Nasby wrote: >> On 1/15/15 9:43 AM, George Weaver wrote: >>> Hi List, >>> >>> I need to import data from a large Excel spreadsheet into a PostgreSQL >>> table. I have a program that uses ODBC to connect to Excel and >>> extract data using SQL queries. The program then inserts the data >>> into a PostgreSQL table. >>> >>> The challenge with this particular spreadsheet is that it is arrayed >>> thus: >>> >>> Model No 1 Product Code 15 >>> Serial No 1 No on Hand >>> Serial No 2 No on Hand >>> Serial No 3 No on Hand >>> Model No 4 Product Code 9 >>> Serial No 12 No on Hand >>> Model No 5 Product Code 27 >>> Serial No 6 No on Hand >>> Serial No 14 No on Hand >>> >>> etc. >>> >>> I need the data in PostgreSQL arrayed thus >>> >>> Model No 1 Product Code 15 Serial No 1 No on Hand >>> Model No 1 Product Code 15 Serial No 2 No on Hand >>> Model No 1 Product Code 15 Serial No 3 No on Hand >>> Model No 4 Product Code 9 Serial No 12 No on Hand >>> Model No 5 Product Code 27 Serial No 6 No on Hand >>> Model No 5 Product Code 27 Serial No 14 No on Hand >>> >>> I can import the data procedurely using plpgsql to match the >>> individual rows to the master for each row (import the raw data into a >>> buffer table in PostgreSQL and then looping through the rows in the >>> buffer table and checking to see when the Model No changes). >> >> Note that if you're doing that you better be putting the rownumber from >> excel into the table... result sets are NOT guaranteed to be in insert >> order! >> >>> I'm wondering if there is a more elegant way to do this using straight >>> sql from Excel? >> >> Well, that's really an excel question, not a Postgres question... >> >> If you load the whole spreadsheet into a single table and have a way to >> differentiate between the different rows then you might be able to do >> something with CTE's to relate a serial number to the product code. That >> might be faster than plpgsql. >> >> You might also be able to do something creative with formulas in excel >> to copy the product code data to the serial # rows. You could then >> import the whole thing and re-normalize it. >> >> There's probably some stuff you could do with VBA too. If you care about >> performance you don't want to execute SQL statements for each >> spreadsheet row. > > Or if you really want to slice and dice and you use Python, then take a > look at Pandas: > > http://pandas.pydata.org/ > > In particular the IO functions: > > http://pandas.pydata.org/pandas-docs/stable/io.html I don't use Python but I found the information in these links very interesting. Thanks! > -- > Adrian Klaver > adrian.klaver@aklaver.com
pgsql-general by date: