The first delimiter is ~^~ (tilde-carat-tilde)
The last field is itself delimited with just ^ (carat)
I would use text parsing tools to do this myself though various commands in PosegreSQL could be combined to get the
desiredresult. The last 4 numbers (second parse) should probably be stored in a numeric[]
Look at COPY and regexp_matches()
David J.
On Aug 22, 2012, at 20:23, Mike Christensen <mike@kitchenpc.com> wrote:
> I'd like to import this data into a Postgres database:
>
> http://www.ars.usda.gov/SP2UserFiles/Place/12354500/Data/SR24/dnload/sr24.zip
>
> However, I'm not quite sure what format this is. It's definitely not
> CSV. Here's an example of a few rows:
>
> ~01001~^~0100~^~Butter, salted~^~BUTTER,WITH
> SALT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
> ~01002~^~0100~^~Butter, whipped, with salt~^~BUTTER,WHIPPED,WITH
> SALT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
> ~01003~^~0100~^~Butter oil, anhydrous~^~BUTTER
> OIL,ANHYDROUS~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
> ~01004~^~0100~^~Cheese,
> blue~^~CHEESE,BLUE~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
> ~01005~^~0100~^~Cheese,
> brick~^~CHEESE,BRICK~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
> ~01006~^~0100~^~Cheese,
> brie~^~CHEESE,BRIE~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
> ~01007~^~0100~^~Cheese,
> camembert~^~CHEESE,CAMEMBERT~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
> ~01008~^~0100~^~Cheese,
> caraway~^~CHEESE,CARAWAY~^~~^~~^~~^~~^0^~~^6.38^4.27^8.79^3.87
> ~01009~^~0100~^~Cheese,
> cheddar~^~CHEESE,CHEDDAR~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
> ~01010~^~0100~^~Cheese,
> cheshire~^~CHEESE,CHESHIRE~^~~^~~^~~^~~^0^~~^6.38^4.27^8.79^3.87
> ~01011~^~0100~^~Cheese,
> colby~^~CHEESE,COLBY~^~~^~~^~Y~^~~^0^~~^6.38^4.27^8.79^3.87
>
> Is there an easy way to get this into PG, or a tool I can download for
> this, or do I need to parse it myself with a script or something?
> Thanks!
>
>
> --
> Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
> To make changes to your subscription:
> http://www.postgresql.org/mailpref/pgsql-general