Thread: pgloader CSV to table column formatting

pgloader CSV to table column formatting

From

Dan Webb

Date:

21 July 2015, 17:15:54

What is the required syntax to convert data fields and insert default column values when using pgloader to load a table from a csv file?

We are migrating some of our code from Oracle to Postgres and I am having difficulty finding documentation and examples for some the tasks I am trying to accomplish.

Currently, I am trying to import a csv files into a Postgres tables using pgloader and a config files. I have been able to import flat csv files into temporary tables containing all character data type columns then using SQL statements with data type conversions to load into production tables. I have had adequate success loading text fields into table columns with specific data types (other than character). Examples are text to date and the various numeric fields. I am also trying to locate documentation or examples to enter default values into columns for fields NOT in the incoming csv file.

NOTE: The fields NOT in the csv file are filled in by other fields or with sql updates. Example the sample_date is parsed into sample_date_year, sample_date_month, sample_date_day and sample_date_week. Only the year portion of a date is required for the sample_date field.

Below is a draft of what I am working on with examples of my test table, control file input file and system command.

TestTable:

(

submission_date date NOT NULL,

sample_date_year smallint NOT NULL,

sample_date_month smallint,

sample_date_day smallint,

sample_date_week smallint,

sample_key integer NOT NULL,

species character varying(2) NOT NULL,

sex character(1),

length smallint,

weight real,

record_origin character(1) NOT NULL

) ;

Control File:

LOAD CSV

FROM -

(submission_date

, sample_date

, species

, sex

, length

, weight)

INTO postgresql://dbname:xxxx543@localhost:NNNNNN/username?TestTable

( submission_date

, sample_date

, sample_date_year

, sample_date_month

, sample_date_day

, sample_date_week

, sample_key

, species

, sex

, length

, weight

, record_origin)

WITH

skip header = 1,

fields optionally enclosed by '"',

fields escaped by double-quote,

fields terminated by ','

SET client_encoding to 'latin1',

work_mem to '128MB',

standard_conforming_strings to 'on'

;

Input File

"submission_date", “sample_date”, ”species”, ”sex”, ”length”, ”weight”

"20150721",”20150101”,”01”,”F”,”99.99”,”9999”

"20150721",”201501”,”02”,”M”,”9.99”,”999”

"20150721",”2015”,”03”,,”.01”,”99”

"20150721",”2015”,”03”,”F”,,”99”

"20150721",”2015”,”03”,”M”,”.01”,

System Command:

system("pgloader --type csv --verbose --logfile $logfilepath/rc_load_err.log $bindirpath/data_load.pgl < $InfilePath/$Infilename");

In Oracle we used the following conversions however we are trying to migrate the routine to Postgres:

submission_date DATE “YYYYMMDD”,

recovery_date_month CONSTANT 0,

recovery_date_day CONSTANT 0,

recovery_date_week CONSTANT 0,

recovery_location_key CONSTANT 0.

record_origin CONSTANT ‘N’)

Thank you very much for you advice and assistance.

Dan Webb

Pacific States Marine Fisheries Commission

Analyst/Programmer – Regional Mark Processing Center

Re: pgloader CSV to table column formatting

From

DWebb

Date:

21 July 2015, 17:43:07

I noticed some errors in my post.

In my attempt to edit and change some of my sentences from singular to
plural, I did so, very poorly.

I also intended to state:

I have NOT had adequate success loading text fields from csv files into
postgres table columns defined with specific data types (other than columns
of character data types).  I assume data type conversion and inserting
default values in columns should be possible in the INTO section of my
control file, however I have not found examples or documentation on how it
is done.

Very sorry for these poorly written sentences.

Assistance, advice and/or examples would be greatly appreciated.

Dan Webb



--
View this message in context:
http://postgresql.nabble.com/pgloader-CSV-to-table-column-formatting-tp5858753p5858758.html
Sent from the PostgreSQL - novice mailing list archive at Nabble.com.