Thread: pgloader CSV to table column formatting
What is the required syntax to convert data fields and insert default column values when using pgloader to load a table from a csv file?
We are migrating some of our code from Oracle to Postgres and I am having difficulty finding documentation and examples for some the tasks I am trying to accomplish.
Currently, I am trying to import a csv files into a Postgres tables using pgloader and a config files. I have been able to import flat csv files into temporary tables containing all character data type columns then using SQL statements with data type conversions to load into production tables. I have had adequate success loading text fields into table columns with specific data types (other than character). Examples are text to date and the various numeric fields. I am also trying to locate documentation or examples to enter default values into columns for fields NOT in the incoming csv file.
NOTE: The fields NOT in the csv file are filled in by other fields or with sql updates. Example the sample_date is parsed into sample_date_year, sample_date_month, sample_date_day and sample_date_week. Only the year portion of a date is required for the sample_date field.
Below is a draft of what I am working on with examples of my test table, control file input file and system command.
TestTable:
(
submission_date date NOT NULL,
sample_date_year smallint NOT NULL,
sample_date_month smallint,
sample_date_day smallint,
sample_date_week smallint,
sample_key integer NOT NULL,
species character varying(2) NOT NULL,
sex character(1),
length smallint,
weight real,
record_origin character(1) NOT NULL
) ;
Control File:
LOAD CSV
FROM -
(submission_date
, sample_date
, species
, sex
, length
, weight)
INTO postgresql://dbname:xxxx543@localhost:NNNNNN/username?TestTable
( submission_date
, sample_date
, sample_date_year
, sample_date_month
, sample_date_day
, sample_date_week
, sample_key
, species
, sex
, length
, weight
, record_origin)
WITH
skip header = 1,
fields optionally enclosed by '"',
fields escaped by double-quote,
fields terminated by ','
SET client_encoding to 'latin1',
work_mem to '128MB',
standard_conforming_strings to 'on'
;
Input File
"submission_date", “sample_date”, ”species”, ”sex”, ”length”, ”weight”
"20150721",”20150101”,”01”,”F”,”99.99”,”9999”
"20150721",”201501”,”02”,”M”,”9.99”,”999”
"20150721",”2015”,”03”,,”.01”,”99”
"20150721",”2015”,”03”,”F”,,”99”
"20150721",”2015”,”03”,”M”,”.01”,
System Command:
system("pgloader --type csv --verbose --logfile $logfilepath/rc_load_err.log $bindirpath/data_load.pgl < $InfilePath/$Infilename");
In Oracle we used the following conversions however we are trying to migrate the routine to Postgres:
submission_date DATE “YYYYMMDD”,
recovery_date_month CONSTANT 0,
recovery_date_day CONSTANT 0,
recovery_date_week CONSTANT 0,
recovery_location_key CONSTANT 0.
record_origin CONSTANT ‘N’)
Thank you very much for you advice and assistance.
Dan Webb
Pacific States Marine Fisheries Commission
Analyst/Programmer – Regional Mark Processing Center
I noticed some errors in my post. In my attempt to edit and change some of my sentences from singular to plural, I did so, very poorly. I also intended to state: I have NOT had adequate success loading text fields from csv files into postgres table columns defined with specific data types (other than columns of character data types). I assume data type conversion and inserting default values in columns should be possible in the INTO section of my control file, however I have not found examples or documentation on how it is done. Very sorry for these poorly written sentences. Assistance, advice and/or examples would be greatly appreciated. Dan Webb -- View this message in context: http://postgresql.nabble.com/pgloader-CSV-to-table-column-formatting-tp5858753p5858758.html Sent from the PostgreSQL - novice mailing list archive at Nabble.com.