Re: [PERFORM] copy vs. C function - Mailing list pgsql-performance

From rajmhn
Subject Re: [PERFORM] copy vs. C function
Date
Msg-id 1483097752044-5936796.post@n3.nabble.com
Whole thread Raw
In response to copy vs. C function  (Jon Nelson <jnelson+pgsql@jamponi.net>)
List pgsql-performance
Hi Jon



This is exactly, what I was looking for. Need to read the data from
delimited file with no header, and do few transformation as described below
using Postgres C function and load it using pg_bulkload utility.



Transformation below, can be handled with query after loading all the data
as varchar and nullable. But we need to handle this before loading as like
we do in Oracle. I'm converting the code from Oracle to Postgres. Both
version of code(Oracle & Postgres) will be available for different users.



In Oracle, doing these kind of transformation in SQL loader. Need to follow
the same kind of approach in Postgres. SQL filter approach was very easy in
terms of coding. From documentation found, C filter was very much faster
than SQL.



I'm very much new to C. Looking for your options as you mentioned here in
post.Some standard syntax for writing these functionalities would be greatly
helpful.  Kindly help me.





    Sample Data:

    ABC|20170101|DEF ||GHIJ|KLM



    Target Table Definition:

    COLA numeric(5,0)

    COLB date

    COLC text

    COLD text

    COLE text



    First column should be mapped to COLA

    Second column should be mapped to COLB

    Third column should be mapped to COLD

    Fourth column should be mapped to COLC

    Fifth column should be mapped to Some default value(column is not
present in source)



    Transformation:

    a)First column should be mapped to COLA. It is numeric in target table.
If any alpha-characters were present, default this column with '0'.
Otherwise, source value should be moved to table.

    b)Second column should be mapped to COLB. TO_DATE function from text
format. File will have date format as YYYYMMDD. It should be converted to
date.

    c)Third column should be mapped to COLD.Need to Trim both leading and
trailing spaces.

    d)Fourth column should be mapped to COLC. If it NULL, some value should
be defaulted.

    e)Only few columns from source file should be loaded. In this case, only
first four columns should be loaded.

    f)Different ordering in source files & target columns.In this case,

            Third column should be mapped to COLD

            Fourth column should be mapped to COLC

    g)COLE should be loaded with default value. This column is not present
in source file.


Thanks



--
View this message in context: http://postgresql.nabble.com/copy-vs-C-function-tp5065298p5936796.html
Sent from the PostgreSQL - performance mailing list archive at Nabble.com.


pgsql-performance by date:

Previous
From: Daniel Blanch Bataller
Date:
Subject: Re: [PERFORM] Slow query after 9.3 to 9.6 migration
Next
From: Андрей Хозов
Date:
Subject: [PERFORM] Performance issue with castings args of the function