Re: Load data from a csv file without using COPY - Mailing list pgsql-general

From Adrian Klaver
Subject Re: Load data from a csv file without using COPY
Date
Msg-id 551a5b45-15d0-f6e4-821a-f2416e09ab6a@aklaver.com
Whole thread Raw
In response to RE: Load data from a csv file without using COPY  (Kevin Brannen <KBrannen@efji.com>)
List pgsql-general
On 06/25/2018 09:17 AM, Kevin Brannen wrote:
> From: Ravi Krishna [mailto:srkrishna@yahoo.com]
> Sent: Tuesday, June 19, 2018 4:15 PM
> To: Steve Atkins <steve@blighty.com>
> Cc: PG mailing List <pgsql-general@lists.postgresql.org>
> Subject: Re: Load data from a csv file without using COPY
> 
>>
>> If performance is relevant then your app should probably be using COPY
>> protocol, not line by line inserts. It's supported by most postgresql
>> access libraries. If your app does that then using "\copy" from psql would be an appropriate benchmark.
> 
> Actually the reluctance to not use COPY is to make the benchmark same across two different RDBMS in two diff env.

That is not clear from this post:

https://www.postgresql.org/message-id/em2345975f-0c51-42dd-a35f-ff88715e8bbb%40ravi-lenovo


"3. We now have a catch-22 situation.  Should we spend time porting the
app to PG without first verifying
      that PG can perform as well as DB2. In other words, if some sort of
testing rules out PG as a good
      replacement for DB2, why even bother to port.  Of course that does
not prove conclusively that if PG
      passes the test, then it would mean that the app will work just as
fine.  But at least basic test will tell
     that we are not on a wrong path.
4. What I am planning is:
      4.a Get a set of large tables exported as a pipe delimited text
file.
      4.b Load them in both DB2 and PG on a similar h/w
      4.c  Run OLAP queries.

4.b is to test i/o. Our app is sensitive to the load times and some of
the tables are really wide.
4.c is to test maturity of PG in handling complex OLAP SQLs. From what I
have read, while PG
       optimizer is very good in handling OLTP, it is not, as yet, as good
in OLAP queries.

"

I read that as can Postgres replace DB2 and not lose performance? That 
would, to me, mean use whatever works best to get the job done.

> 
> ---
> 
> Seems like a ~10 line Perl program could handle this very easily. Use the Text::CSV module to make handling of the
inputeasier. Prepare your insert statement, then once you've pulled each line of input in execute the insert. For
slightlybetter performance, I'd probably use transactions, add a counter, and commit every 10,000 rows (or something
likethat).
 
> 
> Once you have that working with 1 DB, you can copy your program, change the DBD driver and the connect statement to
theother DB and try the other one. Unless you want to be really clever and make the same program do both and pick the
DBby a command line switch. :)
 
> 
> HTH,
> Kevin


-- 
Adrian Klaver
adrian.klaver@aklaver.com


pgsql-general by date:

Previous
From: Adrian Klaver
Date:
Subject: Re: Schema/Data conversion opensource tools from MySQL to PostgreSQL
Next
From: Vikas Sharma
Date:
Subject: Re: DB size growing exponentially when materialized view refreshedconcurrently (postgres 9.6)