Home > mailing lists

Re: Load data from a csv file without using COPY - Mailing list pgsql-general

From	Adrian Klaver
Subject	Re: Load data from a csv file without using COPY
Date	June 26, 2018 01:52:11
Msg-id	551a5b45-15d0-f6e4-821a-f2416e09ab6a@aklaver.com Whole thread Raw
In response to	RE: Load data from a csv file without using COPY (Kevin Brannen <KBrannen@efji.com>)
List	pgsql-general

Tree view

On 06/25/2018 09:17 AM, Kevin Brannen wrote:
> From: Ravi Krishna [mailto:srkrishna@yahoo.com]
> Sent: Tuesday, June 19, 2018 4:15 PM
> To: Steve Atkins <steve@blighty.com>
> Cc: PG mailing List <pgsql-general@lists.postgresql.org>
> Subject: Re: Load data from a csv file without using COPY
> 
>>
>> If performance is relevant then your app should probably be using COPY
>> protocol, not line by line inserts. It's supported by most postgresql
>> access libraries. If your app does that then using "\copy" from psql would be an appropriate benchmark.
> 
> Actually the reluctance to not use COPY is to make the benchmark same across two different RDBMS in two diff env.

That is not clear from this post:

https://www.postgresql.org/message-id/em2345975f-0c51-42dd-a35f-ff88715e8bbb%40ravi-lenovo


"3. We now have a catch-22 situation.  Should we spend time porting the
app to PG without first verifying
      that PG can perform as well as DB2. In other words, if some sort of
testing rules out PG as a good
      replacement for DB2, why even bother to port.  Of course that does
not prove conclusively that if PG
      passes the test, then it would mean that the app will work just as
fine.  But at least basic test will tell
     that we are not on a wrong path.
4. What I am planning is:
      4.a Get a set of large tables exported as a pipe delimited text
file.
      4.b Load them in both DB2 and PG on a similar h/w
      4.c  Run OLAP queries.

4.b is to test i/o. Our app is sensitive to the load times and some of
the tables are really wide.
4.c is to test maturity of PG in handling complex OLAP SQLs. From what I
have read, while PG
       optimizer is very good in handling OLTP, it is not, as yet, as good
in OLAP queries.

"

I read that as can Postgres replace DB2 and not lose performance? That 
would, to me, mean use whatever works best to get the job done.

> 
> ---
> 
> Seems like a ~10 line Perl program could handle this very easily. Use the Text::CSV module to make handling of the
inputeasier. Prepare your insert statement, then once you've pulled each line of input in execute the insert. For
slightlybetter performance, I'd probably use transactions, add a counter, and commit every 10,000 rows (or something
likethat).
 
> 
> Once you have that working with 1 DB, you can copy your program, change the DBD driver and the connect statement to
theother DB and try the other one. Unless you want to be really clever and make the same program do both and pick the
DBby a command line switch. :)
 
> 
> HTH,
> Kevin


-- 
Adrian Klaver
adrian.klaver@aklaver.com

pgsql-general by date:

From: Adrian Klaver
Date: 26 June 2018, 01:40:59
Subject: Re: Schema/Data conversion opensource tools from MySQL to PostgreSQL

From: Vikas Sharma
Date: 26 June 2018, 01:59:16
Subject: Re: DB size growing exponentially when materialized view refreshedconcurrently (postgres 9.6)

Re: Load data from a csv file without using COPY - Mailing list pgsql-general

Previous

Next