Re: COPY from STDIN vs file with large CSVs - Mailing list pgsql-admin

From Wells Oliver
Subject Re: COPY from STDIN vs file with large CSVs
Date
Msg-id CAOC+FBVp=DA0++VT9VaF0azd1_w4GcT2WP6zFNsZ+X0tMzfRug@mail.gmail.com
Whole thread Raw
In response to Re: COPY from STDIN vs file with large CSVs  (Ron <ronljohnsonjr@gmail.com>)
List pgsql-admin
Thanks, I had looked into pg_bulkload a bit but it does not seem to be available for PG 12? It's not in the extension directory, it's not available through apt-cache search, and I have all kinds of issues with it finding pgcommon and pgport when I try to build from source. Using Ubuntu 18 LTS...

On Wed, Jan 8, 2020 at 9:09 AM Ron <ronljohnsonjr@gmail.com> wrote:
On 1/8/20 10:54 AM, Wells Oliver wrote:
> I have a CSV that's ~30GB. Some 400m rows. Would there be a meaningful
> performance difference to run COPY from STDIN using: cat f.csv | psql
> "COPY .. FROM STDIN WITH CSV" versus just doing "COPY ... FROM 'f.csv'
> WITH CSV"?
>
> Thanks. It took about four hours to copy one and I felt that was a little
> much.

catting the file starts another process, and opens a pipe.  That can't be
faster than "COPY ... FROM ... WITH CSV".

pg_bulkload (which might be in your repository) is probably what you really
want.

--
Angular momentum makes the world go 'round.




--

pgsql-admin by date:

Previous
From: Scott Ribe
Date:
Subject: Re: COPY from STDIN vs file with large CSVs
Next
From: "David G. Johnston"
Date:
Subject: Re: COPY from STDIN vs file with large CSVs