Thread: BUG #15654: COPY command not working for 2gb CSV files

BUG #15654: COPY command not working for 2gb CSV files

From
PG Bug reporting form
Date:
The following bug has been logged on the website:

Bug reference:      15654
Logged by:          Sandeep Kumar
Email address:      sandeep.t.kumar@gmail.com
PostgreSQL version: 11.0
Operating system:   Windows
Description:

Hi Team,

When i am trying to import the data from CSV file of 2 GB , getting
following error and i have observed that the file size of less then 2 GB
went well without any issue.Please look into this and provide your inputs on
this.

Command I am using
-----------------------------
Copy table From '<Filename>.csv' DELIMITER '~' null as 'null'  encoding
'windows-1251' CSV; select 1; 

Error I am getting
------------------------
ERROR:  could not stat file "<Filename>.csv": Unknown error
SQL state: XX000

Thanks
Sandeep


Re: BUG #15654: COPY command not working for 2gb CSV files

From
David Rowley
Date:
On Tue, 26 Feb 2019 at 00:35, PG Bug reporting form
<noreply@postgresql.org> wrote:
> Command I am using
> -----------------------------
> Copy table From '<Filename>.csv' DELIMITER '~' null as 'null'  encoding
> 'windows-1251' CSV; select 1;
>
> Error I am getting
> ------------------------
> ERROR:  could not stat file "<Filename>.csv": Unknown error
> SQL state: XX000

I can recreate that here.  The error comes from the call to fstat() in
BeginCopyFrom().

Going by the Microsoft documentation fstat() only has a file length
type of 32bits.


https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fstat-fstat32-fstat64-fstati64-fstat32i64-fstat64i32?view=vs-2017

Seems to work if I change the fstat() call to _fstati64() and change
the type of st to struct _stat64. Perhaps we need to wrap some macros
around these in port and have windows use the 64-bit versions.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: BUG #15654: COPY command not working for 2gb CSV files

From
Michael Paquier
Date:
On Tue, Feb 26, 2019 at 03:42:40AM +1300, David Rowley wrote:
> Seems to work if I change the fstat() call to _fstati64() and change
> the type of st to struct _stat64. Perhaps we need to wrap some macros
> around these in port and have windows use the 64-bit versions.

It is a bit more complicated than it sounds as stat() is already a
macro in the Windows port.  Please see here:
https://www.postgresql.org/message-id/flat/df939c6f-2866-48b8-b3fe-5cbb54576a53%40manitou-mail.org
https://www.postgresql.org/message-id/1803D792815FC24D871C00D17AE95905CF5099@g01jpexmbkw24
--
Michael

Attachment

Re: BUG #15654: COPY command not working for 2gb CSV files

From
David Rowley
Date:
On Tue, 26 Feb 2019 at 12:43, Michael Paquier <michael@paquier.xyz> wrote:
>
> On Tue, Feb 26, 2019 at 03:42:40AM +1300, David Rowley wrote:
> > Seems to work if I change the fstat() call to _fstati64() and change
> > the type of st to struct _stat64. Perhaps we need to wrap some macros
> > around these in port and have windows use the 64-bit versions.
>
> It is a bit more complicated than it sounds as stat() is already a
> macro in the Windows port.  Please see here:
> https://www.postgresql.org/message-id/flat/df939c6f-2866-48b8-b3fe-5cbb54576a53%40manitou-mail.org
> https://www.postgresql.org/message-id/1803D792815FC24D871C00D17AE95905CF5099@g01jpexmbkw24

hmm, but we're talking about fstat() not stat().  Perhaps it suffers
from the same issue, but there does not appear to be a macro for
fstat() in win32_port.h therefore likely involves a less complex fix.

-- 
 David Rowley                   http://www.2ndQuadrant.com/
 PostgreSQL Development, 24x7 Support, Training & Services


Re: BUG #15654: COPY command not working for 2gb CSV files

From
Michael Paquier
Date:
On Tue, Feb 26, 2019 at 12:52:58PM +1300, David Rowley wrote:
> hmm, but we're talking about fstat() not stat().  Perhaps it suffers
> from the same issue, but there does not appear to be a macro for
> fstat() in win32_port.h therefore likely involves a less complex fix.

I thought that was the case, and double-checking pgwin32_safestat()
only maps to stat().

Windows has the bad idea to declare _stat, and put the rest of the
return results of the different calls of stat() and fstat() into
different structures.

Anyway, if I recall correctly, you are still going to run into issues
if trying to map _stat64 to "struct stat".  I have played with this
problem for a couple of hours, and this did not finish well because of
the define of stat to pgwin32_safestat in port.h.  And we likely don't
want to have a dedicated pg_stat struct in the full code tree as
that's spread to a lot of places.
--
Michael

Attachment

Re: BUG #15654: COPY command not working for 2gb CSV files

From
sandy kumar
Date:
Thanks Michael and David for the information, is there any workaround for this issue?

Thanks
Sandeep

On Tue, Feb 26, 2019 at 5:39 AM Michael Paquier <michael@paquier.xyz> wrote:
On Tue, Feb 26, 2019 at 12:52:58PM +1300, David Rowley wrote:
> hmm, but we're talking about fstat() not stat().  Perhaps it suffers
> from the same issue, but there does not appear to be a macro for
> fstat() in win32_port.h therefore likely involves a less complex fix.

I thought that was the case, and double-checking pgwin32_safestat()
only maps to stat().

Windows has the bad idea to declare _stat, and put the rest of the
return results of the different calls of stat() and fstat() into
different structures.

Anyway, if I recall correctly, you are still going to run into issues
if trying to map _stat64 to "struct stat".  I have played with this
problem for a couple of hours, and this did not finish well because of
the define of stat to pgwin32_safestat in port.h.  And we likely don't
want to have a dedicated pg_stat struct in the full code tree as
that's spread to a lot of places.
--
Michael

Re: BUG #15654: COPY command not working for 2gb CSV files

From
Michael Paquier
Date:
On Tue, Feb 26, 2019 at 09:48:11AM +0530, sandy kumar wrote:
> Thanks Michael and David for the information, is there any workaround for
> this issue?

Splitting the file into multiple pieces is the first thing I can think
of.  COPY does not really offer an option to bypass the code involved.
--
Michael

Attachment