BUG #15654: COPY command not working for 2gb CSV files

Started by PG Bug reporting formabout 7 years ago7 messagesbugs
Jump to latest
#1PG Bug reporting form
noreply@postgresql.org

The following bug has been logged on the website:

Bug reference: 15654
Logged by: Sandeep Kumar
Email address: sandeep.t.kumar@gmail.com
PostgreSQL version: 11.0
Operating system: Windows
Description:

Hi Team,

When i am trying to import the data from CSV file of 2 GB , getting
following error and i have observed that the file size of less then 2 GB
went well without any issue.Please look into this and provide your inputs on
this.

Command I am using
-----------------------------
Copy table From '<Filename>.csv' DELIMITER '~' null as 'null' encoding
'windows-1251' CSV; select 1;

Error I am getting
------------------------
ERROR: could not stat file "<Filename>.csv": Unknown error
SQL state: XX000

Thanks
Sandeep

#2David Rowley
dgrowleyml@gmail.com
In reply to: PG Bug reporting form (#1)
Re: BUG #15654: COPY command not working for 2gb CSV files

On Tue, 26 Feb 2019 at 00:35, PG Bug reporting form
<noreply@postgresql.org> wrote:

Command I am using
-----------------------------
Copy table From '<Filename>.csv' DELIMITER '~' null as 'null' encoding
'windows-1251' CSV; select 1;

Error I am getting
------------------------
ERROR: could not stat file "<Filename>.csv": Unknown error
SQL state: XX000

I can recreate that here. The error comes from the call to fstat() in
BeginCopyFrom().

Going by the Microsoft documentation fstat() only has a file length
type of 32bits.

https://docs.microsoft.com/en-us/cpp/c-runtime-library/reference/fstat-fstat32-fstat64-fstati64-fstat32i64-fstat64i32?view=vs-2017

Seems to work if I change the fstat() call to _fstati64() and change
the type of st to struct _stat64. Perhaps we need to wrap some macros
around these in port and have windows use the 64-bit versions.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#3Michael Paquier
michael@paquier.xyz
In reply to: David Rowley (#2)
Re: BUG #15654: COPY command not working for 2gb CSV files

On Tue, Feb 26, 2019 at 03:42:40AM +1300, David Rowley wrote:

Seems to work if I change the fstat() call to _fstati64() and change
the type of st to struct _stat64. Perhaps we need to wrap some macros
around these in port and have windows use the 64-bit versions.

It is a bit more complicated than it sounds as stat() is already a
macro in the Windows port. Please see here:
/messages/by-id/df939c6f-2866-48b8-b3fe-5cbb54576a53@manitou-mail.org
/messages/by-id/1803D792815FC24D871C00D17AE95905CF5099@g01jpexmbkw24
--
Michael

#4David Rowley
dgrowleyml@gmail.com
In reply to: Michael Paquier (#3)
Re: BUG #15654: COPY command not working for 2gb CSV files

On Tue, 26 Feb 2019 at 12:43, Michael Paquier <michael@paquier.xyz> wrote:

On Tue, Feb 26, 2019 at 03:42:40AM +1300, David Rowley wrote:

Seems to work if I change the fstat() call to _fstati64() and change
the type of st to struct _stat64. Perhaps we need to wrap some macros
around these in port and have windows use the 64-bit versions.

It is a bit more complicated than it sounds as stat() is already a
macro in the Windows port. Please see here:
/messages/by-id/df939c6f-2866-48b8-b3fe-5cbb54576a53@manitou-mail.org
/messages/by-id/1803D792815FC24D871C00D17AE95905CF5099@g01jpexmbkw24

hmm, but we're talking about fstat() not stat(). Perhaps it suffers
from the same issue, but there does not appear to be a macro for
fstat() in win32_port.h therefore likely involves a less complex fix.

--
David Rowley http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services

#5Michael Paquier
michael@paquier.xyz
In reply to: David Rowley (#4)
Re: BUG #15654: COPY command not working for 2gb CSV files

On Tue, Feb 26, 2019 at 12:52:58PM +1300, David Rowley wrote:

hmm, but we're talking about fstat() not stat(). Perhaps it suffers
from the same issue, but there does not appear to be a macro for
fstat() in win32_port.h therefore likely involves a less complex fix.

I thought that was the case, and double-checking pgwin32_safestat()
only maps to stat().

Windows has the bad idea to declare _stat, and put the rest of the
return results of the different calls of stat() and fstat() into
different structures.

Anyway, if I recall correctly, you are still going to run into issues
if trying to map _stat64 to "struct stat". I have played with this
problem for a couple of hours, and this did not finish well because of
the define of stat to pgwin32_safestat in port.h. And we likely don't
want to have a dedicated pg_stat struct in the full code tree as
that's spread to a lot of places.
--
Michael

#6sandy kumar
sandeep.t.kumar@gmail.com
In reply to: Michael Paquier (#5)
Re: BUG #15654: COPY command not working for 2gb CSV files

Thanks Michael and David for the information, is there any workaround for
this issue?

Thanks
Sandeep

On Tue, Feb 26, 2019 at 5:39 AM Michael Paquier <michael@paquier.xyz> wrote:

Show quoted text

On Tue, Feb 26, 2019 at 12:52:58PM +1300, David Rowley wrote:

hmm, but we're talking about fstat() not stat(). Perhaps it suffers
from the same issue, but there does not appear to be a macro for
fstat() in win32_port.h therefore likely involves a less complex fix.

I thought that was the case, and double-checking pgwin32_safestat()
only maps to stat().

Windows has the bad idea to declare _stat, and put the rest of the
return results of the different calls of stat() and fstat() into
different structures.

Anyway, if I recall correctly, you are still going to run into issues
if trying to map _stat64 to "struct stat". I have played with this
problem for a couple of hours, and this did not finish well because of
the define of stat to pgwin32_safestat in port.h. And we likely don't
want to have a dedicated pg_stat struct in the full code tree as
that's spread to a lot of places.
--
Michael

#7Michael Paquier
michael@paquier.xyz
In reply to: sandy kumar (#6)
Re: BUG #15654: COPY command not working for 2gb CSV files

On Tue, Feb 26, 2019 at 09:48:11AM +0530, sandy kumar wrote:

Thanks Michael and David for the information, is there any workaround for
this issue?

Splitting the file into multiple pieces is the first thing I can think
of. COPY does not really offer an option to bypass the code involved.
--
Michael