COPY: row is too big

Started by vod vosover 9 years ago34 messagesgeneral

Jump to latest

vod vos

vodvos@zoho.com

over 9 years ago

Hi everyone,

My postgresql is 9.61.

When I copy data from csv file, a very long values for many columns (about 1100 columns). The errors appears:

ERROR: row is too big: size 11808, maximum size 8160CONTEXT:

COPY rius, line 2

rius is the table.

I have searched the mailing list, but seems no solutions founded.

Thanks.

Import Notes

Reply to msg id not found:

John McKown

john.archie.mckown@gmail.com

over 9 years ago

In reply to: vod vos (#1)

Re: COPY: row is too big

On Mon, Jan 2, 2017 at 5:11 AM, vod vos <vodvos@zoho.com> wrote:

Hi everyone,

My postgresql is 9.61.

When I copy data from csv file, a very long values for many columns (about
1100 columns). The errors appears:

ERROR: row is too big: size 11808, maximum size 8160CONTEXT:

COPY rius, line 2

rius is the table.

I have searched the mailing list, but seems no solutions founded.

Thanks.

I looked in the source code. That message _seems_ to be coming from the
file ./src/backend/heap/hio.c and relates to MaxHeapTupleSize. This is set,
indirectly, from the BLKCZ set in the "configure" from when PostgreSQL was
originally compiled. That is, this is a "hard coded" limit which can only
be overridden by re-customizing PostgreSQL yourself using the source.
Apparently whomever did the PostgreSQL compilation setup took the default
BLKCZ of 8192. So there is no solution other than "do it yourself" by
getting the PostgreSQL source code and configuring it yourself. I can give
you the first step. You can get the PostgreSQL source one of two ways. You
can go here: https://www.postgresql.org/ftp/source/v9.6.1/ - download the
proper file. Or, if you have and know "git", you can enter the command: git
clone git://git.postgresql.org/git/postgresql.git .

Oh, I assumed (bad me!) that you're running on Linux. I know _nothing_
about how to do the above on Windows.

I am not a PostgreSQL guru. Perhaps I made a stupid mistake in my analysis
and the truly knowledgeable will have a better answer for you.

--
There’s no obfuscated Perl contest because it’s pointless.

—Jeff Polk

Maranatha! <><
John McKown

Adrian Klaver

adrian.klaver@aklaver.com

over 9 years ago

In reply to: vod vos (#1)

Re: COPY: row is too big

On 01/02/2017 03:11 AM, vod vos wrote:

Hi everyone,

My postgresql is 9.61.

When I copy data from csv file, a very long values for many columns
(about 1100 columns). The errors appears:

My guess is you are tripping this:

https://www.postgresql.org/about/
Maximum Columns per Table 250 - 1600 depending on column types

So what are you storing in table rius and can you give a general idea of
its schema? Not all 1100 columns just a sampling of the data types involved.

Also what is the COPY command you are using?

ERROR: row is too big: size 11808, maximum size 8160CONTEXT:

COPY rius, line 2

rius is the table.

I have searched the mailing list, but seems no solutions founded.

Thanks.

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

vod vos

vodvos@zoho.com

over 9 years ago

In reply to: vod vos (#1)

Re: COPY: row is too big

The most of the data type are text or varhcar, and I use:

COPY rius FROM "/var/www/test/aa.csv" WITH DELIMITER ';' ;

And some the values in the csv file contain nulls, do this null values matter?

Thanks.

---- On 星期一, 02 一月 2017 03:11:14 -0800 vod vos <vodvos@zoho.com> wrote ----

Hi everyone,

My postgresql is 9.61.

When I copy data from csv file, a very long values for many columns (about 1100 columns). The errors appears:

ERROR: row is too big: size 11808, maximum size 8160CONTEXT:

COPY rius, line 2

rius is the table.

I have searched the mailing list, but seems no solutions founded.

Thanks.

Tom Lane

tgl@sss.pgh.pa.us

over 9 years ago

In reply to: vod vos (#1)

Re: COPY: row is too big

vod vos <vodvos@zoho.com> writes:

When I copy data from csv file, a very long values for many columns (about 1100 columns). The errors appears:
ERROR: row is too big: size 11808, maximum size 8160

You need to rethink your table schema so you have fewer columns.
Perhaps you can combine some of them into arrays, for example.
JSON might be a useful option, too.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

vod vos

vodvos@zoho.com

over 9 years ago

In reply to: Tom Lane (#5)

Re: COPY: row is too big

You know, the csv file was exported from other database of a machine, so I really dont want to break it for it is a hard work. Every csv file contains headers and values. If I redesign the table, then I have to cut all the csv files into pieces one by one.

---- On 星期一, 02 一月 2017 08:21:29 -0800 Tom Lane <tgl@sss.pgh.pa.us> wrote ----

vod vos <vodvos@zoho.com> writes:

> When I copy data from csv file, a very long values for many columns (about 1100 columns). The errors appears:

> ERROR: row is too big: size 11808, maximum size 8160

You need to rethink your table schema so you have fewer columns.

Perhaps you can combine some of them into arrays, for example.

JSON might be a useful option, too.

regards, tom lane

Sent via pgsql-general mailing list (pgsql-general@postgresql.org)

To make changes to your subscription:

http://www.postgresql.org/mailpref/pgsql-general

Adrian Klaver

adrian.klaver@aklaver.com

over 9 years ago

In reply to: vod vos (#6)

Re: COPY: row is too big

On 01/02/2017 09:03 AM, vod vos wrote:

You know, the csv file was exported from other database of a machine, so
I really dont want to break it for it is a hard work. Every csv file
contains headers and values. If I redesign the table, then I have to cut
all the csv files into pieces one by one.

If it helps:

http://csvkit.readthedocs.io/en/latest/tutorial/1_getting_started.html#csvcut-data-scalpel

---- On 星期一, 02 一月 2017 08:21:29 -0800 *Tom Lane
<tgl@sss.pgh.pa.us>* wrote ----

vod vos <vodvos@zoho.com <mailto:vodvos@zoho.com>> writes:

When I copy data from csv file, a very long values for many

columns (about 1100 columns). The errors appears:

ERROR: row is too big: size 11808, maximum size 8160

You need to rethink your table schema so you have fewer columns.
Perhaps you can combine some of them into arrays, for example.
JSON might be a useful option, too.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org
<mailto:pgsql-general@postgresql.org>)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Rob Sargent

robjsargent@gmail.com

over 9 years ago

In reply to: Adrian Klaver (#7)

Re: COPY: row is too big

On Jan 2, 2017, at 10:13 AM, Adrian Klaver <adrian.klaver@aklaver.com> wrote:

On 01/02/2017 09:03 AM, vod vos wrote:
You know, the csv file was exported from other database of a machine, so
I really dont want to break it for it is a hard work. Every csv file
contains headers and values. If I redesign the table, then I have to cut
all the csv files into pieces one by one.

If it helps:

http://csvkit.readthedocs.io/en/latest/tutorial/1_getting_started.html#csvcut-data-scalpel

---- On 星期一, 02 一月 2017 08:21:29 -0800 *Tom Lane
<tgl@sss.pgh.pa.us>* wrote ----

vod vos <vodvos@zoho.com <mailto:vodvos@zoho.com>> writes:

When I copy data from csv file, a very long values for many

columns (about 1100 columns). The errors appears:

ERROR: row is too big: size 11808, maximum size 8160

You need to rethink your table schema so you have fewer columns.
Perhaps you can combine some of them into arrays, for example.
JSON might be a useful option, too.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org
<mailto:pgsql-general@postgresql.org>)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

Perhaps this is your opportunity to correct someone else's mistake. You need to show the table definition to convince us that it cannot be improved. That it may be hard work really doesn't mean it's not the right path.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

John McKown

john.archie.mckown@gmail.com

over 9 years ago

In reply to: Rob Sargent (#8)

Re: COPY: row is too big

On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent <robjsargent@gmail.com> wrote:

Perhaps this is your opportunity to correct someone else's mistake. You
need to show the table definition to convince us that it cannot be
improved. That it may be hard work really doesn't mean it's not the right
path.

This may not be possible. The data might be coming in from an external
source. I imagine you've run into the old "well, _we_ don't have any
problems, so it must be on your end!" scenario.

Example: we receive CSV files from an external source. These files are
_supposed_ to be validated. But we have often received files where NOT NULL
fields have "nothing" in them them. E.g. a customer bill which has
_everything_ in it _except_ the customer number (or an invalid one such as
"123{"); or missing some other vital piece of information.

In this particular case, the OP might want to do what we did in a similar
case. We had way too many columns in a table. The performance was horrible.
We did an analysis and, as usual, the majority of the selects were for a
subset of the columns, about 15% of the total. We "split" the table into
the "high use" columns table & the "low use" columns table. We then used
triggers to make sure that if we added a new / deleted an old row from one
table, the corresponding row in the other was created / deleted.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
There’s no obfuscated Perl contest because it’s pointless.

—Jeff Polk

Maranatha! <><
John McKown

#10

vod vos

vodvos@zoho.com

over 9 years ago

In reply to: John McKown (#9)

Re: COPY: row is too big

Now I am confused about I can create 1100 columns in a table in postgresql, but I can't copy 1100 values into the table. And I really dont want to split the csv file to pieces to avoid mistakes after this action.

I create a table with 1100 columns with data type of varchar, and hope the COPY command will auto transfer the csv data that contains some character and date, most of which are numeric.

I use the command: COPY rius FROM "/var/www/test/test.csv" WITH DELIMITER ';' ;

Then it shows:

ERROR: row is too big: size 11808, maximum size 8160

---- On 星期二, 03 一月 2017 05:24:18 -0800 John McKown <john.archie.mckown@gmail.com> wrote ----

On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent <robjsargent@gmail.com> wrote:

This may not be possible. The data might be coming in from an external source. I imagine you've run into the old "well, _we_ don't have any problems, so it must be on your end!" scenario.

Example: we receive CSV files from an external source. These files are _supposed_ to be validated. But we have often received files where NOT NULL fields have "nothing" in them them. E.g. a customer bill which has _everything_ in it _except_ the customer number (or an invalid one such as "123{"); or missing some other vital piece of information.

In this particular case, the OP might want to do what we did in a similar case. We had way too many columns in a table. The performance was horrible. We did an analysis and, as usual, the majority of the selects were for a subset of the columns, about 15% of the total. We "split" the table into the "high use" columns table & the "low use" columns table. We then used triggers to make sure that if we added a new / deleted an old row from one table, the corresponding row in the other was created / deleted.

Sent via pgsql-general mailing list (pgsql-general@postgresql.org)

To make changes to your subscription:

http://www.postgresql.org/mailpref/pgsql-general

There’s no obfuscated Perl contest because it’s pointless.

—Jeff Polk

Maranatha! <><

John McKown

#11

Adrian Klaver

adrian.klaver@aklaver.com

over 9 years ago

In reply to: vod vos (#10)

Re: COPY: row is too big

On 01/04/2017 05:00 AM, vod vos wrote:

Now I am confused about I can create 1100 columns in a table in
postgresql, but I can't copy 1100 values into the table. And I really

As pointed out previously:

https://www.postgresql.org/about/
Maximum Columns per Table 250 - 1600 depending on column types

That being dependent on both the number of columns and the actual data
in the columns. Empty columns are not the problem, it is when you start
filling them that you get the error.

dont want to split the csv file to pieces to avoid mistakes after this
action.

I create a table with 1100 columns with data type of varchar, and hope
the COPY command will auto transfer the csv data that contains some

I am afraid the solution is going to require more then hope. You are
going to need to break the data up. I suspect that just splitting it
into half would do the trick. So:

Table 1
column 1 for a primary key(assuming first column of your present data)
columns 2-550

Table 2
column 1 for a primary key(assuming first column of your present data)
columns 551-1100

Using the program I mentioned previously:

http://csvkit.readthedocs.io/en/1.0.1/scripts/csvcut.html

That translates into:

csvcut -c 1,2-550 your_big.csv > table_1.csv

csvcut -c 1,551-1100 your_big.csv > table_2.csv

character and date, most of which are numeric.

Is this a different data set?
Previously you said:
"The most of the data type are text or varhcar, ..."

I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
DELIMITER ';' ;

Then it shows:

ERROR: row is too big: size 11808, maximum size 8160

---- On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown
<john.archie.mckown@gmail.com>* wrote ----

On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent <robjsargent@gmail.com
<mailto:robjsargent@gmail.com>>wrote:

Perhaps this is your opportunity to correct someone else's
mistake. You need to show the table definition to convince us
that it cannot be improved. That it may be hard work really
doesn't mean it's not the right path.

This may not be possible. The data might be coming in from an
external source. I imagine you've run into the old "well, _we_ don't
have any problems, so it must be on your end!" scenario.

Example: we receive CSV files from an external source. These files
are _supposed_ to be validated. But we have often received files
where NOT NULL fields have "nothing" in them them. E.g. a customer
bill which has _everything_ in it _except_ the customer number (or
an invalid one such as "123{"); or missing some other vital piece of
information.

In this particular case, the OP might want to do what we did in a
similar case. We had way too many columns in a table. The
performance was horrible. We did an analysis and, as usual, the
majority of the selects were for a subset of the columns, about 15%
of the total. We "split" the table into the "high use" columns table
& the "low use" columns table. We then used triggers to make sure
that if we added a new / deleted an old row from one table, the
corresponding row in the other was created / deleted.

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org <mailto:pgsql-general@postgresql.org>)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
There’s no obfuscated Perl contest because it’s pointless.

—Jeff Polk

Maranatha! <><
John McKown

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#12

Pavel Stehule

pavel.stehule@gmail.com

over 9 years ago

In reply to: vod vos (#10)

Re: COPY: row is too big

2017-01-04 14:00 GMT+01:00 vod vos <vodvos@zoho.com>:

Now I am confused about I can create 1100 columns in a table in
postgresql, but I can't copy 1100 values into the table. And I really dont
want to split the csv file to pieces to avoid mistakes after this action.

The PostgreSQL limit is "Maximum Columns per Table250 - 1600 depending on
column types" - this limit is related to placing values or pointers to
values to one page (8KB).

You can hit this limit not in CREATE TABLE time, but in INSERT time.

I create a table with 1100 columns with data type of varchar, and hope the
COPY command will auto transfer the csv data that contains some character
and date, most of which are numeric.

Numeric is expensive type - try to use float instead, maybe double.

Regards

Pavel

Show quoted text

I use the command: COPY rius FROM "/var/www/test/test.csv" WITH DELIMITER
';' ;

Then it shows:

ERROR: row is too big: size 11808, maximum size 8160

---- On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown
<john.archie.mckown@gmail.com <john.archie.mckown@gmail.com>>* wrote ----

On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent <robjsargent@gmail.com> wrote:

Perhaps this is your opportunity to correct someone else's mistake. You
need to show the table definition to convince us that it cannot be
improved. That it may be hard work really doesn't mean it's not the right
path.

This may not be possible. The data might be coming in from an external
source. I imagine you've run into the old "well, _we_ don't have any
problems, so it must be on your end!" scenario.

Example: we receive CSV files from an external source. These files are
_supposed_ to be validated. But we have often received files where NOT NULL
fields have "nothing" in them them. E.g. a customer bill which has
_everything_ in it _except_ the customer number (or an invalid one such as
"123{"); or missing some other vital piece of information.

In this particular case, the OP might want to do what we did in a similar
case. We had way too many columns in a table. The performance was horrible.
We did an analysis and, as usual, the majority of the selects were for a
subset of the columns, about 15% of the total. We "split" the table into
the "high use" columns table & the "low use" columns table. We then used
triggers to make sure that if we added a new / deleted an old row from one
table, the corresponding row in the other was created / deleted.

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

--
There’s no obfuscated Perl contest because it’s pointless.

—Jeff Polk

Maranatha! <><
John McKown

#13

Adrian Klaver

adrian.klaver@aklaver.com

over 9 years ago

In reply to: Pavel Stehule (#12)

Re: COPY: row is too big

On 01/04/2017 06:54 AM, Pavel Stehule wrote:

Hi

2017-01-04 14:00 GMT+01:00 vod vos <vodvos@zoho.com
<mailto:vodvos@zoho.com>>:

__
Now I am confused about I can create 1100 columns in a table in
postgresql, but I can't copy 1100 values into the table. And I
really dont want to split the csv file to pieces to avoid mistakes
after this action.

The PostgreSQL limit is "Maximum Columns per Table250 - 1600 depending
on column types" - this limit is related to placing values or pointers
to values to one page (8KB).

You can hit this limit not in CREATE TABLE time, but in INSERT time.

I create a table with 1100 columns with data type of varchar, and
hope the COPY command will auto transfer the csv data that contains
some character and date, most of which are numeric.

Numeric is expensive type - try to use float instead, maybe double.

If I am following the OP correctly the table itself has all the columns
declared as varchar. The data in the CSV file is a mix of text, date and
numeric, presumably cast to text on entry into the table.

Regards

Pavel

I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
DELIMITER ';' ;

Then it shows:

ERROR: row is too big: size 11808, maximum size 8160

---- On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown
<john.archie.mckown@gmail.com
<mailto:john.archie.mckown@gmail.com>>* wrote ----

On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent
<robjsargent@gmail.com <mailto:robjsargent@gmail.com>>wrote:

Perhaps this is your opportunity to correct someone else's
mistake. You need to show the table definition to convince
us that it cannot be improved. That it may be hard work
really doesn't mean it's not the right path.

This may not be possible. The data might be coming in from an
external source. I imagine you've run into the old "well, _we_
don't have any problems, so it must be on your end!" scenario.

Example: we receive CSV files from an external source. These
files are _supposed_ to be validated. But we have often received
files where NOT NULL fields have "nothing" in them them. E.g. a
customer bill which has _everything_ in it _except_ the customer
number (or an invalid one such as "123{"); or missing some other
vital piece of information.

In this particular case, the OP might want to do what we did in
a similar case. We had way too many columns in a table. The
performance was horrible. We did an analysis and, as usual, the
majority of the selects were for a subset of the columns, about
15% of the total. We "split" the table into the "high use"
columns table & the "low use" columns table. We then used
triggers to make sure that if we added a new / deleted an old
row from one table, the corresponding row in the other was
created / deleted.

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org
<mailto:pgsql-general@postgresql.org>)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
<http://www.postgresql.org/mailpref/pgsql-general>

--
There’s no obfuscated Perl contest because it’s pointless.

—Jeff Polk

Maranatha! <><
John McKown

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#14

Pavel Stehule

pavel.stehule@gmail.com

over 9 years ago

In reply to: Adrian Klaver (#13)

Re: COPY: row is too big

2017-01-04 16:11 GMT+01:00 Adrian Klaver <adrian.klaver@aklaver.com>:

On 01/04/2017 06:54 AM, Pavel Stehule wrote:

Hi

2017-01-04 14:00 GMT+01:00 vod vos <vodvos@zoho.com
<mailto:vodvos@zoho.com>>:

__
Now I am confused about I can create 1100 columns in a table in
postgresql, but I can't copy 1100 values into the table. And I
really dont want to split the csv file to pieces to avoid mistakes
after this action.

The PostgreSQL limit is "Maximum Columns per Table250 - 1600 depending
on column types" - this limit is related to placing values or pointers
to values to one page (8KB).

You can hit this limit not in CREATE TABLE time, but in INSERT time.

I create a table with 1100 columns with data type of varchar, and
hope the COPY command will auto transfer the csv data that contains
some character and date, most of which are numeric.

Numeric is expensive type - try to use float instead, maybe double.

If I am following the OP correctly the table itself has all the columns
declared as varchar. The data in the CSV file is a mix of text, date and
numeric, presumably cast to text on entry into the table.

Table column type are important - Postgres enforces necessary
transformations.

Regards

Pavel

Show quoted text

Regards

Pavel

I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
DELIMITER ';' ;

Then it shows:

ERROR: row is too big: size 11808, maximum size 8160

---- On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown
<john.archie.mckown@gmail.com
<mailto:john.archie.mckown@gmail.com>>* wrote ----

On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent
<robjsargent@gmail.com <mailto:robjsargent@gmail.com>>wrote:

Perhaps this is your opportunity to correct someone else's
mistake. You need to show the table definition to convince
us that it cannot be improved. That it may be hard work
really doesn't mean it's not the right path.

This may not be possible. The data might be coming in from an
external source. I imagine you've run into the old "well, _we_
don't have any problems, so it must be on your end!" scenario.

Example: we receive CSV files from an external source. These
files are _supposed_ to be validated. But we have often received
files where NOT NULL fields have "nothing" in them them. E.g. a
customer bill which has _everything_ in it _except_ the customer
number (or an invalid one such as "123{"); or missing some other
vital piece of information.

In this particular case, the OP might want to do what we did in
a similar case. We had way too many columns in a table. The
performance was horrible. We did an analysis and, as usual, the
majority of the selects were for a subset of the columns, about
15% of the total. We "split" the table into the "high use"
columns table & the "low use" columns table. We then used
triggers to make sure that if we added a new / deleted an old
row from one table, the corresponding row in the other was
created / deleted.

--
Sent via pgsql-general mailing list
(pgsql-general@postgresql.org
<mailto:pgsql-general@postgresql.org>)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
<http://www.postgresql.org/mailpref/pgsql-general>

--
There’s no obfuscated Perl contest because it’s pointless.

—Jeff Polk

Maranatha! <><
John McKown

--
Adrian Klaver
adrian.klaver@aklaver.com

#15

vod vos

vodvos@zoho.com

over 9 years ago

In reply to: Adrian Klaver (#11)

Re: COPY: row is too big

OK, maybe the final solution is to split it into half.

---- On 星期三, 04 一月 2017 06:53:31 -0800 Adrian Klaver <adrian.klaver@aklaver.com> wrote ----

On 01/04/2017 05:00 AM, vod vos wrote:

> Now I am confused about I can create 1100 columns in a table in

> postgresql, but I can't copy 1100 values into the table. And I really

As pointed out previously:

https://www.postgresql.org/about/

Maximum Columns per Table 250 - 1600 depending on column types

That being dependent on both the number of columns and the actual data

in the columns. Empty columns are not the problem, it is when you start

filling them that you get the error.

> dont want to split the csv file to pieces to avoid mistakes after this

> action.

> I create a table with 1100 columns with data type of varchar, and hope

> the COPY command will auto transfer the csv data that contains some

I am afraid the solution is going to require more then hope. You are

going to need to break the data up. I suspect that just splitting it

into half would do the trick. So:

Table 1

column 1 for a primary key(assuming first column of your present data)

columns 2-550

Table 2

column 1 for a primary key(assuming first column of your present data)

columns 551-1100

Using the program I mentioned previously:

http://csvkit.readthedocs.io/en/1.0.1/scripts/csvcut.html

That translates into:

csvcut -c 1,2-550 your_big.csv > table_1.csv

csvcut -c 1,551-1100 your_big.csv > table_2.csv

> character and date, most of which are numeric.

Is this a different data set?

Previously you said:

"The most of the data type are text or varhcar, ..."

> I use the command: COPY rius FROM "/var/www/test/test.csv" WITH

> DELIMITER ';' ;

> Then it shows:

> ERROR: row is too big: size 11808, maximum size 8160

> ---- On 星期二, 03 一月 2017 05:24:18 -0800 *John McKown

> <john.archie.mckown@gmail.com>* wrote ----

> On Mon, Jan 2, 2017 at 2:57 PM, Rob Sargent <robjsargent@gmail.com

> <mailto:robjsargent@gmail.com>>wrote:

> Perhaps this is your opportunity to correct someone else's

> mistake. You need to show the table definition to convince us

> that it cannot be improved. That it may be hard work really

> doesn't mean it's not the right path.

> This may not be possible. The data might be coming in from an

> external source. I imagine you've run into the old "well, _we_ don't

> have any problems, so it must be on your end!" scenario.

> Example: we receive CSV files from an external source. These files

> are _supposed_ to be validated. But we have often received files

> where NOT NULL fields have "nothing" in them them. E.g. a customer

> bill which has _everything_ in it _except_ the customer number (or

> an invalid one such as "123{"); or missing some other vital piece of

> information.

> In this particular case, the OP might want to do what we did in a

> similar case. We had way too many columns in a table. The

> performance was horrible. We did an analysis and, as usual, the

> majority of the selects were for a subset of the columns, about 15%

> of the total. We "split" the table into the "high use" columns table

> & the "low use" columns table. We then used triggers to make sure

> that if we added a new / deleted an old row from one table, the

> corresponding row in the other was created / deleted.

> --

> Sent via pgsql-general mailing list

> (pgsql-general@postgresql.org <mailto:pgsql-general@postgresql.org>)

> To make changes to your subscription:

> http://www.postgresql.org/mailpref/pgsql-general

> --

> There’s no obfuscated Perl contest because it’s pointless.

> —Jeff Polk

> Maranatha! <><

> John McKown

Adrian Klaver

adrian.klaver@aklaver.com

#16

Peter J. Holzer

hjp-pgsql@hjp.at

over 9 years ago

In reply to: Adrian Klaver (#11)

Re: COPY: row is too big

On 2017-01-04 06:53:31 -0800, Adrian Klaver wrote:

On 01/04/2017 05:00 AM, vod vos wrote:

Now I am confused about I can create 1100 columns in a table in
postgresql, but I can't copy 1100 values into the table. And I really

As pointed out previously:

https://www.postgresql.org/about/
Maximum Columns per Table 250 - 1600 depending on column types

That being dependent on both the number of columns and the actual data in
the columns.

I think this is confusingly phrased. In my mind "column type" is static
- the type is the same, independent of the values which are stored. So
"250 - 1600 depending on column types" implies to me that there is
some type A of which I can have only 250 columns and another type B of
which I can have 1600 columns. But it doesn't imply to me that the
number of columns depends on the values which ar put into those columns.

May I suggest the these improvements?

In https://www.postgresql.org/about/:
Instead of
| 250 - 1600 depending on column types
write
| 250 - 1600 depending on column types and data

In https://www.postgresql.org/docs/9.6/static/ddl-basics.html:
Replace the sentence:
| Depending on the column types, it is between 250 and 1600.
with:
| For all columns in a row, some information (either the data itself or
| a pointer to the data) must be stored in a single block (8 kB).
| Because for some types this data is itself of variable length, the
| maximum number of columns depends not only on the types of the columns
| but also on the data (e.g., a NULL uses less space than a non-NULL
| value). Therefore there is no simple way to compute the maximum number
| of columns, and it is possible to declare a table with more columns
| than can be filled. Keeping all this in mind, the limit is between 250
| and 1600.

#17

rob stone

floriparob@gmail.com

over 9 years ago

In reply to: Adrian Klaver (#13)

Re: COPY: row is too big

Hello,
On Wed, 2017-01-04 at 07:11 -0800, Adrian Klaver wrote:

On 01/04/2017 06:54 AM, Pavel Stehule wrote:

Hi

2017-01-04 14:00 GMT+01:00 vod vos <vodvos@zoho.com
<mailto:vodvos@zoho.com>>:

    __
    Now I am confused about I can create 1100 columns in a table in
    postgresql, but I can't copy 1100 values into the table. And I
    really dont want to split the csv file to pieces to avoid
mistakes
    after this action.

The PostgreSQL limit is "Maximum Columns per Table250 - 1600
depending
on column types" - this limit is related to placing values or
pointers
to values to one page (8KB).

You can hit this limit not in CREATE TABLE time, but in INSERT
time.

    I create a table with 1100 columns with data type of varchar,
and
    hope the COPY command will auto transfer the csv data that
contains
    some character and date, most of which are numeric.

Numeric is expensive type - try to use float instead, maybe double.

If I am following the OP correctly the table itself has all the
columns
declared as varchar. The data in the CSV file is a mix of text, date
and
numeric, presumably cast to text on entry into the table.

Regards

Pavel

    I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
    DELIMITER ';' ;

    Then it shows:

    ERROR:  row is too big: size 11808, maximum size 8160

Assuming this is a brand new database instance and not an existing
application, could the OP not compile from source and specify the
--with-blocksize=16384 so as to overcome the 8k default page size
limit?

My 2 cents.
Rob

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#18

Adrian Klaver

adrian.klaver@aklaver.com

over 9 years ago

In reply to: rob stone (#17)

Re: COPY: row is too big

On 01/04/2017 08:00 AM, rob stone wrote:

Hello,
On Wed, 2017-01-04 at 07:11 -0800, Adrian Klaver wrote:

On 01/04/2017 06:54 AM, Pavel Stehule wrote:

Hi

2017-01-04 14:00 GMT+01:00 vod vos <vodvos@zoho.com
<mailto:vodvos@zoho.com>>:

__
Now I am confused about I can create 1100 columns in a table in
postgresql, but I can't copy 1100 values into the table. And I
really dont want to split the csv file to pieces to avoid
mistakes
after this action.

The PostgreSQL limit is "Maximum Columns per Table250 - 1600
depending
on column types" - this limit is related to placing values or
pointers
to values to one page (8KB).

You can hit this limit not in CREATE TABLE time, but in INSERT
time.

I create a table with 1100 columns with data type of varchar,
and
hope the COPY command will auto transfer the csv data that
contains
some character and date, most of which are numeric.

Numeric is expensive type - try to use float instead, maybe double.

If I am following the OP correctly the table itself has all the
columns
declared as varchar. The data in the CSV file is a mix of text, date
and
numeric, presumably cast to text on entry into the table.

Regards

Pavel

I use the command: COPY rius FROM "/var/www/test/test.csv" WITH
DELIMITER ';' ;

Then it shows:

ERROR: row is too big: size 11808, maximum size 8160

Assuming this is a brand new database instance and not an existing
application, could the OP not compile from source and specify the
--with-blocksize=16384 so as to overcome the 8k default page size
limit?

Well I was thinking along those lines also, then I did a search on
BLCKSZ in the docs and saw all the configuration parameters that are
keyed off it. I know I would have to do a lot more homework to
understand the implications to the database instance as a whole and
whether it was worth it to accommodate a single table.

My 2 cents.
Rob

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#19

Steve Crawford

scrawford@pinpointresearch.com

over 9 years ago

In reply to: Adrian Klaver (#13)

Re: COPY: row is too big

...

Numeric is expensive type - try to use float instead, maybe double.

If I am following the OP correctly the table itself has all the columns
declared as varchar. The data in the CSV file is a mix of text, date and
numeric, presumably cast to text on entry into the table.

But a CSV *is* purely text - no casting to text is needed. Conversion is
only needed when the strings in the CSV are text representations of
*non*-text data.

I'm guessing that the OP is using all text fields to deal with possibly
flawed input data and then validating and migrating the data in subsequent
steps. In that case, an ETL solution may be a better approach. Many
options, both open- closed- and hybrid-source exist.

Cheers,
Steve

#20

Adrian Klaver

adrian.klaver@aklaver.com

over 9 years ago

In reply to: Steve Crawford (#19)

Re: COPY: row is too big

On 01/04/2017 08:32 AM, Steve Crawford wrote:

...

Numeric is expensive type - try to use float instead, maybe double.

If I am following the OP correctly the table itself has all the
columns declared as varchar. The data in the CSV file is a mix of
text, date and numeric, presumably cast to text on entry into the table.

But a CSV *is* purely text - no casting to text is needed. Conversion is
only needed when the strings in the CSV are text representations of
*non*-text data.

Yeah, muddled thinking.

I'm guessing that the OP is using all text fields to deal with possibly
flawed input data and then validating and migrating the data in
subsequent steps. In that case, an ETL solution may be a better
approach. Many options, both open- closed- and hybrid-source exist.

Cheers,
Steve

--
Adrian Klaver
adrian.klaver@aklaver.com

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#21

vod vos

vodvos@zoho.com

over 9 years ago

In reply to: Adrian Klaver (#20)

#22

Pavel Stehule

pavel.stehule@gmail.com

over 9 years ago

In reply to: vod vos (#21)

#23

Adrian Klaver