More then 1600 columns?
Greeting PostgreSQL Gurus.
From my goggling I know this has been discussed before and from what I have read it seems that the consensus is you can re-compile postgres and increase the block size from 8k to 16k or 32k to double or quadruple this limit.
I re-compiled the latest 9.0.1 source rpm on CentOS 5 x64 setting --with-blocksize=32 but when I tested it I still get the error "Error: tables can have at most 1600 columns"
pg_controldata verifies the block size is
"Database block size: 32768"
I also tried to increate the WAL block size to see if that had any effect but it does not. So is what I have read wrong? Is there are hard limit of 1600 that you cannot get around?
- Mark
"Mark Mitchell" <mmitchell@riccagroup.com> writes:
Is there are hard limit of 1600 that you cannot get around?
Yes.
Generally, wanting more than a few dozen columns is a good sign that you
need to rethink your schema design. What are you trying to accomplish
exactly?
regards, tom lane
On 11/11/10 9:24 PM, Tom Lane wrote:
"Mark Mitchell"<mmitchell@riccagroup.com> writes:
Is there are hard limit of 1600 that you cannot get around?
Yes.
Generally, wanting more than a few dozen columns is a good sign that you
need to rethink your schema design. What are you trying to accomplish
exactly?
indeed. I'd say a good read on 'data normalization' and the Third
Normal Form would be in order.
relational databases are *not* spreadsheets (and, for that matter,
spreadsheets make lousy relational databases)
if these 1600+ elements come from an ORM, you probably need to rethink
your object model, as no sane object class should have that many members.
Hey Mark,
Yeah, I can't imagine an entity in a real project even with more than 100
columns. Its rare case.
But if you entities (rows/tuples) of some class (table) can contains
variable
set of columns (properties) you can look at hstore contrib module.
2010/11/12 John R Pierce <pierce@hogranch.com>
On 11/11/10 9:24 PM, Tom Lane wrote:
"Mark Mitchell"<mmitchell@riccagroup.com> writes:
Is there are hard limit of 1600 that you cannot get around?
Yes.
Generally, wanting more than a few dozen columns is a good sign that you
need to rethink your schema design. What are you trying to accomplish
exactly?indeed. I'd say a good read on 'data normalization' and the Third Normal
Form would be in order.relational databases are *not* spreadsheets (and, for that matter,
spreadsheets make lousy relational databases)if these 1600+ elements come from an ORM, you probably need to rethink your
object model, as no sane object class should have that many members.--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
// Dmitriy.
On Fri, Nov 12, 2010 at 10:17:50AM +0300, Dmitriy Igrishin wrote:
Hey Mark,
Yeah, I can't imagine an entity in a real project even with more than 100
columns. Its rare case.
But if you entities (rows/tuples) of some class (table) can contains
variable
set of columns (properties) you can look at hstore contrib module.
What can also work extremely well is storing the data in an array.
If you need to access the array based on more meaningful keys you could
store key/index pairs in another table.
This approach only works well if you have multiple arrays with the same
layout. You probably also need to build up your query dynamically if
you need to access variable numbers of datapoints.
Cheers,
Peter
--
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
is especially attractive, not only because it can be economically
and scientifically rewarding, but also because it can be an aesthetic
experience much like composing poetry or music."
-- Donald Knuth
Hey Peter,
Unfortunately, there is no indexes on arrays (only on expressions).
With hstore we can easily create GiST index for effective access.
2010/11/12 Peter Bex <Peter.Bex@xs4all.nl>
On Fri, Nov 12, 2010 at 10:17:50AM +0300, Dmitriy Igrishin wrote:
Hey Mark,
Yeah, I can't imagine an entity in a real project even with more than 100
columns. Its rare case.
But if you entities (rows/tuples) of some class (table) can contains
variable
set of columns (properties) you can look at hstore contrib module.What can also work extremely well is storing the data in an array.
If you need to access the array based on more meaningful keys you could
store key/index pairs in another table.This approach only works well if you have multiple arrays with the same
layout. You probably also need to build up your query dynamically if
you need to access variable numbers of datapoints.Cheers,
Peter
--
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
is especially attractive, not only because it can be economically
and scientifically rewarding, but also because it can be an aesthetic
experience much like composing poetry or music."
-- Donald Knuth--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
// Dmitriy.
On Fri, Nov 12, 2010 at 10:43:14AM +0300, Dmitriy Igrishin wrote:
Hey Peter,
Unfortunately, there is no indexes on arrays (only on expressions).
With hstore we can easily create GiST index for effective access.
True. In my project I only ever needed to search on a particular key,
and I made sure that that key always had a fixed position in the array.
You can then create an index on the expression that extracts that
index from the array.
Cheers,
Peter
--
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
is especially attractive, not only because it can be economically
and scientifically rewarding, but also because it can be an aesthetic
experience much like composing poetry or music."
-- Donald Knuth
2010/11/12 Dmitriy Igrishin <dmitigr@gmail.com>:
Hey Peter,
Unfortunately, there is no indexes on arrays (only on expressions).
there are indexes on arrays - you can use a GiST, Gin indexes.
regards
Pavel
Show quoted text
With hstore we can easily create GiST index for effective access.
2010/11/12 Peter Bex <Peter.Bex@xs4all.nl>
On Fri, Nov 12, 2010 at 10:17:50AM +0300, Dmitriy Igrishin wrote:
Hey Mark,
Yeah, I can't imagine an entity in a real project even with more than
100
columns. Its rare case.
But if you entities (rows/tuples) of some class (table) can contains
variable
set of columns (properties) you can look at hstore contrib module.What can also work extremely well is storing the data in an array.
If you need to access the array based on more meaningful keys you could
store key/index pairs in another table.This approach only works well if you have multiple arrays with the same
layout. You probably also need to build up your query dynamically if
you need to access variable numbers of datapoints.Cheers,
Peter
--
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
is especially attractive, not only because it can be economically
and scientifically rewarding, but also because it can be an aesthetic
experience much like composing poetry or music."
-- Donald Knuth--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general--
// Dmitriy.
Hey Pavel,
Really ? I am sorry, I'll try !
2010/11/12 Pavel Stehule <pavel.stehule@gmail.com>
2010/11/12 Dmitriy Igrishin <dmitigr@gmail.com>:
Hey Peter,
Unfortunately, there is no indexes on arrays (only on expressions).
there are indexes on arrays - you can use a GiST, Gin indexes.
regards
Pavel
With hstore we can easily create GiST index for effective access.
2010/11/12 Peter Bex <Peter.Bex@xs4all.nl>
On Fri, Nov 12, 2010 at 10:17:50AM +0300, Dmitriy Igrishin wrote:
Hey Mark,
Yeah, I can't imagine an entity in a real project even with more than
100
columns. Its rare case.
But if you entities (rows/tuples) of some class (table) can contains
variable
set of columns (properties) you can look at hstore contrib module.What can also work extremely well is storing the data in an array.
If you need to access the array based on more meaningful keys you could
store key/index pairs in another table.This approach only works well if you have multiple arrays with the same
layout. You probably also need to build up your query dynamically if
you need to access variable numbers of datapoints.Cheers,
Peter
--
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
is especially attractive, not only because it can be economically
and scientifically rewarding, but also because it can be an aesthetic
experience much like composing poetry or music."
-- Donald Knuth--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general--
// Dmitriy.
--
// Dmitriy.
Well, it's possible to create GIN indexes on arrays of built-in data types
without extra efforts, because of "GIN" access method has defaul
operator class. But it's not true for GiST:
SELECT am.amname AS index_method, opc.opcname AS opclass_name
FROM pg_am am, pg_opclass opc
WHERE opc.opcmethod = am.oid AND am.amname = 'gist'
ORDER BY index_method, opclass_name;
gist | box_ops
gist | circle_ops
gist | point_ops
gist | poly_ops
gist | tsquery_ops
gist | tsvector_ops
With hstore I don't need to care about it at all.
But thank you very much to points me about indexes on arrays !
2010/11/12 Pavel Stehule <pavel.stehule@gmail.com>
2010/11/12 Dmitriy Igrishin <dmitigr@gmail.com>:
Hey Peter,
Unfortunately, there is no indexes on arrays (only on expressions).
there are indexes on arrays - you can use a GiST, Gin indexes.
regards
Pavel
With hstore we can easily create GiST index for effective access.
2010/11/12 Peter Bex <Peter.Bex@xs4all.nl>
On Fri, Nov 12, 2010 at 10:17:50AM +0300, Dmitriy Igrishin wrote:
Hey Mark,
Yeah, I can't imagine an entity in a real project even with more than
100
columns. Its rare case.
But if you entities (rows/tuples) of some class (table) can contains
variable
set of columns (properties) you can look at hstore contrib module.What can also work extremely well is storing the data in an array.
If you need to access the array based on more meaningful keys you could
store key/index pairs in another table.This approach only works well if you have multiple arrays with the same
layout. You probably also need to build up your query dynamically if
you need to access variable numbers of datapoints.Cheers,
Peter
--
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
is especially attractive, not only because it can be economically
and scientifically rewarding, but also because it can be an aesthetic
experience much like composing poetry or music."
-- Donald Knuth--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general--
// Dmitriy.
--
// Dmitriy.
2010/11/12 Dmitriy Igrishin <dmitigr@gmail.com>:
Well, it's possible to create GIN indexes on arrays of built-in data types
without extra efforts, because of "GIN" access method has defaul
operator class. But it's not true for GiST:
now you have a true :)
you need to install intarray contrib module for GiST
Pavel
Show quoted text
SELECT am.amname AS index_method, opc.opcname AS opclass_name
FROM pg_am am, pg_opclass opc
WHERE opc.opcmethod = am.oid AND am.amname = 'gist'
ORDER BY index_method, opclass_name;gist | box_ops
gist | circle_ops
gist | point_ops
gist | poly_ops
gist | tsquery_ops
gist | tsvector_opsWith hstore I don't need to care about it at all.
But thank you very much to points me about indexes on arrays !
2010/11/12 Pavel Stehule <pavel.stehule@gmail.com>
2010/11/12 Dmitriy Igrishin <dmitigr@gmail.com>:
Hey Peter,
Unfortunately, there is no indexes on arrays (only on expressions).
there are indexes on arrays - you can use a GiST, Gin indexes.
regards
Pavel
With hstore we can easily create GiST index for effective access.
2010/11/12 Peter Bex <Peter.Bex@xs4all.nl>
On Fri, Nov 12, 2010 at 10:17:50AM +0300, Dmitriy Igrishin wrote:
Hey Mark,
Yeah, I can't imagine an entity in a real project even with more than
100
columns. Its rare case.
But if you entities (rows/tuples) of some class (table) can contains
variable
set of columns (properties) you can look at hstore contrib module.What can also work extremely well is storing the data in an array.
If you need to access the array based on more meaningful keys you could
store key/index pairs in another table.This approach only works well if you have multiple arrays with the same
layout. You probably also need to build up your query dynamically if
you need to access variable numbers of datapoints.Cheers,
Peter
--
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
is especially attractive, not only because it can be economically
and scientifically rewarding, but also because it can be an aesthetic
experience much like composing poetry or music."
-- Donald Knuth--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general--
// Dmitriy.--
// Dmitriy.
I looked up the OP's domain. They develop medical research data
analysis software.
That sort of software and the kinds of data analysis they do tortures
SQL databases. These 1600+ element rows are likely very sparse.
Peter Bex, 12.11.2010 08:36:
What can also work extremely well is storing the data in an array.
If you need to access the array based on more meaningful keys you could
store key/index pairs in another table.
The hstore module would also be a viable alternative - and it's indexable as well.
Thomas
On Fri, Nov 12, 2010 at 10:18:31AM +0100, Thomas Kellerer wrote:
The hstore module would also be a viable alternative - and it's indexable
as well.
That's what the post I replied to mentioned :)
Also, when I looked at hstore, it had a limitation on its length which
made it a little dangerous to use for me:
"In the current implementation, neither the key nor the value string
can exceed 65535 bytes in length"
[http://www.postgresql.org/docs/8.4/interactive/hstore.html]
It looks like this limitation has been lifted in 9.0. Good news!
Cheers,
Peter
--
http://sjamaan.ath.cx
--
"The process of preparing programs for a digital computer
is especially attractive, not only because it can be economically
and scientifically rewarding, but also because it can be an aesthetic
experience much like composing poetry or music."
-- Donald Knuth
Sounds like semi-structured data handling. For this tasks hstore would be
ideal, IMO.
2010/11/12 John R Pierce <pierce@hogranch.com>
I looked up the OP's domain. They develop medical research data analysis
software.That sort of software and the kinds of data analysis they do tortures SQL
databases. These 1600+ element rows are likely very sparse.--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
// Dmitriy.
Yes I understand that this is "bad design" but what we are doing is storing each form field in a survey in its own column. For very long surveys we end up with thousands of elements.
I know storing in an array is possible but it makes it so much easier to query the data set when each element is in its own field. I had lots of comments on why I should not do this and the possible alternatives and I thank everyone for their input but no one answered the question about compiling with a higher block size to get more columns. Can anyone answer that?
-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Friday, November 12, 2010 12:24 AM
To: Mark Mitchell
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] More then 1600 columns?
"Mark Mitchell" <mmitchell@riccagroup.com> writes:
Is there are hard limit of 1600 that you cannot get around?
Yes.
Generally, wanting more than a few dozen columns is a good sign that you
need to rethink your schema design. What are you trying to accomplish
exactly?
regards, tom lane
hstore does look like it could work out better then a mother load of columns like we have now so long as the index support is good and there are no limits on value length.
It seems that the limit was lifted in 9.0? Is that true?
I was not aware of hstore when we started using postgres, that’s for the info!
And yes we do data analysis that tortures SQL, but SQL allows us to do so many things quickly and less painfully. Better to torture the machines then torture ourselves….
- Mark
From:pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Dmitriy Igrishin
Sent: Friday, November 12, 2010 4:50 AM
To: John R Pierce
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] More then 1600 columns?
Sounds like semi-structured data handling. For this tasks hstore would be ideal, IMO.
2010/11/12 John R Pierce <pierce@hogranch.com>
I looked up the OP's domain. They develop medical research data analysis software.
That sort of software and the kinds of data analysis they do tortures SQL databases. These 1600+ element rows are likely very sparse.
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general
--
// Dmitriy.
On 11/12/2010 08:38 AM, Mark Mitchell wrote:
Yes I understand that this is "bad design" but what we are doing is storing each form field in a survey in its own column. For very long surveys we end up with thousands of elements.
I know storing in an array is possible but it makes it so much easier to query the data set when each element is in its own field. I had lots of comments on why I should not do this and the possible alternatives and I thank everyone for their input but no one answered the question about compiling with a higher block size to get more columns. Can anyone answer that?-----Original Message-----
From: Tom Lane [mailto:tgl@sss.pgh.pa.us]
Sent: Friday, November 12, 2010 12:24 AM
To: Mark Mitchell
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] More then 1600 columns?"Mark Mitchell" <mmitchell@riccagroup.com> writes:
Is there are hard limit of 1600 that you cannot get around?
Yes.
Generally, wanting more than a few dozen columns is a good sign that you
need to rethink your schema design. What are you trying to accomplish
exactly?regards, tom lane
You can answer this yourself. Save chunks of the survey each in their
own table all keyed with a single id. I'm betting you don't write all
1600 fields at once (or your willing to seriously piss-off the data
entry staff when stuff happens trying to save the last "page"). select *
from table1, table 2 ... where table1.id = table2.id and table2.id =
table3.id .... Then you don't have to ensure that you custom postgres
is "everywhere you want to be".
"Mark Mitchell" <mmitchell@riccagroup.com> writes:
I know storing in an array is possible but it makes it so much easier to query the data set when each element is in its own field. I had lots of comments on why I should not do this and the possible alternatives and I thank everyone for their input but no one answered the question about compiling with a higher block size to get more columns. Can anyone answer that?
Yes, I did answer it: there is no such compilation option.
If you were willing to run a very nonstandard version of Postgres, you
could try widening t_hoff (see src/include/access/htup.h) but there is
nobody who can tell you what the fallout from that might be. One big
concern that I would have is the likelihood of O(N^2) behavior on very
long query targetlists.
On the whole I think you'd be a lot better off looking into hstore,
especially the improved 9.0 version.
regards, tom lane
Apologizes Tom I did not see that you had answered yes to my question about the hard limit.
You have all been very helpful, I will give up on the 1600+ columns and look into using hstore.
Cheers
- Mark
-----Original Message-----
From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Tom Lane
Sent: Friday, November 12, 2010 11:09 AM
To: Mark Mitchell
Cc: pgsql-general@postgresql.org
Subject: Re: [GENERAL] More then 1600 columns?
"Mark Mitchell" <mmitchell@riccagroup.com> writes:
I know storing in an array is possible but it makes it so much easier to query the data set when each element is in its own field. I had lots of comments on why I should not do this and the possible alternatives and I thank everyone for their input but no one answered the question about compiling with a higher block size to get more columns. Can anyone answer that?
Yes, I did answer it: there is no such compilation option.
If you were willing to run a very nonstandard version of Postgres, you
could try widening t_hoff (see src/include/access/htup.h) but there is
nobody who can tell you what the fallout from that might be. One big
concern that I would have is the likelihood of O(N^2) behavior on very
long query targetlists.
On the whole I think you'd be a lot better off looking into hstore,
especially the improved 9.0 version.
regards, tom lane
--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general