B-tree index on a VARCHAR(4000) column

Started by Ron Johnsonover 8 years ago8 messagesgeneral
Jump to latest
#1Ron Johnson
ron.l.johnson@cox.net

Hi,

v 9.2.7

Based on LENGTH(offending_column), none of the values are more than 144
bytes in this 44.2M row table. Even though VARCHAR is, by definition,
variable length, are there any internal design issues which would make
things more efficient if it were dropped to, for example, VARCHAR(256)?

(I don't have access to the source code or to development boxes, so can't
just test this on my own.)

Thanks,

--
World Peace Through Nuclear Pacification

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: Ron Johnson (#1)
Re: B-tree index on a VARCHAR(4000) column

Ron Johnson <ron.l.johnson@cox.net> writes:

Based on LENGTH(offending_column), none of the values are more than 144
bytes in this 44.2M row table. Even though VARCHAR is, by definition,
variable length, are there any internal design issues which would make
things more efficient if it were dropped to, for example, VARCHAR(256)?

No.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#3Merlin Moncure
mmoncure@gmail.com
In reply to: Ron Johnson (#1)
Re: B-tree index on a VARCHAR(4000) column

On Thu, Sep 7, 2017 at 10:48 PM, Ron Johnson <ron.l.johnson@cox.net> wrote:

Hi,

v 9.2.7

Based on LENGTH(offending_column), none of the values are more than 144
bytes in this 44.2M row table. Even though VARCHAR is, by definition,
variable length, are there any internal design issues which would make
things more efficient if it were dropped to, for example, VARCHAR(256)?

(I don't have access to the source code or to development boxes, so can't
just test this on my own.)

Just use TEXT :-). Realizing that obsessing about column lengths was
a giant waste of time and energy for zero useful benefit that I've
ever observed was a transformational moment for me. Also, please
apply bugfix upgrades :-).

merlin

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#4John Turner
fenwayriffs@gmail.com
In reply to: Tom Lane (#2)
Re: B-tree index on a VARCHAR(4000) column

On Fri, Sep 8, 2017 at 6:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ron Johnson <ron.l.johnson@cox.net> writes:

Based on LENGTH(offending_column), none of the values are more than 144
bytes in this 44.2M row table. Even though VARCHAR is, by definition,
variable length, are there any internal design issues which would make
things more efficient if it were dropped to, for example, VARCHAR(256)?

No.

So the declarative column length has no bearing on memory grants during

plan generation/execution?

#5Merlin Moncure
mmoncure@gmail.com
In reply to: John Turner (#4)
Re: B-tree index on a VARCHAR(4000) column

On Friday, September 8, 2017, John Turner <fenwayriffs@gmail.com> wrote:

On Fri, Sep 8, 2017 at 6:57 AM Tom Lane <tgl@sss.pgh.pa.us
<javascript:_e(%7B%7D,'cvml','tgl@sss.pgh.pa.us');>> wrote:

Ron Johnson <ron.l.johnson@cox.net
<javascript:_e(%7B%7D,'cvml','ron.l.johnson@cox.net');>> writes:

Based on LENGTH(offending_column), none of the values are more than 144
bytes in this 44.2M row table. Even though VARCHAR is, by definition,
variable length, are there any internal design issues which would make
things more efficient if it were dropped to, for example, VARCHAR(256)?

No.

So the declarative column length has no bearing on memory grants during

plan generation/execution?

Nope. Memory usage is proportional to the size of the string, not the
maximum length for varchar. Maximum length is a constraint.

merlin

#6John Turner
fenwayriffs@gmail.com
In reply to: Merlin Moncure (#5)
Re: B-tree index on a VARCHAR(4000) column

On Sun, Sep 10, 2017 at 10:42 AM Merlin Moncure <mmoncure@gmail.com> wrote:

On Friday, September 8, 2017, John Turner <fenwayriffs@gmail.com> wrote:

On Fri, Sep 8, 2017 at 6:57 AM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Ron Johnson <ron.l.johnson@cox.net> writes:

Based on LENGTH(offending_column), none of the values are more than 144
bytes in this 44.2M row table. Even though VARCHAR is, by definition,
variable length, are there any internal design issues which would make
things more efficient if it were dropped to, for example, VARCHAR(256)?

No.

So the declarative column length has no bearing on memory grants during

plan generation/execution?

Nope. Memory usage is proportional to the size of the string, not the
maximum length for varchar. Maximum length is a constraint.

Ok, thanks for verifying. I was curious since other platforms seem to

handle this aspect of memory allocation differently (more crudely, perhaps)
based on estimation of how fully populated the column _might_ be given a
size constraint:
https://sqlperformance.com/2017/06/sql-plan/performance-myths-oversizing-strings

John

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: John Turner (#6)
Re: B-tree index on a VARCHAR(4000) column

John Turner <fenwayriffs@gmail.com> writes:

On Sun, Sep 10, 2017 at 10:42 AM Merlin Moncure <mmoncure@gmail.com> wrote:

Nope. Memory usage is proportional to the size of the string, not the
maximum length for varchar. Maximum length is a constraint.

Ok, thanks for verifying. I was curious since other platforms seem to
handle this aspect of memory allocation differently (more crudely, perhaps)
based on estimation of how fully populated the column _might_ be given a
size constraint:
https://sqlperformance.com/2017/06/sql-plan/performance-myths-oversizing-strings

AFAIR, the only way in which a different declared max column length would
affect Postgres' behavior like that is that, depending on what other
columns are in the table, it might be able to prove that it doesn't need
to create a "toast table" for the table, because no row in the table could
ever be wide enough to require toasting. That would save a few
microseconds during table creation and drop ... but an unused toast table
that's just sitting there is surely not much overhead.

For every other purpose, PG just pays attention to the actual column
values' lengths.

regards, tom lane

--
Sent via pgsql-general mailing list (pgsql-general@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-general

#8John Turner
fenwayriffs@gmail.com
In reply to: Tom Lane (#7)
Re: B-tree index on a VARCHAR(4000) column

On Sun, Sep 10, 2017 at 1:24 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

For every other purpose, PG just pays attention to the actual column
values' lengths.

Thanks for elaborating, Tom. This would appear to be a(nother) case where

PG represents the voice of sanity as compared with 'the other guys' : )

John