attndims, typndims still not enforced, but make the value within a sane threshold

Started by jian heover 1 year ago36 messageshackers
Jump to latest
#1jian he
jian.universality@gmail.com

hi.

while looking at tablecmd.c, BuildDescForRelation
attdim = list_length(entry->typeName->arrayBounds);
if (attdim > PG_INT16_MAX)
ereport(ERROR,
errcode(ERRCODE_PROGRAM_LIMIT_EXCEEDED),
errmsg("too many array dimensions"))

makes me related to array_in refactor previously we did.
at first, i thought it should be "if (attdim > MAXDIM)"

pg_attribute attndims description in [1]https://www.postgresql.org/docs/current/catalog-pg-attribute.html
attndims int2
Number of dimensions, if the column is an array type; otherwise 0.
(Presently, the number of dimensions of an array is not enforced, so
any nonzero value effectively means “it's an array”.)

pg_type typndims description in [2]https://www.postgresql.org/docs/current/catalog-pg-type.html
typndims int4
typndims is the number of array dimensions for a domain over an array
(that is, typbasetype is an array type). Zero for types other than
domains over array types.

since array_in is the only source of the real array data.
MAXDIM (6) ensure the max dimension is 6.

Can we error out at the stage "create table", "create domain"
time if the attndims or typndims is larger than MAXDIM (6) ?

for example, error out the following queries immediately
create table t112(a int[][] [][] [][] [][][]);
create domain d_text_arr text [1]https://www.postgresql.org/docs/current/catalog-pg-attribute.html[][][][][][][];

in the doc, we can still say "the number of dimensions of an array is
not enforced",
but attndims, typndims value would be within a sane threshold.

We can change typndims from int4 to int2,
so array type's dimension is consistent with domain type's dimension.
but it seems with the change, pg_type occupies the same amount of
storage as int4.

[1]: https://www.postgresql.org/docs/current/catalog-pg-attribute.html
[2]: https://www.postgresql.org/docs/current/catalog-pg-type.html

#2Tom Lane
tgl@sss.pgh.pa.us
In reply to: jian he (#1)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

jian he <jian.universality@gmail.com> writes:

Can we error out at the stage "create table", "create domain"
time if the attndims or typndims is larger than MAXDIM (6) ?

The last time this was discussed, I think the conclusion was
we should remove attndims and typndims entirely on the grounds
that they're useless. I certainly don't see a point in adding
more logic that could give the misleading impression that they
mean something.

regards, tom lane

#3jian he
jian.universality@gmail.com
In reply to: Tom Lane (#2)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

On Wed, Sep 18, 2024 at 10:10 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

jian he <jian.universality@gmail.com> writes:

Can we error out at the stage "create table", "create domain"
time if the attndims or typndims is larger than MAXDIM (6) ?

The last time this was discussed, I think the conclusion was
we should remove attndims and typndims entirely on the grounds
that they're useless. I certainly don't see a point in adding
more logic that could give the misleading impression that they
mean something.

https://commitfest.postgresql.org/43/
search "dim" or "pg_attribute", no relevant result,
i am assuming, nobody doing work to remove attndims and typndims entirely?
If so, I will try to make one.

#4jian he
jian.universality@gmail.com
In reply to: jian he (#3)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

On Wed, Sep 18, 2024 at 10:35 PM jian he <jian.universality@gmail.com> wrote:

The last time this was discussed, I think the conclusion was
we should remove attndims and typndims entirely on the grounds
that they're useless. I certainly don't see a point in adding
more logic that could give the misleading impression that they
mean something.

attached patch removes attndims and typndims entirely.
some tests skipped in my local my machine, not skipped are all OK.

Attachments:

v1-0001-remove-pg_attribute-attndims-and-pg_type-typndims.patchtext/x-patch; charset=US-ASCII; name=v1-0001-remove-pg_attribute-attndims-and-pg_type-typndims.patchDownload+198-319
#5Junwang Zhao
zhjwpku@gmail.com
In reply to: jian he (#4)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

On Fri, Sep 20, 2024 at 10:11 AM jian he <jian.universality@gmail.com> wrote:

On Wed, Sep 18, 2024 at 10:35 PM jian he <jian.universality@gmail.com> wrote:

The last time this was discussed, I think the conclusion was
we should remove attndims and typndims entirely on the grounds
that they're useless. I certainly don't see a point in adding
more logic that could give the misleading impression that they
mean something.

attached patch removes attndims and typndims entirely.
some tests skipped in my local my machine, not skipped are all OK.

Should you also bump the catalog version?

--
Regards
Junwang Zhao

#6Michael Paquier
michael@paquier.xyz
In reply to: Junwang Zhao (#5)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

On Fri, Sep 20, 2024 at 11:51:49AM +0800, Junwang Zhao wrote:

Should you also bump the catalog version?

No need to worry about that when sending a patch because committers
take care of that when merging a patch into the tree. Doing that in
each patch submitted just creates more conflicts and work for patch
authors because they'd need to recolve conflicts each time a
catversion bump happens. And that can happen on a daily basis
sometimes depending on what is committed.
--
Michael

#7Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#6)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

Michael Paquier <michael@paquier.xyz> writes:

On Fri, Sep 20, 2024 at 11:51:49AM +0800, Junwang Zhao wrote:

Should you also bump the catalog version?

No need to worry about that when sending a patch because committers
take care of that when merging a patch into the tree. Doing that in
each patch submitted just creates more conflicts and work for patch
authors because they'd need to recolve conflicts each time a
catversion bump happens. And that can happen on a daily basis
sometimes depending on what is committed.

Right. Sometimes the committer forgets to do that :-(, which is
not great but it's not normally a big problem either. We've concluded
it's better to err in that direction than impose additional work
on patch submitters.

If you feel concerned about the point, best practice is to include a
mention that catversion bump is needed in your draft commit message.

regards, tom lane

#8Junwang Zhao
zhjwpku@gmail.com
In reply to: Tom Lane (#7)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

Hi Tom and Michael,

On Fri, Sep 20, 2024 at 12:38 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Michael Paquier <michael@paquier.xyz> writes:

On Fri, Sep 20, 2024 at 11:51:49AM +0800, Junwang Zhao wrote:

Should you also bump the catalog version?

No need to worry about that when sending a patch because committers
take care of that when merging a patch into the tree. Doing that in
each patch submitted just creates more conflicts and work for patch
authors because they'd need to recolve conflicts each time a
catversion bump happens. And that can happen on a daily basis
sometimes depending on what is committed.

Right. Sometimes the committer forgets to do that :-(, which is
not great but it's not normally a big problem either. We've concluded
it's better to err in that direction than impose additional work
on patch submitters.

If you feel concerned about the point, best practice is to include a
mention that catversion bump is needed in your draft commit message.

regards, tom lane

Got it, thanks for both of your explanations.

--
Regards
Junwang Zhao

#9Bruce Momjian
bruce@momjian.us
In reply to: jian he (#4)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

On Fri, Sep 20, 2024 at 10:11:00AM +0800, jian he wrote:

On Wed, Sep 18, 2024 at 10:35 PM jian he <jian.universality@gmail.com> wrote:

The last time this was discussed, I think the conclusion was
we should remove attndims and typndims entirely on the grounds
that they're useless. I certainly don't see a point in adding
more logic that could give the misleading impression that they
mean something.

attached patch removes attndims and typndims entirely.
some tests skipped in my local my machine, not skipped are all OK.

I have been hoping for a patch links this because I feel the existence
of these system columns is deceptive since we don't honor them properly.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

When a patient asks the doctor, "Am I going to die?", he means
"Am I going to die soon?"

#10Andrew Dunstan
andrew@dunslane.net
In reply to: Tom Lane (#7)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

On 2024-09-20 Fr 12:38 AM, Tom Lane wrote:

Michael Paquier <michael@paquier.xyz> writes:

On Fri, Sep 20, 2024 at 11:51:49AM +0800, Junwang Zhao wrote:

Should you also bump the catalog version?

No need to worry about that when sending a patch because committers
take care of that when merging a patch into the tree. Doing that in
each patch submitted just creates more conflicts and work for patch
authors because they'd need to recolve conflicts each time a
catversion bump happens. And that can happen on a daily basis
sometimes depending on what is committed.

Right. Sometimes the committer forgets to do that :-(, which is
not great but it's not normally a big problem either. We've concluded
it's better to err in that direction than impose additional work
on patch submitters.

FWIW, I have a git pre-commit hook that helps avoid that. Essentially it
checks to see if there are changes in src/include/catalog but not in
catversion.h. That's not a 100% check, but it probably catches the vast
majority of changes that would require a catversion bump.

cheers

andrew

--
Andrew Dunstan
EDB: https://www.enterprisedb.com

#11Kirill Reshke
reshkekirill@gmail.com
In reply to: jian he (#4)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

On Fri, 20 Sept 2024 at 07:11, jian he <jian.universality@gmail.com> wrote:

attached patch removes attndims and typndims entirely.

LGTM?
I see no open items in this thread. Are there any issues about v1?

--
Best regards,
Kirill Reshke

#12Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Kirill Reshke (#11)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

On 2024-Dec-05, Kirill Reshke wrote:

On Fri, 20 Sept 2024 at 07:11, jian he <jian.universality@gmail.com> wrote:

attached patch removes attndims and typndims entirely.

LGTM?
I see no open items in this thread. Are there any issues about v1?

Should we leave TupleDescInitEntry()'s API alone, to avoid breaking the
compilation of every extension in the world? Maybe we could rename the
function (say to TupleDescInitializeEntry()) and use a define?

#define TupleDescInitEntry(a,b,c,d,e,f) TupleDescInitializeEntry(a,b,c,d,e)

Then the amount of churn is reduced, no extensions are broken, and we
can remove the define in a few years.

--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Debido a que la velocidad de la luz es mucho mayor que la del sonido,
algunas personas nos parecen brillantes un minuto antes
de escuchar las pelotudeces que dicen." (Roberto Fontanarrosa)

#13Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alvaro Herrera (#12)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

On 2024-Dec-05, Kirill Reshke wrote:

I see no open items in this thread. Are there any issues about v1?

Should we leave TupleDescInitEntry()'s API alone, to avoid breaking the
compilation of every extension in the world?

I fear the howls of pain from extension authors will be entirely
drowned out by the howls of pain from application authors whose
queries are broken by the disappearance of two catalog columns.
A quick search at codesearch.debian.net shows that while there's
not that many references to attndims, some of them are in such
never-heard-of-it applications as PHP. typndims has some outside
queries fetching it as well.

I don't think we can do this as presented. Maybe we could do
something like constrain the stored value to be only 0 or 1,
but how much does that improve matters?

regards, tom lane

#14Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#13)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

On Thu, Dec 5, 2024 at 11:33:22AM -0500, Tom Lane wrote:

Alvaro Herrera <alvherre@alvh.no-ip.org> writes:

On 2024-Dec-05, Kirill Reshke wrote:

I see no open items in this thread. Are there any issues about v1?

Should we leave TupleDescInitEntry()'s API alone, to avoid breaking the
compilation of every extension in the world?

I fear the howls of pain from extension authors will be entirely
drowned out by the howls of pain from application authors whose
queries are broken by the disappearance of two catalog columns.
A quick search at codesearch.debian.net shows that while there's
not that many references to attndims, some of them are in such
never-heard-of-it applications as PHP. typndims has some outside
queries fetching it as well.

I don't think we can do this as presented. Maybe we could do
something like constrain the stored value to be only 0 or 1,
but how much does that improve matters?

Well, then the question becomes will we retain useless system columns
forever?

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#15Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#14)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

Bruce Momjian <bruce@momjian.us> writes:

On Thu, Dec 5, 2024 at 11:33:22AM -0500, Tom Lane wrote:

I don't think we can do this as presented. Maybe we could do
something like constrain the stored value to be only 0 or 1,
but how much does that improve matters?

Well, then the question becomes will we retain useless system columns
forever?

If the choice is "not break PHP" versus "get rid of a vestigial column",
I know which one is the saner choice. There might be things that are
worth breaking PHP for, but this isn't one.

regards, tom lane

#16Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#15)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

On Thu, Dec 5, 2024 at 12:00:30PM -0500, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

On Thu, Dec 5, 2024 at 11:33:22AM -0500, Tom Lane wrote:

I don't think we can do this as presented. Maybe we could do
something like constrain the stored value to be only 0 or 1,
but how much does that improve matters?

Well, then the question becomes will we retain useless system columns
forever?

If the choice is "not break PHP" versus "get rid of a vestigial column",
I know which one is the saner choice. There might be things that are
worth breaking PHP for, but this isn't one.

I guess I just don't want to get into a situation where we have so many
useless backward-compatible system columns it is distracting.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#17Bruce Momjian
bruce@momjian.us
In reply to: Bruce Momjian (#16)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

On Thu, Dec 5, 2024 at 12:03:30PM -0500, Bruce Momjian wrote:

On Thu, Dec 5, 2024 at 12:00:30PM -0500, Tom Lane wrote:

Bruce Momjian <bruce@momjian.us> writes:

On Thu, Dec 5, 2024 at 11:33:22AM -0500, Tom Lane wrote:

I don't think we can do this as presented. Maybe we could do
something like constrain the stored value to be only 0 or 1,
but how much does that improve matters?

Well, then the question becomes will we retain useless system columns
forever?

If the choice is "not break PHP" versus "get rid of a vestigial column",
I know which one is the saner choice. There might be things that are
worth breaking PHP for, but this isn't one.

I guess I just don't want to get into a situation where we have so many
useless backward-compatible system columns it is distracting.

How about if we set attndims & typndims to zero in PG 18, and mention
that these fields are deprecated in the release notes. Then, in a few
years, we can remove the columns since interfaces hopefully would have
removed the useless references to the columns.

--
Bruce Momjian <bruce@momjian.us> https://momjian.us
EDB https://enterprisedb.com

Do not let urgent matters crowd out time for investment in the future.

#18Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#17)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

Bruce Momjian <bruce@momjian.us> writes:

How about if we set attndims & typndims to zero in PG 18, and mention
that these fields are deprecated in the release notes. Then, in a few
years, we can remove the columns since interfaces hopefully would have
removed the useless references to the columns.

You have a mighty optimistic view of what will happen. I predict
that if we do step (1), exactly nothing will happen in applications,
and step (2) will remain just as painful for them. (Assuming that
we remember to do step (2), which is no sure thing given our track
record for following through "in a few years".)

regards, tom lane

#19jian he
jian.universality@gmail.com
In reply to: Tom Lane (#18)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

per above discussion:
for TupleDescInitEntry, TupleDescInitBuiltinEntry, I didn't change the
signature of it.
but i added:

if (attdim != 0)
elog(ERROR, "attdim should be zero");

otherwise it may have "unused parameter" warnings?

for the same reason in TypeCreate, I added:
if (typNDims != 0)
elog(ERROR, "typNDims should be zero");

i aslo changed TypeCreate typNDims related comments.
TupleDescInitEntry, TupleDescInitBuiltinEntry comments didn't say much
about attdim.

so overall the change is less pervasive, it would be easier to review.

Attachments:

v2-0001-remove-pg_attribute-attndims-and-pg_type-typndims.patchtext/x-patch; charset=US-ASCII; name=v2-0001-remove-pg_attribute-attndims-and-pg_type-typndims.patchDownload+22-90
#20Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#18)
Re: attndims, typndims still not enforced, but make the value within a sane threshold

On Thu, Dec 05, 2024 at 06:34:31PM -0500, Tom Lane wrote:

You have a mighty optimistic view of what will happen. I predict
that if we do step (1), exactly nothing will happen in applications,
and step (2) will remain just as painful for them. (Assuming that
we remember to do step (2), which is no sure thing given our track
record for following through "in a few years".)

FWIW, my first thought after reading this paragraph is that you sound
too dramatic here, especially after looking at codesearch to note that
the PHP core code stores attndims but does not actually use it.

Now, I've looked at github and noted that there's a large number of
repositories (in the hundreds) that rely on a zero or non-zero value
for *attndims* to check if they're dealing with an array, saving in
what looks like catalog lookups.

The same lookups done for typndims offer an opposite conclusion:
there's no real code in the open that uses this value for similar
checks, most likely because domains are less used in the wild. So I
get the drama for the removal of attndims and I'd agree to keep it
around, but I also see a very good point in removing typndims.
--
Michael

#21Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#20)
#22Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#21)
#23jian he
jian.universality@gmail.com
In reply to: Michael Paquier (#22)
#24Michael Paquier
michael@paquier.xyz
In reply to: jian he (#23)
#25Bruce Momjian
bruce@momjian.us
In reply to: Michael Paquier (#24)
#26Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#25)
#27Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#26)
#28Michael Paquier
michael@paquier.xyz
In reply to: Bruce Momjian (#27)
#29Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#27)
#30Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#29)
#31Tom Lane
tgl@sss.pgh.pa.us
In reply to: Michael Paquier (#30)
#32Michael Paquier
michael@paquier.xyz
In reply to: Tom Lane (#31)
#33Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#29)
#34Tom Lane
tgl@sss.pgh.pa.us
In reply to: Bruce Momjian (#33)
#35Tom Lane
tgl@sss.pgh.pa.us
In reply to: Tom Lane (#34)
#36Bruce Momjian
bruce@momjian.us
In reply to: Tom Lane (#34)