Dimension limit in contrib/cube (dump/restore hazard?)

Started by Andrew Gierthover 7 years ago10 messageshackers
Jump to latest
#1Andrew Gierth
andrew@tao11.riddles.org.uk

contrib/cube has an arbitrary limit of 100 on the number of dimensions
in a cube, but it actually enforces that only in cube_in and
cube_enlarge, with the other cube creation functions happy to create
cubes of more dimensions.

I haven't actually tested, but this implies that one can create cubes
that will break dump/restore.

Should this limit be kept, and if so what should it be? (There's
obviously a limit on the size of indexable cubes)

(Noticed because an irc user was trying to use cubes with 512
dimensions with partial success)

--
Andrew (irc:RhodiumToad)

#2Andrey Borodin
amborodin@acm.org
In reply to: Andrew Gierth (#1)
Re: Dimension limit in contrib/cube (dump/restore hazard?)

Hi!

28 авг. 2018 г., в 8:29, Andrew Gierth <andrew@tao11.riddles.org.uk> написал(а):

contrib/cube has an arbitrary limit of 100 on the number of dimensions
in a cube, but it actually enforces that only in cube_in and
cube_enlarge, with the other cube creation functions happy to create
cubes of more dimensions.

I haven't actually tested, but this implies that one can create cubes
that will break dump/restore.

Should this limit be kept, and if so what should it be? (There's
obviously a limit on the size of indexable cubes)

(Noticed because an irc user was trying to use cubes with 512
dimensions with partial success)

+1
This can cause very unpleasant fails like

postgres=# create table y as select cube(array(SELECT random() as a FROM generate_series(1,1000))) from generate_series(1,1e3,1);
SELECT 1000
postgres=# create index on y using gist(cube );
ERROR: index row size 8016 exceeds maximum 8152 for index "y_cube_idx"

postgres=# create table y as select cube(array(SELECT random() as a FROM generate_series(1,800))) from generate_series(1,1e3,1);
SELECT 1000
postgres=# create index on y using gist(cube );
ERROR: failed to add item to index page in "y_cube_idx"

I belive cube construction from array\arrays should check size of arrays.

Also there are some unexpected cube dimensionality reduction like in cube_enlarge
if (n > CUBE_MAX_DIM)
n = CUBE_MAX_DIM;
You wanted larger cube, but got cube of another dimension.

I think we should something like this

diff --git a/contrib/cube/cube.c b/contrib/cube/cube.c
index dfa8465d74..38739b1df2 100644
--- a/contrib/cube/cube.c
+++ b/contrib/cube/cube.c
@@ -151,6 +151,12 @@ cube_a_f8_f8(PG_FUNCTION_ARGS)
                                 errmsg("cannot work with arrays containing NULLs")));
        dim = ARRNELEMS(ur);
+       if (dim > CUBE_MAX_DIM)
+               ereport(ERROR,
+                               (errcode(ERRCODE_ARRAY_ELEMENT_ERROR),
+                                errmsg("A cube cannot have more than %d dimensions.",
+                                                          CUBE_MAX_DIM)));
+
        if (ARRNELEMS(ll) != dim)
                ereport(ERROR,
                                (errcode(ERRCODE_ARRAY_ELEMENT_ERROR),
@@ -208,6 +214,11 @@ cube_a_f8(PG_FUNCTION_ARGS)
                                 errmsg("cannot work with arrays containing NULLs")));
        dim = ARRNELEMS(ur);
+       if (dim > CUBE_MAX_DIM)
+               ereport(ERROR,
+                               (errcode(ERRCODE_ARRAY_ELEMENT_ERROR),
+                                errmsg("A cube cannot have more than %d dimensions.",
+                                                          CUBE_MAX_DIM)));

dur = ARRPTR(ur);

Best regards, Andrey Borodin.

#3Alexander Korotkov
aekorotkov@gmail.com
In reply to: Andrey Borodin (#2)
Re: Dimension limit in contrib/cube (dump/restore hazard?)

Hi!

On Tue, Aug 28, 2018 at 6:21 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

I belive cube construction from array\arrays should check size of arrays.

Makes sense to me.

Also there are some unexpected cube dimensionality reduction like in cube_enlarge
if (n > CUBE_MAX_DIM)
n = CUBE_MAX_DIM;
You wanted larger cube, but got cube of another dimension.

I think we should something like this

OK, but I think cube_c_f8() and cube_c_f8_f8() also need to be
revised. Also, I think this behavior should be covered by regression
tests.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

#4Andrey Borodin
amborodin@acm.org
In reply to: Alexander Korotkov (#3)
Re: Dimension limit in contrib/cube (dump/restore hazard?)

28 авг. 2018 г., в 14:18, Alexander Korotkov <a.korotkov@postgrespro.ru> написал(а):

OK, but I think cube_c_f8() and cube_c_f8_f8() also need to be
revised. Also, I think this behavior should be covered by regression
tests.

True. Also there's one case in cube_subset.

Best regards, Andrey Borodin.

Attachments:

cube_check_dim.diffapplication/octet-stream; name=cube_check_dim.diff; x-unix-mode=0644Download+92-1
#5Alexander Korotkov
aekorotkov@gmail.com
In reply to: Andrey Borodin (#4)
Re: Dimension limit in contrib/cube (dump/restore hazard?)

Hi!

On Tue, Aug 28, 2018 at 10:30 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote:

28 авг. 2018 г., в 14:18, Alexander Korotkov <a.korotkov@postgrespro.ru> написал(а):

OK, but I think cube_c_f8() and cube_c_f8_f8() also need to be
revised. Also, I think this behavior should be covered by regression
tests.

True. Also there's one case in cube_subset.

In general looks good for me. Personally I get tired with cube.out
and cube_2.out. They are different with only few checks involving
scientific notation. But all the patches touching cube regression
tests should update both cube.out and cube_2.out. I propose to split
scientific notation checks into separate test. I've also add check
for sube_subset().

I'm going to check this patchset on Windows and commit if no objections.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-cube-split-scientific-notation-test-v1.patchapplication/octet-stream; name=0001-cube-split-scientific-notation-test-v1.patchDownload+235-1785
0002-cube-enforce-dimension-checks-v1.patchapplication/octet-stream; name=0002-cube-enforce-dimension-checks-v1.patchDownload+83-1
#6Tom Lane
tgl@sss.pgh.pa.us
In reply to: Alexander Korotkov (#5)
Re: Dimension limit in contrib/cube (dump/restore hazard?)

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

I'm going to check this patchset on Windows and commit if no objections.

These error messages do not conform to our message style guidelines:
you've copied an errdetail message as primary error message, but the
rules are different for that (no complete sentences, no initial cap,
no period).

Using ERRCODE_ARRAY_ELEMENT_ERROR seems pretty random as well --- so far
as I can see, that's generally used for cases like "this array has the
wrong type of data elements". Perhaps ERRCODE_PROGRAM_LIMIT_EXCEEDED
would be the best choice.

regards, tom lane

#7Alexander Korotkov
aekorotkov@gmail.com
In reply to: Tom Lane (#6)
Re: Dimension limit in contrib/cube (dump/restore hazard?)

On Thu, Aug 30, 2018 at 4:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:

Alexander Korotkov <a.korotkov@postgrespro.ru> writes:

I'm going to check this patchset on Windows and commit if no objections.

These error messages do not conform to our message style guidelines:
you've copied an errdetail message as primary error message, but the
rules are different for that (no complete sentences, no initial cap,
no period).

Using ERRCODE_ARRAY_ELEMENT_ERROR seems pretty random as well --- so far
as I can see, that's generally used for cases like "this array has the
wrong type of data elements". Perhaps ERRCODE_PROGRAM_LIMIT_EXCEEDED
would be the best choice.

Thank you for catching this! I'll be more careful about error messages.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company

Attachments:

0001-cube-split-scientific-notation-test-v2.patchapplication/x-patch; name=0001-cube-split-scientific-notation-test-v2.patchDownload+235-1785
0002-cube-enforce-dimension-checks-v2.patchapplication/octet-stream; name=0002-cube-enforce-dimension-checks-v2.patchDownload+88-1
#8Michael Paquier
michael@paquier.xyz
In reply to: Alexander Korotkov (#5)
Re: Dimension limit in contrib/cube (dump/restore hazard?)

On Thu, Aug 30, 2018 at 02:28:20PM +0300, Alexander Korotkov wrote:

In general looks good for me. Personally I get tired with cube.out
and cube_2.out. They are different with only few checks involving
scientific notation. But all the patches touching cube regression
tests should update both cube.out and cube_2.out. I propose to split
scientific notation checks into separate test.

+1.
--
Michael
#9Alvaro Herrera
alvherre@2ndquadrant.com
In reply to: Alexander Korotkov (#5)
Re: Dimension limit in contrib/cube (dump/restore hazard?)

On 2018-Aug-30, Alexander Korotkov wrote:

Personally I get tired with cube.out
and cube_2.out. They are different with only few checks involving
scientific notation. But all the patches touching cube regression
tests should update both cube.out and cube_2.out. I propose to split
scientific notation checks into separate test.

Good idea.

--
�lvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

#10Alexander Korotkov
aekorotkov@gmail.com
In reply to: Alvaro Herrera (#9)
Re: Dimension limit in contrib/cube (dump/restore hazard?)

On Fri, Aug 31, 2018 at 6:18 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote:

On 2018-Aug-30, Alexander Korotkov wrote:

Personally I get tired with cube.out
and cube_2.out. They are different with only few checks involving
scientific notation. But all the patches touching cube regression
tests should update both cube.out and cube_2.out. I propose to split
scientific notation checks into separate test.

Good idea.

Thank you for the feedback! Pushed.

------
Alexander Korotkov
Postgres Professional: http://www.postgrespro.com
The Russian Postgres Company