Thread: Dimension limit in contrib/cube (dump/restore hazard?)
contrib/cube has an arbitrary limit of 100 on the number of dimensions in a cube, but it actually enforces that only in cube_in and cube_enlarge, with the other cube creation functions happy to create cubes of more dimensions. I haven't actually tested, but this implies that one can create cubes that will break dump/restore. Should this limit be kept, and if so what should it be? (There's obviously a limit on the size of indexable cubes) (Noticed because an irc user was trying to use cubes with 512 dimensions with partial success) -- Andrew (irc:RhodiumToad)
Hi! > 28 авг. 2018 г., в 8:29, Andrew Gierth <andrew@tao11.riddles.org.uk> написал(а): > > contrib/cube has an arbitrary limit of 100 on the number of dimensions > in a cube, but it actually enforces that only in cube_in and > cube_enlarge, with the other cube creation functions happy to create > cubes of more dimensions. > > I haven't actually tested, but this implies that one can create cubes > that will break dump/restore. > > Should this limit be kept, and if so what should it be? (There's > obviously a limit on the size of indexable cubes) > > (Noticed because an irc user was trying to use cubes with 512 > dimensions with partial success) +1 This can cause very unpleasant fails like postgres=# create table y as select cube(array(SELECT random() as a FROM generate_series(1,1000))) from generate_series(1,1e3,1); SELECT 1000 postgres=# create index on y using gist(cube ); ERROR: index row size 8016 exceeds maximum 8152 for index "y_cube_idx" postgres=# create table y as select cube(array(SELECT random() as a FROM generate_series(1,800))) from generate_series(1,1e3,1); SELECT 1000 postgres=# create index on y using gist(cube ); ERROR: failed to add item to index page in "y_cube_idx" I belive cube construction from array\arrays should check size of arrays. Also there are some unexpected cube dimensionality reduction like in cube_enlarge if (n > CUBE_MAX_DIM) n = CUBE_MAX_DIM; You wanted larger cube, but got cube of another dimension. I think we should something like this diff --git a/contrib/cube/cube.c b/contrib/cube/cube.c index dfa8465d74..38739b1df2 100644 --- a/contrib/cube/cube.c +++ b/contrib/cube/cube.c @@ -151,6 +151,12 @@ cube_a_f8_f8(PG_FUNCTION_ARGS) errmsg("cannot work with arrays containing NULLs"))); dim = ARRNELEMS(ur); + if (dim > CUBE_MAX_DIM) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_ELEMENT_ERROR), + errmsg("A cube cannot have more than %d dimensions.", + CUBE_MAX_DIM))); + if (ARRNELEMS(ll) != dim) ereport(ERROR, (errcode(ERRCODE_ARRAY_ELEMENT_ERROR), @@ -208,6 +214,11 @@ cube_a_f8(PG_FUNCTION_ARGS) errmsg("cannot work with arrays containing NULLs"))); dim = ARRNELEMS(ur); + if (dim > CUBE_MAX_DIM) + ereport(ERROR, + (errcode(ERRCODE_ARRAY_ELEMENT_ERROR), + errmsg("A cube cannot have more than %d dimensions.", + CUBE_MAX_DIM))); dur = ARRPTR(ur); Best regards, Andrey Borodin.
Hi! On Tue, Aug 28, 2018 at 6:21 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > I belive cube construction from array\arrays should check size of arrays. Makes sense to me. > Also there are some unexpected cube dimensionality reduction like in cube_enlarge > if (n > CUBE_MAX_DIM) > n = CUBE_MAX_DIM; > You wanted larger cube, but got cube of another dimension. > > I think we should something like this OK, but I think cube_c_f8() and cube_c_f8_f8() also need to be revised. Also, I think this behavior should be covered by regression tests. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
> 28 авг. 2018 г., в 14:18, Alexander Korotkov <a.korotkov@postgrespro.ru> написал(а): > > OK, but I think cube_c_f8() and cube_c_f8_f8() also need to be > revised. Also, I think this behavior should be covered by regression > tests. True. Also there's one case in cube_subset. Best regards, Andrey Borodin.
Attachment
Hi! On Tue, Aug 28, 2018 at 10:30 PM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > > 28 авг. 2018 г., в 14:18, Alexander Korotkov <a.korotkov@postgrespro.ru> написал(а): > > > > OK, but I think cube_c_f8() and cube_c_f8_f8() also need to be > > revised. Also, I think this behavior should be covered by regression > > tests. > True. Also there's one case in cube_subset. In general looks good for me. Personally I get tired with cube.out and cube_2.out. They are different with only few checks involving scientific notation. But all the patches touching cube regression tests should update both cube.out and cube_2.out. I propose to split scientific notation checks into separate test. I've also add check for sube_subset(). I'm going to check this patchset on Windows and commit if no objections. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
Alexander Korotkov <a.korotkov@postgrespro.ru> writes: > I'm going to check this patchset on Windows and commit if no objections. These error messages do not conform to our message style guidelines: you've copied an errdetail message as primary error message, but the rules are different for that (no complete sentences, no initial cap, no period). Using ERRCODE_ARRAY_ELEMENT_ERROR seems pretty random as well --- so far as I can see, that's generally used for cases like "this array has the wrong type of data elements". Perhaps ERRCODE_PROGRAM_LIMIT_EXCEEDED would be the best choice. regards, tom lane
On Thu, Aug 30, 2018 at 4:59 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > Alexander Korotkov <a.korotkov@postgrespro.ru> writes: > > I'm going to check this patchset on Windows and commit if no objections. > > These error messages do not conform to our message style guidelines: > you've copied an errdetail message as primary error message, but the > rules are different for that (no complete sentences, no initial cap, > no period). > > Using ERRCODE_ARRAY_ELEMENT_ERROR seems pretty random as well --- so far > as I can see, that's generally used for cases like "this array has the > wrong type of data elements". Perhaps ERRCODE_PROGRAM_LIMIT_EXCEEDED > would be the best choice. Thank you for catching this! I'll be more careful about error messages. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company
Attachment
On Thu, Aug 30, 2018 at 02:28:20PM +0300, Alexander Korotkov wrote: > In general looks good for me. Personally I get tired with cube.out > and cube_2.out. They are different with only few checks involving > scientific notation. But all the patches touching cube regression > tests should update both cube.out and cube_2.out. I propose to split > scientific notation checks into separate test. +1. -- Michael
Attachment
On 2018-Aug-30, Alexander Korotkov wrote: > Personally I get tired with cube.out > and cube_2.out. They are different with only few checks involving > scientific notation. But all the patches touching cube regression > tests should update both cube.out and cube_2.out. I propose to split > scientific notation checks into separate test. Good idea. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
On Fri, Aug 31, 2018 at 6:18 PM Alvaro Herrera <alvherre@2ndquadrant.com> wrote: > On 2018-Aug-30, Alexander Korotkov wrote: > > > Personally I get tired with cube.out > > and cube_2.out. They are different with only few checks involving > > scientific notation. But all the patches touching cube regression > > tests should update both cube.out and cube_2.out. I propose to split > > scientific notation checks into separate test. > > Good idea. Thank you for the feedback! Pushed. ------ Alexander Korotkov Postgres Professional: http://www.postgrespro.com The Russian Postgres Company