Re: Cube extension improvement, GSoC - Mailing list pgsql-hackers

From Stas Kelvich
Subject Re: Cube extension improvement, GSoC
Date
Msg-id 3AF67843-F8C3-4666-A234-39DCA33A6715@gmail.com
Whole thread Raw
In response to Re: Cube extension improvement, GSoC  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: Cube extension improvement, GSoC  (Alexander Korotkov <aekorotkov@gmail.com>)
List pgsql-hackers
HI.

Thanks, Heikki, for the answer on google-melange. For some reason i didn't receive email notification, so I saw this
answeronly today. 

> Do you have access to a server you can use to perform those tests? (...)

Yes, i have. I am maintaining MPI cluster in my university, so it is not a problem. Actually tests for this proposal
wasmade on servers from this cluster. But, anyway, thanks for offering help. 

There is an open question about supporting different datatypes in cube extension. As I understand we have following
ideas:

* add cube-like operators for arrays. We already have support for arrays of any datatype, and any number of dimensions.
Ifwe want to use tree-like data structures for this operators we will run into the same problems with trees and types.
Andwe always can cast array to cube and use this operators. Or I wrongly understand this. 

* Create support for storing cube coordinates with different data types. (2,4,8-bytes integers, 4,8-bytes floats)Main
goalof this is reducing the index size, so in order not to break big amount of code we can store data according to data
typesize (i.e. | smallint | real | real | double | double | ) and when we load data from the disk or cache we can cast
itto float8 and existent code will work. To achieve this behavior two steps should be performed:1) Store information
aboutcoordinates types when the index is created. Good question is where to store this data structure, but I believe it
canbe done.2) Change functions that read and write data to disk, so they can cast type to/from float8 using data from
previousclause. 

* Don't do cube with type supportEventually, there is different ways of reducing R-Tree size. For example we can store
relativecoordinates with dynamic size of MBR (VRMBR), instead of absolute coordinates with fixed sized MBR. There is
someevidences, that this can sufficiently reduce size. http://link.springer.com/chapter/10.1007/11427865_13 

On May 8, 2013, at 2:35 PM, Alexander Korotkov wrote:

> On Sat, May 4, 2013 at 11:19 PM, Stas Kelvich <stanconn@gmail.com> wrote:
> > I think we have at least 3 data types more or less similar to cube.
> > 1) array of ranges
> > 2) range of arrays
> > 3) 2d arrays
> > Semantically cube is most close to array or ranges. However array of ranges have huge storage overhead.
> > Also we can declare cube as domain on 2d arrays and declare operations of that domain.
>
> But what we should do when arrays in different records have different numbers of element?
>
> We can be faced with absolutely same situation with cube.
>
> test=# create table cube_test (v cube);
> CREATE TABLE
>
> test=# insert into cube_test values (cube(array[1,2])), (cube(array[1,2,3]));
> INSERT 0 2
>
> In order to force all cubes to have same number of dimensions excplicit CHECK on table is required.
> As I remember cube treats absent dimensions as zeros.
>
> ------
> With best regards,
> Alexander Korotkov.
>




pgsql-hackers by date:

Previous
From: Peter Eisentraut
Date:
Subject: Re: PostgreSQL 9.3 beta breaks some extensions "make install"
Next
From: Stephen Frost
Date:
Subject: Re: erroneous restore into pg_catalog schema