Re: Aggregation question - Mailing list pgsql-hackers

From Bruno Wolff III
Subject Re: Aggregation question
Date
Msg-id 20040207180949.GA26459@wolff.to
Whole thread Raw
In response to Aggregation question  ("Dilip Angal" <dilip_angal@yahoo.com>)
List pgsql-hackers
On Tue, Feb 03, 2004 at 21:27:48 -0800,
  Dilip Angal <dilip_angal@yahoo.com> wrote:
> Hi
>
> I have a situation that I need flexible number columns to model the business requirements. It could go up to 1000
columnsand will be a sparse matrix. One option I was considering was to have a table with col1......col1000  

I don't think this is really a hackers' topic. Probably it belongs on general,
though with the sizes you are talking about performance might be appropiate.

I would think that the natural way to reprensent a sparse matrix would be
to use a three column table with row number, column number, and cell value.
Doing aggregates on a column number will be easy.

If the data isn't really a matrix and the different cells are different
data types, you may be able to store all the values as strings and convert
to the appropiate type as needed.

> Other option I can consider is store all of them as name values  in single column as  a string
>
> option 1
> col1  | col2  |col 3 |........|col56|.....|col77|
> 10    | 2     |      |        | 4   |     |     |
> 2     |       |  4   |        |     |.... |6    |
>
> option 2
>
> "col1=10,col2=2,col56=4"
> "col1=2,col3=4,col77=6"
>
> I will have about 50Million such records in a table.
>
> I need to aggregate the column values for a given day by taking sum of all the rows
>
> "col1=12,col2=2,col3=4,col56=4,col77=6"
>
> Second option looks very elegant but aggregation is hard
> Besides, second option may have performance penalty (Especially when you have 50 Million records)
>
> Any one can give any pointers or comments on how to model this an how to aggregate it?
>
> Thanks
> Dilip
>

pgsql-hackers by date:

Previous
From: "Alex J. Avriette"
Date:
Subject: RFC: Security documentation
Next
From: "Alex J. Avriette"
Date:
Subject: RFC: Very large scale postgres support