Re: Shall I use PostgreSQL Array Type in The Following Case - Mailing list pgsql-general
| From | Gauthier, Dave | 
|---|---|
| Subject | Re: Shall I use PostgreSQL Array Type in The Following Case | 
| Date | |
| Msg-id | 482E80323A35A54498B8B70FF2B87980043845F79E@azsmsx504.amr.corp.intel.com Whole thread Raw | 
| In response to | Shall I use PostgreSQL Array Type in The Following Case (Yan Cheng Cheok <yccheok@yahoo.com>) | 
| List | pgsql-general | 
I actually tackled a problem very much like this in the distant past with a different DB. I think one of the practical questionsyou have to ask is whether or not you really need all that detailed data, or would storing summarized data serve. If (for example) the components are fabricated on wafers and in "lots", you might just want your DB loading script to determine/calculatethe count, min, max, median, avg, standard deviation, 6-sigma, etc... for each parametric test (e.g."width") per wafer/lot and store that in your DB. If the end users are going to be looking at wafer/lot stats for thesemillions of pieces of data as opposed to individual chips, you'll actually be serving them better because the stat datawill be "pre-calculated" (if you will) and query out a lot faster. And you won't be pounding your DB so hard in theprocess. Something to think about ! -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of Yan Cheng Cheok Sent: Monday, January 04, 2010 8:13 PM To: pgsql-general@postgresql.org Subject: [GENERAL] Shall I use PostgreSQL Array Type in The Following Case I realize there is Array data type for PostgreSQL. http://www.postgresql.org/docs/8.1/interactive/arrays.html Currently, I need to use database to store measurement result of a semiconductor factory. They are producing semicondutor units. Every semicondutor units can have variable number of measurement parameters. I plan to design the table in the following way. SemicondutorComponent ===================== ID | Measurement ================= ID | Name | Value | SemicondutorComponent_ID Example of data : SemicondutorComponent ===================== 1 | 2 | Measurement ================= 1 | Width | 0.001 | 1 2 | Height | 0.021 | 1 3 | Thickness | 0.022 | 1 4 | Pad0_Length | 0.031 | 1 5 | Pad1_Width | 0.041 | 1 6 | Width | 0.001 | 2 7 | Height | 0.021 | 2 8 | Thickness | 0.022 | 2 9 | Pad0_Length | 0.031 | 2 10| Pad1_Width | 0.041 | 2 11| Pad2_Width | 0.041 | 2 12| Lead0_Width | 0.041 | 2 Assume a factory is producing 24 million units in 1 day SemicondutorComponent table will have 24 million rows in 1 day Assume one SemicondutorComponent unit is having 50 measurement parameters. (can be more or can be less, depending on SemicondutorComponenttype) Measurement table will have 24 * 50 million rows in 1 day Is it efficient to design that way? **I wish to have super fast write speed, and reasonable fast read speed from the database.** Or shall I make use of PostgreSQL Array facility? SemicondutorComponent ===================== ID | Array_of_measurement_name | Array_of_measurement_value -- Sent via pgsql-general mailing list (pgsql-general@postgresql.org) To make changes to your subscription: http://www.postgresql.org/mailpref/pgsql-general
pgsql-general by date: