Re: Flattening a kind of 'dynamic' table - Mailing list pgsql-performance
From | Alexandre Leclerc |
---|---|
Subject | Re: Flattening a kind of 'dynamic' table |
Date | |
Msg-id | 1dc7f0e30501270943c8ad16@mail.gmail.com Whole thread Raw |
In response to | Re: Flattening a kind of 'dynamic' table ("Merlin Moncure" <merlin.moncure@rcsonline.com>) |
List | pgsql-performance |
On Thu, 27 Jan 2005 10:44:45 -0500, Merlin Moncure <merlin.moncure@rcsonline.com> wrote: > Alexandre wrote: > > Here a normal listing of design.product_department_time: > > product_id | department_id | req_time > > ------------+---------------+---------- > > 906 | A | 3000 > > 906 | C | 3000 > > 906 | D | 1935 > > 907 | A | 1500 > > 907 | C | 1500 > > 907 | D | 4575 > > 924 | A | 6000 > > 924 | C | 1575 > > product_id | a | c | d > > ------------+------+------+------ > > 924 | 6000 | 1575 | > > 907 | 1500 | 1500 | 4575 > > 906 | 3000 | 3000 | 1935 > > ok, you have a couple of different options here. The first thing that > jumps out at me is to use arrays to cheat using arrays. > Let's start with the normalized result set. > > select product_id, department_id, sum(req_time) group by product_id, > department_id > > product_id | department_id | sum > 924 a 6000 > 924 c 1575 > 907 a 1500 > [...] Hello Merlin, First of all, thanks for your time. Yes this is exactly what I'm doing right now (if I understand well your point here). All records in design.product_department_time are unique for each (product_id, req_time) combo and 0-null values are not possible. This is the first listing you have. In my query I added the sum() and GROUP BY stuff to avoid having such a listing: product_id | a | c | d ------------+------+------+------ 906 | 3000 | | 906 | | 3000 | 906 | | | 1935 907 | 1500 | | 907 | | 1500 | 907 | | | 4575 924 | 6000 | | 924 | | 1575 | So that for a given product_id I have all the times by departments in a single row (second listing I posted). > If you absolutely must have 1 record/product, you can cheat using > arrays: > > select q.product_id, > array_accum(q.department_id) as depts, > array_accum(q.req_time) as times > from > ( > select product_id, department_id, sum(req_time) as req_time > group by product_id, department_id > ) q > group by q.product_id; > > select product_id, array_accum(department_id) sum(req_time) group by > product_id > > product_id | department_id | sum > 924 {a, c} {1500, 1575} > [...] I did not used arrays because I didn't think about it, but I don't know if this is still the most efficient way. My software will have to work out the data, unless the array expands in good columns. But I'm not an expert at all. I try to do good DB design, but sometimes this is more complicated to work with the data. Here is the table definition if it can help: design.products (product_id serial PRIMARY KEY, ...); prod.departments (department_id varchar(3) PRIMARY KEY, ...); design.product_department_time ( product_id integer REFERENCES design.products ON DELETE CASCADE ON UPDATE CASCADE, department_id varchar(3) REFERENCES prod.departments ON DELETE CASCADE ON UPDATE CASCADE, req_time integer NOT NULL DEFAULT 0 CHECK (req_time >= 0), CONSTRAINT product_department_time_pkey PRIMARY KEY (product_id, department_id) ); And i also have a jobs table which has one product_id attached to one job with the required quantity to produce. So I must shouw the user how much time this will require by departments for each jobs. :) This is a nice report, but I don't want to kill the database each time the user want to see it. Thanks for your contrib so far, this will help me looking for other ways doing it. I'm always ready to hear more! Regards. -- Alexandre Leclerc
pgsql-performance by date: