Re: Flattening a kind of 'dynamic' table - Mailing list pgsql-performance
From | Merlin Moncure |
---|---|
Subject | Re: Flattening a kind of 'dynamic' table |
Date | |
Msg-id | 6EE64EF3AB31D5448D0007DD34EEB3412A75F0@Herge.rcsinc.local Whole thread Raw |
In response to | Flattening a kind of 'dynamic' table (Alexandre Leclerc <alexandre.leclerc@gmail.com>) |
Responses |
Re: Flattening a kind of 'dynamic' table
|
List | pgsql-performance |
Alexandre wrote: > On Thu, 27 Jan 2005 10:44:45 -0500, Merlin Moncure > <merlin.moncure@rcsonline.com> wrote: > > Alexandre wrote: > > ok, you have a couple of different options here. The first thing that > > jumps out at me is to use arrays to cheat using arrays. > > Let's start with the normalized result set. > > > > select product_id, department_id, sum(req_time) group by product_id, > > department_id > > > > product_id | department_id | sum > > 924 a 6000 > > 924 c 1575 > > 907 a 1500 > > [...] > > Hello Merlin, > > First of all, thanks for your time. Yes this is exactly what I'm doing > right now (if I understand well your point here). All records in > design.product_department_time are unique for each (product_id, > req_time) combo and 0-null values are not possible. This is the first > listing you have. Right. I expanding departments into columns is basically a dead end. First of all, SQL is not really designed to do this, and second of all (comments continued below) > product_id | a | c | d > ------------+------+------+------ > 906 | 3000 | | > 906 | | 3000 | > 906 | | | 1935 > 907 | 1500 | | > 907 | | 1500 | > 907 | | | 4575 > 924 | 6000 | | > 924 | | 1575 | the above table is more expensive to group than the normalized version above because it is much, much longer. This will get worse and worse as you add more departments. So, whatever you end up doing, I'd advise against expanding rows from a table into columns of a result except for very, very special cases. This is not one of those cases. > I did not used arrays because I didn't think about it, but I don't > know if this is still the most efficient way. My software will have to > work out the data, unless the array expands in good columns. But I'm > not an expert at all. I try to do good DB design, but sometimes this > is more complicated to work with the data. Arrays are a quick'n'dirty way to de-normalize a result set. According to me, de-normalization is o.k. for result sets *only*. Generally, it is inappropriate to de-normalize any persistent object in the database, such as a view (or especially) a table. de-normalizing sets can sometimes simplify client-side coding issues or provide a performance benefit at the query stage (or slow it down, so be careful!) > And i also have a jobs table which has one product_id attached to one > job with the required quantity to produce. So I must shouw the user > how much time this will require by departments for each jobs. :) This > is a nice report, but I don't want to kill the database each time the > user want to see it. You always have the option to do this in code. This basically means ordering the result set and writing a nested loop to pass over the data. If you happen to be using a report engine (and it sounds like you are), some engines can simplify this via a grouping criteria, some can't. If parsing an array string is a pain I happen to have a C++ class handy that can compose/decompose a postgresql array string if: a: no more than 1 dimension and b: array bounds are known Let me know if you need it and I'll send it over. Merlin
pgsql-performance by date: