Re: Normalization or Performance - Mailing list pgsql-performance
From | Richard Huxton |
---|---|
Subject | Re: Normalization or Performance |
Date | |
Msg-id | 41AF6050.9030302@archonet.com Whole thread Raw |
In response to | Normalization or Performance (Alvaro Nunes Melo <al_nunes@atua.com.br>) |
List | pgsql-performance |
Alvaro Nunes Melo wrote: > Hi, > > Before writing this mail, I'd researched a little about this topic, > and got some opinions from guys like Fabien Pascal, who argues that > logical design should be separated from physical design, and other > sources. As this is not fact, I'm writing to you guys, that make > things work in real world. I believe he's right. Or at least that you should only compromise your logical design once it becomes absolutely necessary due to physical limitations. > We started our first big (for our company standards) project always > thinking in normalization. But once we imported legacy data into the > DB, things got harder. > > One example is the clients status. A client might be active, inactive > or pending (for many reasons). We store all the status a client have > since it is in the system. To check what is the actual status of a > client, we get the last status from this historical status table. > This take a considerable time, so our best results were achieved > building a function that checks the status and indexing this > function. The problem is that indexed functions mus bu immutable, so > as you can figure, if the status change after the creation of the > index, the retunr of the function is still the same. > > What do you suggest for situations like this? Should I add a field to > clients table and store its actual status, keep storing data in the > historical table an control its changes with a trigger? Trigger + history table is a common solution, it's easy to implement and there's nothing non-relational about it as a solution. > There are other situations that are making things difficult to us. > For example, one query must return the total amount a client bought > in the last 90 days. It's taking too long, when we must know it for > many clients, many times. So should I create summarization tables to > store this kind of stuff, update it with a trigger in daily basis > (for example), and solve this problem with one join? One solution I use for this sort of thing is a summary table grouped by date, and accurate until the start of today. Then, I check the summary table and the "live" table for todays information and sum those. > Our database is not that big. The larger table has about 7.000.000 > rows. About 50.000 clients, half of them active. All that I'd point > out above uses indexes for queries, etc. But even with this it's not > been fast enough. We have a Dell server for this (I know, the Dell > issue), a Dual Xeon 2.8, SCSI HD, 1 GB mem. Do we need better > hardware for our system? Swap one of your processors for more RAM and disks, perhaps. -- Richard Huxton Archonet Ltd
pgsql-performance by date: