Thread: How to improve performance in reporting database?
I have a daily job that pushes data from the production database into the reporting database, which right now, is an exact copy. I have a webapp that builds lots of reports for users. Most of these reports involve elaborate joins of lookup tables and lots of summations, and they take too long to run, even after using everything I know to tune the queries. Since I know this is a read-only data, it seems like I should be able to speed everything up dramatically if I run the queries offline and then save the results into new tables. Then the web app could just grab the cached results out of these new tables and then spit them out quickly. I've heard people talking about using "materialized views" for this, but that was with Oracle. What's the postgresql way here? More generally, any advice on running reporting databases well is welcome. Matt
Matthew Wilson wrote: > I've heard people talking about using "materialized views" for this, but > that was with Oracle. > You can build those manually with PostgreSQL if you really want them: http://wiki.postgresql.org/wiki/Materialized_Views The fundamental architecture is sound for a lot of problems in this area, you just have to figure out how to build them efficiently. In your case, you might just consider if there's a way way to update the MV in batches, rather than rely on triggers to keep the data up to date, after each data import. -- Greg Smith 2ndQuadrant US Baltimore, MD PostgreSQL Training, Services and Support greg@2ndQuadrant.com www.2ndQuadrant.us
On Thu, Jul 22, 2010 at 8:45 AM, Matthew Wilson <matt@tplus1.com> wrote: > I have a daily job that pushes data from the production database into > the reporting database, which right now, is an exact copy. > > I have a webapp that builds lots of reports for users. Most of these > reports involve elaborate joins of lookup tables and lots of summations, > and they take too long to run, even after using everything I know to > tune the queries. > > Since I know this is a read-only data, it seems like I should be able to > speed everything up dramatically if I run the queries offline and then > save the results into new tables. Then the web app could just grab the > cached results out of these new tables and then spit them out quickly. > > I've heard people talking about using "materialized views" for this, but > that was with Oracle. > > What's the postgresql way here? http://tech.jonathangardner.net/wiki/PostgreSQL/Materialized_Views They're kinda roll your own, but they're not that hard to do. > More generally, any advice on running reporting databases well is > welcome. Throw more drives and RAM at the problem, and use materialized views. Also you're often better off with fewer faster cpus than more slower ones for reporting servers (the opposite of OLTP where number of cores is far more important.)
On Thu, Jul 22, 2010 at 2:31 PM, Greg Smith <greg@2ndquadrant.com> wrote: > You can build those manually with PostgreSQL if you really want them: > http://wiki.postgresql.org/wiki/Materialized_Views > Another thing to consider... In our case we use a materialized view to keep track of counts of various things that are expensive to count, such as number of members of a specific account, or number of times a particular URL was viewed. What we end up with is many processes trying to update the same counter row, and we end up with lots and lots of lock contention. Which reminds me... Greg, are you done with your book yet and when can I get you back in here to help with my locking problems? ;-)