Thread: Up-to-date reports database
Hi guys, I'm interested in a solution that will allow our customers to run reports - which may involve complicated queries - on datawhich is as up-to-date as possible. One thing I don't want to do is to let the reporting system connect to the production database. I want the indexes in productionto be limited to what production needs, and not add indexes that are required for reports, for instance. And basically,I don't want a customer to run a complicated report and degrade the performance of my production system. A replication solution is not very good, either, because of course I can't define indexes differently, I don't want *all*transactions in all tables to be sent, and also, because I may want to cross reference data from different systems.So ideally, I want to have a reporting database, where specific tables (or maybe even just specific columns) fromvarious databases are collected, and have a reporting tool connect to this database. But I want to push the data intointo that database as close to real time as possible. The most important data I am currently considering are two tables which have an average of 7,600 transactions per hour (standarddeviation 10,000, maximum in May is 62,000 transactions per hour). There may be similar pairs of tables collectedfrom more than one database. I assume this is not an uncommon scenario. What solutions would you recommend? Herouth
On 05/23/2012 07:11 AM, Herouth Maoz wrote: > Hi guys, > > I'm interested in a solution that will allow our customers to run reports - which may involve complicated queries - ondata which is as up-to-date as possible. > > One thing I don't want to do is to let the reporting system connect to the production database. I want the indexes in productionto be limited to what production needs, and not add indexes that are required for reports, for instance. And basically,I don't want a customer to run a complicated report and degrade the performance of my production system. > > A replication solution is not very good, either, because of course I can't define indexes differently, I don't want *all*transactions in all tables to be sent, and also, because I may want to cross reference data from different systems.So ideally, I want to have a reporting database, where specific tables (or maybe even just specific columns) fromvarious databases are collected, and have a reporting tool connect to this database. But I want to push the data intointo that database as close to real time as possible. > > The most important data I am currently considering are two tables which have an average of 7,600 transactions per hour(standard deviation 10,000, maximum in May is 62,000 transactions per hour). There may be similar pairs of tables collectedfrom more than one database. > > I assume this is not an uncommon scenario. What solutions would you recommend? Hi Herouth, I would have a look at slony. It is a trigger based replication system that allows you to replicate only the tables you define and you can have different indexing on the slave. The only requirement is that each table you want to replicate has the same primary key or unique constraint on the master and slave. Other than that, you can index the tables on the slave however you want. Cheers! - Chris
On 23/05/2012, at 17:20, Chris Ernst wrote: > I would have a look at slony. It is a trigger based replication system > that allows you to replicate only the tables you define and you can have > different indexing on the slave. The only requirement is that each > table you want to replicate has the same primary key or unique > constraint on the master and slave. Other than that, you can index the > tables on the slave however you want. Thanks for the pointer. I will read up about it. Herouth
hi,
my suggestion is to redesign reporting database to fit reporting specifics (e.g. brake normal form of database, in some cases this will speed up reports). Than you can use some ETL tool to sync production and reporting. Good thing is to use some OLAP software to use multidimensional analyze - this will make queries easier (with MDX language). I think this kind of discussion is huge one :)
there are some opensource ETL and BI suits available.
Regards,
Bartek
On 23/05/2012, at 18:54, Bartosz Dmytrak wrote: > hi, > my suggestion is to redesign reporting database to fit reporting specifics (e.g. brake normal form of database, in somecases this will speed up reports). Than you can use some ETL tool to sync production and reporting. Good thing is touse some OLAP software to use multidimensional analyze - this will make queries easier (with MDX language). I think thiskind of discussion is huge one :) > > there are some opensource ETL and BI suits available. Thanks, I'll take that into consideration.
On Wed, May 23, 2012 at 5:11 PM, Herouth Maoz <herouth@unicell.co.il> wrote: > A replication solution is not very good, either, because of course I can't define indexes differently, I don't want *all*transactions in all tables to be sent, and also, because I may want to cross reference data from different systems.So ideally, I want to have a reporting database, where specific tables (or maybe even just specific columns) fromvarious databases are collected, and have a reporting tool connect to this database. But I want to push the data intointo that database as close to real time as possible. Look at PgQ from Skytools. You can queue your OLTP data changes and restore specific columns only into your OLAP database. > > The most important data I am currently considering are two tables which have an average of 7,600 transactions per hour(standard deviation 10,000, maximum in May is 62,000 transactions per hour). There may be similar pairs of tables collectedfrom more than one database. > > I assume this is not an uncommon scenario. What solutions would you recommend? > > > Herouth > -- > Sent via pgsql-general mailing list (pgsql-general@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-general -- Sergey Konoplev a database and software architect http://www.linkedin.com/in/grayhemp Jabber: gray.ru@gmail.com Skype: gray-hemp Phone: +79160686204