Thread: PostgreSQL, OLAP, and Large Clusters
Hi: The size of our database is growing rather rapidly. We're concerned about how well Postgres will scale for OLAP-style queries over terabytes of data. Googling around doesn't yield great results for vanilla Postgres in this application, but generally links to other software like Greenplum, Netezza, and Aster Data (some of which are based off of Postgres). Too, there are solutions like Stado. But I'm concerned about the amount of effort to use such solutions and what we would have to give up feature-wise. We love a lot of the features that we currently have that (I think) would only be available with Postgres: arrays, hstore, gin/gist indexes, extensions. Not to mention a lot of the other great SQL standard features not available in other open-source databases, especially window functions and CTEs. Of course, migration to another solution is feasible (with enough effort, of course), but given my experiences with Postgres and the support provided by the community that is second to none, I'd very much like to stay with PostgreSQL. Thoughts? -Ryan Kelly
On 26/09/12 23:50, Ryan Kelly wrote: > Hi: > > The size of our database is growing rather rapidly. We're concerned > about how well Postgres will scale for OLAP-style queries over terabytes > of data. Googling around doesn't yield great results for vanilla > Postgres in this application, but generally links to other software like > Greenplum, Netezza, and Aster Data (some of which are based off of > Postgres). Too, there are solutions like Stado. But I'm concerned about > the amount of effort to use such solutions and what we would have to > give up feature-wise. > > We love a lot of the features that we currently have that (I think) > would only be available with Postgres: arrays, hstore, gin/gist indexes, > extensions. Not to mention a lot of the other great SQL standard > features not available in other open-source databases, especially window > functions and CTEs. > > Of course, migration to another solution is feasible (with enough > effort, of course), but given my experiences with Postgres and the > support provided by the community that is second to none, I'd very much > like to stay with PostgreSQL. > > Thoughts? > > -Ryan Kelly > > You might find it worth while to consult a reputable vendor of Postgres, such as EnterpriseDB, who contribute change back to the community. As you may have rather specialized needs beyond the capacity of these mailing lists. Also, I suspect paying them to help improve Postgres's capability to handle such large datasets would be cheaper than an Oracle Licence and costs of migration! I don't have any specific knowledge of the current ability of Postgres to support such high loads, so I can't comment meaningfully on that. And even if I could, and it was favourable, I would still recommend getting a competent company to provide consultancy for your particular situation. Cheers, Gavin
On Wed, Sep 26, 2012 at 5:50 AM, Ryan Kelly <rpkelly22@gmail.com> wrote: > Hi: > > The size of our database is growing rather rapidly. We're concerned > about how well Postgres will scale for OLAP-style queries over terabytes > of data. Googling around doesn't yield great results for vanilla > Postgres in this application, but generally links to other software like > Greenplum, Netezza, and Aster Data (some of which are based off of > Postgres). Too, there are solutions like Stado. But I'm concerned about > the amount of effort to use such solutions and what we would have to > give up feature-wise. If you want fastish OLAP on postgres you need to do several things. 1: Throw very fast disk arrays at it. Lots of spinners in a linux SW RAID-10 or RAID-0 if your data is easily replaceable work wonders here. 2: Throw lots of memory at it. Memory is pretty cheap. 256G is not unusual for OLAP machines 3: Throw fast CPUs at it. Faster CPUs, especially fewer faster cores, are often helpful. Applied in that order you can get some pretty impressive results. A lot of OLAP stuff needs to read hundreds of gigs at a time from the drive array. An array of 32 15kRPM drives, each reading at ~100MB/s or better can flood your PCI bus at 3200MB/s for reads. Note that most RAID controllers aren't as fast for sequential reads on large drive arrays. Although a battery backed cache can GREATLY improved parallel write speed, it doesn't really make a big difference for big sequential stuff and usually gets in the way here. Memory to cache as much as possible and allow all your queries to do hash joins etc in memory (crank up work_mem as needed, but be careful not to use all your memory up.) Lastly once you've thrown lots of IO and memory at it, a faster CPU can make a bit of a difference too. But honestly I'd rather have a dual core 2GHz CPU on top of 32 spinners with 256G than a 3.6GHz 8 core CPU on top of 4 drives and 32G of RAM.
Hi, On 26 September 2012 21:50, Ryan Kelly <rpkelly22@gmail.com> wrote: > The size of our database is growing rather rapidly. We're concerned > about how well Postgres will scale for OLAP-style queries over terabytes > of data. Googling around doesn't yield great results for vanilla > Postgres in this application, but generally links to other software like > Greenplum, Netezza, and Aster Data (some of which are based off of > Postgres). Too, there are solutions like Stado. But I'm concerned about > the amount of effort to use such solutions and what we would have to > give up feature-wise. We were able to have 2TB of data (16 Cores, 96GB RAM, FC connected SAN). Performance was good: - up to 5sec for simple select with multiple where conditions (2 - 25) order by any column (of 400 columns) and return top 10k - up to 15sec for executing 5 queries (simultaneously) which return top 20 combination of any two columns - up to 25sec for executing 56 queries (using materialised aggregate tables) - aggregation job ran every 15 minutes and completed under 2 minutes: 5mil rows -> aggregation -> 56 tables - all queries can be executed over date range up to several months (monthly partitioned tables, 6 months history) but it was very hard to scale this solution. We have tried: - FusionIO cards: 10 to 100 times better performance, but very hard to expand storage capacity; Cooling/power issues - AsterData: nice SQL-MR feature and analytics (decision trees, frequent items, clustering, ...); No libpq support and you have to use JDBC or selected ODBC manager - Greenplum (winer): performance comparable to FusionIO (10 to 50 times); we were able to remove aggregation job (because of columnar store model); easy to port from postgres but could be complicated if you are heavy pgpsql user At this time I would try: - Postgres-XC - Stado - Cassandra + Solr + Hadoop (for example DataStax Enterprise) - Birst (http://www.birst.com/) PS. For any commercial solution is better to buy their appliance (if you can afford it...). -- Ondrej Ivanic (ondrej.ivanic@gmail.com)
All, * Scott Marlowe (scott.marlowe@gmail.com) wrote: > If you want fastish OLAP on postgres you need to do several things. [...] All good suggestions. I'd recommend looking at ROLAP approaches and doing aggregations and materialized views first.. Will depend on exactly what you need/are looking to do with the data, of course. Thanks, Stephen
Attachment
Obviously OLAP performance depends to some extent on what you want to do with it. My recommendation is to follow Scott's advice first (keep things simple as long as you can), and then look at supplemental approaches when you reach the limit there.
The big limit IMO is the lack of intraquery parallelism. This has implications when trying to run OLAP queries on terabytes worth of data and so at some point you may reach the limits of what a single instance can do. At that point you can go with a proprietary vendor like Greenplum or you can look at PostgresXC (which is open source). Keep in mind that solving this issue has significant complexity costs and so you are usually better off with tuning your existing instance first.
Best Wishes,
Chris Travers
On Thu, Sep 27, 2012 at 08:51:31AM +1000, Ondrej Ivanič wrote: > Hi, > > On 26 September 2012 21:50, Ryan Kelly <rpkelly22@gmail.com> wrote: > > The size of our database is growing rather rapidly. We're concerned > > about how well Postgres will scale for OLAP-style queries over terabytes > > of data. Googling around doesn't yield great results for vanilla > > Postgres in this application, but generally links to other software like > > Greenplum, Netezza, and Aster Data (some of which are based off of > > Postgres). Too, there are solutions like Stado. But I'm concerned about > > the amount of effort to use such solutions and what we would have to > > give up feature-wise. > > We were able to have 2TB of data (16 Cores, 96GB RAM, FC connected > SAN). Performance was good: > - up to 5sec for simple select with multiple where conditions (2 - 25) > order by any column (of 400 columns) and return top 10k > - up to 15sec for executing 5 queries (simultaneously) which return > top 20 combination of any two columns > - up to 25sec for executing 56 queries (using materialised aggregate tables) > - aggregation job ran every 15 minutes and completed under 2 minutes: > 5mil rows -> aggregation -> 56 tables 5mil overall, or matching your aggregation query? And is that the 2TB mentioned above? We have more than 100 times that many rows, but less data. > - all queries can be executed over date range up to several months > (monthly partitioned tables, 6 months history) Yeah we have to be able to query over various date ranges. > but it was very hard to scale this solution. We have tried: > - FusionIO cards: 10 to 100 times better performance, but very hard to > expand storage capacity; Cooling/power issues > - AsterData: nice SQL-MR feature and analytics (decision trees, > frequent items, clustering, ...); No libpq support and you have to use > JDBC or selected ODBC manager I don't think no libpq support is a deal-breaker, but other missing features could be problematic. > - Greenplum (winer): performance comparable to FusionIO (10 to 50 > times); we were able to remove aggregation job (because of columnar > store model); easy to port from postgres but could be complicated if > you are heavy pgpsql user Not using any pl/pgpsql, but a number of other features: arrays and hstore, which I doubt (hopefully wrongly) that Greenplum supports. > At this time I would try: > - Postgres-XC From what I understand, more of a write-scaleable-oriented solution. We mostly will need read scalability. I also don't think it really handles redundancy. > - Stado Looks promising, sounded very promising, but it doesn't seem to be particularly active or well-documented. It also doesn't support window functions (which I could probably get by without) or CTEs (which will be trickier, but doable. I'm also not sure of how easy it is to handle node failure or adding more nodes, as it appears the number of nodes is essentially fixed. > - Cassandra + Solr + Hadoop (for example DataStax Enterprise) > - Birst (http://www.birst.com/) > > PS. For any commercial solution is better to buy their appliance (if > you can afford it...). Thanks for the advice. Is it just better supported, or more performant, or...? > > -- > Ondrej Ivanic > (ondrej.ivanic@gmail.com) -Ryan Kelly
On Thu, Sep 27, 2012 at 08:58:05AM +1200, Gavin Flower wrote: > On 26/09/12 23:50, Ryan Kelly wrote: > >Hi: > > > >The size of our database is growing rather rapidly. We're concerned > >about how well Postgres will scale for OLAP-style queries over terabytes > >of data. Googling around doesn't yield great results for vanilla > >Postgres in this application, but generally links to other software like > >Greenplum, Netezza, and Aster Data (some of which are based off of > >Postgres). Too, there are solutions like Stado. But I'm concerned about > >the amount of effort to use such solutions and what we would have to > >give up feature-wise. > > > >We love a lot of the features that we currently have that (I think) > >would only be available with Postgres: arrays, hstore, gin/gist indexes, > >extensions. Not to mention a lot of the other great SQL standard > >features not available in other open-source databases, especially window > >functions and CTEs. > > > >Of course, migration to another solution is feasible (with enough > >effort, of course), but given my experiences with Postgres and the > >support provided by the community that is second to none, I'd very much > >like to stay with PostgreSQL. > > > >Thoughts? > > > >-Ryan Kelly > > > > > You might find it worth while to consult a reputable vendor of > Postgres, such as EnterpriseDB, who contribute change back to the > community. As you may have rather specialized needs beyond the > capacity of these mailing lists. Also, I suspect paying them to > help improve Postgres's capability to handle such large datasets > would be cheaper than an Oracle Licence and costs of migration! Contacting EnterpriseDB has been on my list of things to do for a while now. I should probably get around to do doing that :) > I don't have any specific knowledge of the current ability of > Postgres to support such high loads, so I can't comment meaningfully > on that. And even if I could, and it was favourable, I would still > recommend getting a competent company to provide consultancy for > your particular situation. > > > Cheers, > Gavin Thanks, -Ryan Kelly
On Wed, Sep 26, 2012 at 03:18:16PM -0600, Scott Marlowe wrote: > On Wed, Sep 26, 2012 at 5:50 AM, Ryan Kelly <rpkelly22@gmail.com> wrote: > > Hi: > > > > The size of our database is growing rather rapidly. We're concerned > > about how well Postgres will scale for OLAP-style queries over terabytes > > of data. Googling around doesn't yield great results for vanilla > > Postgres in this application, but generally links to other software like > > Greenplum, Netezza, and Aster Data (some of which are based off of > > Postgres). Too, there are solutions like Stado. But I'm concerned about > > the amount of effort to use such solutions and what we would have to > > give up feature-wise. > > If you want fastish OLAP on postgres you need to do several things. > > 1: Throw very fast disk arrays at it. Lots of spinners in a linux SW > RAID-10 or RAID-0 if your data is easily replaceable work wonders > here. > 2: Throw lots of memory at it. Memory is pretty cheap. 256G is not > unusual for OLAP machines > 3: Throw fast CPUs at it. Faster CPUs, especially fewer faster cores, > are often helpful. What do you mean by "fewer faster cores"? Wouldn't "more faster cores" be better? > Applied in that order you can get some pretty impressive results. > > A lot of OLAP stuff needs to read hundreds of gigs at a time from the > drive array. An array of 32 15kRPM drives, each reading at ~100MB/s > or better can flood your PCI bus at 3200MB/s for reads. Note that > most RAID controllers aren't as fast for sequential reads on large > drive arrays. Although a battery backed cache can GREATLY improved > parallel write speed, it doesn't really make a big difference for big > sequential stuff and usually gets in the way here. > > Memory to cache as much as possible and allow all your queries to do > hash joins etc in memory (crank up work_mem as needed, but be careful > not to use all your memory up.) > > Lastly once you've thrown lots of IO and memory at it, a faster CPU > can make a bit of a difference too. But honestly I'd rather have a > dual core 2GHz CPU on top of 32 spinners with 256G than a 3.6GHz 8 > core CPU on top of 4 drives and 32G of RAM. All of this seems like great advice. Thanks, -Ryan Kelly
On Wed, Sep 26, 2012 at 09:15:35PM -0400, Stephen Frost wrote: > All, > > * Scott Marlowe (scott.marlowe@gmail.com) wrote: > > If you want fastish OLAP on postgres you need to do several things. > [...] > > All good suggestions. > > I'd recommend looking at ROLAP approaches and doing aggregations and > materialized views first.. Will depend on exactly what you need/are > looking to do with the data, of course. We currently do a lot of materialized views, unfortunately even that is starting to get somewhat slow. Some queries continue to be ad-hoc and probably can't be solved with a materialized approach. > > Thanks, > > Stephen Thanks, -Ryan Kelly
Attachment
On 09/27/2012 12:50 PM, Ryan Kelly wrote: > What do you mean by "fewer faster cores"? Wouldn't "more faster cores" > be better? > I believe his point is core does not equal cpu. Too often I've watched cpu performance meters with one core pegged and the other(s) idle, since really it's one cpu. I suspect it comes down to: if your cpu bound, you're cpu bound and the other core(s) can't help.
Hi, On 28 September 2012 04:34, Ryan Kelly <rpkelly22@gmail.com> wrote: > On Thu, Sep 27, 2012 at 08:51:31AM +1000, Ondrej Ivanič wrote: >> - aggregation job ran every 15 minutes and completed under 2 minutes: >> 5mil rows -> aggregation -> 56 tables > 5mil overall, or matching your aggregation query? And is that the 2TB > mentioned above? We have more than 100 times that many rows, but less > data. Let me explain. ETL process imports several thousands row every 5 minutes or so. Aggregation job runs every 15 minutes and it grabs the everything new since last run which could be up to 5 mil rows. Next step is to compute aggregates -- 56 queries like insert into mat_table1 select attr1, attr2, count(*) from tmp_table; 2TB was the size of the live dataset - 6 months, 30-40mil rows per month. >> - all queries can be executed over date range up to several months >> (monthly partitioned tables, 6 months history) > Yeah we have to be able to query over various date ranges. Partitioning works nicely in this case. >> - AsterData: nice SQL-MR feature and analytics (decision trees, >> frequent items, clustering, ...); No libpq support and you have to use >> JDBC or selected ODBC manager > I don't think no libpq support is a deal-breaker, but other missing > features could be problematic. It was for us -- we ended up with one option which was ODBC and unixODBC segfaulted on simple queries like 'select 1::numeric'. Aster removed many PG features (i think arrays and composite types are not supported) but they added several cool things. >> - Greenplum (winer): performance comparable to FusionIO (10 to 50 >> times); we were able to remove aggregation job (because of columnar >> store model); easy to port from postgres but could be complicated if >> you are heavy pgpsql user > Not using any pl/pgpsql, but a number of other features: arrays and > hstore, which I doubt (hopefully wrongly) that Greenplum supports. Anything which you can compile against 8.2 might work... >> At this time I would try: >> - Postgres-XC > From what I understand, more of a write-scaleable-oriented solution. We > mostly will need read scalability. I also don't think it really handles > redundancy. read scalability is there as well: it can use multiple nodes for select quires and push-down (execute it on node) certain operations. Check this talk: http://www.pgcon.org/2012/schedule/events/424.en.html redundancy is up to you -- you can deploy as many coordinator nodes as you need. Data distribution is quite flexible, see DISTRIBUTE BY and TO GROUP / NODE clauses (http://postgres-xc.sourceforge.net/docs/1_0/sql-createtable.html). >> - Stado > Looks promising, sounded very promising, but it doesn't seem to be > particularly active or well-documented. It also doesn't support window > functions (which I could probably get by without) or CTEs (which will be > trickier, but doable. I'm also not sure of how easy it is to handle node > failure or adding more nodes, as it appears the number of nodes is > essentially fixed. yup, documentations is not the best. You might have a look at pgpool-II parallel query mode (docs is skimpy, not sure about window functions and CTEs support) http://pgpool.projects.postgresql.org/pgpool-II/doc/pgpool-en.html >> PS. For any commercial solution is better to buy their appliance (if >> you can afford it...). > Thanks for the advice. Is it just better supported, or more performant, > or...? Usually both. You get support, monitoring, performance. Some of appliances do dial-home calls hence you get support call back with concrete advice / solution. Hardware is fine-tuned and proven: good RAIDs controller, disks, 10GbE interconnects, redundant network / storage paths. You can build something like that by your self but you are not going to save in the long run. -- Ondrej Ivanic (ondrej.ivanic@gmail.com)
On Thu, Sep 27, 2012 at 12:50 PM, Ryan Kelly <rpkelly22@gmail.com> wrote: > On Wed, Sep 26, 2012 at 03:18:16PM -0600, Scott Marlowe wrote: >> On Wed, Sep 26, 2012 at 5:50 AM, Ryan Kelly <rpkelly22@gmail.com> wrote: >> > Hi: >> > >> > The size of our database is growing rather rapidly. We're concerned >> > about how well Postgres will scale for OLAP-style queries over terabytes >> > of data. Googling around doesn't yield great results for vanilla >> > Postgres in this application, but generally links to other software like >> > Greenplum, Netezza, and Aster Data (some of which are based off of >> > Postgres). Too, there are solutions like Stado. But I'm concerned about >> > the amount of effort to use such solutions and what we would have to >> > give up feature-wise. >> >> If you want fastish OLAP on postgres you need to do several things. >> >> 1: Throw very fast disk arrays at it. Lots of spinners in a linux SW >> RAID-10 or RAID-0 if your data is easily replaceable work wonders >> here. >> 2: Throw lots of memory at it. Memory is pretty cheap. 256G is not >> unusual for OLAP machines >> 3: Throw fast CPUs at it. Faster CPUs, especially fewer faster cores, >> are often helpful. > What do you mean by "fewer faster cores"? Wouldn't "more faster cores" > be better? If you can have say 32 opteron cores at 2.2GHz each, or 8 xeon cores at 3.3GHz each for about the same money, get the 8 faster xeon cores, because under postgresql you get one core per connection. No built in parallelism to use greater number of cores. Also on machines with 2 or 4 sockets there are overhead costs for accessing different memory banks, so if you're never gonna have more than a handful of users / queries running at once, you're usually better of with a single socket fast CPU with say 8 cores.
On Thu, Sep 27, 2012 at 11:34 AM, Ryan Kelly <rpkelly22@gmail.com> wrote:
From what I understand, more of a write-scaleable-oriented solution. We
> At this time I would try:
> - Postgres-XC
mostly will need read scalability. I also don't think it really handles
redundancy.
From my understanding it gets around the key read scalability issue in PostgreSQL, which is a lack of intraquery parallelism. Since components of a query can run on different storage nodes concurrently, this helps a great deal. It doesn't do the things a column store would help with but it is still a major step forward.
As for redundancy, Postgres-XC handles redundancy on the coordinator side, but on the storage node side, I believe you could use streaming replication and other standard PostgreSQL approaches to redundancy there.
Hope this helps,
Chris Travers