Thread: PostgreSQL suitable?
Hi We are developing a data warehouse of which the integration layer will start with over 100 TB of data. There are not manyentities though we probably can partition and foremost we should use inheritance for the lab results. I just was wonderingif PostgreSQL was able to cope with. In case it depends on the modelling kind, we have not yet decided between classicerd, anchor modelling and data vault. Does someone have experience with such a set up? Kind regards Thiemo
I think with 100TB the biggest challenge will be taking backups. PG has no real incremental backups offered in Oracle orSQLServer.
On Tue, Dec 19, 2017 at 3:07 PM, Kellner Thiemo <thiemo.kellner@usb.ch> wrote:
Hi
We are developing a data warehouse of which the integration layer will start with over 100 TB of data. There are not many entities though we probably can partition and foremost we should use inheritance for the lab results. I just was wondering if PostgreSQL was able to cope with. In case it depends on the modelling kind, we have not yet decided between classic erd, anchor modelling and data vault.
Does someone have experience with such a set up?
There are a significant number of issues that happen when you try to put that much data in PostgrSQL. Remember there is a hard limit of 32TB per table.
I currently help administer an analytics environment where 400TB of data is pre-aggregated into 32TB of ready-to-serve metrics. We generally try to keep our initial point of entry databases to under 20TB of possible.
Nonetheless it is quite possible either using distributed add-ons like Citus, forks like Postgres-XL, or (as we did at Adjust) a mixture of data consolidation and application-level sharding.
As a plug, I expect to deliver talks about this in various places. Stay tuned ;-)
Kind regards
Thiemo
Best Wishes,
Chris Travers
Efficito: Hosted Accounting and ERP. Robust and Flexible. No vendor lock-in.
I think that table partitioning isn't suitable yet for a large number of partitions. Or a large number of partial indexes.
This will be a challenge too.
Il 19 dic 2017 15:07, "Kellner Thiemo" <thiemo.kellner@usb.ch> ha scritto:
Hi
We are developing a data warehouse of which the integration layer will start with over 100 TB of data. There are not many entities though we probably can partition and foremost we should use inheritance for the lab results. I just was wondering if PostgreSQL was able to cope with. In case it depends on the modelling kind, we have not yet decided between classic erd, anchor modelling and data vault.
Does someone have experience with such a set up?
Kind regards
Thiemo
Am 19.12.2017 um 15:07 schrieb Kellner Thiemo: > Hi > > We are developing a data warehouse of which the integration layer will start with over 100 TB of data. There are not manyentities though we probably can partition and foremost we should use inheritance for the lab results. I just was wonderingif PostgreSQL was able to cope with. In case it depends on the modelling kind, we have not yet decided between classicerd, anchor modelling and data vault. > > Does someone have experience with such a set up? > > Kind regards > > Thiemo > depends at least on the data and the workload. pg10 contains better solutions for table-partitioning, up to 1000 (maybe more) child-tables arn't that problem. We have customers in that range. Regards, Andreas -- 2ndQuadrant - The PostgreSQL Support Company. www.2ndQuadrant.com
I've not tested PG10. But it's not released for production yet!
Il 19 dic 2017 15:48, "Andreas Kretschmer" <andreas@a-kretschmer.de> ha scritto:
Am 19.12.2017 um 15:07 schrieb Kellner Thiemo:Hi
We are developing a data warehouse of which the integration layer will start with over 100 TB of data. There are not many entities though we probably can partition and foremost we should use inheritance for the lab results. I just was wondering if PostgreSQL was able to cope with. In case it depends on the modelling kind, we have not yet decided between classic erd, anchor modelling and data vault.
Does someone have experience with such a set up?
Kind regards
Thiemo
depends at least on the data and the workload. pg10 contains better solutions for table-partitioning, up to 1000 (maybe more) child-tables arn't that problem.
We have customers in that range.
Regards, Andreas
--
2ndQuadrant - The PostgreSQL Support Company.
www.2ndQuadrant.com
Vincenzo Romano wrote: > I've not tested PG10. But it's not released for production yet! It definitely is, for a couple of months now. 10.1 (the first bugfix release) has been out for over a month. -- Álvaro Herrera https://www.2ndQuadrant.com/ PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
PG10 is happily in production - in active use here :) Tim Clarke IT Director Manifest Tel: +44 (0)1376 504510 | Mobile: +44 (0)7887 563420 | Main: +44 (0)1376 503500 | Fax: +44 (0)1376 503550 Blog: https://blog.manifest.co.uk/ <https://www.manifest.co.uk/> | Web: https://www.manifest.co.uk/ 9 Freebournes Court | Newland Street | Witham | Essex | CM8 2BL | England ---------------------------------------------------------------------------------------------------------------------------- Copyright: This e-mail may contain confidential or legally privileged information. If you are not the named addressee you must not use or disclose such information, instead please report it to admin@manifest.info Legal: Manifest is the trading name of: Manifest Information Services Ltd: Registered in England Number 3401145 & The Manifest Voting Agency Ltd: Registered in England Number 2920820 Registered Office at above address. Please Click Here >> <https://www.manifest.co.uk/legal/> for further information. On 19/12/17 14:50, Vincenzo Romano wrote: > I've not tested PG10. But it's not released for production yet! > > Il 19 dic 2017 15:48, "Andreas Kretschmer" <andreas@a-kretschmer.de > <mailto:andreas@a-kretschmer.de>> ha scritto: > > > > Am 19.12.2017 um 15:07 schrieb Kellner Thiemo: > > Hi > > We are developing a data warehouse of which the integration > layer will start with over 100 TB of data. There are not many > entities though we probably can partition and foremost we > should use inheritance for the lab results. I just was > wondering if PostgreSQL was able to cope with. In case it > depends on the modelling kind, we have not yet decided between > classic erd, anchor modelling and data vault. > > Does someone have experience with such a set up? > > Kind regards > > Thiemo > > > > depends at least on the data and the workload. pg10 contains > better solutions for table-partitioning, up to 1000 (maybe more) > child-tables arn't that problem. > We have customers in that range. > > > Regards, Andreas > > -- > 2ndQuadrant - The PostgreSQL Support Company. > www.2ndQuadrant.com <http://www.2ndQuadrant.com> > >
Attachment
Sorry, my bad: I confused V10 with v11.
But accordingly to a discussion with Bruce Momjan, table partitionin V10 is little more than syntactic sugar around old-fashioned table partitioning.
Sub-table partition selection is linear in complexity.
Il 19 dic 2017 15:55, "Alvaro Herrera" <alvherre@alvh.no-ip.org> ha scritto:
Vincenzo Romano wrote:
> I've not tested PG10. But it's not released for production yet!
It definitely is, for a couple of months now. 10.1 (the first bugfix
release) has been out for over a month.
--
Álvaro Herrera https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services
Greetings, * Rakesh Kumar (rakeshkumar464@mail.com) wrote: > I think with 100TB the biggest challenge will be taking backups. PG has no real incremental backups offered in Oracle orSQLServer. There are multiple solutions to doing incremental backups with PostgreSQL, so I'm not sure why you're saying that they don't exist, because that's really not accurate. Thanks! Stephen
Attachment
Greetings, * Vincenzo Romano (vincenzo.romano@notorand.it) wrote: > Sorry, my bad: I confused V10 with v11. > But accordingly to a discussion with Bruce Momjan, table partitionin V10 is > little more than syntactic sugar around old-fashioned table partitioning. Well, it's a bit more than that since there's tuple-routing, but you're right that the partition elimination is the same as it was in earlier versions and based on constraint exclusion. That said, as noted in the email you replied to, reasonable numbers of partitions aren't too bad even with the planning cost; it's when you have many thousands of partitions that you get into cases where planning time for queries is really bad. Also as noted on this thread, PG could handle this data volume, but to be efficient there would be work to be done in normalization, aggregation, and analyzing the system to ensure you're storing and querying on the data efficiently. Thanks! Stephen
Attachment
Am 19.12.2017 um 15:58 schrieb Vincenzo Romano: > But accordingly to a discussion with Bruce Momjan, table partitionin > V10 is little more than syntactic sugar around old-fashioned table > partitioning. there is always room for improvements, but table-partitioning in pg10 is much, much better then before. (there was a impressive talk about that: https://www.postgresql.eu/events/schedule/pgconfeu2017/session/1643-everything-wrong-with-partitioning-in-postgresql-10/ ) Regards, Andreas -- 2ndQuadrant - The PostgreSQL Support Company. www.2ndQuadrant.com
> There are multiple solutions to doing incremental backups with > PostgreSQL, so I'm not sure why you're saying that they don't exist, > because that's really not accurate. PG's incremental backup is essentially WAL logs applied to a point in time. I am talking about true incremental backup wherethe db backs up only the changed blocks after the last full backup. In a DW system where most of the time it is appendonly, it makes a huge difference in backup time. I believe there is one tool which looks at mtime/atime of each of the data file and takes the decision to back it up or lot. Not sure how robust it is.
2017-12-19 16:48 GMT+01:00 Stephen Frost <sfrost@snowman.net>: > * Vincenzo Romano (vincenzo.romano@notorand.it) wrote: >> Sorry, my bad: I confused V10 with v11. >> But accordingly to a discussion with Bruce Momjan, table partitionin V10 is >> little more than syntactic sugar around old-fashioned table partitioning. > > Well, it's a bit more than that since there's tuple-routing, but you're > right that the partition elimination is the same as it was in earlier > versions and based on constraint exclusion. That said, as noted in the > email you replied to, reasonable numbers of partitions aren't too bad > even with the planning cost; it's when you have many thousands of > partitions that you get into cases where planning time for queries is > really bad. When you have to handle a 100TB table, the number of partitions shouldn't be 10 or 12 as seen in most examples and tests. This is the same type of issues you hit with partial indexes (this is why I mentioned them earlier). Sub-table (and partial index) selection algorithm should be logarithmic or sub-linear. As long as it'll be linear, you'll hear about "reasonable number of partitions". One thousand partitions for a 100TB table would make "manageable" 100GB sub-tables. I could be easily wrong, but this is an are where PG needs improvements. One could maybe think about multi-level partitioning, though. > Also as noted on this thread, PG could handle this data volume, but to > be efficient there would be work to be done in normalization, > aggregation, and analyzing the system to ensure you're storing and > querying on the data efficiently. Normalization will grow the number of tables (and later joins) and you'll will very likely end up with at least a table with a "gazillion" rows. I fear normalization, provided it's really needed, would provide little help. With tables that big I usually do "software partitioning". I make the software aware of the partition schema so it can direct the queries to the needed (sub-)tables. So I will have the sub-table selection algorithm in my language of choice with effective support data structures. Of course this is far from being the right solution. -- Vincenzo Romano - NotOrAnd.IT Information Technologies -- NON QVIETIS MARIBVS NAVTA PERITVS
Would a storage block level incremental like zfs work?
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
On December 19, 2017 11:12:48 AM EST, Rakesh Kumar <rakeshkumar464@mail.com> wrote:
There are multiple solutions to doing incremental backups with
PostgreSQL, so I'm not sure why you're saying that they don't exist,
because that's really not accurate.
PG's incremental backup is essentially WAL logs applied to a point in time. I am talking about true incremental backup where the db backs up only the changed blocks after the last full backup. In a DW system where most of the time it is append only, it makes a huge difference in backup time.
I believe there is one tool which looks at mtime/atime of each of the data file and takes the decision to back it up or lot. Not sure how robust it is.
--
Sent from my Android device with K-9 Mail. Please excuse my brevity.
Greetings Rakesh, * Rakesh Kumar (rakeshkumar464@mail.com) wrote: > > > There are multiple solutions to doing incremental backups with > > PostgreSQL, so I'm not sure why you're saying that they don't exist, > > because that's really not accurate. > > PG's incremental backup is essentially WAL logs applied to a point in time. I am talking about true incremental backupwhere the db backs up only the changed blocks after the last full backup. In a DW system where most of the time itis append only, it makes a huge difference in backup time. > > I believe there is one tool which looks at mtime/atime of each of the data file and takes the decision to back it up orlot. Not sure how robust it is. Yes, there are multiple such solutions including pgBackRest and barman, both of which are robust, well supported projects that provide exactly the incremental backup capability you're asking about. I can't speak for barman, but I know that there are many people using pgBackRest to do incremental backups, some on very large PG databases. I will point out that supporting incremental backups is tricky to do correctly, which is why using an existing project that's had years to work on getting it right is really the best approach. I wouldn't recommend trying to roll your own. Thanks! Stephen
Attachment
Greetings, * Vincenzo Romano (vincenzo.romano@notorand.it) wrote: > 2017-12-19 16:48 GMT+01:00 Stephen Frost <sfrost@snowman.net>: > > * Vincenzo Romano (vincenzo.romano@notorand.it) wrote: > >> Sorry, my bad: I confused V10 with v11. > >> But accordingly to a discussion with Bruce Momjan, table partitionin V10 is > >> little more than syntactic sugar around old-fashioned table partitioning. > > > > Well, it's a bit more than that since there's tuple-routing, but you're > > right that the partition elimination is the same as it was in earlier > > versions and based on constraint exclusion. That said, as noted in the > > email you replied to, reasonable numbers of partitions aren't too bad > > even with the planning cost; it's when you have many thousands of > > partitions that you get into cases where planning time for queries is > > really bad. > > When you have to handle a 100TB table, the number of partitions > shouldn't be 10 or 12 > as seen in most examples and tests. Probably not, but it would depend on the data and what you're doing with it. While there are some general rules-of-thumb, there's no one right answer when it comes to the size of individual partitions. > This is the same type of issues you hit with partial indexes (this is > why I mentioned them earlier). > Sub-table (and partial index) selection algorithm should be > logarithmic or sub-linear. Sure, and work is being done to improve PG in that exact area, but that doesn't mean it can't handle workloads like this today, but you have to be aware of the costs associated with today's partitions. > As long as it'll be linear, you'll hear about "reasonable number of partitions". > One thousand partitions for a 100TB table would make "manageable" > 100GB sub-tables. Yup, and 100G tables are certainly large and a bit awkward but they can be managed. > I could be easily wrong, but this is an are where PG needs improvements. Of course, and work is being done to improve it. What's important is knowing that there's a cost to having more partitions when querying through the parent when you get to a point where you have thousands of partitions. That cost may be perfectly fine in some use-cases and in others it might not be, and instead you'd likely have to build logic into the application layer to address it. That's not ideal, which is why there's work being done to improve PG, but it's not necessairly a big deal either. Dynamic SQL is another approach. > One could maybe think about multi-level partitioning, though. Not sure exactly what you're referring to here, but a simple multi-level partitioning setup with PG wouldn't actually change the cost for partition elimination today. > > Also as noted on this thread, PG could handle this data volume, but to > > be efficient there would be work to be done in normalization, > > aggregation, and analyzing the system to ensure you're storing and > > querying on the data efficiently. > > Normalization will grow the number of tables (and later joins) and > you'll will very likely end up with at least a table with a > "gazillion" rows. Natuarlly, but that "gazillion" rows table would be much smaller for having the data normalized- if you don't normalize it then the gazillion row table is a huge amount of duplicated data, making the entire system much larger than necessary. > I fear normalization, provided it's really needed, would provide little help. I seriously doubt that's the case. Normalization might reduce that 100TB down to 10's of TB instead, or perhaps even smaller. > With tables that big I usually do "software partitioning". > I make the software aware of the partition schema so it can direct the > queries to the needed (sub-)tables. Yes, implementing partition elimination at the application level is certainly an approach that can work with PG. Even then, ideally, you'd normalize the data so that the footprint is much smaller, making more efficient use of the storage and memory available. Thanks! Stephen
Attachment
Greetings, * James Keener (jim@jimkeener.com) wrote: > Would a storage block level incremental like zfs work? This really depends on what you want out of your backups and just exactly how the ZFS filesystem is set up. Remember that any backup of PG that doesn't use PG's start/stop backup must be atomic across all tablespaces and even then that really just allows you to bring PG back up as of that point of the snapshot. I wouldn't recommend trying to play WAL forward from that kind of a backup. If you use do use pg_start/stop_backup with ZFS snapshots, and make sure to track all of the WAL that's generated between the start/stop backup and ensure it's reliably stored, etc, then they can work, but it's not simple. Thanks! Stephen
Attachment
Hi. 2017-12-19 17:32 GMT+01:00 Stephen Frost <sfrost@snowman.net>: > Greetings, > > * Vincenzo Romano (vincenzo.romano@notorand.it) wrote: >> 2017-12-19 16:48 GMT+01:00 Stephen Frost <sfrost@snowman.net>: >> > * Vincenzo Romano (vincenzo.romano@notorand.it) wrote: >> >> Sorry, my bad: I confused V10 with v11. >> >> But accordingly to a discussion with Bruce Momjan, table partitionin V10 is >> >> little more than syntactic sugar around old-fashioned table partitioning. >> > >> > Well, it's a bit more than that since there's tuple-routing, but you're >> > right that the partition elimination is the same as it was in earlier >> > versions and based on constraint exclusion. That said, as noted in the >> > email you replied to, reasonable numbers of partitions aren't too bad >> > even with the planning cost; it's when you have many thousands of >> > partitions that you get into cases where planning time for queries is >> > really bad. >> >> When you have to handle a 100TB table, the number of partitions >> shouldn't be 10 or 12 >> as seen in most examples and tests. > > Probably not, but it would depend on the data and what you're doing with > it. While there are some general rules-of-thumb, there's no one right > answer when it comes to the size of individual partitions. I think partitioning is done in order to make smaller tables or to balance table size and table number. You currently have to keep in mind a limit to the number of sub-tables. This makes partitioning less effective. >> This is the same type of issues you hit with partial indexes (this is >> why I mentioned them earlier). >> Sub-table (and partial index) selection algorithm should be >> logarithmic or sub-linear. > > Sure, and work is being done to improve PG in that exact area, but that > doesn't mean it can't handle workloads like this today, but you have to > be aware of the costs associated with today's partitions. > >> As long as it'll be linear, you'll hear about "reasonable number of partitions". >> One thousand partitions for a 100TB table would make "manageable" >> 100GB sub-tables. > > Yup, and 100G tables are certainly large and a bit awkward but they can > be managed. Of course. But I would bet they are still considered as "very large tables". >> I could be easily wrong, but this is an are where PG needs improvements. > > Of course, and work is being done to improve it. What's important is > knowing that there's a cost to having more partitions when querying > through the parent when you get to a point where you have thousands of > partitions. That cost may be perfectly fine in some use-cases and in > others it might not be, and instead you'd likely have to build logic > into the application layer to address it. That's not ideal, which is > why there's work being done to improve PG, but it's not necessairly a > big deal either. Dynamic SQL is another approach. > >> One could maybe think about multi-level partitioning, though. > > Not sure exactly what you're referring to here, but a simple multi-level > partitioning setup with PG wouldn't actually change the cost for > partition elimination today. It'd be a tree of tables, pointing to logarithmic selection. This is why we love tree data structures. >> > Also as noted on this thread, PG could handle this data volume, but to >> > be efficient there would be work to be done in normalization, >> > aggregation, and analyzing the system to ensure you're storing and >> > querying on the data efficiently. >> >> Normalization will grow the number of tables (and later joins) and >> you'll will very likely end up with at least a table with a >> "gazillion" rows. > > Natuarlly, but that "gazillion" rows table would be much smaller for > having the data normalized- if you don't normalize it then the gazillion > row table is a huge amount of duplicated data, making the entire system > much larger than necessary. Partitioning is done for other reasons than de-duplicating data. The number of rows to be analyzed would be still the same and the indexes over those columns would still be rather large. >> I fear normalization, provided it's really needed, would provide little help. > > I seriously doubt that's the case. Normalization might reduce that > 100TB down to 10's of TB instead, or perhaps even smaller. A 10x duplication factor seems very bad to me. If that's the case then normalization would provide for some improvement while eating resources to be done. Storage nowadays isn't a big issue. Querying is. >> With tables that big I usually do "software partitioning". >> I make the software aware of the partition schema so it can direct the >> queries to the needed (sub-)tables. > > Yes, implementing partition elimination at the application level is > certainly an approach that can work with PG. Even then, ideally, you'd > normalize the data so that the footprint is much smaller, making more > efficient use of the storage and memory available. Normalizing can save resources that you later need to join tables back. It's a matter of trade-offs and it strictly depends on the queries to be run in my opinion. Effective table partitioning, instead, seems to me to have bigger importance. My bottom line here is: table partitioning is nice, but not ready yet for the big data where it makes the bigger sense. > Thanks! I do thank you for your acute remarks. > Stephen -- Vincenzo Romano - NotOrAnd.IT Information Technologies -- NON QVIETIS MARIBVS NAVTA PERITVS
On Tue, Dec 19, 2017 at 10:39 AM, Stephen Frost <sfrost@snowman.net> wrote:
Greetings,
* James Keener (jim@jimkeener.com) wrote:
> Would a storage block level incremental like zfs work?
This really depends on what you want out of your backups and just
exactly how the ZFS filesystem is set up. Remember that any backup of
PG that doesn't use PG's start/stop backup must be atomic across all
tablespaces and even then that really just allows you to bring PG back
up as of that point of the snapshot. I wouldn't recommend trying to
play WAL forward from that kind of a backup. If you use do use
pg_start/stop_backup with ZFS snapshots, and make sure to track all of
the WAL that's generated between the start/stop backup and ensure it's
reliably stored, etc, then they can work, but it's not simple.
I believe that the thread started with a data warehouse use case. That might be one application where data ingestion and processing can be stopped and started in a controlled manner. As opposed to a continuously live system where changes are going to continually accumulate in the WAL.
Best regards,
Alex
Greetings, * Vincenzo Romano (vincenzo.romano@notorand.it) wrote: > 2017-12-19 17:32 GMT+01:00 Stephen Frost <sfrost@snowman.net>: > > * Vincenzo Romano (vincenzo.romano@notorand.it) wrote: > >> 2017-12-19 16:48 GMT+01:00 Stephen Frost <sfrost@snowman.net>: > >> > * Vincenzo Romano (vincenzo.romano@notorand.it) wrote: > >> >> Sorry, my bad: I confused V10 with v11. > >> >> But accordingly to a discussion with Bruce Momjan, table partitionin V10 is > >> >> little more than syntactic sugar around old-fashioned table partitioning. > >> > > >> > Well, it's a bit more than that since there's tuple-routing, but you're > >> > right that the partition elimination is the same as it was in earlier > >> > versions and based on constraint exclusion. That said, as noted in the > >> > email you replied to, reasonable numbers of partitions aren't too bad > >> > even with the planning cost; it's when you have many thousands of > >> > partitions that you get into cases where planning time for queries is > >> > really bad. > >> > >> When you have to handle a 100TB table, the number of partitions > >> shouldn't be 10 or 12 > >> as seen in most examples and tests. > > > > Probably not, but it would depend on the data and what you're doing with > > it. While there are some general rules-of-thumb, there's no one right > > answer when it comes to the size of individual partitions. > > I think partitioning is done in order to make smaller tables or to > balance table size and table number. > You currently have to keep in mind a limit to the number of sub-tables. > This makes partitioning less effective. Sure, but that doesn't make PG unsuitable for such use-cases, just that you have to realize these costs and address them. > >> This is the same type of issues you hit with partial indexes (this is > >> why I mentioned them earlier). > >> Sub-table (and partial index) selection algorithm should be > >> logarithmic or sub-linear. > > > > Sure, and work is being done to improve PG in that exact area, but that > > doesn't mean it can't handle workloads like this today, but you have to > > be aware of the costs associated with today's partitions. > > > >> As long as it'll be linear, you'll hear about "reasonable number of partitions". > >> One thousand partitions for a 100TB table would make "manageable" > >> 100GB sub-tables. > > > > Yup, and 100G tables are certainly large and a bit awkward but they can > > be managed. > > Of course. But I would bet they are still considered as "very large tables". They're not small. :) > >> I could be easily wrong, but this is an are where PG needs improvements. > > > > Of course, and work is being done to improve it. What's important is > > knowing that there's a cost to having more partitions when querying > > through the parent when you get to a point where you have thousands of > > partitions. That cost may be perfectly fine in some use-cases and in > > others it might not be, and instead you'd likely have to build logic > > into the application layer to address it. That's not ideal, which is > > why there's work being done to improve PG, but it's not necessairly a > > big deal either. Dynamic SQL is another approach. > > > >> One could maybe think about multi-level partitioning, though. > > > > Not sure exactly what you're referring to here, but a simple multi-level > > partitioning setup with PG wouldn't actually change the cost for > > partition elimination today. > > It'd be a tree of tables, pointing to logarithmic selection. > This is why we love tree data structures. Sure, but that doesn't matter when it comes to how constraint exclusion works with PostgreSQL today- all partitions under a given parent are considered when doing partition elimination. > >> > Also as noted on this thread, PG could handle this data volume, but to > >> > be efficient there would be work to be done in normalization, > >> > aggregation, and analyzing the system to ensure you're storing and > >> > querying on the data efficiently. > >> > >> Normalization will grow the number of tables (and later joins) and > >> you'll will very likely end up with at least a table with a > >> "gazillion" rows. > > > > Natuarlly, but that "gazillion" rows table would be much smaller for > > having the data normalized- if you don't normalize it then the gazillion > > row table is a huge amount of duplicated data, making the entire system > > much larger than necessary. > > Partitioning is done for other reasons than de-duplicating data. Yes, I wasn't suggesting to use partitioning to de-duplicate data, I was suggesting to use normalization for that. > The number of rows to be analyzed would be still the same and the > indexes over those columns would still be rather large. While true, the indexes would be smaller than they would be without the data being normalized. > >> I fear normalization, provided it's really needed, would provide little help. > > > > I seriously doubt that's the case. Normalization might reduce that > > 100TB down to 10's of TB instead, or perhaps even smaller. > > A 10x duplication factor seems very bad to me. I'm not sure why, it's not actually that uncommon. > If that's the case then normalization would provide for some > improvement while eating resources to be done. It's unclear what you're referring to here regarding 'eating resources'. Clearly there's some CPU time that's consumed when doing a join, but that's generally far cheaper than the time required to pull data off of disk because it's not in memory. > Storage nowadays isn't a big issue. Querying is. Sure, and if all you had to pay was the storage cost then no one would care- but you also have to pay the memory cost and that's a much bigger deal. If you have to go out to disk constantly then you've basically already lost when it comes to query time. > >> With tables that big I usually do "software partitioning". > >> I make the software aware of the partition schema so it can direct the > >> queries to the needed (sub-)tables. > > > > Yes, implementing partition elimination at the application level is > > certainly an approach that can work with PG. Even then, ideally, you'd > > normalize the data so that the footprint is much smaller, making more > > efficient use of the storage and memory available. > > Normalizing can save resources that you later need to join tables back. You were commenting above on how normalization eats resources and now it's saving them. :) In my experience, the join cost is nothing compared to the cost to pull data off of disk because you couldn't fit it all in memory. > It's a matter of trade-offs and it strictly depends on the queries to > be run in my opinion. I agree that it's a matter of trade-offs and certainly depends on the queries being run, though none of that information is available here. > Effective table partitioning, instead, seems to me to have bigger importance. They're both important, imv. > My bottom line here is: table partitioning is nice, but not ready yet > for the big data where it makes the bigger sense. I agree that there's more work to be done there, certainly, but it's also getting better and there are ways to work around the existing limitations. Thanks! Stephen
Attachment
Hi Thank you all for your thoughts and the light shedding on this subject. I am very pleased to have got such engaged feedbackand also that PostgreSQL seems to be so capable. I am quite sad that we most probably will not setup our DWH withit as RDBMS. It is, I fear, a quite in-house-political decision. Kind regards Thiemo