Thread: PGSQL with high number of database rows?
Hey all I am possibly looking to use PSGSQL in a project I am working on for a very large client. The upshot of this is the throughput of data will be pretty massive, around 20,000 new rows in one of the tables per day. We also have to keep this data online for a set period so after 5 or 6 weeks it could have nearly a million rows. Are there any implications with possibly doing this? will PG handle it? Are there realworld systems using PG that have a massive amount of data in them? All the best, thanks for any advice up front Tim
On Tue, Apr 03, 2007 at 09:28:28AM +0100, Tim Perrett wrote: > Hey all > > I am possibly looking to use PSGSQL in a project I am working on for a very > large client. The upshot of this is the throughput of data will be pretty > massive, around 20,000 new rows in one of the tables per day. We also have to > keep this data online for a set period so after 5 or 6 weeks it could have > nearly a million rows. > > Are there any implications with possibly doing this? will PG handle it? Are > there realworld systems using PG that have a massive amount of data in them? This is in no way massive for pg. Many millions of rows is not a problem at all, given that you have proper schema and indexing, and run on reasonable hardware (hint: it might be a bit slow on your laptop). 20,000 rows / day is still no more than about 14 / minute, which is a very light load for a server grade machine to deal with without any problem at all. //Magnus
Tim Perrett wrote: > Hey all > > I am possibly looking to use PSGSQL in a project I am working on for a very > large client. The upshot of this is the throughput of data will be pretty > massive, around 20,000 new rows in one of the tables per day. We also have to > keep this data online for a set period so after 5 or 6 weeks it could have > nearly a million rows. > > Are there any implications with possibly doing this? will PG handle it? Are > there realworld systems using PG that have a massive amount of data in them? In all honesty that's really not that big. There are systems out there with database sizes in the multiple terabyte range with billions of rows. A few million shouldn't cause you any issues, unless they're exceptionally wide. Regards, Dave.
> I am possibly looking to use PSGSQL in a project I am working on for a very > large client. The upshot of this is the throughput of data will be pretty > massive, around 20,000 new rows in one of the tables per day. > We also have tokeep this data online for a set period so after 5 or 6 weeks > it could have nearly a million rows. > > Are there any implications with possibly doing this? will PG > handle it? What do you mean, massive? A mere 1000000 rows? I don't think that a small database like this will be a worry. Try to avoid unnecessary table scans by using indexes! Yours, Laurenz Albe
> Are there any implications with possibly doing this? will PG handle it? > Are there realworld systems using PG that have a massive amount of data > in them? It's not how much data you have, it's how you query it. You can have a table with 1000 rows and be dead slow if said rows are big TEXT data and you seq-scan it in its entierety on every webpage hit your server gets... You can have a terabyte table with billions of row, and be fast if you know what you're doing and have proper indexes. Learning all this is very interesting. MySQL always seemed hostile to me, but postgres is friendly, has helpful error messages, the docs are great, and the developer team is really nice. The size of your data has no importance (unless your disk is full), but the size of your working set does. So, if you intend on querying your data for a website, for instance, where the user searches data using forms, you will need to index it properly so you only need to explore small sections of your data set in order to be fast. If you intend to scan entire tables to generate reports or statistics, you will be more interested in knowing if the size of your RAM is larger or smaller than your data set, and about your disk throughput. So, what is your application ?
Tim, > massive, around 20,000 new rows in one of the tables per day. As an example... I'm doing about 4000 inserts spread across about 1800 tables per minute. Pisses it in with fsync off and the PC ( IBM x3650 1 CPU, 1 Gig memory ) on a UPS. Allan The material contained in this email may be confidential, privileged or copyrighted. If you are not the intended recipient,use, disclosure or copying of this information is prohibited. If you have received this document in error, pleaseadvise the sender and delete the document. Neither OneSteel nor the sender accept responsibility for any viruses containedin this email or any attachments.