Re: RFC: Very large scale postgres support - Mailing list pgsql-hackers
From | Keith Bottner |
---|---|
Subject | Re: RFC: Very large scale postgres support |
Date | |
Msg-id | 007f01c3ef1c$6a230ab0$7d00a8c0@juxtapose Whole thread Raw |
In response to | RFC: Very large scale postgres support ("Alex J. Avriette" <alex@posixnap.net>) |
Responses |
Re: RFC: Very large scale postgres support
|
List | pgsql-hackers |
Alex, I agree that this is something that is worth spending time on. This resembles the Oracle RAC (Real Application Cluster). While other people may feel that the amount of data is unreasonable I have a similar problem that will only be solved using such a solution. In regards to how your database is designed? Who cares? This is an RFC for a general discussion on how to design this level of functionality into Postgres. Ultimately any solution would work without regard to the insert, updates, or deletes being executed. Alex, I think as a first step we should start coming up with a feature list of what would be necessary to support this level of functionality. From that point we could then identify efforts that are currently ongoing on Postgres development that we could help out on as well as those items that would need to be handled directly. I am very interested in going forth with this discussion and believe that I would be able to have the company I work for put forward resources (i.e. people or money) on developing the solution if we can come up with a workable plan. Josh, thanks for the heads up on Clusgres, I will take a look and see how that fits. Thanks, Keith -----Original Message----- From: pgsql-hackers-owner@postgresql.org [mailto:pgsql-hackers-owner@postgresql.org] On Behalf Of Alex J. Avriette Sent: Saturday, February 07, 2004 12:29 PM To: pgsql-hackers@postgresql.org Subject: [HACKERS] RFC: Very large scale postgres support Recently I was tasked with creating a "distribution system" for postgres nodes here at work. This would allow us to simply bring up a new box, push postgres to it, and have a new database. At the same time, we have started to approach the limits of what we can do with postgres on one machine. Our platform presently is the HP DL380. It is a reasonably fast machine, but in order to eke more performance out of postgres, we are going to have to upgrade the hardware substantially. So the subject came up, wouldn't it be nice if, with replication and proxies, we could create postgres clusters? When we need more throughput, to just put a new box in the cluster, dist a psotgres instance to it, and tell the proxy about it. This is a very attractive idea for us, from a scalability standpoint. It means that we don't have to buy $300,000 servers when we max out our 2- or 4- cpu machines (in the past, I would have suggested a Sun V880 for this database, but we are using Linux on x86). We are left with one last option, and that is re-engineering our application to distribute load across several instances of postgres which are operating without any real knowledge of eachother. I worry, though, that as our needs increase further, these application redesigns will become asymptotic. I find myself wondering what other people are doing with postgres that this doesn't seem to have come up. When one searches for postgres clustering on google, they will find lots of HA products. However, nobody seems to be attempting to create very high throughput clusters. I feel that it would be a very good thing if some thinking on this subject was done. In the future, people will hopefully begin using postgres for more intense applications. We are looking at perhaps many tens of billions of transactions per day within the next year or two. To simply buy a "bigger box" each time we outgrow the one we're on is not effective nor efficient. I simply don't believe we're the only ones pushing postgres this hard. I understand there are many applications out there trying to achieve replication. Some of them seem fairly promising. However, it seems to me that if we want to see a true clustered database environment, there would have to be actual native support in the postmaster (inter postmaster communication if you will) for replication and cross-instance locking. This is obviously a complicated problem, and probably not very many of us are doing anything near as large-scale as this. However, I am sure most of us can see the benefit of being able to provide support for these sorts of applications. I've just submitted this RFC in the hopes that we can discuss both the best way to support very large scale databases, as well as how to handle them presently. Thanks again for your time. alex -- alex@posixnap.net Alex J. Avriette, Solaris Systems Masseur "I ... remain against the death penalty because I feel that eternal boredom with no hope of parole is a much worse punishment than just ending it all mercifully with that quiet needle." - Rachel Mills, NC Libertarian Gubernatorial Candidate ---------------------------(end of broadcast)--------------------------- TIP 6: Have you searched our list archives? http://archives.postgresql.org
pgsql-hackers by date: