Thread: Splitting queries across servers
Hello, Our postgresql database is getting too big to be handled by one server. We need the database to be in RAM and cannot afford swapping. At the moment, we're using only 3GB or RAM, however our business growth is going to drive this number into the double digits zone, maybe triple digits. What are our options ? I am a little bit ignorant in this part of RDBM. Can we distribute our tables across multiple servers (read server+postgres) and perform distributed SQL queries ? If so, how does that work ? I am totally resourceless! Thank you in advance Max.
-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Wow. Your needs really *are* demanding. You might need to look into a high-end server to meet those kinds of needs - Sun has some *very* expensive hardware that can handle over 1/2TB of RAM, for example, and Cray makes supercomputers (even more *very* expensive) which can handle somewhere around 32 TB of RAM... You're sure you can't possibly put up with even a small amount of swapping? On Jan 26, 2005, at 11:00 AM, Max wrote: > Hello, > > Our postgresql database is getting too big to be handled by one > server. We > need the database to be in RAM and cannot afford swapping. At the > moment, > we're using only 3GB or RAM, however our business growth is going to > drive > this number into the double digits zone, maybe triple digits. > > What are our options ? > > I am a little bit ignorant in this part of RDBM. Can we distribute our > tables across multiple servers (read server+postgres) and perform > distributed SQL queries ? If so, how does that work ? I am totally > resourceless! > > Thank you in advance > > Max. > > > > ---------------------------(end of > broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if > your > joining column's datatypes do not match > > - ----------------------------------------------------------- Frank D. Engel, Jr. <fde101@fjrhome.net> $ ln -s /usr/share/kjvbible /usr/manual $ true | cat /usr/manual | grep "John 3:16" John 3:16 For God so loved the world, that he gave his only begotten Son, that whosoever believeth in him should not perish, but have everlasting life. $ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.2.4 (Darwin) iD8DBQFB98Wx7aqtWrR9cZoRAtYNAJ0SHP+Y65TasCxvUxd9RjTXSJqyOgCdEi6X VfX0V8TKg4JEPPy6eJyYPVM= =zAYr -----END PGP SIGNATURE----- ___________________________________________________________ $0 Web Hosting with up to 120MB web space, 1000 MB Transfer 10 Personalized POP and Web E-mail Accounts, and much more. Signup at www.doteasy.com
We configured a slony1 cluster with a master and slave (both 7.4.6), and used the slave to serve read-only queries thereby offloading some of the work from the master database. This worked well for us. You could also take a look at pg_pool to distribute your load - but I haven't actually used it, so can't give any advice... Hope that helps. John Sidney-Woollett Max wrote: > Hello, > > Our postgresql database is getting too big to be handled by one server. We > need the database to be in RAM and cannot afford swapping. At the moment, > we're using only 3GB or RAM, however our business growth is going to drive > this number into the double digits zone, maybe triple digits. > > What are our options ? > > I am a little bit ignorant in this part of RDBM. Can we distribute our > tables across multiple servers (read server+postgres) and perform > distributed SQL queries ? If so, how does that work ? I am totally > resourceless! > > Thank you in advance > > Max. > > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match
> > Wow. Your needs really *are* demanding. You might need to look into a > high-end server to meet those kinds of needs - Sun has some *very* > expensive hardware that can handle over 1/2TB of RAM, for example, and > Cray makes supercomputers (even more *very* expensive) which can handle > somewhere around 32 TB of RAM... Well, that's too much money. We'd like to stay away from these options. > > You're sure you can't possibly put up with even a small amount of > swapping? Nop. Too many users. Xmas shopping brought us down for a few minutes, and the rest of the time we were too slow. > > > On Jan 26, 2005, at 11:00 AM, Max wrote: > > > Hello, > > > > Our postgresql database is getting too big to be handled by one > > server. We > > need the database to be in RAM and cannot afford swapping. At the > > moment, > > we're using only 3GB or RAM, however our business growth is going to > > drive > > this number into the double digits zone, maybe triple digits. > > > > What are our options ? > > > > I am a little bit ignorant in this part of RDBM. Can we distribute our > > tables across multiple servers (read server+postgres) and perform > > distributed SQL queries ? If so, how does that work ? I am totally > > resourceless! > > > > Thank you in advance > > > > Max. > > > > > > > > ---------------------------(end of > > broadcast)--------------------------- > > TIP 9: the planner will ignore your desire to choose an index scan if > > your > > joining column's datatypes do not match > > > > > - ----------------------------------------------------------- > Frank D. Engel, Jr. <fde101@fjrhome.net> > > $ ln -s /usr/share/kjvbible /usr/manual > $ true | cat /usr/manual | grep "John 3:16" > John 3:16 For God so loved the world, that he gave his only begotten > Son, that whosoever believeth in him should not perish, but have > everlasting life. > $ > -----BEGIN PGP SIGNATURE----- > Version: GnuPG v1.2.4 (Darwin) > > iD8DBQFB98Wx7aqtWrR9cZoRAtYNAJ0SHP+Y65TasCxvUxd9RjTXSJqyOgCdEi6X > VfX0V8TKg4JEPPy6eJyYPVM= > =zAYr > -----END PGP SIGNATURE----- > > > > ___________________________________________________________ > $0 Web Hosting with up to 120MB web space, 1000 MB Transfer > 10 Personalized POP and Web E-mail Accounts, and much more. > Signup at www.doteasy.com > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match >
> > We configured a slony1 cluster with a master and slave (both 7.4.6), and > used the slave to serve read-only queries thereby offloading some of the > work from the master database. > > This worked well for us. > > You could also take a look at pg_pool to distribute your load - but I > haven't actually used it, so can't give any advice... thanks, I am going to look into this. > > Hope that helps. > > John Sidney-Woollett > > Max wrote: > > Hello, > > > > Our postgresql database is getting too big to be handled by one > server. We > > need the database to be in RAM and cannot afford swapping. At > the moment, > > we're using only 3GB or RAM, however our business growth is > going to drive > > this number into the double digits zone, maybe triple digits. > > > > What are our options ? > > > > I am a little bit ignorant in this part of RDBM. Can we distribute our > > tables across multiple servers (read server+postgres) and perform > > distributed SQL queries ? If so, how does that work ? I am totally > > resourceless! > > > > Thank you in advance > > > > Max. > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 9: the planner will ignore your desire to choose an index > scan if your > > joining column's datatypes do not match >
With a Quad Opteron (4 memory slots per CPU), you could put 64GB of RAM onto a single machine using 4GB DIMMs in every slot. The other option is to explore static memory storage. It's probably too expensive to put your entire DB onto such a device but moving just the WAL there would give you a pretty decent jump in write performance. (I ran some tests way back simulating this by using a RAMDISK to store the WAL files and got ~60% increase in an update-heavy mix.) Max wrote: > Hello, > > Our postgresql database is getting too big to be handled by one server. We > need the database to be in RAM and cannot afford swapping. At the moment, > we're using only 3GB or RAM, however our business growth is going to drive > this number into the double digits zone, maybe triple digits. > > What are our options ? > > I am a little bit ignorant in this part of RDBM. Can we distribute our > tables across multiple servers (read server+postgres) and perform > distributed SQL queries ? If so, how does that work ? I am totally > resourceless! > > Thank you in advance > > Max. > > > > ---------------------------(end of broadcast)--------------------------- > TIP 9: the planner will ignore your desire to choose an index scan if your > joining column's datatypes do not match >
Isn't putting the WAL in memory dangerous in case of a power failure? I would think that RI would be compromised. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of William Yu Sent: Thursday, January 27, 2005 12:41 PM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] Splitting queries across servers With a Quad Opteron (4 memory slots per CPU), you could put 64GB of RAM onto a single machine using 4GB DIMMs in every slot. The other option is to explore static memory storage. It's probably too expensive to put your entire DB onto such a device but moving just the WAL there would give you a pretty decent jump in write performance. (I ran some tests way back simulating this by using a RAMDISK to store the WAL files and got ~60% increase in an update-heavy mix.)
Very dangerous. I only did it as a test to guestimate what kind of performance I might get if I got a static ram hard drive to put the WAL onto. Dann Corbit wrote: > Isn't putting the WAL in memory dangerous in case of a power failure? > I would think that RI would be compromised. > > -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org] On Behalf Of William Yu > Sent: Thursday, January 27, 2005 12:41 PM > To: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Splitting queries across servers > > With a Quad Opteron (4 memory slots per CPU), you could put 64GB of RAM > onto a single machine using 4GB DIMMs in every slot. > > The other option is to explore static memory storage. It's probably too > expensive to put your entire DB onto such a device but moving just the > WAL there would give you a pretty decent jump in write performance. (I > ran some tests way back simulating this by using a RAMDISK to store the > WAL files and got ~60% increase in an update-heavy mix.)
> > > With a Quad Opteron (4 memory slots per CPU), you could put 64GB of RAM > onto a single machine using 4GB DIMMs in every slot. We were talking about a similar solution today, but not quite as good as this. You really got me thinking now. 64Gb.. hm that would solve many issues. what do you think the cost would be for such a beast ? Any idea ? Max
On Thu, 27 Jan 2005 20:17:29 -0800, Max <maxdl@adelphia.net> wrote: > > > > > > With a Quad Opteron (4 memory slots per CPU), you could put 64GB of RAM > > onto a single machine using 4GB DIMMs in every slot. > > We were talking about a similar solution today, but not quite as good as > this. You really got me thinking now. 64Gb.. hm that would solve many > issues. what do you think the cost would be for such a beast ? Any idea ? Depending on the vendor and the storage subsystem, between $20k and $50k, and if you start with less memory it will be much less. I can attest to the fact that 4 x Opterons work VERY nicely. ;) > > Max > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > -- Mike Rylander mrylander@gmail.com GPLS -- PINES Development Database Developer http://open-ils.org
Without memory, the prices are roughly: 4x846 = $5500 4x848 = $6500 4x850 = $8000 Memory costs would be: 16GB (1GB DIMMs) = $3000 32GB (2GB DIMMs) = $7500 64GB (4GB DIMMs) = $24000 128GB (8GB DIMMs) = $60000 The prices on the 4GB & 8GB DIMMs are bleeding edge of course. I see a wide range of for them on Froogle. You will want your vendor of course to pick memory is proper for the MB and pay whatever premium they'll charge ontop of it. (Looking at the Tyan website, they only list 2GB DIMMs as supported for their Quad Opteron MBs but whether that's because those specs were done before >= 4GB DIMMs were available or an actual hardware limitation, I dunno.) Max wrote: >> >>With a Quad Opteron (4 memory slots per CPU), you could put 64GB of RAM >>onto a single machine using 4GB DIMMs in every slot. > > > We were talking about a similar solution today, but not quite as good as > this. You really got me thinking now. 64Gb.. hm that would solve many > issues. what do you think the cost would be for such a beast ? Any idea ? > > Max > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend >
Suppose that you currently need 16 GB to cache everything now. I would install (perhaps) 32 GB ram for the initial configuration. The price of memory drops exponentially, and so waiting for the price to drop will give a much lower expense for the cost of the RAM. The reason to double the ram is the expense of upgrading in terms of labor and downtime for the computer. That can be very significant. So if we double the ram, that should give one or (hopefully) two years safety margin. If the database is expected to grow exponentially fast, then that is another issue. In such a case, if it can be cost justified, put on the largest memory volume that is possible given your financial limitations. In any case, plan to completely replace everything (CPU, memory, disk) every 5 years or less. All the value is really in the data. Because hardware increases compute power exponentially over time, that also means that the value of the hardware you bought decreases exponentially over time. -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org] On Behalf Of William Yu Sent: Friday, January 28, 2005 11:36 AM To: pgsql-general@postgresql.org Subject: Re: [GENERAL] Splitting queries across servers Without memory, the prices are roughly: 4x846 = $5500 4x848 = $6500 4x850 = $8000 Memory costs would be: 16GB (1GB DIMMs) = $3000 32GB (2GB DIMMs) = $7500 64GB (4GB DIMMs) = $24000 128GB (8GB DIMMs) = $60000 The prices on the 4GB & 8GB DIMMs are bleeding edge of course. I see a wide range of for them on Froogle. You will want your vendor of course to pick memory is proper for the MB and pay whatever premium they'll charge ontop of it. (Looking at the Tyan website, they only list 2GB DIMMs as supported for their Quad Opteron MBs but whether that's because those specs were done before >= 4GB DIMMs were available or an actual hardware limitation, I dunno.) Max wrote: >> >>With a Quad Opteron (4 memory slots per CPU), you could put 64GB of RAM >>onto a single machine using 4GB DIMMs in every slot. > > > We were talking about a similar solution today, but not quite as good as > this. You really got me thinking now. 64Gb.. hm that would solve many > issues. what do you think the cost would be for such a beast ? Any idea ? > > Max > > > > > ---------------------------(end of broadcast)--------------------------- > TIP 8: explain analyze is your friend > ---------------------------(end of broadcast)--------------------------- TIP 2: you can get off all lists at once with the unregister command (send "unregister YourEmailAddressHere" to majordomo@postgresql.org)
> -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Mike Rylander > Sent: Friday, January 28, 2005 4:02 AM > To: Max; PgSql General > Subject: Re: [GENERAL] Splitting queries across servers > > > On Thu, 27 Jan 2005 20:17:29 -0800, Max <maxdl@adelphia.net> wrote: > > > > > > > > > With a Quad Opteron (4 memory slots per CPU), you could put > 64GB of RAM > > > onto a single machine using 4GB DIMMs in every slot. > > > > We were talking about a similar solution today, but not quite as good as > > this. You really got me thinking now. 64Gb.. hm that would solve many > > issues. what do you think the cost would be for such a beast ? > Any idea ? > > Depending on the vendor and the storage subsystem, between $20k and > $50k, and if you start with less memory it will be much less. I can > attest to the fact that 4 x Opterons work VERY nicely. ;) > Thanks. That would be quite an investment. Interesting. I wonder if we should put windows or linux on it, since postgresql 8.0 works on windows. What OS are you running on your quad opteron ? Anyone's run some postgres performance tests between two machines same config different OS ? Max
> -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of William Yu > Sent: Friday, January 28, 2005 11:36 AM > To: pgsql-general@postgresql.org > Subject: Re: [GENERAL] Splitting queries across servers > > > Without memory, the prices are roughly: > > 4x846 = $5500 > 4x848 = $6500 > 4x850 = $8000 > > Memory costs would be: > > 16GB (1GB DIMMs) = $3000 > 32GB (2GB DIMMs) = $7500 > 64GB (4GB DIMMs) = $24000 > 128GB (8GB DIMMs) = $60000 > > The prices on the 4GB & 8GB DIMMs are bleeding edge of course. I see a > wide range of for them on Froogle. You will want your vendor of course > to pick memory is proper for the MB and pay whatever premium they'll > charge ontop of it. (Looking at the Tyan website, they only list 2GB > DIMMs as supported for their Quad Opteron MBs but whether that's because > those specs were done before >= 4GB DIMMs were available or an actual > hardware limitation, I dunno.) > Thank you! That's very thorough. hm... we're talking 35K then. That could be in my budget. > > > > Max wrote: > >> > >>With a Quad Opteron (4 memory slots per CPU), you could put 64GB of RAM > >>onto a single machine using 4GB DIMMs in every slot. > > > > > > We were talking about a similar solution today, but not quite as good as > > this. You really got me thinking now. 64Gb.. hm that would solve many > > issues. what do you think the cost would be for such a beast ? > Any idea ? > > > > Max > > > > > > > > > > ---------------------------(end of broadcast)--------------------------- > > TIP 8: explain analyze is your friend > > > > ---------------------------(end of broadcast)--------------------------- > TIP 2: you can get off all lists at once with the unregister command > (send "unregister YourEmailAddressHere" to majordomo@postgresql.org) >
> -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Dann Corbit > Sent: Friday, January 28, 2005 12:01 PM > To: William Yu; pgsql-general@postgresql.org > Subject: Re: [GENERAL] Splitting queries across servers > > > Suppose that you currently need 16 GB to cache everything now. > I would install (perhaps) 32 GB ram for the initial configuration. > Good point. Adding memory as I need it. > The price of memory drops exponentially, and so waiting for the price to > drop will give a much lower expense for the cost of the RAM. > > The reason to double the ram is the expense of upgrading in terms of > labor and downtime for the computer. That can be very significant. So > if we double the ram, that should give one or (hopefully) two years > safety margin. Downtime is a big deal, however I am planning on using replication with pgpool. > If the database is expected to grow exponentially fast, then that is > another issue. In such a case, if it can be cost justified, put on the > largest memory volume that is possible given your financial limitations. We can't really forecast the growing curve. My bet is that we have a short term (6 months) need of 32 GB, so I'll just double that and it should give us visibility for about a year. I hope! I just realized I never asked that question: What is the maximum size of a postgresql DB. Can it be anything ? Max
> > > -----Original Message----- > > From: pgsql-general-owner@postgresql.org > > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Mike Rylander > > Sent: Friday, January 28, 2005 4:02 AM > > To: Max; PgSql General > > Subject: Re: [GENERAL] Splitting queries across servers > > > > > > On Thu, 27 Jan 2005 20:17:29 -0800, Max <maxdl@adelphia.net> wrote: > > > > > > > > > > > > With a Quad Opteron (4 memory slots per CPU), you could put > > 64GB of RAM > > > > onto a single machine using 4GB DIMMs in every slot. > > > > > > We were talking about a similar solution today, but not quite > as good as > > > this. You really got me thinking now. 64Gb.. hm that would solve many > > > issues. what do you think the cost would be for such a beast ? > > Any idea ? > > > > Depending on the vendor and the storage subsystem, between $20k and > > $50k, and if you start with less memory it will be much less. I can > > attest to the fact that 4 x Opterons work VERY nicely. ;) > > > > Thanks. > > That would be quite an investment. Interesting. I wonder if we should put > windows or linux on it, since postgresql 8.0 works on windows. > > What OS are you running on your quad opteron ? > > Anyone's run some postgres performance tests between two machines same > config different OS ? > > Max > I want to mention that this last question is not invitation to flame about windows against linux ;) Personal preferences set aside, I am looking at performance consideration. thanks! Max
On Sat, 29 Jan 2005 09:55:15 -0800, Max <maxdl@adelphia.net> wrote: > > > > -----Original Message----- > > From: pgsql-general-owner@postgresql.org > > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Mike Rylander > > Sent: Friday, January 28, 2005 4:02 AM > > To: Max; PgSql General > > Subject: Re: [GENERAL] Splitting queries across servers > > > > > > On Thu, 27 Jan 2005 20:17:29 -0800, Max <maxdl@adelphia.net> wrote: > > > > > > > > > > > > With a Quad Opteron (4 memory slots per CPU), you could put > > 64GB of RAM > > > > onto a single machine using 4GB DIMMs in every slot. > > > > > > We were talking about a similar solution today, but not quite as good as > > > this. You really got me thinking now. 64Gb.. hm that would solve many > > > issues. what do you think the cost would be for such a beast ? > > Any idea ? > > > > Depending on the vendor and the storage subsystem, between $20k and > > $50k, and if you start with less memory it will be much less. I can > > attest to the fact that 4 x Opterons work VERY nicely. ;) > > > > Thanks. > > That would be quite an investment. Interesting. I wonder if we should put > windows or linux on it, since postgresql 8.0 works on windows. To the best of my knowledge there is no 64-bit Windows for AMD64, but I've not been forced to use Windows in 5 or 6 years... > > What OS are you running on your quad opteron ? We are using the x86-64 build of Gentoo Linux and compiling our own copy of PG 8.0. -- Mike Rylander mrylander@gmail.com GPLS -- PINES Development Database Developer http://open-ils.org
> -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Mike Rylander > Sent: Saturday, January 29, 2005 11:01 AM > To: Max > Cc: PgSql General > Subject: Re: [GENERAL] Splitting queries across servers > > > On Sat, 29 Jan 2005 09:55:15 -0800, Max <maxdl@adelphia.net> wrote: > > > > > > > -----Original Message----- > > > From: pgsql-general-owner@postgresql.org > > > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Mike Rylander > > > Sent: Friday, January 28, 2005 4:02 AM > > > To: Max; PgSql General > > > Subject: Re: [GENERAL] Splitting queries across servers > > > > > > > > > On Thu, 27 Jan 2005 20:17:29 -0800, Max <maxdl@adelphia.net> wrote: > > > > > > > > > > > > > > > With a Quad Opteron (4 memory slots per CPU), you could put > > > 64GB of RAM > > > > > onto a single machine using 4GB DIMMs in every slot. > > > > > > > > We were talking about a similar solution today, but not > quite as good as > > > > this. You really got me thinking now. 64Gb.. hm that would > solve many > > > > issues. what do you think the cost would be for such a beast ? > > > Any idea ? > > > > > > Depending on the vendor and the storage subsystem, between $20k and > > > $50k, and if you start with less memory it will be much less. I can > > > attest to the fact that 4 x Opterons work VERY nicely. ;) > > > > > > > Thanks. > > > > That would be quite an investment. Interesting. I wonder if we > should put > > windows or linux on it, since postgresql 8.0 works on windows. > > To the best of my knowledge there is no 64-bit Windows for AMD64, but > I've not been forced to use Windows in 5 or 6 years... > Actually winXP and win server 2003 do support AMD64, it's intel 64 that's not supported yet. > > > > What OS are you running on your quad opteron ? > > We are using the x86-64 build of Gentoo Linux and compiling our own > copy of PG 8.0. Thanks for reminding me that 64bit translates to: recompile everything you need! I think this is exactly the choice of configuration we are going to make. Someone just reminded me that windows and linux come down to the same performance, but that the real overhead is on maintenance. It's true that linux implies less downtime, and you can always upgrade. On top of that, there's more granularity control on linux. Thanks for the info. Max > > > -- > Mike Rylander > mrylander@gmail.com > GPLS -- PINES Development > Database Developer > http://open-ils.org > > ---------------------------(end of broadcast)--------------------------- > TIP 6: Have you searched our list archives? > > http://archives.postgresql.org >
Max wrote: > That would be quite an investment. Interesting. I wonder if we should put > windows or linux on it, since postgresql 8.0 works on windows. > > What OS are you running on your quad opteron ? Well here's one thing that you can bank on. You can get 64-bit Linux now while 64-bit Windows is still beta. (The word is that 64-bit Windows is pretty stable except for the lack of drivers but I'd still feel queazy about spending this much money to run a beta OS.) And you absolutely want a 64-bit OS with that much memory. Even on our development server with just 3GB of RAM, we see quite big jumps in performance after moving from 32-bit Linux to 64-bit Linux. I'd hate to think about 64GB of RAM being swapped in and out using PAE.
> Thanks for reminding me that 64bit translates to: recompile everything > you > need! > > I think this is exactly the choice of configuration we are going to make. > Someone just reminded me that windows and linux come down to the same > performance, but that the real overhead is on maintenance. It's true that > linux implies less downtime, and you can always upgrade. On top of that, > there's more granularity control on linux. I won't say anything on linux vs windows performance (although I sure do have an opinion), but simply keep in mind that postgres on window is young and postgres on linux is vary mature ; therefore one is likely to have a lot more performance refinements than the other. Also Linux is not that harder to administer (especially gentoo). After all, all you want is a DB server, you'll install few software. Not like if you needed 50 different apps. In your case recompiling everything is also a bonus because you'll be sure that everything is 64-bit optimized down to the last driver. I wonder how long your killer machine will take to compile the whole OS. And Postgres too... could you post that ? If you are concerned about security, you can compile everything with the stack protector (Propolice SSP) enabled. It's extremely advisable. All the other options like grsec et al are nice but require careful thinking, propolice just needs to be activated. If you don't know what it is I'll post docs.
> > I won't say anything on linux vs windows performance > (although I sure do > have an opinion), but simply keep in mind that postgres on window > is young > and postgres on linux is vary mature ; therefore one is likely to have a > lot more performance refinements than the other. This is a good point. > > In your case recompiling everything is also a bonus because > you'll be > sure that everything is 64-bit optimized down to the last driver. > I wonder how long your killer machine will take to compile > the whole OS. > And Postgres too... could you post that ? You probably want to ask Mike Rylander (who posted before) because he's already done it. > If you are concerned about security, you can compile > everything with the > stack protector (Propolice SSP) enabled. It's extremely > advisable. All the > other options like grsec et al are nice but require careful thinking, > propolice just needs to be activated. If you don't know what it is I'll > post docs. > I have never heard of Propolice SSP. What is it ? Any relation to the honey 'Propolys'. just kidding. Max
> And you absolutely want a 64-bit OS with that much memory. Even on our > development server with just 3GB of RAM, we see quite big jumps in > performance after moving from 32-bit Linux to 64-bit Linux. I'd hate to > think about 64GB of RAM being swapped in and out using PAE. > What's PAE ? Here's a question for the developers: what's the memory consumption difference when you move a 3GB database from a 32 bit machine to a 64 ? given that the whole thing must be in RAM. Isn't the whole data taking up more RAM because pointers are now 64 bits instead of 32 ? Max
> I have never heard of Propolice SSP. What is it ? Any relation to the > honey > 'Propolys'. just kidding. > > Max The name says little although I like it. http://www.gentoo.org/proj/en/hardened/ I was out of date -- Propolice has been renamed PaX. The hardened project has many parts, you should read the help on grsecurity, but PaX is very interesting : ------------------------------------------------------------------- from http://www.gentoo.org/proj/en/hardened/docs/pax-howto.xml : What is PaX? PaX is a patch to the Linux kernel that provides hardening in two ways. The first, ASLR (Address Space Layout Randomization) provides a means to randomize the addressing scheme of all data loaded into memory. When an application is built as a PIE (Position Independent Executable), PaX is able to also randomize the addresses of the application base in addition. The second protection provided by PaX is non-executable memory. This prevents a common form of attack where executable code is inserted into memory by an attacker. More information on PaX can be found throughout this guide, but the homepage can be found at http://pax.grsecurity.net. At run time, when a buffer is created, SSP adds a secret random value, the canary, to the end of the buffer. When the function returns, SSP makes sure that the canary is still intact. If an attacker were to perform a buffer overflow, he would overwrite this value and trigger that stack smashing handler. ------------------------------------------------------------------- For instance, imagine you have a version of Samba with the latest unpatched hole. An attacker can enter. Now if you have PaX all he can do is crash the process, and his intrusion attempt is detected and logged. It's not the final cure for everything, but it covers unpatched holes.
On Sat, 29 Jan 2005 21:18:38 +0100, PFC <lists@boutiquenumerique.com> wrote: [snip] > I wonder how long your killer machine will take to compile the whole OS. > And Postgres too... could you post that ? Well, I can't rebuild the whole OS (I'm using the box ... ) but: postgresql-8.0.0beta4 $ time make -j 5 ... lots of output ... real 0m41.274s user 1m36.315s sys 0m15.451s -- Mike Rylander mrylander@gmail.com GPLS -- PINES Development Database Developer http://open-ils.org
> -----Original Message----- > From: pgsql-general-owner@postgresql.org > [mailto:pgsql-general-owner@postgresql.org]On Behalf Of PFC > Sent: Saturday, January 29, 2005 1:25 PM > To: Max; PgSql General > Subject: Re: [GENERAL] Splitting queries across servers > > > > > I have never heard of Propolice SSP. What is it ? Any relation to the > > honey > > 'Propolys'. just kidding. > > > > Max > > The name says little although I like it. > > http://www.gentoo.org/proj/en/hardened/ > > I was out of date -- Propolice has been renamed PaX. > The hardened project has many parts, you should read the help on > grsecurity, but PaX is very interesting : > > ------------------------------------------------------------------- > from http://www.gentoo.org/proj/en/hardened/docs/pax-howto.xml : > This stuf is really cool. Thanks for the info. Max
> postgresql-8.0.0beta4 $ time make -j 5 > ... lots of output ... > real 0m41.274s > user 1m36.315s > sys 0m15.451s > Yikes.
Max wrote: >>And you absolutely want a 64-bit OS with that much memory. Even on our >>development server with just 3GB of RAM, we see quite big jumps in >>performance after moving from 32-bit Linux to 64-bit Linux. I'd hate to >>think about 64GB of RAM being swapped in and out using PAE. >> > > > What's PAE ? 32-bit OS can only address up to 4GB of RAM. Of that 4GB, operating systems usually split up the address range as 2GB for userspace and 2GB for kernel. Depending on what OS/version you use, you might be able to change this to 3/1 -- and I even recall a wacky mode under Linux that can get you 4/4 (probably using PAE to swap both areas in and out). So if you have more than 2/3/4GB of memory, the OS has to create a "window" under the 4GB space and "swap" memory > 4GB in/out. It's not really an actual swap of course -- that would absolutely cripple performance to be unusable. It's more like an moving window that can point to memory areas > 4GB. Still, there is a performance hit. If process one needs memory at 6-7GB and process two needs it at 10-11GB, there's going to be some contention for that window. > Here's a question for the developers: what's the memory consumption > difference when you move a 3GB database from a 32 bit machine to a 64 ? > given that the whole thing must be in RAM. > > Isn't the whole data taking up more RAM because pointers are now 64 bits > instead of 32 ? Yes and no. Yes in that the memory used directly by Postgres takes twice the amount of space. No in that the optimal setup for Postgres is to have small settings for shared_buffers and leave the lion's share of memory for the OS cache. I have a bunch of servers -- both 32-bit and 64-bit. The 32-bit servers @ 10K shared_buffers use 72MB of RAM. The 64-bit servers @ 10K shared_buffers use 144MB of RAM. If these servers only had 256MB, it would be a big deal. But since they're all in the gigabytes, the difference between 3500MB of cache versus 3428 is rather insignificant. Now if files on 64-bit OSes took up twice the space as files on 32-bit OSes, the cache could only hold half the amount of data. But as far as I can see, they don't. I don't think even the blocksizes are twice the amount because most filesystems under Linux can already store files > 4GB.
"Max" <maxdl@adelphia.net> writes: > Here's a question for the developers: what's the memory consumption > difference when you move a 3GB database from a 32 bit machine to a 64 ? Not a lot. > Isn't the whole data taking up more RAM because pointers are now 64 bits > instead of 32 ? There are no pointers in on-disk data, so there's no difference at all in file sizes, and thus none in shared-buffer requirements --- at least not from the size of pointers. (64-bit machines tend to have MAXALIGN of 8 bytes instead of 4, which implies more alignment padding at the ends of rows. But that's usually a few-percent kind of cost, not a factor of 2.) The pointer size difference has a larger influence on Postgres' other internal data structures, such as the shared lock table. But by comparison to the shared disk buffers, those generally don't amount to anything. regards, tom lane
Max wrote: > Actually winXP and win server 2003 do support AMD64, it's intel 64 > that's not supported yet. To be accurate, Windows AMD64 support is still in beta. There has been a IA64 production release for quite some time, though Microsoft recently announced that they were discontinuing it. -- Guy Rouillier
On Wed, 26 Jan 2005 08:00:36 -0800, Max <maxdl@adelphia.net> wrote: > Hello, > > Our postgresql database is getting too big to be handled by one server. We > need the database to be in RAM and cannot afford swapping. At the moment, > we're using only 3GB or RAM, however our business growth is going to drive > this number into the double digits zone, maybe triple digits. > > What are our options ? > Depending on what your queries are like, you might take a look at memcached (www.danga.com). It's a distributed memory cache that uses simple key/value storage, and it's extremely fast and uses very little cpu. Chris