Thread: Sun vs. Mac
Hello, I'm having some serious performance issues with PostgreSQL on our newish SunFire 280R (1 900MHz ultrasparc III, 1 GB RAM). It's painfully slow. It took me almost a week of tuning to get it in the range of an old Mac G3 laptop. Now, a few days later, after tweaking every nearly every parameter (only noting decreased performance on some) in /etc/system and $PGDATA/postgresql.conf it's about as fast as I can make it, but still horribly slow. A few simple queries that take 1.5-7 minutes on the G3 take 1-1.5 minutes on the Sun. A bulk load of roughly 2.4 GB database dump takes ~1 hour on each machine. It took almost 2 hours on the Sun before I turned off fsync. We have plans to add another CPU, RAM and another disk, which should all help, but in its current state, I (and many others) would think that it should run circles around the G3. I'm thinking that I'm missing something big and obvious because this can't be right. Otherwise we might as well just get a bunch of ibooks to run our databases - they're a lot smaller and much more quiet. Can someone please point me in the right direction? Thanks, -X __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
Hi CaptainXOr, Which version of PostgreSQL, and which release of Solaris are you running? Regards and best wishes, Justin Clift CaptainX0r wrote: > Hello, > > I'm having some serious performance issues with PostgreSQL on > our newish SunFire 280R (1 900MHz ultrasparc III, 1 GB RAM). > It's painfully slow. It took me almost a week of tuning to get > it in the range of an old Mac G3 laptop. Now, a few days later, > after tweaking every nearly every parameter (only noting > decreased performance on some) in /etc/system and > $PGDATA/postgresql.conf it's about as fast as I can make it, but > still horribly slow. A few simple queries that take 1.5-7 > minutes on the G3 take 1-1.5 minutes on the Sun. A bulk load of > roughly 2.4 GB database dump takes ~1 hour on each machine. It > took almost 2 hours on the Sun before I turned off fsync. > > We have plans to add another CPU, RAM and another disk, which > should all help, but in its current state, I (and many others) > would think that it should run circles around the G3. I'm > thinking that I'm missing something big and obvious because this > can't be right. Otherwise we might as well just get a bunch of > ibooks to run our databases - they're a lot smaller and much > more quiet. > > Can someone please point me in the right direction? > > Thanks, > > -X > > __________________________________________________ > Do you Yahoo!? > Yahoo! Mail Plus - Powerful. Affordable. Sign up now. > http://mailplus.yahoo.com > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
On Tue, Jan 14, 2003 at 07:00:08AM -0800, CaptainX0r wrote: > Hello, > > I'm having some serious performance issues with PostgreSQL on > our newish SunFire 280R (1 900MHz ultrasparc III, 1 GB RAM). > It's painfully slow. It took me almost a week of tuning to get > it in the range of an old Mac G3 laptop. Now, a few days later, > after tweaking every nearly every parameter (only noting > decreased performance on some) in /etc/system and > $PGDATA/postgresql.conf it's about as fast as I can make it, but You should tell us about what version of Solaris you're running, what version of Postgres, and what options you have used. Did you split the WAL onto its own filesystem? You'll get a big win that way. Also, what fsync setting are you using (open_datasync is the fastest in my experience). Finally, the bottleneck on Solaris is both disk and process forking (fork() is notoriously slow on Solaris). Also, certain sort routines are abysmal. Replace the Solaris-provided qsort(). I have to say, however, that my experience indicates that Solaris is slower that the competition for Postgres. It still shouldn't be that bad. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
Sorry - I meant to include that. I'm running PG 7.3.1 on Solaris 8. Thanks, -X --- Justin Clift <justin@postgresql.org> wrote: > > Which version of PostgreSQL, and which release of Solaris are > you running? > > > > > I'm having some serious performance issues with PostgreSQL > on > > our newish SunFire 280R (1 900MHz ultrasparc III, 1 GB RAM). > > > It's painfully slow. It took me almost a week of tuning to > get > > it in the range of an old Mac G3 laptop. Now, a few days > later, > > after tweaking every nearly every parameter (only noting > > decreased performance on some) in /etc/system and > > $PGDATA/postgresql.conf it's about as fast as I can make it, > but > > still horribly slow. A few simple queries that take 1.5-7 > > minutes on the G3 take 1-1.5 minutes on the Sun. A bulk > load of > > roughly 2.4 GB database dump takes ~1 hour on each machine. > It > > took almost 2 hours on the Sun before I turned off fsync. > > > > We have plans to add another CPU, RAM and another disk, > which > > should all help, but in its current state, I (and many > others) > > would think that it should run circles around the G3. I'm > > thinking that I'm missing something big and obvious because > this > > can't be right. Otherwise we might as well just get a bunch > of > > ibooks to run our databases - they're a lot smaller and much > > more quiet. > > > > Can someone please point me in the right direction? > > > > Thanks, > > > > -X > > __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
All, > You should tell us about what version of Solaris you're > running, what > version of Postgres, and what options you have used. You're right, sorry. PG 7.3.1 on Solaris 8. I've got the default recommended /etc/system but with shmmax cranked way up which seems to have helped. I don't have the system in front of me (and it's down, so I can't get to it), but from memory max_connections was increased to 64, shared_buffers up to 65536, sort_mem and vacuum_mem were doubled, and I think that's it. I changed every seemingly relevant one, and spent a lot of time on the *cost section trying various factors of n*10 on each, with no joy. > Did you split > the WAL onto its own filesystem? You'll get a big win that > way. I have not. What exactly do you by "own filesystem"? Another filesystem? I was planning on putting pg_xlog on the OS disk and moving $PGDATA off to a second disk. > Also, what fsync setting are you using (open_datasync is the > fastest in my experience). I've read that somewhere (maybe in the archives?) and I got no change with any of them. But now I'm thinking back - do I need fsync=true for that to have an affect? I'm not worried about the cons of having fsync=false at all - and I'm assuming that should be better than true and open_datasync. Or am I confusing things? > Also, certain sort routines are abysmal. Replace the > Solaris-provided qsort(). I've read about this as well - but haven't even gotten that far on the testing/configuring yet. > I have to say, however, that my experience indicates that > Solaris is > slower that the competition for Postgres. It still shouldn't > be that bad. I agree completely. Thanks for your input, -X __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
CaptainX0r <captainx0r@yahoo.com> writes: > I've read that somewhere (maybe in the archives?) and I got no > change with any of them. But now I'm thinking back - do I need > fsync=true for that to have an affect? I'm not worried about > the cons of having fsync=false at all - and I'm assuming that > should be better than true and open_datasync. You are right that fsync_method is a no-op if you've got fsync turned off. Let me get this straight: the Sun is slower even with fsync off? That shoots down the first theory that I had, which was that the Sun's disk drives were actually honoring fsync while the laptop's drive does not. (See archives for more discussion of that, but briefly: IDE drives are commonly set up to claim write complete as soon as they've absorbed data into their onboard buffers. SCSI drives usually tell the truth about when they've completed a write.) Andrew Sullivan's nearby recommendation to replace qsort() is a good one, but PG 7.3 is already configured to do that by default. (Look in src/Makefile.global to confirm that qsort.o is mentioned in LIBOBJS.) I'd suggest starting with some elementary measurements, for example looking at I/O rates and CPU idle percentage while running the same task on both Solaris and G3. That would at least give us a clue whether I/O or CPU is the bottleneck. regards, tom lane
On Tue, Jan 14, 2003 at 07:41:21AM -0800, CaptainX0r wrote: > > You're right, sorry. PG 7.3.1 on Solaris 8. I've got the > default recommended /etc/system but with shmmax cranked way up Ok, I have no experience with 7.3.1 in a production setting - we're using 7.2. But here are some things. > which seems to have helped. I don't have the system in front of > me (and it's down, so I can't get to it), but from memory > max_connections was increased to 64, shared_buffers up to 65536, > sort_mem and vacuum_mem were doubled, and I think that's it. I > changed every seemingly relevant one, and spent a lot of time on You'll need to increase the number of available semaphores more than likely, if you add any connections. You do indeed need to fix shmmax, but if the postmaster starts, you're fine. I would worry slightly about sort_mem. I have managed to make Solaris boxes with _lots_ of memory start swapping by setting that too high (while experimenting). Look for problems in your I/O. > the *cost section trying various factors of n*10 on each, with > no joy. These are fine-tuning knobs. You have a different problem :) > > Did you split > > the WAL onto its own filesystem? You'll get a big win that > > way. > > I have not. What exactly do you by "own filesystem"? Another > filesystem? I was planning on putting pg_xlog on the OS disk > and moving $PGDATA off to a second disk. That's what you need. Without any doubt at all. The xlog on the same UFS filesystem (and disk) as the rest of $PGDATA is a nightmare. Interestingly, by the way, there is practically _no difference_ if you do this with an A5200 managed by Veritas. I have tried dozens of things. It never matters. The array is too fast. > > Also, what fsync setting are you using (open_datasync is the > > fastest in my experience). > > I've read that somewhere (maybe in the archives?) and I got no > change with any of them. But now I'm thinking back - do I need > fsync=true for that to have an affect? I'm not worried about > the cons of having fsync=false at all - and I'm assuming that > should be better than true and open_datasync. Or am I confusing > things? Yes, if you change the fsync method but have fsync turned off, it will make no difference. > > Also, certain sort routines are abysmal. Replace the > > Solaris-provided qsort(). > > I've read about this as well - but haven't even gotten that far > on the testing/configuring yet. If you're doing any sorting that is not by an index, forget about it. Change it now. It's something like a multiple of 40 slower. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
CaptainX0r writes: > I'm having some serious performance issues with PostgreSQL on > our newish SunFire 280R (1 900MHz ultrasparc III, 1 GB RAM). > It's painfully slow. What has PostgreSQL been compiled by? Personal past experience has shown the Sun Workshop C compiler to result in much better performance compared to GCC... L.
--- Lee Kindness <lkindness@csl.co.uk> wrote: > CaptainX0r writes: >> I'm having some serious performance issues with PostgreSQL on >> our newish SunFire 280R (1 900MHz ultrasparc III, 1 GB RAM). >> It's painfully slow. > > What has PostgreSQL been compiled by? Personal past experience > has > shown the Sun Workshop C compiler to result in much better > performance > compared to GCC... I used gcc - mostly because I have in the past, but also because I've read that it is "the one to use". Am I wrong on this one? I'm certainly willing to try the one from Sun Workshop. Thanks for the input, -X __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
> You'll need to increase the number of available semaphores > more than > likely, if you add any connections. You do indeed need to fix > shmmax, but if the postmaster starts, you're fine. Thanks, I'll take a closer look at this. > That's what you need. Without any doubt at all. The xlog on > the same UFS filesystem (and disk) as the rest of $PGDATA is a > nightmare. The disks are on order - but that can't be the only thing hold it up, can it? I've got to check out the IO, as you suggest. > > > Also, certain sort routines are abysmal. Replace the > > > Solaris-provided qsort(). > > > > I've read about this as well - but haven't even gotten that > > far on the testing/configuring yet. > > If you're doing any sorting that is not by an index, forget > about it. > Change it now. It's something like a multiple of 40 slower. I double checked, and this was one of the reasons I was glad to try 7.3 on Solaris - it's already got it built in. Thanks much for the input, -X __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
All, > Let me get this straight: the Sun is slower even with fsync > off? That Correct. This really helped a lot, especially with the dump load, but I've clearly got some more work ahead of me. > Andrew Sullivan's nearby recommendation to replace qsort() is > a good > one, but PG 7.3 is already configured to do that by default. > (Look in > src/Makefile.global to confirm that qsort.o is mentioned in > LIBOBJS.) Thanks for confirming. I've got LIBOBJS = isinf.o qsort.o > I'd suggest starting with some elementary measurements, for > example looking at I/O rates and CPU idle percentage while > running the same task on both Solaris and G3. That would at > least give us a clue whether I/O or CPU is the bottleneck. Good thoughts - I'm working on it right now, though I'm not really sure how to check I/O rates.... Thanks much for the input, -X __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
On Tue, 2003-01-14 at 15:00, CaptainX0r wrote: > Hello, > > I'm having some serious performance issues with PostgreSQL on > our newish SunFire 280R (1 900MHz ultrasparc III, 1 GB RAM). > It's painfully slow. It took me almost a week of tuning to get > it in the range of an old Mac G3 laptop. Now, a few days later, > after tweaking every nearly every parameter (only noting > decreased performance on some) in /etc/system and > $PGDATA/postgresql.conf could you post your $PGDATA/postgresql.conf for our viewing pleasure ? > it's about as fast as I can make it, but > still horribly slow. A few simple queries that take 1.5-7 > minutes on the G3 take 1-1.5 minutes on the Sun. A bulk load of > roughly 2.4 GB database dump takes ~1 hour on each machine. It > took almost 2 hours on the Sun before I turned off fsync. > > We have plans to add another CPU, RAM and another disk, which > should all help, Another CPU will probably not help with bulk loads or other single-user stuff. > but in its current state, I (and many others) > would think that it should run circles around the G3. I'm > thinking that I'm missing something big and obvious because this > can't be right. Otherwise we might as well just get a bunch of > ibooks to run our databases - they're a lot smaller and much > more quiet. For single-user tasks you will probably be better off by getting a gray box with Athlon 2600+ with 3 Gigs of memory and IDE disks and running Linux or *BSD . > Can someone please point me in the right direction? > > Thanks, > > -X > > __________________________________________________ > Do you Yahoo!? > Yahoo! Mail Plus - Powerful. Affordable. Sign up now. > http://mailplus.yahoo.com > > ---------------------------(end of broadcast)--------------------------- > TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org -- Hannu Krosing <hannu@tm.ee>
> I'd suggest starting with some elementary measurements, for > example looking at I/O rates and CPU idle percentage while > running the same task on both Solaris and G3. That would at > least give us a clue whether I/O or CPU is the bottleneck. Well, I've got the Sun box now, but I don't really have acces to the G3. FWIW, top shows postgres slowly taking up all the CPU - over the course of a minute or so it gradually ramps up to around 90%. Once the query is complete, however, top shows the CPU ramping down slowly, ~1-2% per second over the next 2 minutes which I find very strange. The CPU idle is 0% for the duration of the query, while the user state is around 100% for the same period. This kind of makes me think top is wrong (100% idle and 75% postgres?) iostat gives: (sorry for line wrap). # iostat -DcxnzP cpu us sy wt id 10 1 4 85 extended device statistics r/s w/s kr/s kw/s wait actv wsvc_t asvc_t %w %b device 11.2 1.0 65.5 13.1 0.1 0.1 9.5 9.6 0 3 c1t0d0s0 0.0 0.0 0.0 0.1 0.0 0.0 0.0 6.0 0 0 c1t0d0s1 7.3 0.1 502.3 0.5 0.0 0.0 0.0 2.3 0 1 c1t0d0s3 0.0 0.0 0.0 0.0 0.0 0.0 0.0 0.1 0 0 host:vold(pid313) This doesn't really tell me much, except I'm guessing that PG is CPU bound? -X __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
On Tue, Jan 14, 2003 at 09:50:04AM -0800, CaptainX0r wrote: > the G3. FWIW, top shows postgres slowly taking up all the CPU - Son't use top on Solaris 8. It's inaccurate, and it affects the results itself. Use prstat instead. > This doesn't really tell me much, except I'm guessing that PG is > CPU bound? It looks that way. I've had iostat show CPU-bound, however, when the problem actually turned out to be memory contention. I think you may want to have a poke with vmstat, and also have a look at the SE toolkit. A ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
> could you post your $PGDATA/postgresql.conf for our viewing > pleasure ? max_connections = 64 shared_buffers = 65536 # 1/2 total RAM /8K sort_mem = 100000 # min 64, size in KB checkpoint_timeout = 300 # range 30-3600, in seconds fsync = false effective_cache_size = 65536 # typically 8KB each log_timestamp = true notice, warning, error stats_command_string = true stats_row_level = true stats_block_level = true LC_MESSAGES = 'C' LC_MONETARY = 'C' LC_NUMERIC = 'C' LC_TIME = 'C' I've stripped out the default lines (grep -v ^#) comments and blank lines. > Another CPU will probably not help with bulk loads or other > single-user stuff. > [snip] > > For single-user tasks you will probably be better off by > getting a gray box with Athlon 2600+ with 3 Gigs of memory and > IDE disks and running Linux or *BSD . Hannu brings up a good point - one that was debated before my attempts at making Solaris faster. If you were going to make a fast postgres server what would you use? Assuming you could afford a SunFire 280R (~$8k?), would that money be better spent on a (say) Dell server running (say) linux? We're doing light multiuser (I guess effectively single user) but at some point (years) this may grow considereably. I'm not particular to Macs, but I've got to say, that stock out the box, postgres loves it. That old G3 was faster than the Sun, and still is faster than my (years newer) linux laptop (on which I've done no performance tweaking). So maybe a dual G4 Xserver would scream? Any suggestions? It's still not too late for us to change our minds on this one. Thanks much, -X __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
CaptainX0r <captainx0r@yahoo.com> writes: > Well, I've got the Sun box now, but I don't really have acces to > the G3. FWIW, top shows postgres slowly taking up all the CPU - > over the course of a minute or so it gradually ramps up to > around 90%. Once the query is complete, however, top shows the > CPU ramping down slowly, ~1-2% per second over the next 2 > minutes which I find very strange. I believe top's percent-of-CPU numbers for individual processes are time averages over a minute or so, so the ramping effect is unsurprising. > This doesn't really tell me much, except I'm guessing that PG is > CPU bound? Yup, that seems pretty clear. Next step is to find out what the heck it's doing. My instinct would be to use gprof. Recompile with profiling enabled --- if you're using gcc, this should work cd postgres-distribution/src/backend make clean make PROFILE=-pg all make install-bin -- may need to stop postmaster before install Next run some sample queries (put them all into one session). After quitting the session, find gmon.out in the $PGDATA/base/nnn/ subdirectory corresponding to your database, and feed it to gprof. The results should show where the code hotspot is. regards, tom lane
On Tue, 14 Jan 2003, CaptainX0r wrote: > Hello, > > I'm having some serious performance issues with PostgreSQL on > our newish SunFire 280R (1 900MHz ultrasparc III, 1 GB RAM). > It's painfully slow. It took me almost a week of tuning to get > it in the range of an old Mac G3 laptop. Now, a few days later, > after tweaking every nearly every parameter (only noting > decreased performance on some) in /etc/system and > $PGDATA/postgresql.conf it's about as fast as I can make it, but > still horribly slow. A few simple queries that take 1.5-7 > minutes on the G3 take 1-1.5 minutes on the Sun. A bulk load of > roughly 2.4 GB database dump takes ~1 hour on each machine. It > took almost 2 hours on the Sun before I turned off fsync. Just for giggles, do you have a spare drive or something you can try loading debian or some other Sparc compatible linux distro and get some numbers? My experience has been that on the same basic hardware, Linux runs postgresql about twice as fast as Solaris, and no amount of tweaking seemed to ever get postgresql up to the same performance on Solaris. It's so bad a Sparc 20 with 256 Meg ram and a 50 MHz 32 bit CPU running linux was outrunning our Sun Ultra 1 with 512 Meg ram and a 150 MHz 64 bit CPU by about 50%. That was with the 2.0.x kernel for linux and Solaris 7 on the Ultra I believe. Could have been older on the Solaris version, as I wasn't the SA on that box.
--- Andrew Sullivan <andrew@libertyrms.info> wrote: > On Tue, Jan 14, 2003 at 09:50:04AM -0800, CaptainX0r wrote: > > the G3. FWIW, top shows postgres slowly taking up all the > CPU - > > Son't use top on Solaris 8. It's inaccurate, and it affects > the results itself. Use prstat instead. Thanks for the tip. Unfortunately it shows the same exact thing. > > This doesn't really tell me much, except I'm guessing that > PG is > > CPU bound? > > It looks that way. I've had iostat show CPU-bound, however, > when the problem actually turned out to be memory contention. > I think you may want to have a poke with vmstat, and also have > a look at the SE toolkit. I'll have a look at the SE toolkit - thanks. vmstat shows me this: # vmstat -s 0 swap ins 0 swap outs 0 pages swapped in 0 pages swapped out 125452 total address trans. faults taken 35245 page ins 60 page outs 194353 pages paged in 229 pages paged out 184621 total reclaims 184563 reclaims from free list 0 micro (hat) faults 125452 minor (as) faults 31764 major faults 10769 copy-on-write faults 80220 zero fill page faults 0 pages examined by the clock daemon 0 revolutions of the clock hand 170 pages freed by the clock daemon 601 forks 19 vforks 577 execs 370612 cpu context switches 1288431 device interrupts 148288 traps 1222653 system calls 294090 total name lookups (cache hits 48%) 43510 user cpu 4002 system cpu 480912 idle cpu 13805 wait cpu procs memory page disk faults cpu r b w swap free re mf pi po fr de sr s6 sd -- -- in sy cs us sy id 0 0 0 815496 538976 31 21 261 0 0 0 0 0 12 0 0 136 209 65 7 1 92 I've not much experience with this, it looks like there are considerably more page ins than outs as compared to our other solaris boxen but otherwise pretty normal. -X __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
On Tue, Jan 14, 2003 at 10:10:54AM -0800, CaptainX0r wrote: > > could you post your $PGDATA/postgresql.conf for our viewing > > pleasure ? > > max_connections = 64 > shared_buffers = 65536 # 1/2 total RAM /8K > sort_mem = 100000 # min 64, size in KB ^^^^^^ There's your problem. Don't set that anywhere near that high. If you run 2 queries that require sorting, _each sort_ can use up to 100000 K. Which can chew up all your memory pretty fast. > effective_cache_size = 65536 # typically 8KB each What basis did you have to change this? Have you done work figuring out how big the kernel's disk cache is regularly on that system? > Hannu brings up a good point - one that was debated before my > attempts at making Solaris faster. If you were going to make a > fast postgres server what would you use? Assuming you could > afford a SunFire 280R (~$8k?), would that money be better spent > on a (say) Dell server running (say) linux? We're doing light I've been finding FreeBSD way faster than Linux. But yes. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
> > For single-user tasks you will probably be better off by > > getting a gray box with Athlon 2600+ with 3 Gigs of memory and > > > IDE disks and running Linux or *BSD . > > I'm not particular to > Macs, but I've got to say, that stock out the box, postgres > loves it. That old G3 was faster than the Sun, and still is > faster than my (years newer) linux laptop (on which I've done no > performance tweaking). So maybe a dual G4 Xserver would scream? > > Any suggestions? It's still not too late for us to change our > minds on this one. I can't recommend macs for either brute force speed or price/performance. My current flock of machines are mostly OSX g4 boxes (single 400s and dual 800), with a couple of linux boxen thrown in forgood measure. The mac's biggest issues are: 1) Tweakability - you've got one file system, and it doesn't really do useful mount options like noatime. 2) There are bugs in mount and traversing symlinks that make it hard to move pg_xlog onto another file system and retainperformance (in 10.1.5, I don't have test hardware with enough drives to test 10.2) 3) vm_stat gives vm status, iostat gives nothing. Looks like this is working on my 10.2 laptop, but it's annoyed me for awhile on 10.1.5 4) SW raid is not that much of a help for speed. 5) Bus bandwidth is a good factor behind x86 linux boxen. (DDR ram isn't really taken advantage of in current designs) Having said that, I'm getting reasonable performance out of all the macs, in fact, I'm getting reasonably similar performanceout of all of them desplite the 4x difference in processor power. And that's because they basically have thesame low end disk system. I'm piecing together something that I hope will be faster out of a x86 box loaded with more drives in a sw raid mirroringsetup. (tentative is 1x system+logs, mirror for pg_xlog, mirror for pg data) I'm planning on running some comparative benchmarks prior to going live, so I should be able to tell how much faster it is. eric
Hi, > Just for giggles, do you have a spare drive or something you > can try > loading debian or some other Sparc compatible linux distro and > get some > numbers? My experience has been that on the same basic This was on the todo list, "just to see", but I'm not sure how much time we want to spend trying a myriad of options when concentrating on one should (maybe?) do the trick. > hardware, Linux runs postgresql about twice as fast as > Solaris, and no amount of tweaking seemed to ever get > postgresql up to the same performance on Solaris. It's > so bad a Sparc 20 with 256 Meg ram and a 50 MHz 32 bit CPU > running linux was outrunning our Sun Ultra 1 with 512 Meg ram > and a 150 MHz 64 bit CPU by about 50%. That was with the > 2.0.x kernel for linux and This is not encouraging..... We may be revisiting the linux option. Thanks much for the input, -X __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
--- Andrew Sullivan <andrew@libertyrms.info> wrote: > On Tue, Jan 14, 2003 at 10:10:54AM -0800, CaptainX0r wrote: > > > could you post your $PGDATA/postgresql.conf for our > viewing > > > pleasure ? > > > > max_connections = 64 > > shared_buffers = 65536 # 1/2 total RAM /8K > > sort_mem = 100000 # min 64, size in KB > ^^^^^^ > There's your problem. Don't set that anywhere near that high. > If you run 2 queries that require sorting, _each sort_ can use > up to > 100000 K. Which can chew up all your memory pretty fast. I changed back to the default 1024, and down to the minimum, 64 - no change. I think that was changed simultaneously with some other parameter (bad, I know) that actually had an affect. I guess I can remove it. > > effective_cache_size = 65536 # typically 8KB each > > What basis did you have to change this? Have you done work > figuring out how big the kernel's disk cache is regularly on > that system? I read somewhere that this should be set to half the system RAM size, 64k*8k=512m = 1/2 of the 1 Gig RAM. I guess this is way off since you're saying that it's disk cache. This agrees with the documentation. I can't really rely on the (precious little Solaris postgres) info I find on the net.... ;) Unfortunately, setting back to 1000 doesn't appear to help. > > Hannu brings up a good point - one that was debated before > > my attempts at making Solaris faster. If you were going to > make a > > fast postgres server what would you use? Assuming you could > > afford a SunFire 280R (~$8k?), would that money be better > > spent on a (say) Dell server running (say) linux? We're > > doing light > > I've been finding FreeBSD way faster than Linux. But yes. I like to hear this since I'm a big FreeBSD fan. So far I think I've understood this as: FreeBSD > Linux > OSX > Solaris. Thanks much for the input, -X __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
On Tue, Jan 14, 2003 at 11:01:49AM -0800, CaptainX0r wrote: > I changed back to the default 1024, and down to the minimum, 64 > - no change. I think that was changed simultaneously with some > other parameter (bad, I know) that actually had an affect. I > guess I can remove it. Very bad to change two things at once. You think it's saving you time, but now . . . well, you already know what happens ;-) Anyway, you _still_ shouldn't have it that high. > > > effective_cache_size = 65536 # typically 8KB each > > I read somewhere that this should be set to half the system RAM > size, 64k*8k=512m = 1/2 of the 1 Gig RAM. I guess this is way > off since you're saying that it's disk cache. This agrees with > the documentation. I can't really rely on the (precious little > Solaris postgres) info I find on the net.... ;) I think you should rely on the Postgres documentation, which has way fewer errors than just about any other technical documentation I've ever seen. Yes, it's disk cache. I wouldn't set _anything_ to half the system RAM. It'd be real nice if your disk cache was half your RAM, but I'd be amazed if anyone's system were that efficient. It sounds like you need to follow Tom Lane's advice, though, and do some profiling. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
> I believe top's percent-of-CPU numbers for individual > processes are time > averages over a minute or so, so the ramping effect is > unsurprising. Thanks - this makes much more sense. > > This doesn't really tell me much, except I'm guessing that > > PG is CPU bound? > > Yup, that seems pretty clear. Next step is to find out what > the heck > it's doing. My instinct would be to use gprof. Recompile > with > profiling enabled --- if you're using gcc, this should work > cd postgres-distribution/src/backend > make clean > make PROFILE=-pg all > make install-bin -- may need to stop postmaster > Next run some sample queries (put them all into one session). > After quitting the session, find gmon.out in the > $PGDATA/base/nnn/ subdirectory corresponding to your database, > and feed it to gprof. > The results should show where the code hotspot is. Well if that isn't a fancy bit of info.... Thanks! gprof says: Fatal ELF error: can't read ehdr (Request error: class file/memory mismatch) I'm guessing that's not what we're expecting... I'm using /usr/ccs/bin/gprof - maybe there's a better one? -X __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
Andrew Sullivan wrote: <snip> > I've been finding FreeBSD way faster than Linux. But yes. Out of curiosity, have you been trying FreeBSD 4.7x or the developer releases of 5.0? Apparently there are new kernel scheduling improvements in FreeBSD 5.0 that will help certain types of tasks and might boost our performance further. Would be interested in seeing if the profiling/optimisation options of GCC 3.2.x are useful as well. :-) Regards and best wishes, Justin Clift > A -- "My grandfather once told me that there are two kinds of people: those who work and those who take the credit. He told me to try to be in the first group; there was less competition there." - Indira Gandhi
On Wed, Jan 15, 2003 at 10:29:46AM +1030, Justin Clift wrote: > Andrew Sullivan wrote: > <snip> > > I've been finding FreeBSD way faster than Linux. But yes. > > Out of curiosity, have you been trying FreeBSD 4.7x or the developer Just 4.7x. And mostly for little jobs for myself, so I can't speak about testing it in a production case. A -- ---- Andrew Sullivan 204-4141 Yonge Street Liberty RMS Toronto, Ontario Canada <andrew@libertyrms.info> M2P 2A8 +1 416 646 3304 x110
CaptainX0r <captainx0r@yahoo.com> writes: > gprof says: > Fatal ELF error: can't read ehdr (Request error: class > file/memory mismatch) Hm, that's a new one on me. Just to eliminate the obvious: you did read the gprof man page? It typically needs both the pathname of the postgres executable and that of the gmon.out file. If that's not it, I fear you need a gprof expert, which I ain't. regards, tom lane
All, > Hm, that's a new one on me. Just to eliminate the obvious: > you did read the gprof man page? It typically needs both the > pathname of the postgres executable and that of the gmon.out I read it, but apparently not very well. It appears that as long as gmon.out is in the current dir, all that's needed is the name of the executeable (with full path). The way it's formated I read it as all that's needed is the image-file. Anyways... There's a ton of output, so I'm picking what appear to be the highlights. granularity: each sample hit covers 4 byte(s) for 0.00% of 705.17 seconds called/total parents index %time self descendents called+self name index called/total children [1] 63.6 446.05 2.65 44386289+57463869 <cycle 1 as a whole> [1] 442.08 2.50 59491100+2566048020 <external> <cycle 1> [2] 2.91 0.00 23045572+366 _fini <cycle 1> [23] 0.57 0.00 17763681 _rl_input_available <cycle 1> [42] 0.11 0.15 478216+7 history_expand <cycle 1> [54] 0.21 0.00 7 history_tokenize_internal <cycle 1> [56] 0.10 0.00 1071137 tilde_expand <cycle 1> [68] 0.07 0.00 397 rl_gather_tyi <cycle 1> [70] 0.00 0.00 31 qsort <cycle 1> [81] 0.00 0.00 17 rl_insert_close <cycle 1> [82] ----------------------------------------------- <spontaneous> [3] 32.3 0.65 227.47 rl_get_termcap [3] 226.13 1.34 22502064/44386289 <external> <cycle 1> [2] ----------------------------------------------- <spontaneous> [4] 26.3 0.78 184.35 rl_stuff_char [4] 178.51 1.06 17763681/44386289 <external> <cycle 1> [2] 4.78 0.00 17763681/17763681 rl_clear_signals [18] ----------------------------------------------- <spontaneous> [5] 15.8 111.61 0.00 rl_signal_handler [5] 0.00 0.00 1/44386289 <external> <cycle 1> [2] ----------------------------------------------- <spontaneous> [6] 4.3 30.57 0.00 rl_sigwinch_handler [6] ----------------------------------------------- And: granularity: each sample hit covers 4 byte(s) for 0.00% of 705.17 seconds % cumulative self self total time seconds seconds calls ms/call ms/call name 15.8 111.61 111.61 rl_signal_handler [5] 4.3 142.18 30.57 rl_sigwinch_handler [6] 1.9 155.42 13.24 rl_set_sighandler [8] 1.9 168.52 13.10 rl_maybe_set_sighandler [9] 1.1 176.37 7.85 _rl_next_macro_key [11] 0.9 182.38 6.01 rl_read_key [12] 0.8 188.07 5.69 rl_backward_kill_line [13] 0.8 193.56 5.49 rl_unix_word_rubout [14] 0.8 198.91 5.35 _rl_pop_executing_macro [15] 0.7 203.73 4.82 _rl_fix_last_undo_of_type [17] 0.7 208.51 4.78 17763681 0.00 0.00 rl_clear_signals [18] 0.6 212.87 4.36 rl_modifying [19] 0.6 216.95 4.08 rl_begin_undo_group [20] 0.6 221.00 4.05 rl_tilde_expand [21] 0.4 223.98 2.98 region_kill_internal [22] 0.4 226.89 2.91 23045572 0.00 0.00 _fini <cycle 1> [23] So. Any thoughts? This looks really useful in the hands of someone who knows what it all means. Looks like some signal handlers are using up most of the time. Good? Bad? Am I reading that first part correctly in that a good part of the time spent is external to Postgres? This report also seems to verify that qsort isn't a problem since it was the 81st index, with 31 calls (not much) and 0.00 self seconds. Thanks much, -X __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com
CaptainX0r <captainx0r@yahoo.com> writes: > % cumulative self self total > time seconds seconds calls ms/call ms/call name > 15.8 111.61 111.61 rl_signal_handler [5] > 4.3 142.18 30.57 rl_sigwinch_handler [6] > 1.9 155.42 13.24 rl_set_sighandler [8] > 1.9 168.52 13.10 rl_maybe_set_sighandler > [9] > 1.1 176.37 7.85 _rl_next_macro_key [11] > 0.9 182.38 6.01 rl_read_key [12] > 0.8 188.07 5.69 rl_backward_kill_line [13] All of these names correspond to internal routines in libreadline. It'd not be surprising for libreadline to suck a good deal of the runtime of psql ... but I don't believe the backend will call it at all. So, either this trace is erroneous, or you profiled the wrong process (client instead of backend), or there's something truly weird going on. regards, tom lane
> It'd not be surprising for libreadline to suck a good deal of the > runtime of psql ... but I don't believe the backend will call it at all. > So, either this trace is erroneous, or you profiled the wrong process > (client instead of backend), or there's something truly weird going on. You're right, I got the client, here's the backend: % cumulative self self total time seconds seconds calls ms/call ms/call name 23.4 125.31 125.31 internal_mcount [13] 21.2 239.07 113.76 79296415 0.00 0.00 ExecMakeFunctionResult <cycle 4> [14] 7.8 280.68 41.61 98971658 0.00 0.00 AllocSetReset [23] 6.8 317.13 36.45 193735603 0.00 0.00 ExecEvalVar [18] 5.2 345.21 28.08 280731963 0.00 0.00 ExecEvalExpr <cycle 4> [15] 2.7 359.93 14.72 38140599 0.00 0.00 nocachegetattr [35] 2.7 374.28 14.35 320207 0.04 0.04 _read [38] 2.2 385.97 11.69 78969393 0.00 0.00 ExecQual <cycle 4> [34] 2.1 397.46 11.49 79296415 0.00 0.00 ExecEvalFuncArgs <cycle 4> [42] 1.4 404.71 7.25 _mcount (6219) 1.3 411.73 7.02 11293115 0.00 0.00 heapgettup [31] 1.2 418.34 6.61 98975017 0.00 0.00 ExecClearTuple [43] 1.0 423.93 5.59 98971592 0.00 0.00 ExecStoreTuple [33] 0.9 428.87 4.94 197952332 0.00 0.00 MemoryContextSwitchTo [53] 0.9 433.53 4.66 7612547 0.00 0.00 heap_formtuple [39] 0.8 437.96 4.43 7609318 0.00 0.01 ExecScanHashBucket <cycle 4> [17] 0.8 442.34 4.38 8 547.50 547.50 .rem [55] 0.8 446.64 4.30 79296261 0.00 0.00 ExecEvalOper <cycle 4> [56] I'm not sure what to make of this. Thanks, -X __________________________________________________ Do you Yahoo!? Yahoo! Mail Plus - Powerful. Affordable. Sign up now. http://mailplus.yahoo.com