Thread: RE: [HACKERS] sort on huge table
Now that's a close to linear as you are going to get. Pretty good I think: a sort of one billion rows in half an hour. Mikea >> -----Original Message----- >> From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp] >> Sent: Thursday, November 04, 1999 10:30 AM >> To: Tom Lane >> Cc: t-ishii@sra.co.jp; pgsql-hackers@postgreSQL.org >> Subject: Re: [HACKERS] sort on huge table >> >> >> > >> >Tatsuo Ishii <t-ishii@sra.co.jp> writes: >> >> I have compared current with 6.5 using 1000000 >> tuple-table (243MB) (I >> >> wanted to try 2GB+ table but 6.5 does not work in this case). The >> >> result was strange in that current is *faster* than 6.5! >> > >> >> RAID5 >> >> current 2:29 >> >> 6.5.2 3:15 >> > >> >> non-RAID >> >> current 1:50 >> >> 6.5.2 2:13 >> > >> >> Seems my previous testing was done in wrong way or the behavior of >> >> sorting might be different if the table size is changed? >> > >> >Well, I feel better now, anyway ;-). I thought that my first cut >> >ought to have been about the same speed as 6.5, and after I added >> >the code to slurp up multiple tuples in sequence, it should've been >> >faster than 6.5. The above numbers seem to be in line with that >> >theory. Next question: is there some additional effect that comes >> >into play once the table size gets really huge? I am thinking maybe >> >there's some glitch affecting performance once the temp file size >> >goes past one segment (1Gb). Tatsuo, can you try sorts of say >> >0.9 and 1.1 Gb to see if something bad happens at 1Gb? I could >> >try rebuilding here with a small RELSEG_SIZE, but right at the >> >moment I'm not certain I'd see the same behavior you do... >> >> Ok. I have run some testings with various amount of data. >> >> RedHat Linux 6.0 >> Kernel 2.2.5-smp >> 512MB RAM >> Sort mem: 80MB >> RAID5 >> >> 100 million tuples 1:31 >> 200 4:24 >> 300 7:27 >> 400 11:11 <-- 970MB >> 500 14:01 <-- 1.1GB (segmented files) >> 600 18:31 >> 700 22:24 >> 800 24:36 >> 900 28:12 >> 1000 32:14 >> >> I didn't see any bad thing at 1.1GB (500 million). >> -- >> Tatsuo Ishii >> >> ************ >>
>Now that's a close to linear as you are going to get. Pretty good I think: >a sort of one billion rows in half an hour. Oops! It's not one billion but 10 millions. Sorry. 1 million tuples 1:31 2 4:24 3 7:27 4 11:11 <-- 970MB 5 14:01 <-- 1.1GB (segmented files) 6 18:31 7 22:24 8 24:36 9 28:12 10 32:14 -- Tatsuo Ishii
Tatsuo Ishii <t-ishii@sra.co.jp> writes: >> Now that's a close to linear as you are going to get. Pretty good I think: >> a sort of one billion rows in half an hour. > Oops! It's not one billion but 10 millions. Sorry. Actually, the behavior ought to be O(N log N) not O(N). With a little bit of arithmetic, we get MTuples Time Sec Delta Sec/ MTuple/sec sec MTuple 1 1:31 91 91 91 0.010989 2 4:24 264 173 132 0.00757576 3 7:27 447 183 149 0.00671141 4 11:11 671 224 167.75 0.00596125 5 14:01 841 170 168.2 0.0059453 6 18:31 1111 270 185.167 0.00540054 7 22:24 1344 233 192 0.00520833 8 24:36 1476 132 184.5 0.00542005 9 28:12 1692 216 188 0.00531915 10 32:14 1934 242 193.4 0.00517063 which is obviously nonlinear. Column 5 should theoretically be a log(N) curve, and it's not too hard to draw one that matches it pretty well (see attached plot). It's pretty clear that we don't have any specific problem with the one-versus-two-segment issue, which is good (I looked again at the code and couldn't see any reason for such a problem to exist). But there's still the question of whether the old code is faster. Tatsuo, could you run another set of numbers using 6.5.* and the same test conditions, as far up as you can get with 6.5? (I think you ought to be able to reach 2Gb, though not pass it, so most of this curve can be compared to 6.5.) regards, tom lane