RE: [HACKERS] sort on huge table - Mailing list pgsql-hackers

From Ansley, Michael
Subject RE: [HACKERS] sort on huge table
Date
Msg-id 1BF7C7482189D211B03F00805F8527F748C208@S-NATH-EXCH2
Whole thread Raw
Responses Re: [HACKERS] sort on huge table  (Tatsuo Ishii <t-ishii@sra.co.jp>)
List pgsql-hackers
Now that's a close to linear as you are going to get.  Pretty good I think:
a sort of one billion rows in half an hour.

Mikea

>> -----Original Message-----
>> From: Tatsuo Ishii [mailto:t-ishii@sra.co.jp]
>> Sent: Thursday, November 04, 1999 10:30 AM
>> To: Tom Lane
>> Cc: t-ishii@sra.co.jp; pgsql-hackers@postgreSQL.org
>> Subject: Re: [HACKERS] sort on huge table 
>> 
>> 
>> >
>> >Tatsuo Ishii <t-ishii@sra.co.jp> writes:
>> >> I have compared current with 6.5 using 1000000 
>> tuple-table (243MB) (I
>> >> wanted to try 2GB+ table but 6.5 does not work in this case). The
>> >> result was strange in that current is *faster* than 6.5!
>> >
>> >> RAID5
>> >>     current    2:29
>> >>     6.5.2    3:15
>> >
>> >> non-RAID
>> >>     current    1:50
>> >>     6.5.2    2:13
>> >
>> >> Seems my previous testing was done in wrong way or the behavior of
>> >> sorting might be different if the table size is changed?
>> >
>> >Well, I feel better now, anyway ;-).  I thought that my first cut
>> >ought to have been about the same speed as 6.5, and after I added
>> >the code to slurp up multiple tuples in sequence, it should've been
>> >faster than 6.5.  The above numbers seem to be in line with that
>> >theory.  Next question: is there some additional effect that comes
>> >into play once the table size gets really huge?  I am thinking maybe
>> >there's some glitch affecting performance once the temp file size
>> >goes past one segment (1Gb).  Tatsuo, can you try sorts of say
>> >0.9 and 1.1 Gb to see if something bad happens at 1Gb?  I could
>> >try rebuilding here with a small RELSEG_SIZE, but right at the
>> >moment I'm not certain I'd see the same behavior you do...
>> 
>> Ok. I have run some testings with various amount of data.
>> 
>> RedHat Linux 6.0
>> Kernel 2.2.5-smp
>> 512MB RAM
>> Sort mem: 80MB
>> RAID5
>> 
>> 100 million tuples    1:31
>> 200            4:24
>> 300            7:27
>> 400            11:11  <-- 970MB
>> 500            14:01  <-- 1.1GB (segmented files)
>> 600            18:31
>> 700            22:24
>> 800            24:36
>> 900            28:12
>> 1000            32:14
>> 
>> I didn't see any bad thing at 1.1GB (500 million).
>> --
>> Tatsuo Ishii
>> 
>> ************
>> 


pgsql-hackers by date:

Previous
From: Tatsuo Ishii
Date:
Subject: Re: [HACKERS] sort on huge table
Next
From: Tatsuo Ishii
Date:
Subject: Re: [HACKERS] sort on huge table