Re: [HACKERS] sort on huge table - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] sort on huge table
Date
Msg-id 20416.939824280@sss.pgh.pa.us
Whole thread Raw
In response to sort on huge table  (Tatsuo Ishii <t-ishii@sra.co.jp>)
Responses Re: [HACKERS] sort on huge table  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
Tatsuo Ishii <t-ishii@sra.co.jp> writes:
> I came across problems with sorting a huge (2.4GB) table.

The current sorting code will fail if the data volume exceeds whatever
the maximum file size is on your OS.  (Actually, if long is 32 bits,
it might fail at 2gig even if your OS can handle 4gig; not sure, but
it is doing signed-long arithmetic with byte offsets...)

I am just about to commit code that fixes this by allowing temp files
to have multiple segments like tables can.

> o it took 46 minutes to complete following query:

What -S setting are you using?  Increasing it should reduce the time
to sort, so long as you don't make it so large that the backend starts
to swap.  The current default seems to be 512 (Kb) which is probably
on the conservative side for modern machines.

> o it produced 7 sort temp files each having size of 1.4GB (total 10GB)

Yes, I've been seeing space consumption of about 4x the actual data
volume.  Next step is to revise the merge algorithm to reduce that.
        regards, tom lane


pgsql-hackers by date:

Previous
From: Brian E Gallew
Date:
Subject: Re: [HACKERS] Dead CVS directoriesh
Next
From: Tatsuo Ishii
Date:
Subject: Re: [HACKERS] Outline for PostgreSQL book