Re: [HACKERS] sort on huge table - Mailing list pgsql-hackers

From Tatsuo Ishii
Subject Re: [HACKERS] sort on huge table
Date
Msg-id 199910141459.XAA09758@srapc451.sra.co.jp
Whole thread Raw
In response to Re: [HACKERS] sort on huge table  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: [HACKERS] sort on huge table  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: [HACKERS] sort on huge table  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
>> I will test it with my 2GB table. Creating 4GB would probably be
>> possible, but I don't have enough sort space for that:-)
>
>OK.  I am working on reducing the space requirement, but it would be
>nice to test the bottom-level multi-temp-file code before layering
>more stuff on top of it.  Anyone else have a whole bunch of free
>disk space they could try a big sort with?
>
>> I ran my previous test on 6.5.2, not on current. I hope current is
>> stable enough to perform my testing.
>
>It seems reasonably stable here, though I'm not doing much except
>testing... main problem is you'll need to initdb, which means importing
>your large dataset...

I have done the 2GB test on current (with your fixes). This time the
sorting query worked great! I saw lots of temp files, but the total
disk usage was almost same as before (~10GB). So I assume this is ok.

>> Talking about the -S, I did use the default since setting -S seems to
>> consume too much memory. For example, if I set it to 128MB, backend
>> process grows over 512MB and it was killed due to swap space was run
>> out. Maybe 4x law can be also applicated to -S?
>
>If the code is working correctly then -S should be obeyed ---
>approximately, anyway, since psort.c only counts the actual tuple data;
>it doesn't know anything about AllocSet overhead &etc.  But it looked
>to me like there might be some plain old memory leaks in psort.c, which
>could account for actual usage being much more than intended.  I am
>going to work on cleaning up psort.c after I finish building
>infrastructure for it.

I did set the -S to 8MB, and it seems boost the performance. It took
only 22:37 (previous result was ~45:00).
---
Tatsuo Ishii



pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: [HACKERS] sort on huge table
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] sort on huge table