Thread: Tuning Question sort_mem vs pgsql_tmp
Is pgsql_tmp.* used for anything other than sorts? The only thing I see in a quick review of the sources is hash joins but that shouldn't be relevant. I've raised sort_mem to 49152 (1k blocks as I understand it) Yet pgsql_tmp files are still being created. They range in size from 7M up to 34M. I don't understand why posgres would need a 34M temporary sort space if it has 48M of sort memory available. The results weren't much different before I raised sort_mem. The query being run is of the form INSERT INTO () (SELECT DISTINCT ON...) Does the insert have the accumulate temporary data somewhere? Does sort_mem have to be larger than the corresponding pgsql_tmp area that would be used if postgres runs out of sort_mem? -- greg
Greg Stark <gsstark@mit.edu> writes: > Does sort_mem have to be larger than the corresponding pgsql_tmp area that > would be used if postgres runs out of sort_mem? Probably. At least in recent versions, the "do we still fit in sort_mem" logic tries to account for palloc overhead and alignment padding, neither of which are present in the on-disk representation of the same tuples. So data unloaded to disk should be more compact than it was in memory. You didn't say what you were sorting, but if it's narrow rows (like maybe just an int or two) the overhead could easily be more than the actual data size. regards, tom lane
Tom Lane <tgl@sss.pgh.pa.us> writes: > Greg Stark <gsstark@mit.edu> writes: > > Does sort_mem have to be larger than the corresponding pgsql_tmp area that > > would be used if postgres runs out of sort_mem? > > Probably. At least in recent versions, the "do we still fit in > sort_mem" logic tries to account for palloc overhead and alignment > padding, neither of which are present in the on-disk representation > of the same tuples. So data unloaded to disk should be more compact > than it was in memory. You didn't say what you were sorting, but > if it's narrow rows (like maybe just an int or two) the overhead > could easily be more than the actual data size. Thank you. 64M seems to be enough after all, 48M just wasn't big enough. At 64M I don't see any more usage of pgsql_tmp. The largest on disk sort was 35,020,800 bytes. So that translates to a 44%-92% space overhead. It turns out it was the same data structure as my earlier message which puts it at 53 byte records in practice. Two integers, a float, a varchar with up to 12 characters. -- greg