Thread: Tuning Question sort_mem vs pgsql_tmp

Tuning Question sort_mem vs pgsql_tmp

From
Greg Stark
Date:
Is pgsql_tmp.* used for anything other than sorts? The only thing I see in a
quick review of the sources is hash joins but that shouldn't be relevant.

I've raised sort_mem to 49152 (1k blocks as I understand it) Yet pgsql_tmp
files are still being created. They range in size from 7M up to 34M. I don't
understand why posgres would need a 34M temporary sort space if it has 48M of
sort memory available. The results weren't much different before I raised
sort_mem.

The query being run is of the form INSERT INTO () (SELECT DISTINCT ON...)
Does the insert have the accumulate temporary data somewhere?

Does sort_mem have to be larger than the corresponding pgsql_tmp area that
would be used if postgres runs out of sort_mem?

--
greg

Re: Tuning Question sort_mem vs pgsql_tmp

From
Tom Lane
Date:
Greg Stark <gsstark@mit.edu> writes:
> Does sort_mem have to be larger than the corresponding pgsql_tmp area that
> would be used if postgres runs out of sort_mem?

Probably.  At least in recent versions, the "do we still fit in
sort_mem" logic tries to account for palloc overhead and alignment
padding, neither of which are present in the on-disk representation
of the same tuples.  So data unloaded to disk should be more compact
than it was in memory.  You didn't say what you were sorting, but
if it's narrow rows (like maybe just an int or two) the overhead
could easily be more than the actual data size.

            regards, tom lane

Re: Tuning Question sort_mem vs pgsql_tmp

From
Greg Stark
Date:
Tom Lane <tgl@sss.pgh.pa.us> writes:

> Greg Stark <gsstark@mit.edu> writes:
> > Does sort_mem have to be larger than the corresponding pgsql_tmp area that
> > would be used if postgres runs out of sort_mem?
>
> Probably.  At least in recent versions, the "do we still fit in
> sort_mem" logic tries to account for palloc overhead and alignment
> padding, neither of which are present in the on-disk representation
> of the same tuples.  So data unloaded to disk should be more compact
> than it was in memory.  You didn't say what you were sorting, but
> if it's narrow rows (like maybe just an int or two) the overhead
> could easily be more than the actual data size.

Thank you. 64M seems to be enough after all, 48M just wasn't big enough. At
64M I don't see any more usage of pgsql_tmp. The largest on disk sort was
35,020,800 bytes. So that translates to a 44%-92% space overhead.

It turns out it was the same data structure as my earlier message which puts
it at 53 byte records in practice. Two integers, a float, a varchar with up to
12 characters.

--
greg