Re: changing sort_mem on the fly? - Mailing list pgsql-general
From | Jim C. Nasby |
---|---|
Subject | Re: changing sort_mem on the fly? |
Date | |
Msg-id | 20050130220936.GQ64304@decibel.org Whole thread Raw |
In response to | Re: changing sort_mem on the fly? (Tom Lane <tgl@sss.pgh.pa.us>) |
List | pgsql-general |
On Sun, Jan 30, 2005 at 04:49:39PM -0500, Tom Lane wrote: > "Jim C. Nasby" <decibel@decibel.org> writes: > > Since the planner knows how many rows will > > be going into the sort and how wide they are, ISTM it should be able to > > estimate how much memory will be needed. > > ... which is different from how much will be available. See cost_sort(): Ok, I wasn't sure which you were refering to. As I mentioned in the earlier thread, there would have to be some means of accounting for how much memory active sorts in the system are using. One possibility is that a global counter is updated any time a sort allocates or frees memory. If allocations are done 8k at a time that would probably be too expensive, but I suspect it wouldn't be too bad with larger allocations that are done less frequently. And bear in mind that if these changes prevent even a few sorts an hour from spilling to disk then it's probably still a net gain. > * If the total volume of data to sort is less than work_mem, we will do > * an in-memory sort, which requires no I/O and about t*log2(t) tuple > * comparisons for t tuples. > * > * If the total volume exceeds work_mem, we switch to a tape-style merge > * algorithm. There will still be about t*log2(t) tuple comparisons in > * total, but we will also need to write and read each tuple once per > * merge pass. We expect about ceil(log6(r)) merge passes where r is the > * number of initial runs formed (log6 because tuplesort.c uses six-tape > * merging). Since the average initial run should be about twice work_mem, > * we have > * disk traffic = 2 * relsize * ceil(log6(p / (2*work_mem))) > * cpu = comparison_cost * t * log2(t) > > The actual cost of a sort is therefore *highly* sensitive to how much > memory it is allowed to use. Blowing off any ability to estimate this > in advance is not going to improve matters... Doesn't the sort code also have a provision for starting an in-memory sort, and then realizing that it's going to exceed work_mem and going to disk at that point? Yes, without something as simple as work_mem it is harder to determine how much memory you can use, though I don't think it's impossible. First, a global variable/counter could be used to keep track of how much memory all the active sorts estimate they'll need. That would be a first-order guess. This could also be compared to how much memory is actually being used, though I suspect that would require globally tracking how much memory each sort estimates it will need and how much it's already using. Of course there would also be some kind of policy limiting sorts from taking over all of sort_mem; possibly a GUC, possibly something more dynamic. Whatever it is, that gives you an upper-bounds for how much memory would be used at a given point in time. BTW, something that just occured to me... with the new cache management code (ARC and whatever's going to replace it), presumably it will now be better for PostgreSQL to primarily do caching instead of the OS. This means you'd want to give PostgreSQL the lion's share of memory, but with how sorts currently aquire memory doing so would be risky. Should there be an option in 8.0 for sorts to use shared memory instead of allocating their own from a (possibly small) pool of memory the OS has? -- Jim C. Nasby, Database Consultant decibel@decibel.org Give your computer some brain candy! www.distributed.net Team #1828 Windows: "Where do you want to go today?" Linux: "Where do you want to go tomorrow?" FreeBSD: "Are you guys coming, or what?"
pgsql-general by date: