Re: [HACKERS] \dt and disk access - Mailing list pgsql-hackers
| From | Bruce Momjian |
|---|---|
| Subject | Re: [HACKERS] \dt and disk access |
| Date | |
| Msg-id | f336f4eae22a60b3f2c09348a2230c8b Whole thread Raw |
| In response to | [HACKERS] \dt and disk access (Bruce Momjian <maillist@candle.pha.pa.us>) |
| List | pgsql-hackers |
> > Bruce Momjian wrote: > > > > > Can you make the size of the result set above which diskfiles will be used > > > configurable? That way ppl with loads of RAM can use huge buffers, and ppl > > > with little RAM can keep that RAM free for other processes. > > > > If I need a configuration value, I will either determine the amount of > > RAM portably, or base the value on the number of shared buffers > > requested with -B. > > Why don't use additional flag as it suggested by Mark ? > Using -B is not good: the size of shared memory segment may be > limited by OS (under FreeBSD 2.1.0 it's limited to 4M) or system > administrator and so backend will use 2-4 M of RAM for sort on > box with 192 M RAM ? OK, I will use a new flag. > > This flag may be useful for joinpath.c:EnoughMemoryForHashjoin() too... > > Also note that following > > - make psort read directly from the executor node below it > > (instead of an input relation) > > it will be impossible to know the size of result before sort startup. > So you have to use palloc up to in-memory limit and switch to > 'tape' files dynamically. I like this idea. I was struggling on how I was going to determine the size of the result anyway. I have checked the Mariposa source changes Paul mentioned. They do indeed change the behavior or psort(). It still uses tape files, so I will need to increase the memory used for each sort, and only create the tape files if the initial sort does not fit within the allocated memory. > > Also > > - makes the Sort node read directly from the last set of psort runs > > (instead of an output relation) > > require changes to ExecSortMarkPos()/ExecSortRestrPos() which > use heap_markpos()/heap_restrpos() (because of last set of > psort is not normal heap relation). > > But both changes of nodeSort.c are what we really need. With the new psort(), you can do multiple sorts at the same time. The new psort() comment says: * The old psort.c's routines formed a temporary relation from the merged * sort files. This version keeps the files around instead of generating the * relation from them, and provides interface functions to the file so that * you can grab tuples, mark a position in the file, restore a position in the * file. You must now explicitly call an interface function to end the sort, * psort_end, when you are done. * Now most of the global variables are stuck in the Sort nodes, and * accessed from there (they are passed to all the psort routines) so that * each sort running has its own separate state. This is facilitated by having * the Sort nodes passed in to all the interface functions. * The one global variable that all the sorts still share is SortMemory. * You should now be allowed to run two or more psorts concurrently, * so long as the memory they eat up is not greater than SORTMEM, the initial * value of SortMemory. -Rex 2.15.1995 * * Use the tape-splitting method (Knuth, Vol. III, pp281-86) in the future. I am uploading mariposa-alpha-1.tar.gz to the postgreSQL ftp incoming directory because I think I am going to need help on this one. The official mariposa ftp site is very, very slow and unreliable. This release is dated June, 1996, and is the newest available. - -- Bruce Momjian maillist@candle.pha.pa.us ------------------------------
pgsql-hackers by date: