Re: [HACKERS] \dt and disk access - Mailing list pgsql-hackers
| From | Bruce Momjian |
|---|---|
| Subject | Re: [HACKERS] \dt and disk access |
| Date | |
| Msg-id | 640f2198ab5800447460bf75cbfc2d09 Whole thread Raw |
| In response to | [HACKERS] \dt and disk access (Bruce Momjian <maillist@candle.pha.pa.us>) |
| List | pgsql-hackers |
> I don't fully understand what do you mean... > btree-build uses qsort to sort items on a page and has own > priority queue methods (instead of lselect.c - leftist tree > selection algorithm). > Do you want use qsort for in-memory sorts ? When I said btree-build, I meant the leftist tree sort, so I am considering replacing the leftist-tree selection algorithm with a qsort of an array of HeapTuple pointers. Any comments on this? The only performance problem I see is re-allocing the array as the number of entries grows, but if I double the size each time I exceed the current size, I don't believe it will be much of a problem. The leftist tree looked very funny to me, with recursion and stuff. I saw the new FASTBUILD sort used qsort, so I thought it may be a good idea. > > BTW, I may suggest additional trick: > if size of sort-keys is (much?) smaller then size of tuple > (select _int4_field_, _big_text_field_ from ... order by _int4_field_) > and you are going to switch to tape-methods (i.e. you've pallocted > all sort-memory) then you may try to create temp-file to store > (unsorted) tuples there and to deal with sort-keys only. After > you'll got sort-keys (or better say - records which keep sort-keys > and pointers - temp-file offsets - to original tuples) sorted, you > may use these pointers to get requested tuple from temp-file. I am a little lost here. The Mariposa sort code allows ExecSort to read directly from final sorted tape file, preventing the copy to a temp table. So, if I don't store the full tuple in the tape file, I can't pass a REAL tuple back to ExecSort. I assume you are suggesting putting all the tuples in one tape file, then doing the sort with tape file ONLY with the indexed fields, then using those indexed fields to retrieve the data from the tape. The problem I see is that the final read of sorted rows from the full tape file will be seeking all over the file, causing poor performance. Now, not having to move around extra data during the sort is going to be good for performance, but I am not sure what the result will be. I think you are correct that it will be a win to only sort the needed columns, but considering that any sort that uses tape files will be big, and hence will not fit in the file system cache, and considering the extra code to do this, I am uncertain about its usefulness. > > Even for select int4_field from ... order by int4_field > current psort palloc 48(header) + 4 (field) + 4 (aligning) = 56 bytes > for each tuple - why don't work with 4(field) + 4 (offset) = 8 bytes > of sort-records (tm-:)) ? Again, can I do this if I need to pass back the actual row? Also, another confusing issue is that when I start the sort, I don't know if it is going to fit into memory or not, so I must store the entire row into memory because I might be returning a pointer and not using tape files. Can I then switch in the middle of the sort to tape files and start doing my comparisons differently. Seems hard to me. I am just guessing on these issues. Any comments? - -- Bruce Momjian maillist@candle.pha.pa.us ------------------------------
pgsql-hackers by date: