"Robert Bedell" <robert@friendlygenius.com> writes:
> 1) When does the optimizer set the nodeAgg plan to HASHED?
See grouping_planner() in src/backend/optimizer/plan/planner.c
particularly the logic around use_hashed_grouping.
> 2) What mechanism would be best to use for storing the data on disk? I know
> there is a temporary table mechanism, I'll be hunting for that shortly..
Temp files, not temp tables. You could look at
src/backend/utils/sort/tuplesort.c or src/backend/executor/nodeHash.c
for examples.
> 3) What should define the spillover point.
sort_mem.
> The documentation points to the
> 'sort_mem' parameter for this, but the code doesn't look to actually
> implement that yet.
Well, yeah, that's sort of exactly the point ... it's considered during
planning but the executor code has no fallback if the planner guesses
wrong.
> 4) Should LookupTupleHashEntry() be worried about the pointers it
> receives...similarly for hash_search()?
Eh?
regards, tom lane