I mean without further calculation to consider other operation
IMHO the best thing you can do is call estimate_num_groups() and combine that with the number of input rows. That shall benefit from ndistinct coefficients when available, etc. I've been thinking that considering the unreliability of grouping estimates we should use a multiple of the average size (because there may be much larger groups), but I think that's quite unprecipled and I'd much rather try without it first.