Tiago Antão <tra@fct.unl.pt> writes:
> About analyze.c:
> If taken out vacuum, couldn't it be completly taken out of pg? Say,
> to an external program?
Not if you want to do anything useful with it --- direct access to the
database is only possible within the context of a backend, because of
all the locking, buffering, etc behavior that you must adhere to.
> What's the big reason not to do that? I know that
> there is some code in analyze.c (like comparing) that uses other parts of
> pg, but that seems to be easily fixed.
Are you proposing not to do any comparisons? It will be interesting to
see how you can compute a histogram without any idea of equality or
ordering. But if you want that, then you still need the function-call
manager as well as the type-specific comparison routines for every
datatype that you might be asked to operate on (don't forget
user-defined types here).
In short, I doubt you can build a useful analyze-engine that's
significantly smaller than a full backend. Besides, having ANALYZE
available as a regular SQL command is just too useful to want to see
it moved out to some outside program that would have to be run
separately.
> I'm leaning toward the implementation of end-biased histograms. There is
> an introductory reference in the IEEE Data Engineering Bulletin, september
> 1995 (available on microsoft research site).
Sounds interesting. Can you give us an exact URL?
regards, tom lane