Re: analyze.c - Mailing list pgsql-hackers

From Tom Lane
Subject Re: analyze.c
Date
Msg-id 28154.967041988@sss.pgh.pa.us
Whole thread Raw
In response to analyze.c  (Tiago Antão <tra@fct.unl.pt>)
Responses Re: analyze.c  (Tiago Antão <tra@fct.unl.pt>)
List pgsql-hackers
Tiago Antão <tra@fct.unl.pt> writes:
>   About analyze.c:
>   If taken out vacuum, couldn't it be completly taken out of pg? Say,
> to an external program?

Not if you want to do anything useful with it --- direct access to the
database is only possible within the context of a backend, because of
all the locking, buffering, etc behavior that you must adhere to.

> What's the big reason not to do that? I know that
> there is some code in analyze.c (like comparing) that uses other parts of
> pg, but that seems to be easily fixed.

Are you proposing not to do any comparisons?  It will be interesting to
see how you can compute a histogram without any idea of equality or
ordering.  But if you want that, then you still need the function-call
manager as well as the type-specific comparison routines for every
datatype that you might be asked to operate on (don't forget
user-defined types here).

In short, I doubt you can build a useful analyze-engine that's
significantly smaller than a full backend.  Besides, having ANALYZE
available as a regular SQL command is just too useful to want to see
it moved out to some outside program that would have to be run
separately.

>   I'm leaning toward the implementation of end-biased histograms. There is
> an introductory reference in the IEEE Data Engineering Bulletin, september
> 1995 (available on microsoft research site).

Sounds interesting.  Can you give us an exact URL?
        regards, tom lane


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Optimisation deficiency: currval('seq')-->seq scan, constant-->index scan
Next
From: Tom Lane
Date:
Subject: Re: Problem with insert