Re: Collect frequency statistics for arrays - Mailing list pgsql-hackers

From Noah Misch
Subject Re: Collect frequency statistics for arrays
Date
Msg-id 20120123155810.GA12821@tornado.leadboat.com
Whole thread Raw
In response to Re: Collect frequency statistics for arrays  (Alexander Korotkov <aekorotkov@gmail.com>)
Responses Re: Collect frequency statistics for arrays
List pgsql-hackers
On Mon, Jan 23, 2012 at 01:21:20AM +0400, Alexander Korotkov wrote:
> Updated patch is attached. I've updated comment
> of mcelem_array_contained_selec with more detailed description of
> probability distribution assumption. Also, I found that "rest" behavious
> should be better described by Poisson distribution, relevant changes were
> made.

Thanks.  That makes more of the math clear to me.  I do not follow all of it,
but I feel that the comments now have enough information that I could go about
doing so.

> +     /* Take care about events with low probabilities. */
> +     if (rest > DEFAULT_CONTAIN_SEL)
> +     {

Why the change from "rest > 0" to this in the latest version?

> +         /* emit some statistics for debug purposes */
> +         elog(DEBUG3, "array: target # mces = %d, bucket width = %d, "
> +              "# elements = %llu, hashtable size = %d, usable entries = %d",
> +              num_mcelem, bucket_width, element_no, i, track_len);

That should be UINT64_FMT.  (I introduced that error in v0.10.)


I've attached a new version that includes the UINT64_FMT fix, some edits of
your newest comments, and a rerun of pgindent on the new files.  I see no
other issues precluding commit, so I am marking the patch Ready for Committer.
If I made any of the comments worse, please post another update.

Thanks,
nm

Attachment

pgsql-hackers by date:

Previous
From: Robert Haas
Date:
Subject: Re: GUC_REPORT for protocol tunables was: Re: Optimize binary serialization format of arrays with fixed size elements
Next
From: Tom Lane
Date:
Subject: Re: Removing freelist (was Re: Should I implement DROP INDEX CONCURRENTLY?)