Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT - Mailing list pgsql-hackers

From Robert Haas
Subject Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT
Date
Msg-id 603c8f070904051854h6e6e380dg4c10118c4b1e6f66@mail.gmail.com
Whole thread Raw
In response to Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sun, Apr 5, 2009 at 7:56 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sat, Apr 4, 2009 at 11:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> * Using an integer is bogus.  Use a float4 and forget the weird scaling;
>>> it should have exactly the same interpretation as stadistinct, except
>>> for 0 meaning "unset" instead of "unknown".
>
>> I have a deep-seated aversion to storing important values as float,
>
> [ shrug... ]  Precision is not important for this value: we are not
> anywhere near needing more than six significant digits for our
> statistical estimates.  Range, on the other hand, could be important
> when dealing with really large tables.  So I'm much more concerned
> about whether the definition is too restrictive than about whether
> some uninformed person complains about exactness.

I thought about that, and if you think that's better, I can implement
it that way.  Personally, I'm unconvinced.  The use case for
specifying a number of distinct values in excess of 2 billion as an
absolute number rather than as a percentage of the table size seems
pretty weak to me.  I would rather use integers and have it be clean.
But I would rather have it your way than not have it at all.

...Robert


pgsql-hackers by date:

Previous
From: "Greg Sabino Mullane"
Date:
Subject: Re: EXPLAIN WITH
Next
From: Tom Lane
Date:
Subject: Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT