Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT - Mailing list pgsql-hackers

From Robert Haas
Subject Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT
Date
Msg-id 603c8f070904050412vc8d0ff2y14bc7c913cbfbb@mail.gmail.com
Whole thread Raw
In response to Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT  (Tom Lane <tgl@sss.pgh.pa.us>)
Responses Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-hackers
On Sat, Apr 4, 2009 at 11:14 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Robert Haas <robertmhaas@gmail.com> writes:
>> On Sat, Apr 4, 2009 at 7:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>>> I'm not thrilled about adding a column to pg_attribute for this.
>
>> What is the specific nature of your concern?
>
> Actually, I'm more worried about the TupleDesc data structure than
> the catalogs.  There are TupleDescs all over the backend, and I've
> seen evidence in profiles that setting them up is a nontrivial cost.
>
> You're very possibly right that four more bytes is in the noise,
> though.
>
> Two other comments now that I've read a little further:
>
> * This isn't happening for 8.4, so adjust the pg_dump code.

I thought about writing 80500, but the effect of that would have been
to render the patch impossible to test, so I didn't. :-)

I think I'll be very lucky if that's the most bitrot this accumulates
between now and when the tree is open for 8.5 development.  System
catalog changes stink in that regard.  I suppose we could tag and
branch the tree now, but that would just move the work of fixing any
subsequent conflicts from patch authors to committers, which is sort
of a zero-sum game.

> * Using an integer is bogus.  Use a float4 and forget the weird scaling;
> it should have exactly the same interpretation as stadistinct, except
> for 0 meaning "unset" instead of "unknown".

I think there's a pretty good chance that will lead to a complaint
that is some variant of the following: "I ran this command and then I
did a pg_dump and the output doesn't match what I put in." Or maybe,
"I did a dump and a restore on a different machine with a different
architecture and then another dump and then I diffed them and this
popped out."

I have a deep-seated aversion to storing important values as float,
and we seem to have no other floats anywhere in our DDL, so I was a
little leery about breaking new ground.  There's nothing particularly
special about the scaling that the pg_statistic stuff uses, and it's
basically pretty obscure internal stuff anyway, so I think the
consistency argument is fairly weak.

...Robert


pgsql-hackers by date:

Previous
From: Martin Pihlak
Date:
Subject: Re: psql \d commands and information_schema
Next
From: Andrew Dunstan
Date:
Subject: Re: Closing some 8.4 open items