Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT - Mailing list pgsql-hackers

From Robert Haas
Subject Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT
Date
Msg-id 603c8f070904041956i18fc51e6p2236bb06aaab36d6@mail.gmail.com
Whole thread Raw
In response to Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT  (Alvaro Herrera <alvherre@commandprompt.com>)
List pgsql-hackers
On Sat, Apr 4, 2009 at 10:31 PM, Alvaro Herrera
<alvherre@commandprompt.com> wrote:
> Robert Haas escribió:
>> On Sat, Apr 4, 2009 at 7:04 PM, Tom Lane <tgl@sss.pgh.pa.us> wrote:
>> > Robert Haas <robertmhaas@gmail.com> writes:
>> >> Per previous discussion.
>> >> http://archives.postgresql.org/message-id/8066.1229106059@sss.pgh.pa.us
>> >> http://archives.postgresql.org/message-id/603c8f070904021926g92eb55sdfc68141133957c1@mail.gmail.com
>> >
>> > I'm not thrilled about adding a column to pg_attribute for this.
>> > Isn't there some way of keeping it in pg_statistic?
>>
>> I don't like the idea of keeping it in pg_statistic.  Right now, all
>> of the data in pg_statistic is transient, so you could theoretically
>> truncate the table at any time without losing anything permanent.
>
> Maybe use a new catalog?

If we go that route, we would probably make sense to move
attstattarget there as well.  Obviously it wouldn't make sense to move
anything that's in the critical path of ordinary database operations,
but maybe attislocal or attinhcount could be moved as well.  But I'm
not sure it's really warranted because, AFAIK, we have no evidence
that this is a real as opposed to a theoretical problem, and even if
we moved all of that stuff, that's only 12 bytes, and now you have
another table that's competing for space in the system cache.  If
someone could demonstrate (say, by reducing NAMEDATALEN) that a
smaller pg_attribute structure would generate a real performance
benefit, then it would be worth spending the time to figure out a way
to make that happen (obviously without actually reducing NAMEDATALEN,
that's only a possible way to measure the impact).

>> What is the specific nature of your concern?  I thought about the
>> possibility of a distributed performance penalty that might be
>> associated with enlarging pg_attribute, but increasing the size of a
>> structure that is already 112 bytes by another 4 doesn't seem likely
>> to be significant, especially since we're not crossing a power-of-two
>> boundary.
>
> FWIW it has been said that whoever is concerned about pg_attribute bloat
> should be first looking at getting rid of the redundant entries for  BN
> system columns, for each and every table.

That's a different kind of bloat (more rows vs. larger rows) but a
valid point all the same.  I suspect neither type has much practical
impact, and that if we listed all the performance problems that
PostgreSQL has today, neither would be in the top 500.  Bad ndistinct
estimates would be, however.

...Robert


pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT
Next
From: Tom Lane
Date:
Subject: Re: ALTER TABLE ... ALTER COLUMN ... SET DISTINCT