Re: [HACKERS] Re: attdisbursiont - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [HACKERS] Re: attdisbursiont
Date
Msg-id 334.937604670@sss.pgh.pa.us
Whole thread Raw
In response to Re: [HACKERS] Re: attdisbursiont  (Bruce Momjian <maillist@candle.pha.pa.us>)
Responses Re: [HACKERS] Re: attdisbursiont  (Bruce Momjian <maillist@candle.pha.pa.us>)
List pgsql-hackers
Bruce Momjian <maillist@candle.pha.pa.us> writes:
>>>> * change VACUUM ANALYZE to use btree comparison functions, not <,=,> calls
>> 
>> There are several places that know more than they should about the
>> meaning of "<" etc operators.  For example, the parser assumes it
>> should use "<" and ">" to implement ORDER BY [DESC].  Making VACUUM
>> not depend on specific names for the ordering operators will not
>> improve life unless we fix *all* of these places.

> Actually, I thought it would be good for performance reasons, not for
> portability.  We would call one function per attribute instead of three.

No such luck: what VACUUM wants to do is figure out whether the current
value is less than the min-so-far (one "<" call), greater than the
max-so-far (one ">") call, and/or equal to the candidate most-frequent
values it has (one "=" call apiece).  Same number of function calls if
it's using a "compare" function.

I suppose you'd save a little time by only looking up one operator
function per column instead of three, but it's hard to think that'd
be measurable let alone significant.  There's not going to be any
per-tuple savings.

>> While we are at it we could think about saying that there is just one
>> "standard ordering operator" for a type and it yields a strcmp-like
>> result (minus, zero, plus) rather than several ops yielding booleans.
>> But that'd take a lot of changes in btree and everywhere else...

> The btree comparison functions do just that, returning -1,0,1 like
> strcmp, for each type btree supports.

Right, and that's useful for btree because it saves compares, but it
doesn't really help VACUUM noticeably.

After writing the above quote, I realized that you can't really define
a type's ordering just in terms of a strcmp-like operator with no other
baggage.  That might be enough for building a btree index, but in order
to *do* anything with the index, the optimizer and executor have to
understand the relationship of the index ordering to the things that
a user would write in a query, such as "WHERE A >= 12 AND A < 100"
or "ORDER BY column USING >".  So there has to be information relating
these user-available operators to the type's ordering, as well.
(We do have that, in the form of the pg_amop table entries.  The point
is that you can't get away with much less information than is contained
in pg_amop.)

As far as I can see, the only thing that's really at stake here is not
hardwiring the semantics of the operator names "<", "=", ">" into the
system.  While that'd be nice from a cleanliness/data-type-independence
point of view, it's not clear that it has any real practical
significance.  Any data type designer who didn't make "=" mean equals
ought to be shot anyway ;-).  So upon second thought I think I'd put
this *way* down the to-do list...
        regards, tom lane


pgsql-hackers by date:

Previous
From: Bruce Momjian
Date:
Subject: Re: [HACKERS] pgaccess seems a tad confused
Next
From: Tom Lane
Date:
Subject: Re: [HACKERS] pgaccess seems a tad confused