Thread: type coersion

type coersion

From
Bruce Momjian
Date:
Let me ask about type coersion.

When you have an int2 column called x, how do you place the conversion
functions when it is being compared to an in4 constant?

    x = int2(500)

    int4(x) = 500

The first is good for indexing, the second is not.  If they are both
variables or both constants, the handling does not matter.


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: type coersion (was OR clause status)

From
"Thomas G. Lockhart"
Date:
<I changed "x" to "i2" in the example for clarity>

> ... have an int2 column called i2, how do you place the conversion
> functions when it is being compared to an in4 constant?
>         i2 = int2(500)
>         int4(i2) = 500
> The first is good for indexing, the second is not.  If they are both
> variables or both constants, the handling does not matter.

Yes the handling does matter *in general*, since
  while i4 = 500.1
cannot be evaluated as
  while i4 = int4(500.1)
and get the right result.
Even for the types in your example,
  while i2 = 4000000
should execute correctly, even though we both know that *for this case*
it doesn't make a lot of sense since the constant exceeds the range of
the column.

Actually, there are a lot of mixed-type comparison functions in pg_proc.
For example, there is a int24eq function which would be used for your
query outside of a "where clause". But, read on...

*slaps forehead*

It dawned on me while I was waking up this morning that we'd forgotten a
major bit of info about indexing. The pg_proc routines which are
declared for =, <, >, etc. are _not_ directly used to access indices! It
is of course the pg_am, pg_amop, and pg_amproc tables which are used for
this.

Here are examples from v6.3.2:

regression=> explain select * from tenk1 where unique1 = 3000;
Index Scan on tenk1  (cost=2.05 size=1 width=100)
regression=> explain select * from tenk1 where unique1 = 3000+1;
Seq Scan on tenk1  (cost=512.00 size=1000 width=100)
regression=> explain select * from tenk1 where unique1 = int4(3000.1);
Seq Scan on tenk1  (cost=512.00 size=1000 width=100)

The *only case* I've noticed so far which does better in v6.3.2 than
"v6.4today" (not yet v6.4alpha :) is the one involving OIDs:
regression=> explain select * from tenk1 where oid = 3000;
Index Scan on tenk1  (cost=2.05 size=1 width=100)

And remember that there are a lot of cases which do better in v6.4today
than in earlier versions.

Let's try to figure out how to get constant expressions using indices,
rather than just patching to get mismatched constant values using them.
And even for that I should be able to fix up the "binary compatible"
cases such as OID and int4 without too much trouble. Just need to find
the right spot in the index handling.

However, I should be able to find that by looking for references to the
pg_am* tables in the source code now that my brain is partly unwedged :)
Will do that, and we might still want to figure out how to use Vadim's
new PARAM_EXEC nodes for function calls on constants and constant
expressions (e.g. func1(func2(const1+const2))+const3) in the parser.

                   - Tom

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Tom Lane
Date:
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
> The *only case* I've noticed so far which does better in v6.3.2 than
> "v6.4today" (not yet v6.4alpha :) is the one involving OIDs:
> regression=> explain select * from tenk1 where oid = 3000;
> Index Scan on tenk1  (cost=2.05 size=1 width=100)

Actually, I'm aware of another one: comparisons using "> constant"
or "< constant" seem more likely to use an index in 6.3.2 than they
do in the sources I have.  I have examples involving both datetime
and int4 columns where "where x = constant" will be implemented by
index scan, but "where x > constant" will not.  Explicit casts of
the righthand side make no difference.  And it works fine in 6.3.2.

I'm a couple of weeks behind the CVS tree, so I was going to wait
until I'd confirmed it with up-to-the-minute sources before complaining.
But if you're proceeding on the assumption that the "oid = integer" case
is the only thing that's broken, you may be misled.

The fact that casting doesn't affect this makes me think it is a
different problem than the must-cast-to-get-an-index-scan cases
we've discussed so far.

            regards, tom lane

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Tom Lane
Date:
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
>> ... have an int2 column called i2, how do you place the conversion
>> functions when it is being compared to an in4 constant?
>> i2 = int2(500)
>> int4(i2) = 500
>> The first is good for indexing, the second is not.  If they are both
>> variables or both constants, the handling does not matter.

> Yes the handling does matter *in general*, since
>   while i4 = 500.1
> cannot be evaluated as
>   while i4 = int4(500.1)
> and get the right result.
> Even for the types in your example,
>   while i2 = 4000000
> should execute correctly, even though we both know that *for this case*
> it doesn't make a lot of sense since the constant exceeds the range of
> the column.

This is all a good point.  I wonder whether it wouldn't help to make the
parser's initial assignment of types to constants depend on how big the
constants are.  In other words, if I write "400" the parser would
implicitly mark this as "int2", whereas if I write "400000" it would be
implicitly marked "int4".  Then, subsequent implicit upward promotion
would cast 400::int2 to 400::int4 if int4 was actually the most
appropriate thing to use *in context*.

It seems to me that this handles all the cases cited correctly:
    where i2 = 400
will get implemented directly as int2 eq int2,
    where i4 = 400
will promote 400::int2 to 400::int4 and then use an int4 eq int4
comparison op (of course we have to address the current failure to
reduce int4(constant) to a constant, but IMHO that has to happen
anyway),
    where i2 = 400000
will get promoted to where int4(i2) = 400000::int4 and executed
correctly (in this case returning no rows, but that's not a reason
not to do it right --- "where i2 < 32768" might be a more compelling
example).  Likewise
    where i4 = 500.1
will be executed with correct semantics as float8(i4) = 500.1::float8
since the parser will know that float8 is a "wider" type than int4.

It'd be nice if the same answer would work for
    where f4 = 500.1
but I'm not sure that I care to see the parser deciding that "float4
is good enough for this constant".  We could too easily find that in
    where f8 = 500.1
the parser might cast the constant down to float4 and back up to float8
with catastrophic loss of precision.

In the float case, is there some way that a constant might be marked
as "float of unspecified width" (with the actual value held as a float8)
until the type resolution pass is done?  This would need to be
considered "equivalent" to both float4 and float8, so that in
    where f4 = 500.1
the type resolver doesn't decide it must rewrite f4 as float8(f4).
Then at some late stage of the game we resolve the constant to float4
rather than float8 if the context is float4.  I'm not sure how the
details should work, however.  If we *could* make this work it might
be best to handle int2 vs. int4 constants the same way.

            regards, tom lane

Re: [HACKERS] Re: type coersion (was OR clause status)

From
"Thomas G. Lockhart"
Date:
> > Yes the handling does matter *in general*...
> This is all a good point. I wonder whether it wouldn't help to make
> the parser's initial assignment of types to constants depend on how
> big the constants are.

I agree that there is something to fix, including both of the indexing
example cases you are following. I've been seeing this as an opportunity
to fix sub-optimal utilization of indices, and my recent changes to
enhance the type conversion capabilities in the parser just put the
index handling in a harsher light.

If we come near release time, and no one has been able to penetrate the
index handling (to fix cases as simple as "where x = 1 + 1" which have
never worked) the we could fairly easily go back into the parser and
brute-force some non-general workarounds to get the few "worse than
before" cases hacked around.

If we _can_ get the index handling to work more generally, then we have
substantially enhanced the overall capabilities of Postgres.

So far, I haven't seen cases where the parser has tried to do "the wrong
thing", only cases where "the right thing" causes the index handling to
miss the boat...

                       - Tom

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Tom Lane
Date:
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
> So far, I haven't seen cases where the parser has tried to do "the wrong
> thing", only cases where "the right thing" causes the index handling to
> miss the boat...

I disagree, actually.  In the example

    select ... where i2 = 400;

I claim the parser is doing the wrong thing by representing this as
"where int4(i2) = 400::int4" rather than "where i2 = 400::int2".
It is not really reasonable to expect the optimizer to clean up
after that initial mistake.

            regards, tom lane

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> > > Yes the handling does matter *in general*...
> > This is all a good point. I wonder whether it wouldn't help to make
> > the parser's initial assignment of types to constants depend on how
> > big the constants are.
>
> I agree that there is something to fix, including both of the indexing
> example cases you are following. I've been seeing this as an opportunity
> to fix sub-optimal utilization of indices, and my recent changes to
> enhance the type conversion capabilities in the parser just put the
> index handling in a harsher light.
>
> If we come near release time, and no one has been able to penetrate the
> index handling (to fix cases as simple as "where x = 1 + 1" which have
> never worked) the we could fairly easily go back into the parser and
> brute-force some non-general workarounds to get the few "worse than
> before" cases hacked around.
>
> If we _can_ get the index handling to work more generally, then we have
> substantially enhanced the overall capabilities of Postgres.
>
> So far, I haven't seen cases where the parser has tried to do "the wrong
> thing", only cases where "the right thing" causes the index handling to
> miss the boat...

I have found what is happening in the optimizer, and will explain soon.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> If we come near release time, and no one has been able to penetrate the
> index handling (to fix cases as simple as "where x = 1 + 1" which have
> never worked) the we could fairly easily go back into the parser and
> brute-force some non-general workarounds to get the few "worse than
> before" cases hacked around.
>
> If we _can_ get the index handling to work more generally, then we have
> substantially enhanced the overall capabilities of Postgres.
>
> So far, I haven't seen cases where the parser has tried to do "the wrong
> thing", only cases where "the right thing" causes the index handling to
> miss the boat...

OK, here is what I have.  I have a table called test, with one
attribute.  I execute:

    select * from test where oid = 3;

and the output plan is:

    (
       { EXPR
       :typeOid 0
       :opType op
       :oper
          { OPER
          :opno 1137
          :opid 0
          :opresulttype 16
          }

       :args (
          { VAR
          :varno 1
          :varattno -2
          :vartype 26
          :vartypmod -1
          :varlevelsup 0
          :varnoold 1
          :varoattno -2
          }

          { CONST
          :consttype 23
          :constlen 4
          :constisnull false
          :constvalue  4 [  3  0  0  0 ]
          :constbyval true
          }
       )
       }
    )

Why does the Var have a type 26(int), and the constant a type of
23(oid)?  Where's the conversion function?

Now, the existance of the function doesn't help either, but that is a
different problem:

    test=> explain select * from test where oid = oid(3);
    NOTICE:  QUERY PLAN:

    Seq Scan on test  (cost=1936.05 size=4916 width=8)

Is it because there is a int4eqoid() function?  How to use an index on that?

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
"Thomas G. Lockhart"
Date:
> and the output plan is:
>         (
>            { EXPR
>            :typeOid 0
>            :opType op
>            :oper
>               { OPER
>               :opno 1137
>               :opid 0
>               :opresulttype 16
>               }
>
>            :args (
>               { VAR
>               :varno 1
>               :varattno -2
>               :vartype 26
>               :vartypmod -1
>               :varlevelsup 0
>               :varnoold 1
>               :varoattno -2
>               }
>
>               { CONST
>               :consttype 23
>               :constlen 4
>               :constisnull false
>               :constvalue  4 [  3  0  0  0 ]
>               :constbyval true
>               }
>            )
>            }
>         )
>
> Why does the Var have a type 26(int), and the constant a type of
> 23(oid)?  Where's the conversion function?

A conversion function is not necessary; the operator in the "opno" field
(1137) corresponds to the oid of the entry in pg_operator for "=" with
the correct arguments.

> Now, the existance of the function doesn't help either, but that is a
> different problem:
>         test=> explain select * from test where oid = oid(3);
>         NOTICE:  QUERY PLAN:
>         Seq Scan on test  (cost=1936.05 size=4916 width=8)
> Is it because there is a int4eqoid() function?

Yes. The function is called int4eqoid() (good guess :). And there is
also a function oideqint4().

There is a chance that this case would actually work if we just removed
those functions, since (in my test code only) I've made int4 and oid
"binary compatible" so the int4eq or oideq routines would be used
instead. The index support code might actually behave properly then.

> How to use an index on that?

So that is the problem for this case; there is actually a function which
matches the arguments exactly, so it is specified. *And* the same
function is (probably) not mentioned in the index configuration tables
pg_am*.

However, if we allowed the index support code to look for possible
matches on indices for the non-constant terms, and then look for the
best possible match for conversion routines on other terms, and then did
an "optimizer substitution", we might get better behavior. We would want
code to do the same kind of analysis for constant terms with function
calls and constant expressions too.

I'd be happy to work on the actual substitution code, but still don't
know what the planner does with indices. I'm starting to poke through it
like you are, but am farther behind.

I thought a good start would be to try addressing a case like this, and
allow the planner/indexer/optimizer to substitute "binary compatible"
indices. If we can succeed at that, then we would know what places need
to be touched to do more, like handling function calls and expressions
with constants.

                      - Tom

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> "Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
> > So far, I haven't seen cases where the parser has tried to do "the wrong
> > thing", only cases where "the right thing" causes the index handling to
> > miss the boat...
>
> I disagree, actually.  In the example
>
>     select ... where i2 = 400;
>
> I claim the parser is doing the wrong thing by representing this as
> "where int4(i2) = 400::int4" rather than "where i2 = 400::int2".
> It is not really reasonable to expect the optimizer to clean up
> after that initial mistake.

I don't see that.  If I do:

    select * from test2 where i2 = 3;

I don't see any constants being converted.  The Var is still i2, and the
Const is i4.  This is as it is seen by the optimizer.  It is using
int24eq(opno = 532).  This is valid when doing comparisons in the
executor.

The problem is how do we use indexes for this?  I am still researching
this.


---------------------------------------------------------------------------


(
   { EXPR
   :typeOid 0
   :opType op
   :oper
      { OPER
      :opno 532
      :opid 0
      :opresulttype 16
      }

   :args (
      { VAR
      :varno 1
      :varattno 1
      :vartype 21
      :vartypmod -1
      :varlevelsup 0
      :varnoold 1
      :varoattno 1
      }

       { CONST
      :consttype 23
      :constlen 4
      :constisnull false
      :constvalue  4 [  3  0  0  0 ]
      :constbyval true
      }
   )
   }
)


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
"Thomas G. Lockhart"
Date:
> > I claim the parser is doing the wrong thing by representing this as
> > "where int4(i2) = 400::int4" rather than "where i2 = 400::int2".
> > It is not really reasonable to expect the optimizer to clean up
> > after that initial mistake.
> I don't see that.

Yup. The parser is behaving as Bruce describes. The new type conversion
stuff isn't the fundamental problem. It's the original features in the
planner when trying to use indices.

> The problem is how do we use indexes for this?  I am still researching
> this.

OK, let me know if I can help look into anything. In the meantime, I'll
keep poking at it a bit...

                         - Tom

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> > > I claim the parser is doing the wrong thing by representing this as
> > > "where int4(i2) = 400::int4" rather than "where i2 = 400::int2".
> > > It is not really reasonable to expect the optimizer to clean up
> > > after that initial mistake.
> > I don't see that.
>
> Yup. The parser is behaving as Bruce describes. The new type conversion
> stuff isn't the fundamental problem. It's the original features in the
> planner when trying to use indices.
>
> > The problem is how do we use indexes for this?  I am still researching
> > this.
>
> OK, let me know if I can help look into anything. In the meantime, I'll
> keep poking at it a bit...

The optimizer does a loop for each index on every relation:

In match_clause_to_indexkey(), there is code that takes the
operator, in the case of "oid = 3", value 1137, oideqint4:

        if ((rightop && IsA(rightop, Const)) ||
            (rightop && IsA(rightop, Param)))
        {
            restrict_op = ((Oper *) ((Expr *) clause)->oper)->opno;
            isIndexable = (op_class(restrict_op, xclass, index->relam) &&
                            IndexScanableOperand(leftop,
                                                  indexkey,
                                                  rel,
                                                  index));
        }


and calls opclass(), which does a lookup in the pg_amop cache, passing
the operator oid (1137), the access method class oid, and the index
access method:

    #0  op_class (opno=1137, opclass=427, amopid=403) at lsyscache.c:58
                      oideqint4    oid_ops   btree_am_oid


and it returns false because there is no access operator for oid_ops and
btree_am_oid that matches oideqint4.

The fundamental problem is that index scans are made to compare columns
all of the same type.  That is how indexes are built and traversed.
Now, we want to bring in a constant of another type, and have it use an
index.

Sounds like if we add the proper pg_am functions for binary
compatability, the optimizer should then use the proper indexes.

Thomas?

[I am going to bed now.]

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> > > I claim the parser is doing the wrong thing by representing this as
> > > "where int4(i2) = 400::int4" rather than "where i2 = 400::int2".
> > > It is not really reasonable to expect the optimizer to clean up
> > > after that initial mistake.
> > I don't see that.
>
> Yup. The parser is behaving as Bruce describes. The new type conversion
> stuff isn't the fundamental problem. It's the original features in the
> planner when trying to use indices.
>
> > The problem is how do we use indexes for this?  I am still researching
> > this.
>
> OK, let me know if I can help look into anything. In the meantime, I'll
> keep poking at it a bit...

I have also looked at x = oid(3) to see why functions are not being
used.  That same function I mentioned match_clause_to_indexkey clearly
shows it does not support this:

    /*
     * If this is not a join clause, check for clauses of the form:
     * (operator var/func constant) and (operator constant var/func)
     */
    if (!join)
    {
        /*
         * Check for standard s-argable clause
         */
        if ((rightop && IsA(rightop, Const)) ||
            (rightop && IsA(rightop, Param)))
        {
            restrict_op = ((Oper *) ((Expr *) clause)->oper)->opno;
            isIndexable = (op_class(restrict_op, xclass, index->relam) &&


They want to use a functional index for oid(), but we are computing it
on a constant.  Some code will have to be added somewhere to handle
this.  If we want it done in the executor, perhaps we can look inside
the function to see if they are all consts, and handle it somehow.

Of course, if a function index does match, we should use that first.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> The fundamental problem is that index scans are made to compare columns
> all of the same type.  That is how indexes are built and traversed.
> Now, we want to bring in a constant of another type, and have it use an
> index.
>
> Sounds like if we add the proper pg_am functions for binary
> compatability, the optimizer should then use the proper indexes.

FOLLOWUP:

This may fix our int4/oid problem, but I don't think it addresses the
more common problems like int2/int4.  I can't see how that would work
with just additions to pg_amop.

Yes, we have the int2eqint4 function, but how does that work in an
index.  Actually, it was my understanding that the new conversion code
was going to get rid of the int2eqint4 style functions, and replace them
with conversions.

Thomas, perhaps a quick summary of the logic behind your parser
conversion changes would help.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> Sounds like if we add the proper pg_am functions for binary
> compatability, the optimizer should then use the proper indexes.
>

I just tried this and it failed.  Not sure why yet:

    test=> explain select * from test where oid = 3;
    NOTICE:  QUERY PLAN:

    Seq Scan on test  (cost=1936.05 size=1 width=8)

    EXPLAIN
    test=> create index i_test3 on test(oid oidint4ops);
    ERROR:  DefineIndex: oidint4ops class not found
    test=> create index i_test3 on test(oid oidint4_ops);
    vacuum ;
    CREATE
    test=> vacuum ;
    VACUUM
    test=> explain select * from test where oid = 3;
    NOTICE:  QUERY PLAN:

    Seq Scan on test  (cost=1936.05 size=1 width=8)

    EXPLAIN


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
>     #0  op_class (opno=1137, opclass=427, amopid=403) at lsyscache.c:58
>                       oideqint4    oid_ops   btree_am_oid
>
>
> and it returns false because there is no access operator for oid_ops and
> btree_am_oid that matches oideqint4.
>
> The fundamental problem is that index scans are made to compare columns
> all of the same type.  That is how indexes are built and traversed.
> Now, we want to bring in a constant of another type, and have it use an
> index.
>
> Sounds like if we add the proper pg_am functions for binary
> compatability, the optimizer should then use the proper indexes.

I think I have found part of the cause.  We have duplicate type
conversion functions, and the parser is choosing the one that is not in
the access method tables.


---------------------------------------------------------------------------


test=> select * from pg_operator where oid = 1137;

oprname|oprowner|oprprec|oprkind|oprisleft|oprcanhash|oprleft|oprright|oprresult|oprcom|oprnegate|oprlsortop|oprrsortop|oprcode
 |oprrest|oprjoin

-------+--------+-------+-------+---------+----------+-------+--------+---------+------+---------+----------+----------+---------+-------+---------
=      |     139|      0|b      |t        |t         |     26|      23|
     16|  1136|        0|         0|         0|oideqint4|eqsel
|eqjoinsel (1 row)


test=> select * from pg_operator where oid = 932;

oprname|oprowner|oprprec|oprkind|oprisleft|oprcanhash|oprleft|oprright|oprresult|oprcom|oprnegate|oprlsortop|oprrsortop|oprcode
 |oprrest |oprjoin

-------+--------+-------+-------+---------+----------+-------+--------+---------+------+---------+----------+----------+---------+--------+------------
=      |     139|      0|b      |t        |f         |    910|     910|
     16|   932|      935|         0|         0|oidint4eq|intltsel|intltjoinsel (1 row)

t

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> test=> select * from pg_operator where oid = 1137;
>
oprname|oprowner|oprprec|oprkind|oprisleft|oprcanhash|oprleft|oprright|oprresult|oprcom|oprnegate|oprlsortop|oprrsortop|oprcode
>  |oprrest|oprjoin
>
-------+--------+-------+-------+---------+----------+-------+--------+---------+------+---------+----------+----------+---------+-------+---------
> =      |     139|      0|b      |t        |t         |     26|      23|
>      16|  1136|        0|         0|         0|oideqint4|eqsel
> |eqjoinsel (1 row)
>
>
> test=> select * from pg_operator where oid = 932;
>
oprname|oprowner|oprprec|oprkind|oprisleft|oprcanhash|oprleft|oprright|oprresult|oprcom|oprnegate|oprlsortop|oprrsortop|oprcode
>  |oprrest |oprjoin
>
-------+--------+-------+-------+---------+----------+-------+--------+---------+------+---------+----------+----------+---------+--------+------------
> =      |     139|      0|b      |t        |f         |    910|     910|
>      16|   932|      935|         0|         0|oidint4eq|intltsel|intltjoinsel (1 row)

I was wrong about the above entries.  oidint4eq is for a custom type
that is made up of and oid and an int4.  Strange but true.  Probably
should be removed.  Can't imagine why someone would use this.  Probably
used when we didn't have multi-key indexes or something.

Anyway, here is a query that shows the operators defined for access
methods.

    SELECT proname
    FROM pg_operator, pg_amop,pg_proc
    WHERE amopopr = pg_operator.oid AND
          RegprocToOid(pg_operator.oprcode) = pg_proc.oid;


Perhaps if i add oid am functions to use the int4 functions, it would
work.  int4int2eq assumes promotion of int2 to int4.  Should work.

proname
----------------
chareq
chareq
nameeq
nameeq
int2eq
int2eq
int2lt
int4eq
int4eq
int4lt
texteq
texteq
poly_left
poly_overleft
poly_overright
poly_right
poly_contained
poly_contain
poly_same
poly_overlap
box_left
box_left
box_overleft
box_overleft
box_overright
box_overright
box_right
box_right
box_contained
box_contained
box_contain
box_contain
box_same
box_same
box_overlap
box_overlap
int2gt
int4gt
int2le
int4le
int2ge
int4ge
int24eq
int42eq
int24lt
int42lt
int24gt
int42gt
int24le
int42le
int24ge
int42ge
abstimeeq
abstimelt
abstimegt
abstimele
abstimege
oideq
oideq
int4lt
int4gt
int4le
int4ge
float4eq
float4eq
float4lt
float4gt
float4le
float4ge
charlt
charle
chargt
charge
namelt
namele
namegt
namege
text_lt
text_le
text_gt
text_ge
float8eq
float8eq
float8lt
float8le
float8gt
float8ge
oidnamelt
oidnamele
oidnameeq
oidnamege
oidnamegt
oidint2lt
oidint2le
oidint2eq
oidint2ge
oidint2gt
oidint4lt
oidint4le
oidint4eq
oidint4ge
oidint4gt
bpchareq
bpchareq
bpcharlt
bpcharle
bpchargt
bpcharge
varchareq
varchareq
varcharlt
varcharle
varchargt
varcharge
date_eq
date_eq
date_lt
date_le
date_gt
date_ge
time_eq
time_eq
time_lt
time_le
time_gt
time_ge
datetime_eq
datetime_eq
datetime_lt
datetime_le
datetime_gt
datetime_ge
timespan_eq
timespan_eq
timespan_lt
timespan_le
timespan_gt
timespan_ge
circle_left
circle_overleft
circle_overright
circle_right
circle_contained
circle_contain
circle_same
circle_overlap
(146 rows)




--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
"Thomas G. Lockhart"
Date:
> I think I have found part of the cause.  We have duplicate type
> conversion functions, and the parser is choosing the one that is not
> in the access method tables.

I don't think so for this case; I got stuck on this for awhile too. It
seems that "oidint4" is an actual data type, so has an "oidint4eq"
comparison function. The parser/planner/optimizer is finding the correct
functions, which are "oideqint4" and "int4eqoid".

Don't know what "oidint4" actually does or how it's used. Confusing...

                       - Tom

---------------------------------------------------------------------------
>
> test=> select * from pg_operator where oid = 1137;
>
oprname|oprowner|oprprec|oprkind|oprisleft|oprcanhash|oprleft|oprright|oprresult|oprcom|oprnegate|oprlsortop|oprrsortop|oprcode
>  |oprrest|oprjoin
>
-------+--------+-------+-------+---------+----------+-------+--------+---------+------+---------+----------+----------+---------+-------+---------
> =      |     139|      0|b      |t        |t         |     26|      23|
>      16|  1136|        0|         0|         0|oideqint4|eqsel
> |eqjoinsel (1 row)
>
> test=> select * from pg_operator where oid = 932;
>
oprname|oprowner|oprprec|oprkind|oprisleft|oprcanhash|oprleft|oprright|oprresult|oprcom|oprnegate|oprlsortop|oprrsortop|oprcode
>  |oprrest |oprjoin
>
-------+--------+-------+-------+---------+----------+-------+--------+---------+------+---------+----------+----------+---------+--------+------------
> =      |     139|      0|b      |t        |f         |    910|     910|
>      16|   932|      935|         0|         0|oidint4eq|intltsel|intltjoinsel (1 row)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> > I think I have found part of the cause.  We have duplicate type
> > conversion functions, and the parser is choosing the one that is not
> > in the access method tables.
>
> I don't think so for this case; I got stuck on this for awhile too. It
> seems that "oidint4" is an actual data type, so has an "oidint4eq"
> comparison function. The parser/planner/optimizer is finding the correct
> functions, which are "oideqint4" and "int4eqoid".
>
> Don't know what "oidint4" actually does or how it's used. Confusing...

OK, new information.  The following query shows the pg_proc names for
oid.*int4:

select pg_operator.oid as pg_operator_oid, pg_proc.proname from pg_proc,
pg_operator where proname ~ 'oid.*int4' and pg_proc.oid =
RegprocToOid(pg_operator.oprcode)

pg_operator_oid|proname
---------------+---------
           1137|oideqint4
            930|oidint4lt
            931|oidint4le
            932|oidint4eq
            933|oidint4ge
            934|oidint4gt
            935|oidint4ne
(7 rows)

The pg_operator_oid is the value stored in pg_amop.amopopr field.  The
optimizer is finding a function that works for the query, in our case
oideqint4, with a pg_operator oid of 1137.

The problem is that there is no pg_am.amopopr for 1137.  What is defined
for oid is this:

/*
 *  nbtree oid_ops
 */

DATA(insert OID = 0 (  403 427 609 1 btreesel btreenpage ));
DATA(insert OID = 0 (  403 427 611 2 btreesel btreenpage ));
DATA(insert OID = 0 (  403 427 607 3 btreesel btreenpage ));
DATA(insert OID = 0 (  403 427 612 4 btreesel btreenpage ));
DATA(insert OID = 0 (  403 427 610 5 btreesel btreenpage ));

For example, the third number in the third line is 607, which is the
amopopr, and is oideq.  We know we can use this for the query because
int4 and oid are identical, but the parser has already chosen the more
appopriate 1137/oideqint4.  If we could add an extra line to this file,
perhaps:

DATA(insert OID = 0 (  403 427 607 3 btreesel btreenpage ));

However, I don't think the access methods allow more than one line in
this way.

The other problem is that there is not a <, >, etc for int4/oid, so a
separate op type can not be created.  In fact, I am not sure it would
work anyway.  I think the operator oid chosen by the backend must
EXACTLY match the oid stored in pg_amop.amopopr for the index to be
used.

That is what match_clause_to_indexkey is doing by calling op_class().

[Update you source trees.  The op_class parameter names were wrong,
called opid instead of opno, so it was very confusing.  Fixed now.]

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> For example, the third number in the third line is 607, which is the
> amopopr, and is oideq.  We know we can use this for the query because
> int4 and oid are identical, but the parser has already chosen the more
> appopriate 1137/oideqint4.  If we could add an extra line to this file,
> perhaps:
>
> DATA(insert OID = 0 (  403 427 607 3 btreesel btreenpage ));
>
> However, I don't think the access methods allow more than one line in
> this way.

I just tried adding the extra line, and initdb failed.  I wonder if we
remove the oideqint4, if the parser will go for oideq?

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
"Thomas G. Lockhart"
Date:
> I just tried adding the extra line, and initdb failed.  I wonder if we
> remove the oideqint4, if the parser will go for oideq?

If you want to try this as an experiment, so we can tell if this is a
possible solution, then great. I still have hopes that we can substitute
other parse trees and strategies within the index utilization routines,
so how about not committing things until we've tried a few options.

match_clause_to_indexkey() seems to be a starting point for what I want
to do. Will let you know how it goes...

                       - Tom

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> > I just tried adding the extra line, and initdb failed.  I wonder if we
> > remove the oideqint4, if the parser will go for oideq?
>
> If you want to try this as an experiment, so we can tell if this is a
> possible solution, then great. I still have hopes that we can substitute
> other parse trees and strategies within the index utilization routines,
> so how about not committing things until we've tried a few options.
>
> match_clause_to_indexkey() seems to be a starting point for what I want
> to do. Will let you know how it goes...

OK.  Check out my other posts on how op_class/pg_amop.amopopr is used to
look up if the current expression operator is a member of the
class(btree), and index opclass(oid_ops).  I could probably code
something so you could put multiple entries in pg_amop.  My issue is
that there are not that many interchangable operators to make that
useful.

Now, if you are thinking of doing some type of constant conversion
there, that may be a bigger win.  The issue is at this point in the
code, there is no guarentee the index will be used.  It is just checking
the index usability.

I think I see where you are going with this.  For queries not involving
indexes(many joins don't), it may be better to leave things for the
executor.

Also, if you can, do a fresh cvs update if you can, because the code is
a little cleaner now in that area.  Particularly, the op_class()
parameters were badly named, causing me confusion.


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> > However, I don't think the access methods allow more than one line in
> > this way.
>
> I just tried adding the extra line, and initdb failed.  I wonder if we
> remove the oideqint4, if the parser will go for oideq?
>

One other item.  oid indexes use int4lt(), etc. for their internal
indexing, not oidlt() because they do not exist.


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
"Thomas G. Lockhart"
Date:
> > match_clause_to_indexkey() seems to be a starting point for what I
> > want to do. Will let you know how it goes...
> Check out my other posts on how op_class/pg_amop.amopopr is used to
> look up if the current expression operator is a member of the
> class(btree), and index opclass(oid_ops).  I could probably code
> something so you could put multiple entries in pg_amop.  My issue is
> that there are not that many interchangable operators to make that
> useful.
> Now, if you are thinking of doing some type of constant conversion
> there, that may be a bigger win.  The issue is at this point in the
> code, there is no guarentee the index will be used.  It is just
> checking the index usability.
> I think I see where you are going with this.  For queries not
> involving indexes(many joins don't), it may be better to leave things
> for the executor.

The first thing I want to try is to substitute the operator for types
which are known to be binary-compatible and do not have their own index
defined. The next step would be to substitute an operator _and_ insert
an explicit type conversion using a function call (which may not work
yet for other reasons). Don't know where Vadim's PARAM_EXEC node comes
in, but if it does it might be around here.

> ... if you can, do a fresh cvs update if you can, because the code is
> a little cleaner now in that area.  Particularly, the op_class()
> parameters were badly named, causing me confusion.

OK. Does the current source tree compile now? Haven't had much time to
look at it the last few days, but will start poking at it this evening a
bit.

                         - Tom

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> The first thing I want to try is to substitute the operator for types
> which are known to be binary-compatible and do not have their own index
> defined. The next step would be to substitute an operator _and_ insert
> an explicit type conversion using a function call (which may not work
> yet for other reasons). Don't know where Vadim's PARAM_EXEC node comes
> in, but if it does it might be around here.
>
> > ... if you can, do a fresh cvs update if you can, because the code is
> > a little cleaner now in that area.  Particularly, the op_class()
> > parameters were badly named, causing me confusion.
>
> OK. Does the current source tree compile now? Haven't had much time to
> look at it the last few days, but will start poking at it this evening a
> bit.

Yep.  Only been down for a few hours in the past month, at least for me.
I am working on another huge patch, but it will be tested before
install, of course.

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
"Thomas G. Lockhart"
Date:
> > The first thing I want to try is to substitute the operator for
> > types which are known to be binary-compatible and do not have their
> > own index defined.

Got a start on it :)

regression=> explain select * from tenk1 where oid = 3000;
NOTICE:  QUERY PLAN:

Index Scan using tenk1_oid on tenk1  (cost=2.05 size=1 width=148)

EXPLAIN

So far, I've just done the right-hand form (the one with the constant on
the rhs of the expression). The left-hand form should be easy now.

So, just curious: if we stop here, and only match up binary-compatible
built-in types with available indices, then what v6.3.2 features/good
behaviors are still missing?

                       - Tom

The example is from the regression test database with an extra btree
index built on the tenk1 table...

regression=> \d tenk1

Table    = tenk1
+----------------------------------+----------------------------------+-------+
|              Field               |              Type                |
Length|
+----------------------------------+----------------------------------+-------+
| unique1                          | int4
|     4 |
| unique2                          | int4
|     4 |
| two                              | int4
|     4 |
| four                             | int4
|     4 |
| ten                              | int4
|     4 |
| twenty                           | int4
|     4 |
| hundred                          | int4
|     4 |
| thousand                         | int4
|     4 |
| twothousand                      | int4
|     4 |
| fivethous                        | int4
|     4 |
| tenthous                         | int4
|     4 |
| odd                              | int4
|     4 |
| even                             | int4
|     4 |
| stringu1                         | name
|    32 |
| stringu2                         | name
|    32 |
| string4                          | name
|    32 |
+----------------------------------+----------------------------------+-------+
Indices:  tenk1_hundred
          tenk1_oid
          tenk1_unique1
          tenk1_unique2
regression=>

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> > > The first thing I want to try is to substitute the operator for
> > > types which are known to be binary-compatible and do not have their
> > > own index defined.
>
> Got a start on it :)
>
> regression=> explain select * from tenk1 where oid = 3000;
> NOTICE:  QUERY PLAN:
>
> Index Scan using tenk1_oid on tenk1  (cost=2.05 size=1 width=148)
>
> EXPLAIN
>
> So far, I've just done the right-hand form (the one with the constant on
> the rhs of the expression). The left-hand form should be easy now.
>
> So, just curious: if we stop here, and only match up binary-compatible
> built-in types with available indices, then what v6.3.2 features/good
> behaviors are still missing?

Did you put it in the optimizer?

I think binary compatable types converted is going to be the easiest
thing to do for index use.  Not sure how you could try and break it.
How about character string comparisons using indexes?

> |    32 |
> +----------------------------------+----------------------------------+-------+
> Indices:  tenk1_hundred
>           tenk1_oid
>           tenk1_unique1
>           tenk1_unique2
  ^^^^^^^^^^^^^^^^^^^^^^^^^

How about that new display?


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
"Thomas G. Lockhart"
Date:
> > > > The first thing I want to try is to substitute the operator for
> > > > types which are known to be binary-compatible and do not have
> > > > their own index defined.
> > Got a start on it :)
> > regression=> explain select * from tenk1 where oid = 3000;
> > Index Scan using tenk1_oid on tenk1  (cost=2.05 size=1 width=148)
> > So, just curious: if we stop here, and only match up
> > binary-compatible built-in types with available indices, then what
> > v6.3.2 features/good behaviors are still missing?
> Did you put it in the optimizer?

It is inside backend/optimizer/path/indxpath.c. I'm using a bit of the
parser support code to help out. It has to be where the backend actually
is checking to see if an index is usable, which is in the optimization
step. Any earlier and we would have to look-ahead at the indices which
seems inappropriate.

> I think binary compatable types converted is going to be the easiest
> thing to do for index use.  Not sure how you could try and break it.
> How about character string comparisons using indexes?

Will try some more tests, but it seems like it will be hard to break
since it only comes into effect with built-in datatypes which are
supposed to be binary compatible.

It would be interesting to try to do constant expressions and function
calls on constants next, though I'm thinking that it isn't required for
v6.4.

Vadim, will the executor know how to use a PARAM_EXEC node in any
context, or will we have to do some coding to get it recognized outside
of subselects? I'll need to figure out how to build one too, I
suppose...

                     - Tom

Re: [HACKERS] Re: type coersion (was OR clause status)

From
"Thomas G. Lockhart"
Date:
Thomas G. Lockhart wrote:
>
> > > > > The first thing I want to try is to substitute the operator for
> > > > > types which are known to be binary-compatible and do not have
> > > > > their own index defined.
> > > Got a start on it :)
> > > regression=> explain select * from tenk1 where oid = 3000;
> > > Index Scan using tenk1_oid on tenk1  (cost=2.05 size=1 width=148)
> > > So, just curious: if we stop here, and only match up
> > > binary-compatible built-in types with available indices, then what
> > > v6.3.2 features/good behaviors are still missing?
> > Did you put it in the optimizer?
>
> It is inside backend/optimizer/path/indxpath.c. I'm using a bit of the
> parser support code to help out. It has to be where the backend actually
> is checking to see if an index is usable, which is in the optimization
> step. Any earlier and we would have to look-ahead at the indices which
> seems inappropriate.
>
> > I think binary compatable types converted is going to be the easiest
> > thing to do for index use.  Not sure how you could try and break it.
> > How about character string comparisons using indexes?
>
> Will try some more tests, but it seems like it will be hard to break
> since it only comes into effect with built-in datatypes which are
> supposed to be binary compatible.

OK, I've just committed to the source tree changes for looking for
binary-compatible indices when either the left- or right-hand side of
the restriction clause is a constant.

I've #ifndef'd them so we can disable it if necessary. But, all of the
regression tests pass, except for the select_view test, which has been
core dumping for weeks. Anyone else seeing that, or is it just me? :(

                    - Tom

Regression test status (was type coersion)

From
Tom Lane
Date:
"Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
> ... all of the
> regression tests pass, except for the select_view test, which has been
> core dumping for weeks. Anyone else seeing that, or is it just me? :(

I rebuilt the system from current sources today, and ran the regression
tests for the first time in a long time.  select_views works fine for
me, but there are several other tests that look badly broken:
SELECT ... ORDER BY upper(c) is misordering the results in select_implicit,
GROUP BY on a datetime is not working right in select_having, and the
random test is failing because it "can't look up operator 713".

I'm on HP-UX 9.03, PA-RISC 1.1, gcc 2.7.2.2 if that helps.

            regards, tom lane


*** expected/select_implicit.out    Sat Aug 15 11:56:03 1998
--- results/select_implicit.out    Sat Aug 15 13:44:16 1998
***************
*** 213,226 ****
  QUERY: SELECT a FROM test_missing_target ORDER BY upper(c);
  a
  -
- 1
  2
  3
  4
  5
  6
- 7
  8
  9
  0
  (10 rows)
--- 213,226 ----
  QUERY: SELECT a FROM test_missing_target ORDER BY upper(c);
  a
  -
  2
+ 1
  3
  4
  5
  6
  8
+ 7
  9
  0
  (10 rows)

----------------------

*** expected/select_having.out    Wed Jul  8 10:29:09 1998
--- results/select_having.out    Sat Aug 15 13:44:16 1998
***************
*** 2,12 ****
    GROUP BY d1 HAVING count(*) > 1;
  d1                          |count
  ----------------------------+-----
! Thu Jun 13 00:00:00 1957 PDT|    2
! Mon Feb 10 09:32:01 1997 PST|    3
! Mon Feb 10 17:32:01 1997 PST|   13
  Sun Feb 16 17:32:01 1997 PST|    2
  Sat Mar 01 17:32:01 1997 PST|    2
! invalid                     |    2
! (6 rows)

--- 2,13 ----
    GROUP BY d1 HAVING count(*) > 1;
  d1                          |count
  ----------------------------+-----
! Thu Jun 13 00:00:00 1957 PST|    2
! Mon Feb 10 17:32:01 1997 PST|    4
! Mon Feb 10 09:32:01 1997 PST|    2
! Mon Feb 10 17:32:01 1997 PST|    2
! Mon Feb 10 17:32:01 1997 PST|    7
  Sun Feb 16 17:32:01 1997 PST|    2
  Sat Mar 01 17:32:01 1997 PST|    2
! (7 rows)


----------------------

*** expected/random.out    Tue Apr 29 10:23:40 1997
--- results/random.out    Sat Aug 15 13:44:19 1998
***************
*** 5,18 ****
  (1 row)

  QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
! count
! -----
!    92
! (1 row)

  QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
! count
! -----
!    98
! (1 row)

--- 5,12 ----
  (1 row)

  QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
! ERROR:  can't look up operator 713

  QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
! ERROR:  can't look up operator 713


----------------------

Re: [HACKERS] Regression test status (was type coersion)

From
"Thomas G. Lockhart"
Date:
> > ... all of the
> > regression tests pass, except for the select_view test, which has
> > been core dumping for weeks. Anyone else seeing that?
> I ... ran the regression
> tests for the first time in a long time.  select_views works fine for
> me, but there are several other tests that look badly broken:
> I'm on HP-UX 9.03, PA-RISC 1.1, gcc 2.7.2.2 if that helps.

Hmm. I'm on Linux-libc5, i686, gcc 2.7.2.1, and select_views is still
core dumping:

QUERY: SELECT * FROM toyemp WHERE name = 'sharon';
pqReadData() -- backend closed the channel unexpectedly.

> --- results/select_implicit.out Sat Aug 15 13:44:16 1998
> --- results/select_having.out   Sat Aug 15 13:44:16 1998

These two pass on my machine. Your select_having result looks similar to
the sorting results back when the in-memory/on-disk sorting routines
were broken; some identical values are not getting grouped together
while others are.

> --- results/random.out  Sat Aug 15 13:44:19 1998
>   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> ! ERROR:  can't look up operator 713

Ah. I am getting failures on this too but didn't actually look at why,
just assuming that the randomizer was doing better nowadays. This is
probably related to something I, or possibly Bruce or David, had done.
If you change the query to

  SELECT count(*) FROM onek where oidrand(onek.oid, 10) = TRUE;

then it works. I'll look at it. Darn, I had just stripped out the
debugging stuff in the code before submitting it, since leaving it
hanging around was annoying Bruce (with good reason btw).

                     - Tom

Re: [HACKERS] Regression test status (was type coersion)

From
"Thomas G. Lockhart"
Date:
> > --- results/random.out  Sat Aug 15 13:44:19 1998
> >   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> > ! ERROR:  can't look up operator 713
> Ah. I am getting failures on this too but didn't actually look at why,
> just assuming that the randomizer was doing better nowadays. This is
> probably related to something I, or possibly Bruce or David, had done.

OK, I just committed patches which fixes this problem. The index
strategy code used to just check for an available operator, but I had
changed it to try an alternate strategy by looking for the same operator
with different types. But the same code gets executed even if the
restriction clause is just a function returning a boolean, with no
operator involved at all.

I now allow the operator name lookup to return a null pointer if an
alternate does not exist, and then the code stops looking for
alternatives.

So now the "random" test still fails, but in a good way with the
expected results...

                      - Tom

Re: [HACKERS] Regression test status (was type coersion)

From
David Hartwig
Date:

Tom Lane wrote:

> I rebuilt the system from current sources today, and ran the regression
> tests for the first time in a long time.  select_views works fine for
> me, but there are several other tests that look badly broken:
> SELECT ... ORDER BY upper(c) is misordering the results in select_implicit,
> GROUP BY
>                         regards, tom lane
>
> *** expected/select_implicit.out        Sat Aug 15 11:56:03 1998
> --- results/select_implicit.out Sat Aug 15 13:44:16 1998
> ***************
> *** 213,226 ****
>   QUERY: SELECT a FROM test_missing_target ORDER BY upper(c);
>   a
>   -
> - 1
>   2
>   3
>   4
>   5
>   6
> - 7
>   8
>   9
>   0
>   (10 rows)
> --- 213,226 ----
>   QUERY: SELECT a FROM test_missing_target ORDER BY upper(c);
>   a
>   -
>   2
> + 1
>   3
>   4
>   5
>   6
>   8
> + 7
>   9
>   0
>   (10 rows)
>


Interesting.   I do not recall my exact data set in the regression, but I
believe both results are correct.   In some sense, on your machine
upper('CCCC') and upper('cccc') are sorting in a different order then my
machine.    I realize that internally they are actually ordinally tied.   But I
thought they should still produce a predictable, uniform, result set.  Either,
there is a bug or I need more reliable test data.   I will verify this when I
get home.


Re: [HACKERS] Re: type coersion (was OR clause status)

From
Vadim Mikheev
Date:
Thomas G. Lockhart wrote:
>
> It is inside backend/optimizer/path/indxpath.c. I'm using a bit of the
> parser support code to help out. It has to be where the backend actually
> is checking to see if an index is usable, which is in the optimization
> step. Any earlier and we would have to look-ahead at the indices which
> seems inappropriate.

Just let me note that function calls on constants is problem not
only for indices using. Call lower() for each tuple in
WHERE a = lower('bbb') is always bad - lower() eats memory...

> > I think binary compatable types converted is going to be the easiest
> > thing to do for index use.  Not sure how you could try and break it.
> > How about character string comparisons using indexes?

Parser could use type_in()/type_out() funcs to do type
coersion...

>
> Will try some more tests, but it seems like it will be hard to break
> since it only comes into effect with built-in datatypes which are
> supposed to be binary compatible.
>
> It would be interesting to try to do constant expressions and function
> calls on constants next, though I'm thinking that it isn't required for
> v6.4.
>
> Vadim, will the executor know how to use a PARAM_EXEC node in any
> context, or will we have to do some coding to get it recognized outside
> of subselects? I'll need to figure out how to build one too, I
> suppose...

I'm not sure... But imho, PARAM_EXEC could be usefull for
now() etc funcs - for non-variant funcs I would suggest
just evaluate them in parser...

Vadim

Re: [HACKERS] Re: type coersion (was OR clause status)

From
"Thomas G. Lockhart"
Date:
> Just let me note that function calls on constants is problem not
> only for indices using. Call lower() for each tuple in
> WHERE a = lower('bbb') is always bad - lower() eats memory...

Sure, so we want to evaluate as a constant. I've found a place in the
code where a function gets evaluated immediately; don't remember where
but it looked like an interesting possibility.

> > > How about character string comparisons using indexes?
> Parser could use type_in()/type_out() funcs to do type
> coersion...

That is what it used to try to do. But I don't think that generalizes
very well. For example, at the moment floating point numbers without
fractional parts are printed without a decimal point or trailing digits,
but we could decide to format them with a ".0" at the end. Then they
couldn't be converted to an integer...

> > Vadim, will the executor know how to use a PARAM_EXEC node in any
> > context, or will we have to do some coding to get it recognized
> > outside of subselects? I'll need to figure out how to build one too,
> > I suppose...
> I'm not sure... But imho, PARAM_EXEC could be usefull for
> now() etc funcs - for non-variant funcs I would suggest
> just evaluate them in parser...

But it could be used for every function called with constants, right? If
it works for everything, why bother with other special cases?

                     - Tom

Re: [HACKERS] Regression test status (was type coersion)

From
Bruce Momjian
Date:
> then it works. I'll look at it. Darn, I had just stripped out the
> debugging stuff in the code before submitting it, since leaving it
> hanging around was annoying Bruce (with good reason btw).

I have an idea on this.  If we change #ifdef PARSEDEBUG to something
else, I think we can keep it.  If in an include file we say:

#ifdef PARSEDEBUG
#define ParseDebug(x)    x
#else
#define ParseDebug(x)
#endif

we can then do

    x=3;
    y=4;
    ParseDebug(printf("test));

and it will look good, and we can always enable it.  What do you think?
I can do it if you want?

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Regression test status (was type coersion)

From
Bruce Momjian
Date:
Do any of these problems still exist?

> "Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
> > ... all of the
> > regression tests pass, except for the select_view test, which has been
> > core dumping for weeks. Anyone else seeing that, or is it just me? :(
>
> I rebuilt the system from current sources today, and ran the regression
> tests for the first time in a long time.  select_views works fine for
> me, but there are several other tests that look badly broken:
> SELECT ... ORDER BY upper(c) is misordering the results in select_implicit,
> GROUP BY on a datetime is not working right in select_having, and the
> random test is failing because it "can't look up operator 713".
>
> I'm on HP-UX 9.03, PA-RISC 1.1, gcc 2.7.2.2 if that helps.
>
>             regards, tom lane
>
>
> *** expected/select_implicit.out    Sat Aug 15 11:56:03 1998
> --- results/select_implicit.out    Sat Aug 15 13:44:16 1998
> ***************
> *** 213,226 ****
>   QUERY: SELECT a FROM test_missing_target ORDER BY upper(c);
>   a
>   -
> - 1
>   2
>   3
>   4
>   5
>   6
> - 7
>   8
>   9
>   0
>   (10 rows)
> --- 213,226 ----
>   QUERY: SELECT a FROM test_missing_target ORDER BY upper(c);
>   a
>   -
>   2
> + 1
>   3
>   4
>   5
>   6
>   8
> + 7
>   9
>   0
>   (10 rows)
>
> ----------------------
>
> *** expected/select_having.out    Wed Jul  8 10:29:09 1998
> --- results/select_having.out    Sat Aug 15 13:44:16 1998
> ***************
> *** 2,12 ****
>     GROUP BY d1 HAVING count(*) > 1;
>   d1                          |count
>   ----------------------------+-----
> ! Thu Jun 13 00:00:00 1957 PDT|    2
> ! Mon Feb 10 09:32:01 1997 PST|    3
> ! Mon Feb 10 17:32:01 1997 PST|   13
>   Sun Feb 16 17:32:01 1997 PST|    2
>   Sat Mar 01 17:32:01 1997 PST|    2
> ! invalid                     |    2
> ! (6 rows)
>
> --- 2,13 ----
>     GROUP BY d1 HAVING count(*) > 1;
>   d1                          |count
>   ----------------------------+-----
> ! Thu Jun 13 00:00:00 1957 PST|    2
> ! Mon Feb 10 17:32:01 1997 PST|    4
> ! Mon Feb 10 09:32:01 1997 PST|    2
> ! Mon Feb 10 17:32:01 1997 PST|    2
> ! Mon Feb 10 17:32:01 1997 PST|    7
>   Sun Feb 16 17:32:01 1997 PST|    2
>   Sat Mar 01 17:32:01 1997 PST|    2
> ! (7 rows)
>
>
> ----------------------
>
> *** expected/random.out    Tue Apr 29 10:23:40 1997
> --- results/random.out    Sat Aug 15 13:44:19 1998
> ***************
> *** 5,18 ****
>   (1 row)
>
>   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> ! count
> ! -----
> !    92
> ! (1 row)
>
>   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> ! count
> ! -----
> !    98
> ! (1 row)
>
> --- 5,12 ----
>   (1 row)
>
>   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> ! ERROR:  can't look up operator 713
>
>   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> ! ERROR:  can't look up operator 713
>
>
> ----------------------
>
>


--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Re: type coersion (was OR clause status)

From
Bruce Momjian
Date:
> OK, I've just committed to the source tree changes for looking for
> binary-compatible indices when either the left- or right-hand side of
> the restriction clause is a constant.
>
> I've #ifndef'd them so we can disable it if necessary. But, all of the
> regression tests pass, except for the select_view test, which has been
> core dumping for weeks. Anyone else seeing that, or is it just me? :(

Is this fixed?  Are there any open problems with the regress tests?

--
Bruce Momjian                          |  830 Blythe Avenue
maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
  +  If your life is a hard drive,     |  (610) 353-9879(w)
  +  Christ can be your backup.        |  (610) 853-3000(h)

Re: [HACKERS] Regression test status (was type coersion)

From
David Hartwig
Date:
Tom,

I was never able to reproduce the ORDER BY anomaly on my Linux box.   I will try
to reproduce it on my AIX box.  I have some porting issues I need to get through
first.

In the mean time, if you still see a problem, could you run the query as:
     SELECT a, upper(c) FROM test_missing_target ORDER BY upper;
to eliminate the junkfilter as a culprit.

Bruce Momjian wrote:

> Do any of these problems still exist?
>
> > "Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
> > > ... all of the
> > > regression tests pass, except for the select_view test, which has been
> > > core dumping for weeks. Anyone else seeing that, or is it just me? :(
> >
> > I rebuilt the system from current sources today, and ran the regression
> > tests for the first time in a long time.  select_views works fine for
> > me, but there are several other tests that look badly broken:
> > SELECT ... ORDER BY upper(c) is misordering the results in select_implicit,
> > GROUP BY on a datetime is not working right in select_having, and the
> > random test is failing because it "can't look up operator 713".
> >
> > I'm on HP-UX 9.03, PA-RISC 1.1, gcc 2.7.2.2 if that helps.
> >
> >                       regards, tom lane
> >
> >
> > *** expected/select_implicit.out      Sat Aug 15 11:56:03 1998
> > --- results/select_implicit.out       Sat Aug 15 13:44:16 1998
> > ***************
> > *** 213,226 ****
> >   QUERY: SELECT a FROM test_missing_target ORDER BY upper(c);
> >   a
> >   -
> > - 1
> >   2
> >   3
> >   4
> >   5
> >   6
> > - 7
> >   8
> >   9
> >   0
> >   (10 rows)
> > --- 213,226 ----
> >   QUERY: SELECT a FROM test_missing_target ORDER BY upper(c);
> >   a
> >   -
> >   2
> > + 1
> >   3
> >   4
> >   5
> >   6
> >   8
> > + 7
> >   9
> >   0
> >   (10 rows)
> >
> > ----------------------
> >
> > *** expected/select_having.out        Wed Jul  8 10:29:09 1998
> > --- results/select_having.out Sat Aug 15 13:44:16 1998
> > ***************
> > *** 2,12 ****
> >     GROUP BY d1 HAVING count(*) > 1;
> >   d1                          |count
> >   ----------------------------+-----
> > ! Thu Jun 13 00:00:00 1957 PDT|    2
> > ! Mon Feb 10 09:32:01 1997 PST|    3
> > ! Mon Feb 10 17:32:01 1997 PST|   13
> >   Sun Feb 16 17:32:01 1997 PST|    2
> >   Sat Mar 01 17:32:01 1997 PST|    2
> > ! invalid                     |    2
> > ! (6 rows)
> >
> > --- 2,13 ----
> >     GROUP BY d1 HAVING count(*) > 1;
> >   d1                          |count
> >   ----------------------------+-----
> > ! Thu Jun 13 00:00:00 1957 PST|    2
> > ! Mon Feb 10 17:32:01 1997 PST|    4
> > ! Mon Feb 10 09:32:01 1997 PST|    2
> > ! Mon Feb 10 17:32:01 1997 PST|    2
> > ! Mon Feb 10 17:32:01 1997 PST|    7
> >   Sun Feb 16 17:32:01 1997 PST|    2
> >   Sat Mar 01 17:32:01 1997 PST|    2
> > ! (7 rows)
> >
> >
> > ----------------------
> >
> > *** expected/random.out       Tue Apr 29 10:23:40 1997
> > --- results/random.out        Sat Aug 15 13:44:19 1998
> > ***************
> > *** 5,18 ****
> >   (1 row)
> >
> >   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> > ! count
> > ! -----
> > !    92
> > ! (1 row)
> >
> >   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> > ! count
> > ! -----
> > !    98
> > ! (1 row)
> >
> > --- 5,12 ----
> >   (1 row)
> >
> >   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> > ! ERROR:  can't look up operator 713
> >
> >   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> > ! ERROR:  can't look up operator 713
> >
> >
> > ----------------------
> >
> >
>
> --
> Bruce Momjian                          |  830 Blythe Avenue
> maillist@candle.pha.pa.us              |  Drexel Hill, Pennsylvania 19026
>   +  If your life is a hard drive,     |  (610) 353-9879(w)
>   +  Christ can be your backup.        |  (610) 853-3000(h)




Re: [HACKERS] Regression test status (was type coersion)

From
David Hartwig
Date:
I check  two these of these problems.    Read ahead  --->

Bruce Momjian wrote:

> Do any of these problems still exist?
>
> > "Thomas G. Lockhart" <lockhart@alumni.caltech.edu> writes:
> > > ... all of the
> > > regression tests pass, except for the select_view test, which has been
> > > core dumping for weeks. Anyone else seeing that, or is it just me? :(
> >
> > I rebuilt the system from current sources today, and ran the regression
> > tests for the first time in a long time.  select_views works fine for
> > me, but there are several other tests that look badly broken:
> > SELECT ... ORDER BY upper(c) is misordering the results in select_implicit,
> > GROUP BY on a datetime is not working right in select_having, and the
> > random test is failing because it "can't look up operator 713".
> >
> > I'm on HP-UX 9.03, PA-RISC 1.1, gcc 2.7.2.2 if that helps.
> >
> >                       regards, tom lane
> >
> >
> > *** expected/select_implicit.out      Sat Aug 15 11:56:03 1998
> > --- results/select_implicit.out       Sat Aug 15 13:44:16 1998
> > ***************
> > *** 213,226 ****
> >   QUERY: SELECT a FROM test_missing_target ORDER BY upper(c);
> >   a
> >   -
> > - 1
> >   2
> >   3
> >   4
> >   5
> >   6
> > - 7
> >   8
> >   9
> >   0
> >   (10 rows)
> > --- 213,226 ----
> >   QUERY: SELECT a FROM test_missing_target ORDER BY upper(c);
> >   a
> >   -
> >   2
> > + 1
> >   3
> >   4
> >   5
> >   6
> >   8
> > + 7
> >   9
> >   0
> >   (10 rows)
> >
> > ----------------------
> >

Both sort orders are correct.     I seems that different machines are resolving
equivalent strings differently.    I got the same result as Tom on an RS/6000
box.   I could be an big vs little endian, signed vs unsigned chars issue.  In
any case, the actual sort order is indeterminate.  Therefore I will submit a new
regression test with reliable test data.

> > *** expected/select_having.out        Wed Jul  8 10:29:09 1998
> > --- results/select_having.out Sat Aug 15 13:44:16 1998
> > ***************
> > *** 2,12 ****
> >     GROUP BY d1 HAVING count(*) > 1;
> >   d1                          |count
> >   ----------------------------+-----
> > ! Thu Jun 13 00:00:00 1957 PDT|    2
> > ! Mon Feb 10 09:32:01 1997 PST|    3
> > ! Mon Feb 10 17:32:01 1997 PST|   13
> >   Sun Feb 16 17:32:01 1997 PST|    2
> >   Sat Mar 01 17:32:01 1997 PST|    2
> > ! invalid                     |    2
> > ! (6 rows)
> >
> > --- 2,13 ----
> >     GROUP BY d1 HAVING count(*) > 1;
> >   d1                          |count
> >   ----------------------------+-----
> > ! Thu Jun 13 00:00:00 1957 PST|    2
> > ! Mon Feb 10 17:32:01 1997 PST|    4
> > ! Mon Feb 10 09:32:01 1997 PST|    2
> > ! Mon Feb 10 17:32:01 1997 PST|    2
> > ! Mon Feb 10 17:32:01 1997 PST|    7
> >   Sun Feb 16 17:32:01 1997 PST|    2
> >   Sat Mar 01 17:32:01 1997 PST|    2
> > ! (7 rows)
> >
> >
> > ----------------------
> >

These results are also correct.  Somewhat.    I do not know much about datatime
porting issues, but if  I do a:
    SELECT d1 FROM  DATETIME_TBL
I get time reported to the 1/100 of a second.    If  GROUP BY d1 the hundredths
are not shown.   Thus, the counts and groupings are correct.   Its just not
showing the hundredths portion.

> > *** expected/random.out       Tue Apr 29 10:23:40 1997
> > --- results/random.out        Sat Aug 15 13:44:19 1998
> > ***************
> > *** 5,18 ****
> >   (1 row)
> >
> >   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> > ! count
> > ! -----
> > !    92
> > ! (1 row)
> >
> >   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> > ! count
> > ! -----
> > !    98
> > ! (1 row)
> >
> > --- 5,12 ----
> >   (1 row)
> >
> >   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> > ! ERROR:  can't look up operator 713
> >
> >   QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
> > ! ERROR:  can't look up operator 713
> >
> >

Don't know about this one.


Re: [HACKERS] Regression test status (was type coersion)

From
Tom Lane
Date:
David Hartwig <daveh@insightdist.com> writes:
> Both sort orders are correct.  I seems that different machines are
> resolving equivalent strings differently.  I got the same result as
> Tom on an RS/6000 box.  I could be an big vs little endian, signed vs
> unsigned chars issue.  In any case, the actual sort order is
> indeterminate.  Therefore I will submit a new regression test with
> reliable test data.

OK.

> These results are also correct.  Somewhat.  I do not know much about
> datatime porting issues, but if I do a:
>     SELECT d1 FROM  DATETIME_TBL
> I get time reported to the 1/100 of a second.  If GROUP BY d1 the
> hundredths are not shown.  Thus, the counts and groupings are correct.
> Its just not showing the hundredths portion.

The issue here is why you (or whoever it was prepared the regression
test) got different results from me...

>>>> QUERY: SELECT count(*) FROM onek where oidrand(onek.oid, 10);
>>>> ! ERROR:  can't look up operator 713
>
> Don't know about this one.

That bug should be fixed in the current sources, I believe.

            regards, tom lane

Re: [HACKERS] Regression test status (was type coersion)

From
David Hartwig
Date:

Tom Lane wrote:

> > These results are also correct.  Somewhat.  I do not know much about
> > datatime porting issues, but if I do a:
> >     SELECT d1 FROM  DATETIME_TBL
> > I get time reported to the 1/100 of a second.  If GROUP BY d1 the
> > hundredths are not shown.  Thus, the counts and groupings are correct.
> > Its just not showing the hundredths portion.
>
> The issue here is why you (or whoever it was prepared the regression
> test) got different results from me...
>

I do get the same results - on my RS/6000.   And I did not prepare the
test.    But, to be more precise:

1. The failure does not indicate any problems w/ the HAVING clause.

2.  This test will never work if there is a machine dependence on the
ability to resolve datatime down to the 1/100 second.

3.   There appears to be something strange going on in the way GROUPed BY
datetime columns do not display the 1/100 portion of seconds.   Is this a
known problem?

Unless someone objects, I will submit a  test suit  to, at least remove the
false reading on select_having.