help debugging an issue with selectivity - Mailing list pgsql-general

From Greg Hennessy
Subject help debugging an issue with selectivity
Date
Msg-id 8e4f0d76-9e77-4819-96f7-cbc64a9d2b5a@gmail.com
Whole thread Raw
In response to Re: How to properly use TRIM()?  ("David G. Johnston" <david.g.johnston@gmail.com>)
Responses Re: help debugging an issue with selectivity
Re: help debugging an issue with selectivity
List pgsql-general
I am not sure if this belongs in pgsql-general or pgsql-hackers, I am 
trying first in psgl-general.

I am trying to understand setting a selectivity function that gets 
applied to an operator (to hopefully
provide better information for the planner/optmizer). This is for the 
q3c extension, source code found at
https://github.com/segasai/q3c.

There are functions for selectivity, and for an operator.

-- A dummy type used in the selectivity operator
create type q3c_type as (ra double precision, dec double precision,
        ra1 double precision, dec1 double precision);

-- A dummy operator function (always returns true)
CREATE OR REPLACE FUNCTION q3c_seloper(double precision, q3c_type)
         RETURNS bool
         AS 'MODULE_PATHNAME', 'pgq3c_seloper'
         LANGUAGE C STRICT IMMUTABLE COST 1000;

-- A selectivity function for the q3c operator
CREATE OR REPLACE FUNCTION q3c_sel(internal, oid, internal, int4)
         RETURNS float8
         AS 'MODULE_PATHNAME', 'pgq3c_sel'
         LANGUAGE C IMMUTABLE STRICT ;

-- A selectivity function for the q3c operator
CREATE OR REPLACE FUNCTION q3c_seljoin(internal, oid, internal, int2, 
internal)
         RETURNS float8
         AS 'MODULE_PATHNAME', 'pgq3c_seljoin'
         LANGUAGE C IMMUTABLE STRICT ;

-- distance operator with correct selectivity
CREATE OPERATOR ==<<>>== (
         LEFTARG = double precision,
         RIGHTARG = q3c_type,
         PROCEDURE = q3c_seloper,
         RESTRICT = q3c_sel,
         JOIN = q3c_seljoin
);

The C portions are declared as:

/* The actual selectivity function, it returns the ratio of the
  * search circle to the whole sky area
  */
PG_FUNCTION_INFO_V1(pgq3c_sel);
Datum pgq3c_sel(PG_FUNCTION_ARGS)

where the actual calculation portion is (not showing the setup portion):

         ratio = 3.14 * rad * rad / 41252.;  /* pi*r^2/whole_sky_area */
         /* clamp at 0, 1*/
         CLAMP_PROBABILITY(ratio);
         elog(WARNING, "HERE0 pgq3c_sel.... %e", ratio);
         PG_RETURN_FLOAT8(ratio);
}

The join function is declared as:
PG_FUNCTION_INFO_V1(pgq3c_seljoin);
Datum pgq3c_seljoin(PG_FUNCTION_ARGS)
{

where the meat portion is:
        ratio = 3.14 * rad * rad / 41252.;  /* pi*r^2/whole_sky_area */
         /* clamp at 0, 1*/
         CLAMP_PROBABILITY(ratio);
         elog(WARNING, "HERE0 pgq3c_seljoin.... %e", ratio);
         PG_RETURN_FLOAT8(ratio);
}

The two elog statements aren't in the orig code, I've added them to help 
me trace
the code. As far as I can tell, the these selectivity functions are 
called in
src/backend/optimizer/path/clausesel.c in the routine 
clause_selectivity_ext.
If I add similar elog statements, at about line 836, the code says:

               if (treat_as_join_clause(root, clause, rinfo, varRelid, 
sjinfo))
         {
             /* Estimate selectivity for a join clause. */
             if (opno > 6000)
                 elog(WARNING, "clause_selectivity: join_selectivity 
opno %d",opno);
             s1 = join_selectivity(root, opno,
                                   opclause->args,
                                   opclause->inputcollid,
                                   jointype,
                                   sjinfo);
             if (opno > 6000){
                 elog(WARNING, "join_selectivity: s1 %f", s1);
             }
         }
         else
         {
             /* Estimate selectivity for a restriction clause. */
             if (opno > 6000)
                 elog(WARNING, "clause_selectivity: 
restriction_selectivity opno %d", opno);
             s1 = restriction_selectivity(root, opno,
                                          opclause->args,
                                          opclause->inputcollid,
                                          varRelid);
             if (opno > 6000){
                 elog(WARNING, "restriction_selectivity: s1 %lf", s1);
             }
         }


When I actually execute this, I get output to the terminal of the form:
WARNING:  join_selectivity: operator id 16818 jointype 0 0
WARNING:  HERE0 pgq3c_seljoin.... 5.873266e-12
WARNING:  datum result 4438812783922730423 0.000000
WARNING:  HERE0 pgq3c_seljoin.... 5.873266e-12
WARNING:  join_selectivity: 0.000000 16818 jointype 0
WARNING:  join_selectivity: s1 0.000000
WARNING:  clause_selectivity: s1 0.000000

where it seems to me the q3c code is returning a non zero value, but in 
the guts of
postgres what is found is a zero value. If I want to verify I have the 
correct opr,
which is 16818, I can verify via:
q3c_test=# select 
oid,oprname,oprnamespace,oprowner,oprkind,oprleft,oprright,oprresult,oprcode 
from pg_operator where oid = 16818;
   oid  | oprname  | oprnamespace | oprowner | oprkind | oprleft | 
oprright | oprresult |   oprcode
-------+----------+--------------+----------+---------+---------+----------+-----------+-------------
  16818 | ==<<>>== |         2200 |    16391 | b       |    701 |    
16814 |        16 | q3c_seloper

which yeilds what I expect.

The join_selectivity is essentially a call in 
src/backend/optimizer/util/plancat.c of:
     result = DatumGetFloat8(OidFunctionCall5Coll(oprjoin,
             inputcollid,
             PointerGetDatum(root),
             ObjectIdGetDatum(operatorid),
             PointerGetDatum(args),
             Int16GetDatum(jointype),
             PointerGetDatum(sjinfo)));

     if (result < 0.0 || result > 1.0)
         elog(ERROR, "invalid join selectivity: %f", result);

while restriction_selectivity is a call to:
     result = DatumGetFloat8(OidFunctionCall4Coll(oprrest,
                     inputcollid,
                     PointerGetDatum(root),
                     ObjectIdGetDatum(operatorid),
                     PointerGetDatum(args),
                     Int32GetDatum(varRelid)));

This is the point where I run out of steam. The basic issue I have is 
that q3c code is attempting
to return a small, but non-zero value for the selectivity in two 
functions, but the guts of
postgresql has both the join_selectivity  and restriction_selectivity 
function return zero where
I think they shouldn't.

Any advice in how to make progress on this is welcome.  I'm using 
19devel (I can probably do a
git merge to move to a more up to  date version), and I'm running Fedora 
release 43 in case which
exact OS I'm using is relavent.

Greg




pgsql-general by date:

Previous
From: Shaheed Haque
Date:
Subject: Re: Can I use pg_dump to save a sequence for a table that is not also being saved?
Next
From: Adrian Klaver
Date:
Subject: Re: Can I use pg_dump to save a sequence for a table that is not also being saved?