Specifying attribute slot for storing/reading statistics - Mailing list pgsql-hackers

From Esteban Zimanyi
Subject Specifying attribute slot for storing/reading statistics
Date
Msg-id CAPqRbE5F3Hft8+e8prrc7Ew8N5uydrrFD2Xy-CCONVTC+zFnvA@mail.gmail.com
Whole thread Raw
Responses Re: Specifying attribute slot for storing/reading statistics
List pgsql-hackers
Dear all

We are developing MobilityDB, an open source PostgreSQL/PostGIS extension that provides temporal and spatio-temporal types. The source code, manuals, and related publications are available at the address
https://github.com/ULB-CoDE-WIT/MobilityDB/

In MobilityDB temporal types are types derived from PostgreSQL/PostGIS types to which a time dimension is added. MobilityDB provides the following temporal types: tbool (temporal boolean), tint (temporal int), tfloat (temporal float), text (temporal text), tgeompoint (temporal geometric points) and tgeogpoint (temporal geographic points). For example, we can define a tfloat and a tgeompoint as follows

SELECT tfloat '[1.5@2000-01-01, 2.5@2000-01-02, 1.5@2000-01-03]';
SELECT tgeompoint '[Point(0 0)@2000-01-01 08:00, Point(1 0)@2000-01-02 08:05, Point(1 1)@2000-01-03 08:10]';

We are developing the analyze/selectivity functions for those types. Our approach is to use the standard PostgreSQL/PostGIS functions for the value and the time dimensions where the slots starting from 0 will be used for the value dimension, and the slots starting from 2 will be used for the time dimension. For example, for tfloat we use range_typanalyze and related functions for
* collecting in slots 0 and 1, STATISTIC_KIND_BOUNDS_HISTOGRAM and STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM for the float ranges of the value dimension
*  collecting in slots 2 and 3, STATISTIC_KIND_BOUNDS_HISTOGRAM and STATISTIC_KIND_RANGE_LENGTH_HISTOGRAM for the periods (similar to tstzranges) of the time dimension

However, we end up copying several PostgreSQL functions to which we only add an additional parameter stating the slot number from which the specific statistic kind should be found (either 0 or 2)

bool
get_attstatsslot_mobdb(AttStatsSlot *sslot, HeapTuple statstuple,
int reqkind, Oid reqop, int flags, int startslot)
{
    [...]
    for (i = startslot; i < STATISTIC_NUM_SLOTS; i++)
    {
        if ((&stats->stakind1)[i] == reqkind &&
            (reqop == InvalidOid || (&stats->staop1)[i] == reqop))
            break;
    }
    [...]
}

double
var_eq_const_mobdb(VariableStatData *vardata, Oid operator, Datum constval,
    bool negate, int startslot)
{
    [...]
}
Selectivity
scalarineqsel_mobdb(PlannerInfo *root, Oid operator, bool isgt, bool iseq,
VariableStatData *vardata, Datum constval, Oid consttype,
    int startslot)
{
    [...]
}

static Selectivity
mcv_selectivity_mobdb(VariableStatData *vardata, FmgrInfo *opproc,
Datum constval, Oid atttype, bool varonleft,
    double *sumcommonp, int startslot)
{
    [...]
}
static double
ineq_histogram_selectivity_mobdb(PlannerInfo *root, VariableStatData *vardata,
FmgrInfo *opproc, bool isgt, bool iseq, Datum constval,
    Oid consttype, int startslot)
{
    [...]
}

in addition to copying other functions needed by the above functions since they are not exported (defined as static)

static bool
get_actual_variable_range(PlannerInfo *root, VariableStatData *vardata,
Oid sortop, Datum *min, Datum *max)

static bool
get_actual_variable_endpoint(Relation heapRel,
Relation indexRel, ScanDirection indexscandir,
ScanKey scankeys, int16 typLen,
bool typByVal, MemoryContext outercontext,
Datum *endpointDatum)

[...]

Is there a better way to do this ?

Is there any chance that the API for accessing the typanalyze and selectivity functions will be enhanced in a future release ?

Regards

Esteban

-- 
------------------------------------------------------------
Prof. Esteban Zimanyi
Department of Computer & Decision Engineering  (CoDE) CP 165/15   
Universite Libre de Bruxelles           
Avenue F. D. Roosevelt 50               
B-1050 Brussels, Belgium                
fax: + 32.2.650.47.13
tel: + 32.2.650.31.85
e-mail: ezimanyi@ulb.ac.be
Internet: http://code.ulb.ac.be/
------------------------------------------------------------

pgsql-hackers by date:

Previous
From: Dilip Kumar
Date:
Subject: Re: CVE-2017-7484-induced bugs, or, btree cmp functions are not leakproof?
Next
From: Amit Langote
Date:
Subject: Re: CVE-2017-7484-induced bugs, or, btree cmp functions are not leakproof?