Re: PoC/WIP: Extended statistics on expressions - Mailing list pgsql-hackers
From | Dean Rasheed |
---|---|
Subject | Re: PoC/WIP: Extended statistics on expressions |
Date | |
Msg-id | CAEZATCW_7nfeHmLtV6TzSfmFG+i=io_Qo0NL8BiBnmEqq3CQnQ@mail.gmail.com Whole thread Raw |
In response to | Re: PoC/WIP: Extended statistics on expressions (Tomas Vondra <tomas.vondra@enterprisedb.com>) |
Responses |
Re: PoC/WIP: Extended statistics on expressions
|
List | pgsql-hackers |
Looking over the statscmds.c changes, there are a few XXX's and FIXME's that need resolving, and I had a couple of other minor comments: + /* + * An expression using mutable functions is probably wrong, + * since if you aren't going to get the same result for the + * same data every time, it's not clear what the index entries + * mean at all. + */ + if (CheckMutability((Expr *) expr)) + ereport(ERROR, That comment is presumably copied from the index code, so needs updating. + /* + * Disallow data types without a less-than operator + * + * XXX Maybe allow this, but only for EXPRESSIONS stats and + * prevent building e.g. MCV etc. + */ + atttype = exprType(expr); + type = lookup_type_cache(atttype, TYPECACHE_LT_OPR); + if (type->lt_opr == InvalidOid) + ereport(ERROR, + (errcode(ERRCODE_FEATURE_NOT_SUPPORTED), + errmsg("expression cannot be used in statistics because its type %s has no default btree operator class", + format_type_be(atttype)))); As the comment suggests, it's probably worth skipping this check if numcols is 1 so that single-column stats can be built for more types of expressions. (I'm assuming that it's basically no more effort to make that work, so I think it falls into the might-as-well-do-it category.) + /* + * Parse the statistics kinds. Firstly, check that this is not the + * variant building statistics for a single expression, in which case + * we don't allow specifying any statistis kinds. The simple variant + * only has one expression, and does not allow statistics kinds. + */ + if ((list_length(stmt->exprs) == 1) && (list_length(stxexprs) == 1)) + { Typo: "statistis" Nit-picking, this test could just be: + if ((numcols == 1) && (list_length(stxexprs) == 1)) which IMO is a little more readable, and matches a similar test a little further down. + /* + * If there are no simply-referenced columns, give the statistics an + * auto dependency on the whole table. In most cases, this will + * be redundant, but it might not be if the statistics expressions + * contain no Vars (which might seem strange but possible). + * + * XXX This is copied from index_create, not sure if it's applicable + * to extended statistics too. + */ Seems right to me. + /* + * FIXME use 'expr' for expressions, which have empty column names. + * For indexes this is handled in ChooseIndexColumnNames, but we + * have no such function for stats. + */ + if (!name) + name = "expr"; In theory, this function could be made to duplicate the logic used for indexes, creating names like "expr1", "expr2", etc. To be honest though, I don't think it's worth the effort. The code for indexes isn't really bulletproof anyway -- for example there might be a column called "expr" that is or isn't included in the index, which would make the generated name ambiguous. And in any case, a name like "tbl_cola_expr_colb_expr1_colc_stat" isn't really any more useful than "tbl_cola_expr_colb_expr_colc_stat". So I'd be tempted to leave that code as it is. + +/* + * CheckMutability + * Test whether given expression is mutable + * + * FIXME copied from indexcmds.c, maybe use some shared function? + */ +static bool +CheckMutability(Expr *expr) +{ As the comment says, it's quite messy duplicating this code, but I'm wondering whether it would be OK to just skip this check entirely. I think someone else suggested that elsewhere, and I think it might not be a bad idea. For indexes, it could easily lead to wrong query results, but for stats the most likely problem is that the stats would get out of date (which they tend to do all by themselves anyway) and need rebuilding. If you ignore intentionally crazy examples (which are still possible even with this check), then there are probably many legitimate cases where someone might want to use non-immutable functions in stats, and this check just forces them to create an immutable wrapper function. Regards, Dean
pgsql-hackers by date: