Thread: Re: [HACKERS] patch: function xmltable

Re: [HACKERS] patch: function xmltable

From
Pavel Stehule
Date:
Hi

2017-01-11 22:53 GMT+01:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
Alvaro Herrera wrote:

> The more I look at this, the less I like using NameArgExpr for
> namespaces.  It looks all wrong to me, and it causes ugly code all over.
> Maybe I just need to look at it a bit longer.

I think it'd be cleaner to use ResTarget for the namespaces, like
xml_attribute_el does, and split the names from actual exprs in the same
way.  So things like ExecInitExpr become simpler because you just
recurse to initialize the list, without having to examine each element
individually.  tabexprInitialize can just do forboth().

The main reason I don't like abusing NamedArgExpr is that the whole
comment that explains it becomes a lie.

I used your proposed way based on Restarget

Updated patch attached.

Regards

Pavel
 

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: [HACKERS] patch: function xmltable

From
Alvaro Herrera
Date:
Pavel Stehule wrote:
> 
> I used your proposed way based on Restarget

Thanks.  Some more tweaking to go yet before I consider this
committable, but it's much better now.  Here's v28.  I changed a few
things:

- make expression evaluation code more orthodox:
  1. avoid PG_TRY, use a ExprContext shutdown callback instead
  2. use a "Fast" evaluator, for calls past the first one
  3. don't look up fmgrinfos until execution time
  4. don't duplicate get_expr_result_type
- make parser accept DEFAULT namespace. Only xml implementation barfs.
  (this means we lost the errposition pointer, but I don't really
   care. We could fix it if we cared)
- clean up parse analysis code a little bit
- move decls/struct defs to better locations in source code
- remove leftover "namespaces" in TableExprState
- pgindent the whole mess.

I don't like the xml.c code and the "evalcols" flag.  That's next on my
list to fix.

I don't think to_xmlstr() is necessary, considering xml_text2xmlChar.
We could just apply a cast of the source cstring to xmlChar.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

-- 
Sent via pgsql-hackers mailing list (pgsql-hackers@postgresql.org)
To make changes to your subscription:
http://www.postgresql.org/mailpref/pgsql-hackers

Attachment

Re: [HACKERS] patch: function xmltable

From
Pavel Stehule
Date:
Hi

2017-01-13 21:32 GMT+01:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
Pavel Stehule wrote:
>
> I used your proposed way based on Restarget

Thanks.  Some more tweaking to go yet before I consider this
committable, but it's much better now.  Here's v28.  I changed a few
things:

- make expression evaluation code more orthodox:
  1. avoid PG_TRY, use a ExprContext shutdown callback instead
  2. use a "Fast" evaluator, for calls past the first one
  3. don't look up fmgrinfos until execution time
  4. don't duplicate get_expr_result_type
- make parser accept DEFAULT namespace. Only xml implementation barfs.
  (this means we lost the errposition pointer, but I don't really
   care. We could fix it if we cared)
- clean up parse analysis code a little bit
- move decls/struct defs to better locations in source code
- remove leftover "namespaces" in TableExprState
- pgindent the whole mess.


I checked the changes and looks correct - although for some I had not courage :) - like dynamic change of exprstate->evalfunc

I fixed test, and append forgotten header file


 
I don't like the xml.c code and the "evalcols" flag.  That's next on my
list to fix.

You need some flag to specify if column paths are valid or not.  


I don't think to_xmlstr() is necessary, considering xml_text2xmlChar.
We could just apply a cast of the source cstring to xmlChar.

is it safe? For one byte encodings?

Regards

Pavel
 

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services

Attachment

Re: [HACKERS] patch: function xmltable

From
Pavel Stehule
Date:


2017-01-14 14:43 GMT+01:00 Pavel Stehule <pavel.stehule@gmail.com>:
Hi

2017-01-13 21:32 GMT+01:00 Alvaro Herrera <alvherre@2ndquadrant.com>:
Pavel Stehule wrote:
>
> I used your proposed way based on Restarget

Thanks.  Some more tweaking to go yet before I consider this
committable, but it's much better now.  Here's v28.  I changed a few
things:

- make expression evaluation code more orthodox:
  1. avoid PG_TRY, use a ExprContext shutdown callback instead
  2. use a "Fast" evaluator, for calls past the first one
  3. don't look up fmgrinfos until execution time
  4. don't duplicate get_expr_result_type
- make parser accept DEFAULT namespace. Only xml implementation barfs.
  (this means we lost the errposition pointer, but I don't really
   care. We could fix it if we cared)
- clean up parse analysis code a little bit
- move decls/struct defs to better locations in source code
- remove leftover "namespaces" in TableExprState
- pgindent the whole mess.


I checked the changes and looks correct - although for some I had not courage :) - like dynamic change of exprstate->evalfunc

I fixed test, and append forgotten header file


 
I don't like the xml.c code and the "evalcols" flag.  That's next on my
list to fix.

You need some flag to specify if column paths are valid or not.  


I don't think to_xmlstr() is necessary, considering xml_text2xmlChar.
We could just apply a cast of the source cstring to xmlChar.

is it safe? For one byte encodings?

Looks so this patch breaks regression tests

estoring database schemas in the new cluster
  \"\ cdefghijklmnopqrstuvwxyz{|}~                      

  regression                                                
*failure*

Consult the last few lines of "pg_upgrade_dump_16387.log" for
the probable cause of the failure.
Failure, exiting
+ rm -rf /tmp/pg_upgrade_check-wSfzCh
Makefile:39: návod pro cíl „check“ selhal
make[2]: *** [check] Chyba 1
make[2]: Opouští se adresář „/home/pavel/src/postgresql/src/bin/pg_upgrade“

 
pg_restore: [archiver (db)] Error while PROCESSING TOC:
pg_restore: [archiver (db)] Error from TOC entry 496; 1259 47693 VIEW xmltableview2 pavel
pg_restore: [archiver (db)] could not execute query: ERROR:  syntax error at or near "("
LINE 15: ...XMLTABLE(XMLNAMESPACES('http://x.y'::"text" AS zz)('/zz:rows...

Fixed in attached patch 


 
Regards

Pavel
 

--
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services


Attachment

Re: [HACKERS] patch: function xmltable

From
Alvaro Herrera
Date:
I just realized that your new xml_xmlnodetostr is line-by-line identical
to the existing xml_xmlnodetoxmltype except for two or three lines.
I think that's wrong.  I'm going to patch the existing function so that
they can share code.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] patch: function xmltable

From
Alvaro Herrera
Date:
Given
https://www.postgresql.org/message-id/20170116210019.a3glfwspg5lnfrnm@alap3.anarazel.de
which is going to heavily change how the executor works in this area, I
am returning this patch to you again.  I would like a few rather minor
changes:

1. to_xmlstr can be replaced with calls to xmlCharStrdup.
2. don't need xml_xmlnodetostr either -- just use xml_xmlnodetoxmltype  (which returns text*) and extract the cstring
fromthe varlena.  It's  a bit more wasteful in terms of cycles, but I don't think we care.  If we do care, change the
functionso that it returns cstring, and  have the callers that want text wrap it in cstring_to_text.
 
3. have a new perValueCxt memcxt in TableExprState, child of buildercxt,  and switch to it just before GetValue()
(resetit just before  switching).  Then, don't worry about leaks in GetValue.  This way,  the text* conversions et al
don'tmatter.
 

After that I think we're going to need to get this working on top of
Andres' changes.  Which I'm afraid is going to be rather major surgery,
but I haven't looked.

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services



Re: [HACKERS] patch: function xmltable

From
Alvaro Herrera
Date:
In case this still matters, I think GetValue should look more or less
like this (untested):


/** Return the value for column number 'colnum' for the current row.  If column* -1 is requested, return representation
ofthe whole row.** This leaks memory, so be sure to reset often the context in which it's* called.*/
 
static Datum
XmlTableGetValue(TableExprState *state, int colnum, bool *isnull)
{
#ifdef USE_LIBXMLXmlTableBuilderData *xtCxt;Datum        result = (Datum) 0;xmlNodePtr    cur;char       *cstr =
NULL;volatilexmlXPathObjectPtr xpathobj;
 
xtCxt = GetXmlTableBuilderPrivateData(state, "XmlTableGetValue");
Assert(xtCxt->xpathobj &&       xtCxt->xpathobj->type == XPATH_NODESET &&       xtCxt->xpathobj->nodesetval != NULL);
/* Propagate context related error context to libxml2 */xmlSetStructuredErrorFunc((void *) xtCxt->xmlerrcxt,
xml_errorHandler);
cur = xtCxt->xpathobj->nodesetval->nodeTab[xtCxt->row_count - 1];if (cur->type != XML_ELEMENT_NODE)    elog(ERROR,
"unexpectedxmlNode type");
 
/* Handle whole row case the easy way. */if (colnum == -1){    text       *txt;
    txt = xml_xmlnodetoxmltype(cur, xtCxt->xmlerrcxt);    result = InputFunctionCall(&state->in_functions[0],
                   text_to_cstring(txt),                               state->typioparams[0],
   -1);    *isnull = false;
 
    return result;}
Assert(xtCxt->xpathscomp[colnum] != NULL);
xpathobj = NULL;PG_TRY();{    Form_pg_attribute attr;
    attr = state->resultSlot->tts_tupleDescriptor->attrs[colnum];
    /* Set current node as entry point for XPath evaluation */    xmlXPathSetContextNode(cur, xtCxt->xpathcxt);
    /* Evaluate column path */    xpathobj = xmlXPathCompiledEval(xtCxt->xpathscomp[colnum], xtCxt->xpathcxt);    if
(xpathobj== NULL || xtCxt->xmlerrcxt->err_occurred)        xml_ereport(xtCxt->xmlerrcxt, ERROR, ERRCODE_INTERNAL_ERROR,
                  "could not create XPath object");
 
    if (xpathobj->type == XPATH_NODESET)    {        int            count;        Oid            targettypid =
attr->atttypid;
        if (xpathobj->nodesetval != NULL)            count = xpathobj->nodesetval->nodeNr;
        /*         * There are four possible cases, depending on the number of         * nodes returned by the XPath
expressionand the type of the         * target column: a) XPath returns no nodes.  b) One node is         * returned,
andcolumn is of type XML.  c) One node, column type         * other than XML.  d) Multiple nodes are returned.
*/       if (xpathobj->nodesetval == NULL)        {            *isnull = true;        }        else if (count == 1 &&
targettypid== XMLOID)        {            textstr = xml_xmlnodetoxmltype(xpathobj->nodesetval->nodeTab[0],
                            xtCxt->xmlerrcxt);            cstr = text_to_cstring(textstr);        }        else if
(count== 1)        {            xmlChar    *str;
 
            str = xmlNodeListGetString(xtCxt->doc,
xpathobj->nodesetval->nodeTab[0]->xmlChildrenNode,                                      1);            if (str)
  {                PG_TRY();                {                    cstr = pstrdup(str);                }
PG_CATCH();               {                    xmlFree(str);                    PG_RE_THROW();                }
      PG_END_TRY();                xmlFree(str);            }            else                cstr = pstrdup("");
}       else        {            StringInfoData buf;            int            i;
 
            Assert(count > 1);
            /*             * When evaluating the XPath expression returns multiple             * nodes, the result is
theconcatenation of them all.             * The target type must be XML.             */            if (targettypid !=
XMLOID)               ereport(ERROR,                        (errcode(ERRCODE_CARDINALITY_VIOLATION),
    errmsg("more than one value returned by column XPath expression")));
 
            initStringInfo(&buf);            for (i = 0; i < count; i++)                /* worth freeing the text here?
Naahh ... */                appendStringInfoText(&buf,
xml_xmlnodetoxmltype(xpathobj->nodesetval->nodeTab[i],
xtCxt->xmlerrcxt));           cstr = buf.data;        }    }    else if (xpathobj->type == XPATH_STRING)    {
cstr= (char *) xpathobj->stringval;        *isnull = false;    }    else        elog(ERROR, "unexpected XPath object
type%u", xpathobj->type);
 
    /*     * By here, either cstr contains the result value, or the isnull flag     * has been set.     */
Assert(cstr|| *isnull);
 
    if (!*isnull)        result = InputFunctionCall(&state->in_functions[colnum],
cstr,                                  state->typioparams[colnum],
attr->atttypmod);}PG_CATCH();{   if (xpathobj != NULL)        xmlXPathFreeObject(xpathobj);
PG_RE_THROW();}PG_END_TRY();
if (xpathobj)    xmlXPathFreeObject(xpathobj);
return result;
#elseNO_XML_SUPPORT();
#endif   /* not USE_LIBXML */
}

-- 
Álvaro Herrera                https://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Remote DBA, Training & Services