Re: jsonpath - Mailing list pgsql-hackers

From Andres Freund
Subject Re: jsonpath
Date
Msg-id 20190130022826.of7w6xv5lbvzqxfp@alap3.anarazel.de
Whole thread Raw
In response to Re: jsonpath  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Responses Re: jsonpath  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
Re: jsonpath  (Alexander Korotkov <a.korotkov@postgrespro.ru>)
List pgsql-hackers
Hi,

On 2019-01-29 04:00:19 +0300, Alexander Korotkov wrote:
> +/*
> + * Make datetime type from 'date_txt' which is formated at argument 'fmt'.
> + * Actual datatype (returned in 'typid', 'typmod') is determined by
> + * presence of date/time/zone components in the format string.
> + */
> +Datum
> +to_datetime(text *date_txt, text *fmt, char *tzname, bool strict, Oid *typid,
> +            int32 *typmod, int *tz)

Given other to_<type> functions being SQL callable, I'm not a fan of the
naming here.

> +{
> +    struct pg_tm tm;
> +    fsec_t        fsec;
> +    int            fprec = 0;
> +    int            flags;
> +
> +    do_to_timestamp(date_txt, fmt, strict, &tm, &fsec, &fprec, &flags);
> +
> +    *typmod = fprec ? fprec : -1;    /* fractional part precision */
> +    *tz = 0;
> +
> +    if (flags & DCH_DATED)
> +    {
> +        if (flags & DCH_TIMED)
> +        {
> +            if (flags & DCH_ZONED)
> +            {
> +                TimestampTz result;
> +
> +                if (tm.tm_zone)
> +                    tzname = (char *) tm.tm_zone;
> +
> +                if (tzname)
> +                {
> +                    int            dterr = DecodeTimezone(tzname, tz);
> +
> +                    if (dterr)
> +                        DateTimeParseError(dterr, tzname, "timestamptz");


Do we really need 6/7 indentation levels in new code?




> +jsonpath_scan.c: FLEXFLAGS = -CF -p -p

Why -CF, and why is -p repeated?


> -                JsonEncodeDateTime(buf, val, DATEOID);
> +                JsonEncodeDateTime(buf, val, DATEOID, NULL);

ISTM API changes like this ought to be done in a separate patch, to ease
review.


>  }
>  
> +
>  /*
>   * Compare two jbvString JsonbValue values, a and b.
>   *

Random WS change.


> +/*****************************INPUT/OUTPUT*********************************/

Why do we need this much code to serialize data that we initially got in
serialized manner? Couldn't we just keep the original around?


> +Datum
> +jsonpath_in(PG_FUNCTION_ARGS)
> +{

No binary input support? I'd suggest adding that, but keeping the
representation the same. Otherwise just adds unnecesary complexity for
driver authors wishing to use the binar protocol.



> +/********************Support functions for JsonPath**************************/
> +
> +/*
> + * Support macroses to read stored values
> + */

s/macroses/macros/


> +++ b/src/backend/utils/adt/jsonpath_exec.c

Gotta say, I'm unenthusiastic about yet another execution engine in some
PG subsystem.


> @@ -0,0 +1,2776 @@
> +/*-------------------------------------------------------------------------
> + *
> + * jsonpath_exec.c
> + *     Routines for SQL/JSON path execution.
> + *
> + * Copyright (c) 2019, PostgreSQL Global Development Group
> + *
> + * IDENTIFICATION
> + *    src/backend/utils/adt/jsonpath_exec.c
> + *
> + *-------------------------------------------------------------------------
> + */

Somewhere there needs to be higher level documentation explaining how
this stuff is supposed to work on a code level.

> +
> +/* strict/lax flags is decomposed into four [un]wrap/error flags */
> +#define jspStrictAbsenseOfErrors(cxt)    (!(cxt)->laxMode)
> +#define jspAutoUnwrap(cxt)                ((cxt)->laxMode)
> +#define jspAutoWrap(cxt)                ((cxt)->laxMode)
> +#define jspIgnoreStructuralErrors(cxt)    ((cxt)->ignoreStructuralErrors)
> +#define jspThrowErrors(cxt)                ((cxt)->throwErrors)

What's the point?


> +#define ThrowJsonPathError(code, detail) \
> +    ereport(ERROR, \
> +             (errcode(ERRCODE_ ## code), \
> +              errmsg(ERRMSG_ ## code), \
> +              errdetail detail))
> +
> +#define ThrowJsonPathErrorHint(code, detail, hint) \
> +    ereport(ERROR, \
> +             (errcode(ERRCODE_ ## code), \
> +              errmsg(ERRMSG_ ## code), \
> +              errdetail detail, \
> +              errhint hint))

These seem ill-advised, just making it harder to understand the code,
and grepping for it, without actually meaningfully simplifying anything.


> +/*
> + * Find value of jsonpath variable in a list of passing params
> + */

What is that comment supposed to mean?


> +/*
> + * Get the type name of a SQL/JSON item.
> + */
> +static const char *
> +JsonbTypeName(JsonbValue *jb)
> +{

Wasn't there pretty similar infrastructure elsewhere?



> +/*
> + * Cross-type comparison of two datetime SQL/JSON items.  If items are
> + * uncomparable, 'error' flag is set.
> + */

Sounds like it'd not raise an error, but it does in a bunch of scenarios.


> @@ -206,6 +206,22 @@ Section: Class 22 - Data Exception
>  2200N    E    ERRCODE_INVALID_XML_CONTENT                                    invalid_xml_content
>  2200S    E    ERRCODE_INVALID_XML_COMMENT                                    invalid_xml_comment
>  2200T    E    ERRCODE_INVALID_XML_PROCESSING_INSTRUCTION                     invalid_xml_processing_instruction
> +22030    E    ERRCODE_DUPLICATE_JSON_OBJECT_KEY_VALUE                        duplicate_json_object_key_value
> +22031    E    ERRCODE_INVALID_ARGUMENT_FOR_JSON_DATETIME_FUNCTION
invalid_argument_for_json_datetime_function
> +22032    E    ERRCODE_INVALID_JSON_TEXT                                      invalid_json_text
> +22033    E    ERRCODE_INVALID_JSON_SUBSCRIPT                                 invalid_json_subscript
> +22034    E    ERRCODE_MORE_THAN_ONE_JSON_ITEM                                more_than_one_json_item
> +22035    E    ERRCODE_NO_JSON_ITEM                                           no_json_item
> +22036    E    ERRCODE_NON_NUMERIC_JSON_ITEM                                  non_numeric_json_item
> +22037    E    ERRCODE_NON_UNIQUE_KEYS_IN_JSON_OBJECT                         non_unique_keys_in_json_object
> +22038    E    ERRCODE_SINGLETON_JSON_ITEM_REQUIRED                           singleton_json_item_required
> +22039    E    ERRCODE_JSON_ARRAY_NOT_FOUND                                   json_array_not_found
> +2203A    E    ERRCODE_JSON_MEMBER_NOT_FOUND                                  json_member_not_found
> +2203B    E    ERRCODE_JSON_NUMBER_NOT_FOUND                                  json_number_not_found
> +2203C    E    ERRCODE_JSON_OBJECT_NOT_FOUND                                  object_not_found
> +2203F    E    ERRCODE_JSON_SCALAR_REQUIRED                                   json_scalar_required
> +2203D    E    ERRCODE_TOO_MANY_JSON_ARRAY_ELEMENTS                           too_many_json_array_elements
> +2203E    E    ERRCODE_TOO_MANY_JSON_OBJECT_MEMBERS
>  too_many_json_object_members

Are all of these invented by us?


Greetings,

Andres Freund


pgsql-hackers by date:

Previous
From: Hubert Zhang
Date:
Subject: Re: Control your disk usage in PG: Introduction to Disk Quota Extension
Next
From: David Fetter
Date:
Subject: Re: Early WIP/PoC for inlining CTEs