Re: [HACKERS] [PATCH] Generic type subscripting - Mailing list pgsql-hackers

From Dmitry Dolgov
Subject Re: [HACKERS] [PATCH] Generic type subscripting
Date
Msg-id CA+q6zcUX=Om75kgZXkpYPhnwUCo4osFpzd83dKPzjSGfCmxEgA@mail.gmail.com
Whole thread Raw
In response to Re: [HACKERS] [PATCH] Generic type subscripting  (David Steele <david@pgmasters.net>)
Responses Re: [PATCH] Generic type subscripting  (David Steele <david@pgmasters.net>)
List pgsql-hackers
> On 21 March 2017 at 18:16, David Steele <david@pgmasters.net> wrote:
>
> This thread has been idle for over a week.

Yes, sorry for the late reply. I'm still trying to find a better solution for
some of the problems, that arose in this patch.

> On 15 March 2017 at 00:10, Tom Lane <tgl@sss.pgh.pa.us> wrote:
> Dmitry Dolgov <9erthalion6@gmail.com> writes:
>
> I looked through this a bit.
>

Thank you for the feedback.

> > On 10 March 2017 at 06:20, Peter Eisentraut <peter.eisentraut@2ndquadrant.com> wrote:
> > I see a possible problem here:  This design only allows one subscripting
> > function.  But what you'd really want in this case is at least two: one
> > taking an integer type for selecting by array index, and one taking text
> > for selecting by field name.
>
> I would guess that what we really want for jsonb is the ability to
> intermix integer and text subscripts, so that
>          jsonbcol['foo'][42]['bar']
> would extract the "bar" field of an object in position 42 of an
> array in field "foo" of the given jsonb value.
>

Maybe I misunderstood you, but isn't it already possible?

```
=# select ('{"a": [{"b": 1}, {"c": 2}]}'::jsonb)['a'][0]['b'];
 jsonb
-------
 1
(1 row)

=# select * from test;
            data
-----------------------------
 {"a": [{"b": 1}, {"c": 2}]}
(1 row)

=# update test set data['a'][0]['b'] = 42;
UPDATE 1

=# select * from test;
            data
-----------------------------
 {"a": [{"b": 42}, {"c": 2}]}
(1 row)
```

> I'm not totally excited about the naming you've chosen though,
> particularly the function names "array_subscripting()" and
> "jsonb_subscripting()" --- those are too generic, and a person coming to
> them for the first time would probably expect that they actually execute
> subscripting, when they do no such thing.  Names like
> "array_subscript_parse()" might be better.  Likewise the name of the
> new pg_type column doesn't really convey what it does, though I suppose
> "typsubscriptparse" is too much of a mouthful.  "typsubparse" seems short
> enough but might be confusing too.

It looks quite tempting for me to replace the word "subscript" by an
abbreviation all over the patch. Then it will become something like
"typsbsparse" which is not that mouthful, but I'm not sure if it will be easily
recognizable.

> I wonder also if we should try to provide some helper functions rather
> than expecting every data type to know all there is to know about parsing
> and execution of subscripting.  Not sure what would be helpful, however.

I don't really see what details we can hide behind this helper, because almost
all code there is type specific (e.g. to check if subscript indexes are
correct), can you elaborate on that?

> The documentation needs a lot of work of course, and I do not think
> you're doing either yourself or anybody else any favors with the proposed
> additions to src/tutorial/.

Yes, unfortunately, I forget to update documentation from the previous version
of the patch. I'll fix it soon in the next version.

> Aren't SBS_VALIDATION and SBS_EXEC just hangovers from the previous
> design?  They're still in parse_node.h, and they're still mentioned in
> the docs, but I don't see them used in actual code anywhere.

Yes, these are from the previous version too, I'll remove them.

> Another question here is whether every datatype will be on board
> with the current rules about null subscripts, or whether we need to
> delegate null-handling to the datatype-specific execution function.
> I'm not sure; it would complicate the API significantly for what might be
> useless flexibility.

It looks for me that it's a great idea to perform null-handling inside datatype
specific code and I'm not sure, what would be complicated? All necessary
information for that is already in `SubscriptingExecData` (I just have to use
`eisnull` in a different way).

> I do not think that the extra SubscriptingBase data structure is paying
> for itself either; you're taking a hit in readability from the extra level
> of struct, and as far as I can find it's not buying you one single thing,
> because there's no code that operates on a SubscriptingBase argument.
> I'd just drop that idea and make two independent struct types, or else
> stick with the original ArrayRef design that made one struct serve both
> purposes.  (IOW, my feeling that a separate assign struct would be a
> readability improvement isn't exactly getting borne out by what you've
> done here.  But maybe there's a better way to do that.)

I'm thinking to replace these structures by more meaningful ones, something like:

```
typedef struct SubscriptContainerRef
{
Expr xpr;
Oid refcontainertype;
Oid refelemtype;
int32 reftypmod;
Oid refcollid;
} SubscriptBase;

typedef struct SubscriptExtractRef
{
Expr        xpr;
SubscriptContainerRef *containerExpr;
List                *refupperindexpr;
List                *reflowerindexpr;
Oid            refevalfunc;
} SubscriptExtractRef;

typedef struct SubscriptAssignRef
{
Expr        xpr;
SubscriptContainerRef *containerExpr;
Expr        *refassgnexpr;
List                *refupperindexpr;
List                *reflowerindexpr;
Oid            refevalfunc;
} SubscriptAssignRef;

```

It's close to the previous version, but here will be a bit less duplicated code
in comparison with two independent structures, we still can use different nodes
for data assignment and data extraction, and node processing for these nodes
can be little bit more separated, e.g.:

```
    case T_SubscriptExtractRef:
        {
            // extract specific logic
            sbsref->containerExpr = ExecInitExpr((Expr *) sbsref->containerExpr, parent);
        }
    case T_SubscriptAssignRef:
        {
            // assign specific logic
            sbsref->containerExpr = ExecInitExpr((Expr *) sbsref->containerExpr, parent);
        }
    case T_SubscriptContainerRef:
        {
            // subscript container logic 
        }
```

pgsql-hackers by date:

Previous
From: Vitaly Burovoy
Date:
Subject: Re: [HACKERS] identity columns
Next
From: Denish Patel
Date:
Subject: Re: [HACKERS] Monitoring roles patch