Re: Ltree syntax improvement - Mailing list pgsql-hackers

From Nikolay Shaplov
Subject Re: Ltree syntax improvement
Date
Msg-id 2996702.6AG0pjhCY8@x200m
Whole thread Raw
In response to Re: Ltree syntax improvement  (Dmitry Belyavsky <beldmit@gmail.com>)
Responses Re: Ltree syntax improvement
List pgsql-hackers
В письме от воскресенье, 24 февраля 2019 г. 14:31:55 MSK пользователь Dmitry
Belyavsky написал:

Hi! Am back here again.

I've been thinking about this patch a while... Come to some conclusions and
wrote some examples...

First I came to idea that the best way to simplify the code is change the
state machine from if to switch/case. Because in your code a lot of repetition
is done just because you can't say "Thats all, let's go to next symbol" in any
place in the code. Now it should follow all the branches of if-else tree that
is inside the state-machine "node" to get to the next symbol.

To show how simpler the things would be I changed the state machine processing
in lquery_in form if to switch/case, and changed the code for LQPRS_WAITLEVEL
state processing, removing duplicate code, using "break" wherever it is
possible

(The indention in this example is unproper to make diff more clear)

so from that much of code
=================
        if (state == LQPRS_WAITLEVEL)
        {
            if (t_isspace(ptr))
            {
                ptr += charlen;
                pos++;
                continue;
            }

            escaped_count = 0;
            real_levels++;
            if (charlen == 1)
            {
                if (t_iseq(ptr, '!'))
                {
                    GETVAR(curqlevel) = lptr = (nodeitem *)
palloc0(sizeof(nodeitem) * numOR);
                    lptr->start = ptr + 1;
                    state = LQPRS_WAITDELIM;
                    curqlevel->numvar = 1;
                    curqlevel->flag |= LQL_NOT;
                    hasnot = true;
                }
                else if (t_iseq(ptr, '*'))
                    state = LQPRS_WAITOPEN;
                else if (t_iseq(ptr, '\\'))
                {
                    GETVAR(curqlevel) = lptr = (nodeitem *)
palloc0(sizeof(nodeitem) * numOR);
                    lptr->start = ptr;
                    curqlevel->numvar = 1;
                    state = LQPRS_WAITESCAPED;
                }
                else if (strchr(".|@%{}", *ptr))
                {
                    UNCHAR;
                }
                else
                {
                    GETVAR(curqlevel) = lptr = (nodeitem *)
palloc0(sizeof(nodeitem) * numOR);
                    lptr->start = ptr;
                    state = LQPRS_WAITDELIM;
                    curqlevel->numvar = 1;
                    if (t_iseq(ptr, '"'))
                    {
                        lptr->flag |= LVAR_QUOTEDPART;
                    }
                }
            }
            else
            {
                GETVAR(curqlevel) = lptr = (nodeitem *) palloc0(sizeof(nodeitem) *
numOR);
                lptr->start = ptr;
                state = LQPRS_WAITDELIM;
                curqlevel->numvar = 1;
            }
        }
=====================
I came to this
=====================
         case LQPRS_WAITLEVEL:
            if (t_isspace(ptr))
                break; /* Just go to next symbol */
            escaped_count = 0;
            real_levels++;

            if (charlen == 1)
            {
                if (strchr(".|@%{}", *ptr))
                    UNCHAR;
                if (t_iseq(ptr, '*'))
                {
                    state = LQPRS_WAITOPEN;
                    break;
                }
            }
            GETVAR(curqlevel) = lptr = (nodeitem *) palloc0(sizeof(nodeitem) *
numOR);
            lptr->start = ptr;
            curqlevel->numvar = 1;
            state = LQPRS_WAITDELIM;
            if (charlen == 1)
            {
                if (t_iseq(ptr, '\\'))
                {
                    state = LQPRS_WAITESCAPED;
                    break;
                }
                if (t_iseq(ptr, '!'))
                {
                    lptr->start += 1 /*FIXME explain why */;
                    curqlevel->flag |= LQL_NOT;
                    hasnot = true;
                }
                else if (t_iseq(ptr, '"'))
                {
                    lptr->flag |= LVAR_QUOTEDPART;
                }
            }
            break;
=======
And this code is much more readable.... You can just read it aloud:
-Spaces we just skip
- on .|@%{} throws and exception
- for asterisk change state and nothing else
then for others allocate a buffer
- for slash we just set a flag
- for exclamation mark we set a flag and skip it
- for double quote set a flag.

All the logic are clear. And in your version of the code it is difficult to get
the idea from the code that simple.

So my suggestion is following:

1. Submit and commit the patch that changes state machines from ifs to switch/
cases, and nothing else. This will reindent the code, so in order to keep
further changes visible, we should change nothing else.

2. Change your code to switch/cases, use break to deduplicate code wherever is
possible, and commit it.

I can join you while working with this (but then I am not sure I would be able
to remain a reviewer...)

I've also attached an example I've quoted above, formatted as a diff, so it
would be easy to check how does it work. It should be applied after your
patch.
(I gave it a strange name ltree.not-a-patch hoping that commitfest software
will not treat it a new version of a patch)
Attachment

pgsql-hackers by date:

Previous
From: David Steele
Date:
Subject: Re: 2019-03 Starts Tomorrow
Next
From: Pierre Ducroquet
Date:
Subject: [Patch] Invalid permission check in pg_stats for functional indexes