Re: [PATCH] Allow multiple recursive self-references - Mailing list pgsql-hackers

From Tom Lane
Subject Re: [PATCH] Allow multiple recursive self-references
Date
Msg-id 988140.1616527782@sss.pgh.pa.us
Whole thread Raw
In response to Re: [PATCH] Allow multiple recursive self-references  (Denis Hirn <denis.hirn@uni-tuebingen.de>)
Responses Re: [PATCH] Allow multiple recursive self-references  (Denis Hirn <denis.hirn@uni-tuebingen.de>)
List pgsql-hackers
Denis Hirn <denis.hirn@uni-tuebingen.de> writes:
>> I am not at all sure what the standard says about such recursion [...]

> as far as I know, the standard does not constraint the number of self-references
> of recursive common table expressions. However, I could be wrong here.

As far as I can see, the spec flat-out forbids this.  In SQL:2021,
it's discussed in 7.17 <query expression> syntax rule 3) j) ix), which
defines

  ix) If WLEi is recursive, then:
      1) Let S be the stratum that contains WQNi.
      2) If WQEi does not contain a <query specification> that contains
         more than one <query name> referencing members of S, then WLEi is
         linearly recursive.

("stratum" means a set of mutually-recursive WITH items), and they
helpfully explain

  NOTE 308 — For example, if WLEi contains the <query specification>
    SELECT * FROM A AS A1, A AS A2
  where A is a <query name> in S, then WLEi is not linearly recursive. The
  point is that this <query specification> contains two references to
  members of S. It is irrelevant that they reference the same member of S.

and then the next rule says

  x) If WLEi is recursive, then WLEi shall be linearly recursive.


So the problem with extending the spec here is that someday they might
extend it with some other semantics, and then we're between a rock and
a hard place.

If there were really compelling reasons why (a) we have to have this
feature and (b) there's only one reasonable way for it to act, hence
no likelihood that the SQL committee will choose something else, then
I'd be willing to think about getting out front of the committee.
But you haven't made either case.

>> I don't think any other DBMS has implemented this, except MariaDB. Tested here:

> There are a few recent DBMSs that I know of that support this: HyPer, Umbra, DuckDB, and NoisePage.

Do they all act the same?  Has anybody that sits on the SQL committee
done it?  (If you could point to DB2, in particular, I'd have some
confidence that the committee might standardize on their approach.)


A totally independent question is whether you've even defined a
well-defined behavior.  There's an interesting comment in the spec:

  NOTE 310 — The restrictions insure that each WLEi, viewed as a
  transformation of the query names of the stratum, is monotonically
  increasing. According to Tarski’s fixed point theorem, this insures that
  there is a fixed point. The General Rules use Kleene’s fixed point
  theorem to define a sequence that converges to the minimal fixed
  point.

I don't know any of the underlying math here, but if you've got a
join of multiple recursion references then the number of output
rows could certainly be nonlinear, which very possibly breaks the
argument that there's a well-defined interpretation.

            regards, tom lane



pgsql-hackers by date:

Previous
From: Jan Wieck
Date:
Subject: Re: pg_upgrade failing for 200+ million Large Objects
Next
From: Tom Lane
Date:
Subject: Re: pg_upgrade failing for 200+ million Large Objects