A modest proposal vis hierarchical queries: MINUS in the column list - Mailing list pgsql-hackers

From Mark Zellers
Subject A modest proposal vis hierarchical queries: MINUS in the column list
Date
Msg-id BYAPR06MB4600BFBF43879247EFCDF3D888389@BYAPR06MB4600.namprd06.prod.outlook.com
Whole thread Raw
Responses Re: A modest proposal vis hierarchical queries: MINUS in the column list
List pgsql-hackers

One of the friction points I have found in migrating from Oracle to PostgreSQL is in the conversion of hierarchical queries from the Oracle START WITH/CONNECT BY/ORDER SIBLINGS by pattern to using the ANSI recursive subquery form.

Once you wrap your head around it, the ANSI form is not so bad with one major exception.  In order to achieve the equivalent of Oracle’s  ORDER SIBLINGS BY clause, you need to add an additional column containing an array with the accumulated path back to the root of the hierarchy for each row.  The problem with that is that it leaves you with an unfortunate choice: either accept the inefficiency of returning the array with the path back to the client (which the client doesn’t need or want), or requiring the application to explicitly list all of the columns that it wants just to exclude the hierarchy column, which can be hard to maintain, especially if your application needs to support both databases.  If you have a ORM model where there could be multiple queries that share the same client code to read the result set, you might have to change multiple queries when new columns are added to a table or view even though you have centralized the processing of the result set.

The ideal solution for this would be for PostgreSQL to support the Oracle syntax and internally convert it to the ANSI form.  Failing that, I have a modest suggestion that I would like to start a discussion around.  What if you could use the MINUS keyword in the column list of a select statement to remove a column from the result set returned to the client?  What I have in mind is something like this:

To achieve the equivalent of the following Oracle query:


      SELECT T.*
          FROM T
       START WITH T.ParentID IS NULL
       CONNECT BY T.ParentID = PRIOR T.ID
      ORDER SIBLINGS BY T.OrderVal

You could use

      WITH RECURSIVE TT AS (
              SELECT T0.*, ARRAY[]::INTEGER[] || T.OrderVal AS Sortable
                 FROM T T0
             UNION ALL
                SELECT T1.*, TT.Sortable || T1 AS Sortable
                   FROM TT
      INNER JOIN T T1 ON (T1.ParentID = TT.ID)
    )
   SELECT TT.* MINUS TT.Sortable
      FROM TT
ORDER BY TT.Sortable

Now the Sortable column can be used to order the result set but is not returned to the client.

Not knowing the internals of the parser, I’m assuming that the use of MINUS in this construct would be distinguishable from the set difference use case because the expression being subtracted is a column (or perhaps even a lst of columns) rather than a SELECT expression.





            

pgsql-hackers by date:

Previous
From: Stephen Frost
Date:
Subject: Re: Duplicate history file?
Next
From: Robert Haas
Date:
Subject: Re: Make unlogged table resets detectable