Re: Implement targetlist SRFs using ROWS FROM() (was Changed SRF in targetlist handling) - Mailing list pgsql-hackers

Hi,

Attached is a significantly updated patch series (see the mail one up
for details about what this is, I don't want to quote it in its
entirety).

There's still some corner cases (DISTINCT + SRF, UNION/INTERSECT with
SRF) to test / implement and a good bit of code cleanup to do. But
feature wise it's pretty much complete.

It currently consists of the following patches:

0001-Add-some-more-targetlist-srf-tests.patch
  Add some test.

0002-Shore-up-some-weird-corner-cases-for-targetlist-SRFs.patch
  Forbid UPDATE ... SET foo = SRF() and ORDER BY / GROUP BY containing
  SRFs that would change the number of returned rows.  Without the
  latter e.g. SELECT 1 ORDER BY generate_series(1,10); returns 10 rows.

0003-Avoid-materializing-SRFs-in-the-FROM-list.patch
  To avoid performance regressions from moving SRFM_ValuePerCall SRFs to
  ROWS FROM, nodeFunctionscan.c needs to support not materializing
  output.

  In my present patch I've *ripped out* the support for materialization
  in nodeFunctionscan.c entirely. That means that rescans referencing
  volatile functions can change their behaviour (if a function is
  rescanned, without having it's parameters changed), and that native
  backward scan support is gone.  I don't think that's actually an issue.

  This temporarily duplicates a bit of code from execQual.c, but that's
  removed again in 0006.

0004-Allow-ROWS-FROM-to-return-functions-as-single-record.patch
  To allow transforming SELECT record_srf(); nodeFunctionscan.c needs to
  learn to return the result as a record.  I chose
  ROWS FROM (record_srf() AS ()) as the syntax for that. It doesn't
  necessarily have to be SQL exposed, but it does make testing easier.

0005-Basic-implementation-of-targetlist-SRFs-via-ROWS-FRO.patch
  Convert all targetlist SRFs to ROWS FROM() expression, referencing the
  original query (without the SRFs) via a subquery.

  Note that this changes the behaviour of queries in a few cases. Namely
  the "least-common-multiple" behaviour of targetlist SRFs is gone, a
  function can now accept multiple set returning functions as input, SRF
  references in COALESCE / CASE are evaluated a bit more "eagerly". The
  last one I have to think a bit more about.

0006-Remove-unused-code-related-to-targetlist-SRFs.patch
  Now that there's no targetlist SRFs at execution time anymore, rip out
  executor and planner code related to that.  There's possibly more, but
  that's what I could find in a couple passes of searching around.

  This actually speeds up tpch-h queries by roughly 4% for me.


My next steps are to work on cleaning up the code a bit more, and
increase the regression coverage.

Input is greatly welcome.

Greetings,

Andres Freund

Attachment

pgsql-hackers by date:

Previous
From: Tom Lane
Date:
Subject: Re: Set log_line_prefix and application name in test drivers
Next
From: Peter Geoghegan
Date:
Subject: Re: WAL consistency check facility