Hi,
I represent a small group of developers. We are working on an open-source PostgreSQL extension to enable processing of genomic data inside Postgres. We have an extensive knowledge of molecular biology or data science and none of the Postgres internals.
I don’t know if this mailing list is a good place to ask this question, but if it’s not, just correct me.
The problem:
We currently have a one-to-many function (an operation that produces multiple rows per one one input row). Now we would like to translate that functionality to a sensible many-to-many. We need to know how we are constrained by the internals of Postgres itself and what syntax we should use.
Also, the operation we are implementing requires knowing the full set of inputs before it can be computed.
Current solution:
There is ValuePerCall (1/0 returned rows) or Materialize mode (any number of returned rows), however the second one does not offer any invocation counter (like ValuePerCall does). Hence to provide any persistence between subcalls we introduced the following syntax:
SELECT _ FROM table t, my_function(t.a, t.b, t.c, number_of_rows);
Where by FROM a, b we mean cartesian product a times b. And my_function for first (number_of_rows - 1) invocations returns an empty set and the full result set for the last one.
Sadly this syntax requires us to enter a number of rows which is not very convenient.
Do you know how to handle this situation correctly? We looked for example at the code of tablefunc but the syntax there requires a full SQL query as an input, so that wasn’t useful.