Thread: Order of rows in simple "select r from table_fn()"

Order of rows in simple "select r from table_fn()"

From

Bryn Llewellyn

Date:

15 February 2023, 02:49:18

In general, the order of the rows in a result set is stated to be unpredictable without an "order by" at the outermost level. Famously, beginners observe what seems to be reliably reproducible ordering in some queries that don't have an "order by"—and it can take some effort to persuade them that they cannot rely on what seems to be a reliable order unless they clutter the SQL (and add the cost of sorting) by adding an "order by" clause.

I've found that a table function with "returns table(r text)" provides a convenient way to write a nicely formatted report using psql that can be easily directed to a file with the "\o" metacommand. In general, for cases like this, I can't write a useful "order by r" because the values of "r" interleave, for example, rule-offs between sections of the report, various sub-headings, and actual query results. The required order is exactly the order in which my code produces the rows.

Here's a trivial, artificial, example:

create function report_1()
  returns table(r text)
  language plpgsql
as $body$
declare
  rule constant text not null := lpad('—', 40, '—');
begin
  r := rule;   return next;
  r := 'MY REPORT';     return next;
  r := rule;   return next;

  r := '';   return next;
  r := 'Section 1';     return next;
  r := '---------';     return next;
  for r in (
select v::text
from t1
order by v
) loop
  /**/   return next;
  end loop;

  r := '';   return next;
  r := 'Section 2';     return next;
  r := '---------';     return next;
  for r in (
select v::text
from t2
order by v desc
) loop
  /**/   return next;
  end loop;
  r := rule;   return next;
end;
$body$;

And this is the output, given some suitable content in t1 and t2, from "select r from report_1()" with no "order by":

————————————————————————————————————————
MY REPORT
————————————————————————————————————————

Section 1
---------
10
12
14
16

Section 2
---------
27
24
21
————————————————————————————————————————

I've written no end of reports this way. And I've never, ever, seen the rows come out in an order that differs from the order in which they're written. (Of course, I know that this proves nothing.) Here's a variant that lets me say "select r from report_1() order by k":

create function report_2()
  returns table(k int, r text)
  language plpgsql
as $body$
declare
  rule constant text not null := lpad('—', 40, '—');
begin
  k = 1;     r := rule;     return next;
  k = k + 1; r := 'MY REPORT';   return next;
  k = k + 1; r := rule;     return next;

  k = k + 1; r := '';     return next;
  k = k + 1; r := 'Section 1';   return next;
  k = k + 1; r := '---------';   return next;
  for r in (
select v::text
from t1
order by v
) loop
  k = k + 1;   return next;
  end loop;

  k = k + 1; r := '';     return next;
  k = k + 1; r := 'Section 2';   return next;
  k = k + 1; r := '---------';   return next;
  for r in (
select v::text
from t2
order by v desc
) loop
  k = k + 1;   return next;
  end loop;
  k = k + 1; r := rule;     return next;
end;
$body$;

It adds an uncomfortable amount of clutter.

* Is it essential for correctness? *

It's annoying that the use of "return next" prevents the pattern that each "print line" follows from being encapsulated into a procedure. But it is what it is, yes?

Re: Order of rows in simple "select r from table_fn()"

From

"David G. Johnston"

Date:

15 February 2023, 03:08:30

On Tue, Feb 14, 2023 at 4:49 PM Bryn Llewellyn <bryn@yugabyte.com> wrote:

I've found that a table function with "returns table(r text)" provides a convenient way to write a nicely formatted report using psql that can be easily directed to a file with the "\o" metacommand. In general, for cases like this, I can't write a useful "order by r" because the values of "r" interleave, for example, rule-offs between sections of the report, various sub-headings, and actual query results. The required order is exactly the order in which my code produces the rows.

Seems safe enough to rely upon if the query is indeed: SELECT * FROM fn(); The system has to consume the output of the function call in its serial order and has no reason to then reorder things prior to producing the final result set. Though I'd probably still consider adding a "report line number" column to the output for end-user usability or if they want to sort the report and then return to the physical order.

I am curious whether a user-defined set-returning function is allowed to specify "WITH ORDINALITY" like the built-in UNNEST function does to produce the output row numbers external to the function body and signature.

David J.

Re: Order of rows in simple "select r from table_fn()"

From

Pavel Stehule

Date:

15 February 2023, 07:34:43

It adds an uncomfortable amount of clutter.

* Is it essential for correctness? *

It's annoying that the use of "return next" prevents the pattern that each "print line" follows from being encapsulated into a procedure. But it is what it is, yes?

RETURN NEXT, RETURN QUERY (RETURN QUERY is +/- RETURN NEXT) materializes result in tuple store. This can be a source of performance problems sometimes (SELECT * FROM fc() LIMIT n) , but you can believe in persistent order. Order of reading from the tuple store is exactly like reading from a file, and there are not any other processes that can modify the order of reading (tuple store is not visible from other processes). I can imagine different mechanisms in the future - without materialization based on pipes - but it is a very far future, and still there will not be a reason for change of order.

Regards

Pavel

Re: Order of rows in simple "select r from table_fn()"

From

Bryn Llewellyn

Date:

15 February 2023, 08:01:20

david.g.johnston@gmail.com wrote:

bryn@yugabyte.com wrote:

I've found that a table function with "returns table(r text)" provides a convenient way to write a nicely formatted report using psql that can be easily directed to a file with the "\o" metacommand. In general, for cases like this, I can't write a useful "order by r" because the values of "r" interleave, for example, rule-offs between sections of the report, various sub-headings, and actual query results. The required order is exactly the order in which my code produces the rows.

Seems safe enough to rely upon if the query is indeed: SELECT * FROM fn(); The system has to consume the output of the function call in its serial order and has no reason to then reorder things prior to producing the final result set. Though I'd probably still consider adding a "report line number" column to the output for end-user usability or if they want to sort the report and then return to the physical order.

I am curious whether a user-defined set-returning function is allowed to specify "WITH ORDINALITY" like the built-in UNNEST function does to produce the output row numbers external to the function body and signature.

Thanks, David. Thanks, too, to pavel.stehule@gmail.com for your separate reply that also says that I can rely on seeing the order in which I produce the rows in the function's implementation. And yes, I realize that Postgres table functions are not pipelined in the way that they can be, if you choose this, in Oracle Database.

Given that the order is pre-calculated, it seems that "with ordinality" can add line numbering "after the fact" reliably and with minimum clutter when it's needed. I tried these two variants:

create function f1()
returns setof text
language sql
as $body$
values ('skiing'), ('cycling'), ('running');
$body$;
and:

create function f2()
  returns table(r text)
  language plpgsql
as $body$
begin
  r := 'skiing'; return next;
  r := 'cycling'; return next;
  r := 'running'; return next;
end;
$body$;

select t.line_no, t.report_text
from f1() with ordinality as t(report_text, line_no);

Each supports this same query

select t.line_no, t.report_text
from fN() with ordinality as t(report_text, line_no);

and gets this same result:

line_no | report_text
---------+-------------
1 | skiing
2 | cycling
3 | running