Set-Returning functions in a select list - Mailing list pgsql-general

From J Lumby
Subject Set-Returning functions in a select list
Date
Msg-id DM5PR06MB3082AEC288F546F7AF5393E3A3E69@DM5PR06MB3082.namprd06.prod.outlook.com
Whole thread Raw
Responses Re: Set-Returning functions in a select list  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
I am unclear about the rules which postgresql uses in building a result 
set when the select list ( see note ) contains several different 
Set-Returning functions,  possibly with sets of different sizes.
At a very high level,   what I see happening is that each Set-Returning 
expression causes the result set to be expanded with additional rows,  
one per member of the set.
When there is only one such expression,     this is clear  -   the size 
of the final result set is N times the size of the output of the final 
operator where N is the size of the set   (i.e. N times the size if that 
expression had been a scalar).
But it becomes unclear where there is more than one such expression.

I can't find much on this point in the doc  (I am looking in PostgreSQL 
13.2 Documentation).    There is one small example in
"9.25. Set Returning Functions"
but in this example there is only one Set-Returning expression in the 
select list.

Also the description of the SELECT command mentions something called a 
"row group" :
"The actual output rows are computed using the SELECT output expressions 
for each selected row or row group. (See SELECT List below.)"
but no mention of Set-Returning expressions in the referenced section.

I tried an experiment :

<code>
echo -e "DROP SCHEMA testing cascade;
CREATE SCHEMA testing;
CREATE TABLE testing.set_returning_in_select_list (id int4 , bitmask_a 
int , bitmask_b int );
insert into testing.set_returning_in_select_list values ( 1 , 
x'1010'::int , x'0101'::int ) , ( 2 , x'2020'::int , x'0202'::int ) , ( 
3 , x'3030'::int , x'0303'::int );
select id , to_hex(bitmask_a) as hex_a , unnest( array [ ( bitmask_a & 
16 ) ,  ( bitmask_a & 32 ) , ( bitmask_a & 48 ) ] ) as bits_of_a ,
                to_hex(bitmask_b) as hex_b from 
testing.set_returning_in_select_list order by bits_of_a;
select id , to_hex(bitmask_a) as hex_a , unnest( array [ ( bitmask_a & 
16 ) ,  ( bitmask_a & 32 ) , ( bitmask_a & 48 ) ] ) as bits_of_a ,
                to_hex(bitmask_b) as hex_b , unnest( array [ ( bitmask_b 
& 1 ) ,  ( bitmask_b & 2 ) , ( bitmask_b & 3 ) ] ) as bits_of_b
  from testing.set_returning_in_select_list order by bits_of_a;
select id , to_hex(bitmask_a) as hex_a , unnest( array [ ( bitmask_a & 
16 ) ,  ( bitmask_a & 32 ) , ( bitmask_a & 48 ) ] ) as bits_of_a ,
                to_hex(bitmask_b) as hex_b , unnest( array [ ( bitmask_b 
& 3 ) ,  ( bitmask_b & 2 ) , ( bitmask_b & 1 ) ] ) as bits_of_b ,
                unnest( array [ ( bitmask_b & 3 ) ,  ( bitmask_b & 2 ) , 
( bitmask_b & 1 ) , ( bitmask_b & 17 ) ] ) as extraneous_bits_of_b
  from testing.set_returning_in_select_list order by bits_of_a;" | psql -e
</code>

The output of the last query is,  firstly,  different on postgresql-9.6 
than on 13.3,   and secondly,     hard to explain in either case.
my outputs appended below.

I *think* :

  . postgresql-9.6 applies a kind of triple cartesian join on the 
elements of each set, joining one set with the other set and then again 
with itself,   ending up with many duplicates
  . postgresql-13.3 applies a kind of null-padding extension of the 
elements of the smaller set and then an arbitrary linear pairing of the 
two sets,   ending up with one duplicate.

Which one,  if any,  is correct?    And why?

(note : I am not asking about the case of Set-Returning functions in the 
FROM clause,  where things seem clear(er).)

Cheers,   John Lumby

My outputs of the final query :


  . postgresql-9.6

<code>

  id | hex_a | bits_of_a | hex_b | bits_of_b | extraneous_bits_of_b
----+-------+-----------+-------+-----------+----------------------
   1 | 1010  |         0 | 101   |         0 |                    0
   1 | 1010  |         0 | 101   |         0 |                    1
   1 | 1010  |         0 | 101   |         0 |                    1
   1 | 1010  |         0 | 101   |         0 |                    1
   2 | 2020  |         0 | 202   |         2 |                    2
   2 | 2020  |         0 | 202   |         2 |                    2
   2 | 2020  |         0 | 202   |         2 |                    0
   2 | 2020  |         0 | 202   |         2 |                    0
   1 | 1010  |        16 | 101   |         1 |                    1
   1 | 1010  |        16 | 101   |         1 |                    1
   1 | 1010  |        16 | 101   |         1 |                    1
   1 | 1010  |        16 | 101   |         1 |                    0
   1 | 1010  |        16 | 101   |         1 |                    1
   1 | 1010  |        16 | 101   |         1 |                    1
   1 | 1010  |        16 | 101   |         1 |                    0
   1 | 1010  |        16 | 101   |         1 |                    1
   3 | 3030  |        16 | 303   |         3 |                    3
   3 | 3030  |        16 | 303   |         3 |                    1
   3 | 3030  |        16 | 303   |         3 |                    1
   3 | 3030  |        16 | 303   |         3 |                    2
   2 | 2020  |        32 | 202   |         2 |                    0
   2 | 2020  |        32 | 202   |         2 |                    2
   2 | 2020  |        32 | 202   |         2 |                    0
   2 | 2020  |        32 | 202   |         0 |                    0
   2 | 2020  |        32 | 202   |         2 |                    2
   2 | 2020  |        32 | 202   |         0 |                    2
   2 | 2020  |        32 | 202   |         0 |                    0
   2 | 2020  |        32 | 202   |         0 |                    2
   3 | 3030  |        32 | 303   |         2 |                    2
   3 | 3030  |        32 | 303   |         2 |                    3
   3 | 3030  |        32 | 303   |         2 |                    1
   3 | 3030  |        32 | 303   |         2 |                    1
   3 | 3030  |        48 | 303   |         1 |                    3
   3 | 3030  |        48 | 303   |         1 |                    2
   3 | 3030  |        48 | 303   |         1 |                    1
   3 | 3030  |        48 | 303   |         1 |                    1
(36 rows)

</code>


  . postgresql-13.3

<code>

  id | hex_a | bits_of_a | hex_b | bits_of_b | extraneous_bits_of_b
----+-------+-----------+-------+-----------+----------------------
   1 | 1010  |         0 | 101   |         0 |                    0
   2 | 2020  |         0 | 202   |         2 |                    2
   1 | 1010  |        16 | 101   |         1 |                    1
   1 | 1010  |        16 | 101   |         1 |                    1
   3 | 3030  |        16 | 303   |         3 |                    3
   2 | 2020  |        32 | 202   |         0 |                    0
   2 | 2020  |        32 | 202   |         2 |                    2
   3 | 3030  |        32 | 303   |         2 |                    2
   3 | 3030  |        48 | 303   |         1 |                    1
   1 | 1010  |           | 101   |           |                    1
   3 | 3030  |           | 303   |           |                    1
   2 | 2020  |           | 202   |           |                    0
(12 rows)
</code>



pgsql-general by date:

Previous
From: Mayan
Date:
Subject: Re: pg_restore (fromuser -> touser)
Next
From: Tom Lane
Date:
Subject: Re: Set-Returning functions in a select list