Hello,
I'm sorry if this message brings up once more an already settled issue, but there's no public list of bug reports for PostgreSQL (the release notes contain only acknowledged, solved bugs).
This problem was noticed in version 15.1, as I don't have the latest version at hand, but I've read the release notes for 15.2 and 15.3 and there's no sign of any related change. It's about the outcome of a SELECT statement which involves computing aggregate functions like json_agg, jsonb_agg and array_agg over columns that originate from outer-joined entities, when these entities are missing.
Here's a very simple schema to illustrate the problem:
create table a (
id serial2 primary key,
name text
);
create table b (
name text primary key,
a int2 references a(id)
);
insert into a (name) values ('a1'), ('a2');
insert into b (name, a) values
('b for a1', 1), ('another b for a1', 1);
Notice that entity named
a2 in table
a has no corresponding
bs. Now run this query:
select a.*, json_agg(b) from
a
left join b on b.a = a.id
group by a.id; The aggregation column for entity a2 has value [null] (if array_agg() was used, the obtained value would be {NULL}).
Since the query without aggregation
select a.*, b.* from
a
left join b on b.a = a.id;
produces an entry for entity a2 with NULL values for columns belonging to table b, I expect the aggregation to produce either [{"name":null,"a":null}] or the SQL NULL for it, preferrably the latter.
I suppose there's a dillema regarding what to return: the aggregation is computed over one row, but its values are actually missing. The curent choice of PostgreSQL is somewhere in-between, which is inappropriate from both perspectives - a political compromise. Moreover, in the case of array_agg(), the returned value could have also originated from a single-row single-column actual NULL value!
Although there is a workaround for this problem, it implies checking a stupid condition for absolutely every row in the JOIN result:
select a.*, json_agg(b) filter (where b.name is not null) from
a
left join b on b.a = a.id
group by a.id;
Thoughts?