Thread: subselect removes rows

subselect removes rows

From
"Poot, Bas (B.J.)"
Date:

Hi there,

We recently ran into the following bug. Which is very annoying since you don't expect it to happen.

The bug is as follows:
When using certain commands in a sub-select, it removes the row.
This should never happen. It should return null instead. It worked in postgres 10, but somehow doesn't work anymore on postgres 13 (which we currently have)

Example query:

select 

col1, col2, jsonb_each_text(col2) 

from (

select 1 as col1, null::jsonb as col2

union all

select 1 as col1, '{"a":"2"}'::jsonb as col2

) t1


I haven't had the opportunity to test it on postgres 14, but I couldn't find anything about it in the release notes, so I don't expect it works any different in 14.

The workaround is obviously simple if we could use set returning functions in a case:
select 
col1, col2, 
case 
when col2 is null then null 
else jsonb_each_text(col2) 
end
from (
select 1 as col1, null::jsonb as col2
union all
select 1 as col1, '{"a":"2"}'::jsonb as col2
) t1

but we can't.. so that kinda makes it problematic..

Can you please look into it?

Thanks,
Bas Poot

------------------------- Disclaimer ----------------------------
De informatie verzonden met dit e-mailbericht (en bijlagen) is uitsluitend bestemd voor de geadresseerde(n) en zij die van de geadresseerde(n) toestemming kregen dit bericht te lezen.
Kennisneming door anderen is niet toegestaan.
De informatie in dit e-mailbericht (en bijlagen) kan vertrouwelijk van aard zijn en binnen het bereik van een geheimhoudingsplicht en/of een verschoningsrecht vallen.
Indien dit e-mailbericht niet voor u bestemd is, wordt u verzocht de afzender daarover onmiddellijk te informeren en het e-mailbericht (en bijlagen) te vernietigen.
Conform het beveiligingsbeleid van de Politie wordt e-mail van en naar de politie gecontroleerd op virussen, spam en phishing en moet deze e-mail voldoen aan de voor de overheid verplichte mailbeveiligingsstandaarden die zijn vastgesteld door het Forum Standaardisatie.
Mail die niet voldoet aan het beveiligingsbeleid kan worden geblokkeerd waardoor deze de geadresseerde niet bereikt. De geadresseerde wordt hiervan niet in kennis gesteld.
---------------------------------------------------------------------
The information sent in this E-mail message (including any attachments) is exclusively intended for the individual(s) to whom it is addressed and for the individual(s) who has/have had permission from the recipient(s) to read this message.
Access by others is not permitted.
The information in this E-mail message (including any attachments) may be of a confidential nature and may form part of the duty of confidentiality and/or the right of non-disclosure.
If you have received this E-mail message in error, please notify the sender without delay and delete the E-mail message (including any attachments).
In conformity with the security policy of the Police, E-mails from and to the Police are checked for viruses, spam and phishing and this E-mail must meet the standards of the government-imposed E-mail security as set by the Standardization Forum.
Any E-mail failing to meet said security policy may be blocked as a result of which it will not reach the intended recipient. The recipient concerned will not be notified.
---------------------------------------------------------------------

Re: subselect removes rows

From
Tom Lane
Date:
"Poot, Bas (B.J.)" <bas.poot@politie.nl> writes:
> The bug is as follows:
> When using certain commands in a sub-select, it removes the row.
> This should never happen. It should return null instead. It worked in postgres 10, but somehow doesn't work anymore
onpostgres 13 (which we currently have) 

> Example query:

> select
> col1, col2, jsonb_each_text(col2)
> from (
> select 1 as col1, null::jsonb as col2
> union all
> select 1 as col1, '{"a":"2"}'::jsonb as col2
> ) t1

Your claim is quite unclear ... but AFAICS, that query produces exactly
the same results in v10 as in later versions.

regression=# select
col1, col2, jsonb_each_text(col2)
from (
select 1 as col1, null::jsonb as col2
union all
select 1 as col1, '{"a":"2"}'::jsonb as col2
) t1
;
 col1 |    col2    | jsonb_each_text
------+------------+-----------------
    1 | {"a": "2"} | (a,2)
(1 row)

            regards, tom lane



Re: subselect removes rows

From
"Poot, Bas (B.J.)"
Date:

Hi there,


Thanks for looking in to it.


The problem is that I don't expect (and can't imagine it is correct) that a (any) sub-select removes rows from the resultset.


my table (t1 in my example) has 3 rows.

I don't use any where clauses or inner joins, so I expect 3 rows in my outer query.


An other argument is, that I have 2 columns in my table (col1 and col2)

My action on col2 (in this case using jsonb_each_text) should not have ANY effect on my col1, yet it has.


By the way, I also noticed that it happens with empty json objects.


select 
col1, col2, jsonb_each_text(col2) 
from (
select 1 as col1, null::jsonb as col2
union all
select 1 as col1, '{}'::jsonb as col2
union all
select 1 as col1, '{"a":"2"}'::jsonb as col2
) t1

With kind regards,

Bas






Van: Tom Lane <tgl@sss.pgh.pa.us>
Verzonden: maandag 29 november 2021 16:01
Aan: Poot, Bas (B.J.)
CC: pgsql-bugs@lists.postgresql.org
Onderwerp: Re: subselect removes rows
 
"Poot, Bas (B.J.)" <bas.poot@politie.nl> writes:
> The bug is as follows:
> When using certain commands in a sub-select, it removes the row.
> This should never happen. It should return null instead. It worked in postgres 10, but somehow doesn't work anymore on postgres 13 (which we currently have)

> Example query:

> select
> col1, col2, jsonb_each_text(col2)
> from (
> select 1 as col1, null::jsonb as col2
> union all
> select 1 as col1, '{"a":"2"}'::jsonb as col2
> ) t1

Your claim is quite unclear ... but AFAICS, that query produces exactly
the same results in v10 as in later versions.

regression=# select
col1, col2, jsonb_each_text(col2)
from (
select 1 as col1, null::jsonb as col2
union all
select 1 as col1, '{"a":"2"}'::jsonb as col2
) t1
;
 col1 |    col2    | jsonb_each_text
------+------------+-----------------
    1 | {"a": "2"} | (a,2)
(1 row)

                        regards, tom lane
------------------------- Disclaimer ----------------------------
De informatie verzonden met dit e-mailbericht (en bijlagen) is uitsluitend bestemd voor de geadresseerde(n) en zij die van de geadresseerde(n) toestemming kregen dit bericht te lezen.
Kennisneming door anderen is niet toegestaan.
De informatie in dit e-mailbericht (en bijlagen) kan vertrouwelijk van aard zijn en binnen het bereik van een geheimhoudingsplicht en/of een verschoningsrecht vallen.
Indien dit e-mailbericht niet voor u bestemd is, wordt u verzocht de afzender daarover onmiddellijk te informeren en het e-mailbericht (en bijlagen) te vernietigen.
Conform het beveiligingsbeleid van de Politie wordt e-mail van en naar de politie gecontroleerd op virussen, spam en phishing en moet deze e-mail voldoen aan de voor de overheid verplichte mailbeveiligingsstandaarden die zijn vastgesteld door het Forum Standaardisatie.
Mail die niet voldoet aan het beveiligingsbeleid kan worden geblokkeerd waardoor deze de geadresseerde niet bereikt. De geadresseerde wordt hiervan niet in kennis gesteld.
---------------------------------------------------------------------
The information sent in this E-mail message (including any attachments) is exclusively intended for the individual(s) to whom it is addressed and for the individual(s) who has/have had permission from the recipient(s) to read this message.
Access by others is not permitted.
The information in this E-mail message (including any attachments) may be of a confidential nature and may form part of the duty of confidentiality and/or the right of non-disclosure.
If you have received this E-mail message in error, please notify the sender without delay and delete the E-mail message (including any attachments).
In conformity with the security policy of the Police, E-mails from and to the Police are checked for viruses, spam and phishing and this E-mail must meet the standards of the government-imposed E-mail security as set by the Standardization Forum.
Any E-mail failing to meet said security policy may be blocked as a result of which it will not reach the intended recipient. The recipient concerned will not be notified.
---------------------------------------------------------------------

Re: subselect removes rows

From
Tom Lane
Date:
"Poot, Bas (B.J.)" <bas.poot@politie.nl> writes:
> The problem is that I don't expect (and can't imagine it is correct) that a (any) sub-select removes rows from the
resultset.

I'm really not sure what you're saying here.  If you're complaining
about jsonb_each_text returning no rows for empty input, that behavior
hasn't changed, and it's hard to see what else it could do.  Your
example isn't showing any other behavior that seems odd.

            regards, tom lane



Re: subselect removes rows

From
Pantelis Theodosiou
Date:


On Mon, Nov 29, 2021 at 5:34 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
"Poot, Bas (B.J.)" <bas.poot@politie.nl> writes:
> The problem is that I don't expect (and can't imagine it is correct) that a (any) sub-select removes rows from the resultset.

I'm really not sure what you're saying here.  If you're complaining
about jsonb_each_text returning no rows for empty input, that behavior
hasn't changed, and it's hard to see what else it could do.  Your
example isn't showing any other behavior that seems odd.

                        regards, tom lane


As Tom explained, jsonb_each_text expands the json and may produce more rows (if the json object has more than one items) or none (if it's empty or null).
If you want something else, perhaps you can use a lateral join, to keep at least one row always. Something like:

select
  col1, col2, j.col2_item
from (
  select 1 as col1, null::jsonb as col2
  union all
  select 1 as col1, '{"a":"2"}'::jsonb as col2
) t1
  left join lateral
  ( select jsonb_each_text(t1.col2) as col2_item
  ) as j on true ;

Re: subselect removes rows

From
"David G. Johnston"
Date:
On Mon, Nov 29, 2021 at 10:09 AM Poot, Bas (B.J.) <bas.poot@politie.nl> wrote:

The problem is that I don't expect (and can't imagine it is correct) that a (any) sub-select removes rows from the resultset.


Ignoring your specific example for the moment your basic complaint seems to be that:

select 1, unnest(array[]::integer[]);   --zero rows
select 1, (select unnest(array[]::integer[])); -- one row, second column is null

This is actually the reverse of what you are saying though - the subselect actually allows the row to be returned, not the opposite.  In short, it turns the implicit join between the input row and the set returning function into a left join instead of an inner join.

This is how things work today - an empty SRF function implicitly inner joins on the rows of the main query and so, like any inner join, will remove rows from the output if there are no records on the SRF side of the join.

Others, or the mailing list archives where this has come up many times, can provide further insight into the why.  I choose not to remember such details here and just accept it as the behavior.  With the addition of lateral joins the cases where you have an SRF in the select-list should tend toward zero anyway (and then you get to be explicit as to inner or outer join).

David J.

Re: subselect removes rows

From
"Poot, Bas (B.J.)"
Date:

Thanks for the explanation.


oké I read your message 3 times and i think we're lost in translation.


What I call a subquery is anything in the 'select' part that is not a column.

apparently that's not correct, and I have to use (select xxx) to call it a subquery.


That also solves my problem, since:


select 
col1, col2, (select jsonb_each_text(col2)) 
from (
select 1 as col1, null::jsonb as col2
union all
select 1 as col1, '{}'::jsonb as col2
union all
select 1 as col1, '{"a":"2"}'::jsonb as col2
) t1

Works as I expected.


I am sorry for the disturbance.. I guess the error was on my part...


Kind regards,

Bas


Van: David G. Johnston <david.g.johnston@gmail.com>
Verzonden: maandag 29 november 2021 19:21
Aan: Poot, Bas (B.J.)
CC: Tom Lane; pgsql-bugs@lists.postgresql.org
Onderwerp: Re: subselect removes rows
 
On Mon, Nov 29, 2021 at 10:09 AM Poot, Bas (B.J.) <bas.poot@politie.nl> wrote:

The problem is that I don't expect (and can't imagine it is correct) that a (any) sub-select removes rows from the resultset.


Ignoring your specific example for the moment your basic complaint seems to be that:

select 1, unnest(array[]::integer[]);   --zero rows
select 1, (select unnest(array[]::integer[])); -- one row, second column is null

This is actually the reverse of what you are saying though - the subselect actually allows the row to be returned, not the opposite.  In short, it turns the implicit join between the input row and the set returning function into a left join instead of an inner join.

This is how things work today - an empty SRF function implicitly inner joins on the rows of the main query and so, like any inner join, will remove rows from the output if there are no records on the SRF side of the join.

Others, or the mailing list archives where this has come up many times, can provide further insight into the why.  I choose not to remember such details here and just accept it as the behavior.  With the addition of lateral joins the cases where you have an SRF in the select-list should tend toward zero anyway (and then you get to be explicit as to inner or outer join).

David J.

------------------------- Disclaimer ----------------------------
De informatie verzonden met dit e-mailbericht (en bijlagen) is uitsluitend bestemd voor de geadresseerde(n) en zij die van de geadresseerde(n) toestemming kregen dit bericht te lezen.
Kennisneming door anderen is niet toegestaan.
De informatie in dit e-mailbericht (en bijlagen) kan vertrouwelijk van aard zijn en binnen het bereik van een geheimhoudingsplicht en/of een verschoningsrecht vallen.
Indien dit e-mailbericht niet voor u bestemd is, wordt u verzocht de afzender daarover onmiddellijk te informeren en het e-mailbericht (en bijlagen) te vernietigen.
Conform het beveiligingsbeleid van de Politie wordt e-mail van en naar de politie gecontroleerd op virussen, spam en phishing en moet deze e-mail voldoen aan de voor de overheid verplichte mailbeveiligingsstandaarden die zijn vastgesteld door het Forum Standaardisatie.
Mail die niet voldoet aan het beveiligingsbeleid kan worden geblokkeerd waardoor deze de geadresseerde niet bereikt. De geadresseerde wordt hiervan niet in kennis gesteld.
---------------------------------------------------------------------
The information sent in this E-mail message (including any attachments) is exclusively intended for the individual(s) to whom it is addressed and for the individual(s) who has/have had permission from the recipient(s) to read this message.
Access by others is not permitted.
The information in this E-mail message (including any attachments) may be of a confidential nature and may form part of the duty of confidentiality and/or the right of non-disclosure.
If you have received this E-mail message in error, please notify the sender without delay and delete the E-mail message (including any attachments).
In conformity with the security policy of the Police, E-mails from and to the Police are checked for viruses, spam and phishing and this E-mail must meet the standards of the government-imposed E-mail security as set by the Standardization Forum.
Any E-mail failing to meet said security policy may be blocked as a result of which it will not reach the intended recipient. The recipient concerned will not be notified.
---------------------------------------------------------------------