Thread: BUG #16356: Can't perform json operations on casted text in where clause when a join is used. WHERE is too early
BUG #16356: Can't perform json operations on casted text in where clause when a join is used. WHERE is too early
From
PG Bug reporting form
Date:
The following bug has been logged on the website: Bug reference: 16356 Logged by: Vale Violet Mote Email address: valefbonetwo@gmail.com PostgreSQL version: 12.2 Operating system: Win10 Description: Live question: https://stackoverflow.com/questions/61147921 Paste of text: Postgres (12.2) Setup: ```sql CREATE TABLE public.test_table ( id int NOT NULL, value_type text NOT NULL, value text NOT NULL ); ``` ```sql INSERT INTO public.test_table (id, value_type, value) VALUES (1, 'string', 'a'), (2, 'json', '{"hello":"world"}'), (3, 'json', '{"color":"blue"}'); ``` Initial Queries: ```sql select value::jsonb as json_value from test_table where value_type = 'json' ``` ``` json_value | ------------------| {"hello": "world"}| {"color": "blue"} | ``` But I'm only interested in ones with 'color'. Moving it to a subquery so that I can get only 'color', also just fine: ```sql select only_json.json_value from( select value::jsonb as json_value from test_table where value_type = 'json' ) only_json where only_json.json_value ? 'color' = true ``` ``` json_value | ------------------| {"color": "blue"} | ``` Now let's break that main table up into two, and suddenly effectively the same query has trouble: ```sql CREATE TABLE public.test_table ( id INT PRIMARY KEY, value TEXT NOT NULL ); CREATE TABLE public.test_types ( id INT PRIMARY KEY REFERENCES public.test_table (id), value_type TEXT NOT NULL ); INSERT INTO public.test_table (id, value) VALUES (1, 'a'), (2, '{"hello":"world"}'), (3, '{"color":"blue"}'); insert into public.test_types (id, value_type) values (1, 'string'), (2, 'json'), (3, 'json'); ``` Now this query: ```sql select id, value from ( select id, value::jsonb from public.test_table natural join public.test_types where value_type = 'json') only_json ``` returns, as expected: ``` id|value | --|------------------| 2|{"hello": "world"}| 3|{"color": "blue"} | ``` But as soon as I attach the where clause, it fails: ```sql select id, value from ( select id, value::jsonb from public.test_table natural join public.test_types where value_type = 'json') only_json where only_json.value ? 'color' = true ``` ``` SQL Error [22P02]: ERROR: invalid input syntax for type json Detail: Token "a" is invalid. Where: JSON data, line 1: a ``` It's somehow resurrected the value of 'a' that was well-eliminated prior to this where clause. So what gives? Why does the join cause it to apply the last where clause (which should happen logically last) too early? Failed workarounds I've tried: - Using left join instead of natural join. - Applying `where value_type = 'json'` to the joined table first, prior to the join. - Moving it to a "with". - Creating a view and then applying the where clause to a select from the view. - Creating a column via select called `is_color_holder` with `SELECT only_json.value ? 'color' as is_color_holder`. This column populates correctly, but if I use a where clause, `WHERE is_color_holder = true`, I receive the same error.
Re: BUG #16356: Can't perform json operations on casted text in where clause when a join is used. WHERE is too early
From
Tom Lane
Date:
PG Bug reporting form <noreply@postgresql.org> writes: > It's somehow resurrected the value of 'a' that was well-eliminated prior to > this where clause. So what gives? Why does the join cause it to apply the > last where clause (which should happen logically last) too early? The documentation explicitly disclaims any specific evaluation order for WHERE clauses, cf https://www.postgresql.org/docs/current/sql-expressions.html#SYNTAX-EXPRESS-EVAL If you really need to, you can force the matter by putting some type of optimization fence into the sub-select ("OFFSET 0" is the traditional way, or you can use WITH ... AS MATERIALIZED in v12 and up). This is typically disastrous for performance, of course. Not being able to filter rows before joining would make the join much slower. I'll just note that the sort of EAV schema that you have here is widely agreed to be an anti-pattern in database design. regards, tom lane
Re: BUG #16356: Can't perform json operations on casted text in whereclause when a join is used. WHERE is too early
From
"David G. Johnston"
Date:
On Friday, April 10, 2020, PG Bug reporting form <noreply@postgresql.org> wrote:
The following bug has been logged on the website:
Bug reference: 16356
Logged by: Vale Violet Mote
Email address: valefbonetwo@gmail.com
PostgreSQL version: 12.2
Operating system: Win10
Description:
Live question: https://stackoverflow.com/questions/61147921
Paste of text:
But as soon as I attach the where clause, it fails:
```sql
select id, value from (
select id, value::jsonb from public.test_table natural join
public.test_types
where value_type = 'json') only_json
where only_json.value ? 'color' = true
```
```
SQL Error [22P02]: ERROR: invalid input syntax for type json
Detail: Token "a" is invalid.
Where: JSON data, line 1: a
```
It's somehow resurrected the value of 'a' that was well-eliminated prior to
this where clause. So what gives? Why does the join cause it to apply the
last where clause (which should happen logically last) too early?
PostgreSQL is trying to be helpful by re-arranging things to execute in the most efficient way possible. It cannot adapt those optimizations on the fly in response to data. Since your model has issues that make this optimization fail you need to make it so the optimization cannot be applied. Adding “offset 0” to the subquery should accomplish this.
- Moving it to a "with".
It would have in prior versions. I think 12 removed the optimization fence that used to be in place here.
The best solution is not to have column content formats vary, though I get that is not always possible.
David J.