Thread: Array casting in where : unexpected behavior

Array casting in where : unexpected behavior

From
Achilleas Mantzios
Date:
Good Morning list,
we have a strange situation here, manifested in the following queries :
Postgresql version : 9.3.10

dynacom=# select * FROM (select vals from fb_forms_defs fd, fb_reports_dets rd where fd.id=rd.fdid and
fd.dbtag='observer')as qry;  vals
 
--------- {7078} {13916} {7078} {2054} {7078} {13916} {2054} {13916} {2054} {8844} {13916} {13916} {2054} {13916}
{13916}{13916} {2054} {2054} {13916} {13916} {13916} {13916} {13916} {13916} {13916}
 
(25 rows)

dynacom=#

dynacom=# select * FROM (select vals from fb_forms_defs fd, fb_reports_dets rd where fd.id=rd.fdid and
fd.dbtag='observer')as qry WHERE vals[1]::int=8844;
 
ERROR:  invalid input syntax for integer: "19/10/2015"
dynacom=#

dynacom=# select vals[1]::int FROM (select vals from fb_forms_defs fd, fb_reports_dets rd where fd.id=rd.fdid and
fd.dbtag='observer')as qry; vals
 
-------  7078 13916  7078  2054  7078 13916  2054 13916  2054  8844 13916 13916  2054 13916 13916 13916  2054  2054
1391613916 13916 13916 13916 13916 13916
 
(25 rows)

dynacom=#

dynacom=# select vals1 FROM (select vals[1]::int as vals1 from fb_forms_defs fd, fb_reports_dets rd where fd.id=rd.fdid
andfd.dbtag='observer') as qry; vals1
 
-------  7078 13916  7078  2054  7078 13916  2054 13916  2054  8844 13916 13916  2054 13916 13916 13916  2054  2054
1391613916 13916 13916 13916 13916 13916
 
(25 rows)

dynacom=# select vals1 FROM (select vals[1]::int as vals1 from fb_forms_defs fd, fb_reports_dets rd where fd.id=rd.fdid
andfd.dbtag='observer') as qry WHERE vals1=8844;
 
ERROR:  invalid input syntax for integer: "19/10/2015"
dynacom=#

^^^ is this normal? Isn't vals1 guaranteed to be integer since its the type defined in the subselect ?

However this seems to work :

dynacom=# select * FROM (select vals from fb_forms_defs fd, fb_reports_dets rd where fd.id=rd.fdid and
fd.dbtag='observer')as qry WHERE vals[1] ~ E'^\\d+$' AND vals[1]::int=8844;  vals
 
-------- {8844}
(1 row)

dynacom=#

-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt




Re: Array casting in where : unexpected behavior

From
"David G. Johnston"
Date:
On Wed, May 25, 2016 at 2:42 AM, Achilleas Mantzios <achill@matrix.gatewaynet.com> wrote:
dynacom=# select * FROM (select vals from fb_forms_defs fd, fb_reports_dets rd where fd.id=rd.fdid and fd.dbtag='observer') as qry WHERE vals[1]::int=8844;
ERROR:  invalid input syntax for integer: "19/10/2015"
dynacom=#

​There order of evaluation in this query does not have to match the physical order.

If you need to ensure the join and 'observer' filters are applied first you have two options.

Make "qry" into a CTE.  This, at least presently, introduces an optimization fence.

Add "OFFSET 0" to the qry subquery: SELECT * FROM (SELECT ... FROM ... OFFSET 0) AS qry WHERE ...​

​That too introduces an optimization fence.

You could (I think) also do something like.

WHERE CASE WHEN vals[1] !~ '^\d+$' THEN false ELSE vals[1]::int = ​8844 END

or, more obviously, since vals is a text array, WHERE vals[1] = '8844'


dynacom=# select vals1 FROM (select vals[1]::int as vals1 from fb_forms_defs fd, fb_reports_dets rd where fd.id=rd.fdid and fd.dbtag='observer') as qry WHERE vals1=8844;
ERROR:  invalid input syntax for integer: "19/10/2015"
dynacom=#

^^^ is this normal? Isn't vals1 guaranteed to be integer since its the type defined in the subselect ?


This gets rewritten to:  "WHERE vals[1]::int = 8844" and pushed down and so is susceptible to failing.

However this seems to work :

dynacom=# select * FROM (select vals from fb_forms_defs fd, fb_reports_dets rd where fd.id=rd.fdid and fd.dbtag='observer') as qry WHERE vals[1] ~ E'^\\d+$' AND vals[1]::int=8844;
  vals
--------
 {8844}
(1 row)

​This works just by chance.​
 

​David J.​

Re: Array casting in where : unexpected behavior

From
Achilleas Mantzios
Date:
Hello David,

On 25/05/2016 15:29, David G. Johnston wrote:
On Wed, May 25, 2016 at 2:42 AM, Achilleas Mantzios <achill@matrix.gatewaynet.com> wrote:
dynacom=# select * FROM (select vals from fb_forms_defs fd, fb_reports_dets rd where fd.id=rd.fdid and fd.dbtag='observer') as qry WHERE vals[1]::int=8844;
ERROR:  invalid input syntax for integer: "19/10/2015"
dynacom=#

​There order of evaluation in this query does not have to match the physical order.

If you need to ensure the join and 'observer' filters are applied first you have two options.

Make "qry" into a CTE.  This, at least presently, introduces an optimization fence.

Add "OFFSET 0" to the qry subquery: SELECT * FROM (SELECT ... FROM ... OFFSET 0) AS qry WHERE ...​

​That too introduces an optimization fence.

You could (I think) also do something like.

WHERE CASE WHEN vals[1] !~ '^\d+$' THEN false ELSE vals[1]::int = ​8844 END

Thank you a lot for your thorough explanation, the above might be the best solution.
or, more obviously, since vals is a text array, WHERE vals[1] = '8844'


dynacom=# select vals1 FROM (select vals[1]::int as vals1 from fb_forms_defs fd, fb_reports_dets rd where fd.id=rd.fdid and fd.dbtag='observer') as qry WHERE vals1=8844;
ERROR:  invalid input syntax for integer: "19/10/2015"
dynacom=#

^^^ is this normal? Isn't vals1 guaranteed to be integer since its the type defined in the subselect ?


This gets rewritten to:  "WHERE vals[1]::int = 8844" and pushed down and so is susceptible to failing.
I understand, but still this does not look conceptually right.

However this seems to work :

dynacom=# select * FROM (select vals from fb_forms_defs fd, fb_reports_dets rd where fd.id=rd.fdid and fd.dbtag='observer') as qry WHERE vals[1] ~ E'^\\d+$' AND vals[1]::int=8844;
  vals
--------
 {8844}
(1 row)

​ This works just by chance.​
 
So I guess we must introduce an optimization fence by the means of a CTE or the use OFFSET. Since we must replicate this query on a 8.3 (well 100s of them actually) as well, I guess OFFSET is the way to go.

​David J.​



-- 
Achilleas Mantzios
IT DEV Lead
IT DEPT
Dynacom Tankers Mgmt