Re: BUG #18399: Query plan optimization results in runtime error when hoisting cast from inside subquery - Mailing list pgsql-bugs

From Tom Lane
Subject Re: BUG #18399: Query plan optimization results in runtime error when hoisting cast from inside subquery
Date
Msg-id 3877106.1710863246@sss.pgh.pa.us
Whole thread Raw
In response to BUG #18399: Query plan optimization results in runtime error when hoisting cast from inside subquery  (PG Bug reporting form <noreply@postgresql.org>)
Responses Re: BUG #18399: Query plan optimization results in runtime error when hoisting cast from inside subquery  (Sawyer Knoblich <scknoblich@gmail.com>)
List pgsql-bugs
PG Bug reporting form <noreply@postgresql.org> writes:
> My expectation was that the query should behave as if the CTE (or subquery
> if I had used that instead) has fully executed before moving on to the main
> query and this behavior breaks that assumption (which may or may not be
> correct). I attempted to search the docs for anything related to this,
> either CTE/subquery execution order guarantees or hoisting of casts, but I
> couldn't find anything about it. If this turns out not to a bug and is just
> expected behavior with existing documentation could you please link me to
> the relevant page for future reference?

You need to materialize the CTE to be certain that it won't be
intermingled with the calling query:

=# with raw_data(id, value) as (values (1, '1900'), (2, '2100'), (3, 'abc')),
     integers as materialized (select value::integer
                  from raw_data
                  where id in (select id from raw_data where value ~ '^\d+$'))
select count(*)
from integers
where value < 2000;
 count 
-------
     1
(1 row)

See
https://www.postgresql.org/docs/current/sql-select.html#SQL-WITH
(last few paras of that sub-section).

The real bottom line here is that the cast to integer is presumed
to be side-effect-free, which isn't true if you're working with
data where it could throw errors.  My advice is to rethink your
data model: this sort of EAV approach with inconsistent data
types in the "same" column just does not play nice with SQL
optimizers, so it'll cause you ongoing heartburn.  If you think
you're too far down the road for that, consider replacing the
cast with some non-error-throwing user-defined function,
about like

  case when x ~ '^\d+$' then x::integer else null::integer end

            regards, tom lane



pgsql-bugs by date:

Previous
From: PG Bug reporting form
Date:
Subject: BUG #18400: logging_collector does not collect messages from postmaster
Next
From: Masahiko Sawada
Date:
Subject: Re: Potential data loss due to race condition during logical replication slot creation