Re: BUG #16251: ::text type casting of a constant breaks query performance - Mailing list pgsql-bugs

From Pavel Stehule
Subject Re: BUG #16251: ::text type casting of a constant breaks query performance
Date
Msg-id CAFj8pRC_a9xy1x3WNsqpEOy9+Lz0HjKjhqNE1iVn0y=Wg9npBQ@mail.gmail.com
Whole thread Raw
In response to Re: BUG #16251: ::text type casting of a constant breaks query performance  (Pavel Stehule <pavel.stehule@gmail.com>)
List pgsql-bugs


so 8. 2. 2020 v 8:09 odesílatel Pavel Stehule <pavel.stehule@gmail.com> napsal:


so 8. 2. 2020 v 7:44 odesílatel PG Bug reporting form <noreply@postgresql.org> napsal:
The following bug has been logged on the website:

Bug reference:      16251
Logged by:          Dima Pavlov
Email address:      imyfess@gmail.com
PostgreSQL version: 12.1
Operating system:   Windows 10
Description:       

Test table and indexes (PostgreSQL 12.1):

CREATE TABLE t (dt timestamp with time zone);
CREATE INDEX ind ON t USING btree (dt);

INSERT
INTO t(dt)
    SELECT
    (
        timestamp '2020-01-01 00:00:00' +
        random() * (
            timestamp '2020-02-29 00:00:00' -
            timestamp '2020-01-01 00:00:00'
        )
    )
    FROM generate_series(1, 10000)

-------------------------------------

In the first query, everything is ok, appropriate index "ind" is used:

explain (analyze, buffers)
SELECT *
FROM t
WHERE
   ('2020-02-08')::date IS NULL
   OR
   dt > '2020-02-08'
ORDER BY dt
LIMIT 1

"Limit  (cost=0.29..0.37 rows=1 width=8) (actual time=0.186..0.188 rows=1
loops=1)"
"  Buffers: shared hit=3"
"  ->  Index Only Scan using ind on t  (cost=0.29..303.75 rows=3627 width=8)
(actual time=0.184..0.184 rows=1 loops=1)"
"        Index Cond: (dt > '2020-02-08 00:00:00+05'::timestamp with time
zone)"
"        Heap Fetches: 1"
"        Buffers: shared hit=3"
"Planning Time: 2.365 ms"
"Execution Time: 0.239 ms"

-----------------------------------------------

With '::text' type casting of '2020-02-08' (which is already text) query
permofance is very low

explain (analyze, buffers)
SELECT *
FROM t
WHERE
   ('2020-02-08'::text)::date IS NULL
   OR
   dt > '2020-02-08'
ORDER BY dt
LIMIT 1

"Limit  (cost=0.29..0.44 rows=1 width=8) (actual time=45.306..45.307 rows=1
loops=1)"
"  Buffers: shared hit=6232"
"  ->  Index Only Scan using ind on t  (cost=0.29..561.28 rows=3658 width=8)
(actual time=45.304..45.304 rows=1 loops=1)"
"        Filter: ((('2020-02-08'::cstring)::date IS NULL) OR (dt >
'2020-02-08 00:00:00+05'::timestamp with time zone))"
"        Rows Removed by Filter: 6367"
"        Heap Fetches: 6368"
"        Buffers: shared hit=6232"
"Planning Time: 0.348 ms"
"Execution Time: 45.343 ms"

This is not a bug, but just feature.

The '2020-02-08' is not text type - it is 'unknown' type - and then is just directly transformed to date. I think so cast from text to date is not maybe immutable, and it can stops some optimizations.


postgres=# explain analyze select * from foo where '2020-02-01'::date is null or a > '2020-01-01';
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                      QUERY PLAN                                                      │
╞══════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Index Only Scan using foo_a_idx on foo  (cost=0.42..9.12 rows=40 width=8) (actual time=0.083..0.125 rows=37 loops=1) │
│   Index Cond: (a > '2020-01-01 00:00:00+01'::timestamp with time zone)                                               │
│   Heap Fetches: 37                                                                                                   │
│ Planning Time: 0.279 ms                                                                                              │
│ Execution Time: 0.189 ms                                                                                             │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(5 rows)

postgres=# explain analyze select * from foo where '2020-02-01'::text::date is null or a > '2020-01-01';
┌─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                   QUERY PLAN                                                    │
╞═════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ Seq Scan on foo  (cost=0.00..2193.00 rows=540 width=8) (actual time=0.050..71.671 rows=37 loops=1)              │
│   Filter: ((('2020-02-01'::cstring)::date IS NULL) OR (a > '2020-01-01 00:00:00+01'::timestamp with time zone)) │
│   Rows Removed by Filter: 99963                                                                                 │
│ Planning Time: 0.299 ms                                                                                         │
│ Execution Time: 71.714 ms                                                                                       │
└─────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(5 rows)


The problem is in casting from Date to Timestamptz - the related function "timestamptz" is marked as "stable" - and then probably the all expression is marked as "stable", what block to use this expression as index condition.

I was wrong - problem is in probably cast from text to date.


I don't know the context, but the expression "'2020-02-01'::date is null or a > '2020-01-01'" has not too much sense, so just don't do this. Postgres is not too smart and don't try to reduce some useless part of expressions.

More times Postgres is more sensitive on data types due special rules of casting.


Or if you need it, then use UNION and separate this expression to two independent expressions

postgres=# explain analyze select * from foo where '2020-02-01'::text::date is null union select * from foo where  a > '2020-01-01';
┌────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│                                                               QUERY PLAN                                                               │
╞════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════════╡
│ HashAggregate  (cost=3202.82..4203.22 rows=100040 width=8) (actual time=0.175..1.635 rows=37 loops=1)                                  │
│   Group Key: foo.a                                                                                                                     │
│   ->  Append  (cost=0.01..2952.72 rows=100040 width=8) (actual time=0.061..0.120 rows=37 loops=1)                                      │
│         ->  Result  (cost=0.01..1443.01 rows=100000 width=8) (actual time=0.016..0.017 rows=0 loops=1)                                 │
│               One-Time Filter: (('2020-02-01'::cstring)::date IS NULL)                                                                 │
│               ->  Seq Scan on foo  (cost=0.01..1443.01 rows=100000 width=8) (never executed)                                           │
│         ->  Index Only Scan using foo_a_idx on foo foo_1  (cost=0.42..9.12 rows=40 width=8) (actual time=0.043..0.090 rows=37 loops=1) │
│               Index Cond: (a > '2020-01-01 00:00:00+01'::timestamp with time zone)                                                     │
│               Heap Fetches: 37                                                                                                         │
│ Planning Time: 0.437 ms                                                                                                                │
│ Execution Time: 6.690 ms                                                                                                               │
└────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────┘
(11 rows)



Regards

Pavel

pgsql-bugs by date:

Previous
From: Pavel Stehule
Date:
Subject: Re: BUG #16251: ::text type casting of a constant breaks query performance
Next
From: Marco Lechner privat
Date:
Subject: Re: BUG #16245: proj63 from rhel-7 repo depends on sqlite33, but isnot available