Thread: View performance with implicit cast

View performance with implicit cast

From
Tomasz Szypowski
Date:

Hello

 

I have got an example, in which PostgreSQL could speed up:

 

First, create and populate structures:

 

create table test1 (date_1 timestamp);

insert into test1

select generate_series( '2020-01-01', '2025-01-01', interval '1 minute');

CREATE INDEX idx_test1_date_1 on test1(date_1);

 

create table test2 (date_2 date);

insert into test2

select generate_series( '1900-01-01', '2200-01-01', interval '1 day');

CREATE INDEX idx_test2_date_2 on test2(date_2);

 

Then create a view:

create or replace view v_test as

select date_1 from test1

union all

select date_2 from test2

 

and now run the query:

explain analyze

select * from  v_test

where (date_1>='2024-12-09') and (date_1<'2025-01-01');

 

You would see:

Subquery Scan on v_test  (cost=0.00..95710.28 rows=13702 width=8) (actual time=440.998..472.586 rows=33143 loops=1)

  Filter: ((v_test.date_1 >= '2024-12-09 00:00:00'::timestamp without time zone) AND (v_test.date_1 < '2025-01-01 00:00:00'::timestamp without time zone))

  Rows Removed by Filter: 2707311

  ->  Append  (cost=0.00..54603.47 rows=2740454 width=8) (actual time=0.038..332.601 rows=2740454 loops=1)

        ->  Seq Scan on test1  (cost=0.00..37950.81 rows=2630881 width=8) (actual time=0.035..165.235 rows=2630881 loops=1)

        ->  Subquery Scan on "*SELECT* 2"  (cost=0.00..2950.39 rows=109573 width=8) (actual time=0.025..16.285 rows=109573 loops=1)

              ->  Seq Scan on test2  (cost=0.00..1580.73 rows=109573 width=4) (actual time=0.022..4.974 rows=109573 loops=1)

Planning Time: 0.263 ms

Execution Time: 473.598 ms

 

I can imagine, that test2 cannot be searched by index due to the implicite cast of the column date -> timestamp, but why the query doesn`t use index idx_test1_date_1? Furthermore it could use as well index idx_test2_date_2

 

Now let`s modify the view`s definition only to add explicit cast:

create or replace view v_test as

select date_1 from test1

union all

select date_2::timestamp from test2

 

Run the query again:

explain analyze

select * from  v_test

where (date_1>='2024-12-09') and (date_1<'2025-01-01');

 

Append  (cost=0.43..3812.87 rows=31903 width=8) (actual time=0.024..17.287 rows=33143 loops=1)

  ->  Index Only Scan using idx_test1_date_1 on test1  (cost=0.43..975.53 rows=31355 width=8) (actual time=0.023..5.500 rows=33120 loops=1)

        Index Cond: ((date_1 >= '2024-12-09 00:00:00'::timestamp without time zone) AND (date_1 < '2025-01-01 00:00:00'::timestamp without time zone))

        Heap Fetches: 0

  ->  Seq Scan on test2  (cost=0.00..2677.83 rows=548 width=8) (actual time=3.744..8.610 rows=23 loops=1)

        Filter: (((date_2)::timestamp without time zone >= '2024-12-09 00:00:00'::timestamp without time zone) AND ((date_2)::timestamp without time zone < '2025-01-01 00:00:00'::timestamp without time zone))

        Rows Removed by Filter: 109550

Planning Time: 0.230 ms

Execution Time: 18.937 ms

 

So

  1. Why the index is being used only when the column types are the same?
  2. Why th other indexes are not being used?

 

Maybe you can help me to rewrite the query

 

Regards

Thomas Szypowski

Re: View performance with implicit cast

From
Tom Lane
Date:
Tomasz Szypowski <tomasz.szypowski@asseco.pl> writes:
> I have got an example, in which PostgreSQL could speed up:

The reason why the first version of the view doesn't behave well
is that it's not optimized into an "append relation", because
is_simple_union_all() doesn't think that's safe:

/*
 * is_simple_union_all
 *      Check a subquery to see if it's a simple UNION ALL.
 *
 * We require all the setops to be UNION ALL (no mixing) and there can't be
 * any datatype coercions involved, ie, all the leaf queries must emit the
 * same datatypes.
 */

Perhaps this could be improved, but it's a lot easier just to add
the cast yourself.

            regards, tom lane



RE: View performance with implicit cast

From
Tomasz Szypowski
Date:
Thanks

So how can i create the view, with different date data types (date, timestamp), so that both indexes would be involved.
Isee only the solution with index with casted value (create index ... on ...(cast(... as date))), but maybe it is a
bettersolution? 

What`s more i found this strange behaviour:

CREATE VIEW test AS SELECT * FROM foo

CREATE INDEX test ON foo(id)

Now if i execute: DROP INDEX test, I receive:
ERROR:  "test" is not an index
HINT:  Use DROP VIEW to remove a view.

After executing DROP VIEW test, I can drop this index

Regards
Thomas Szypowski


-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Saturday, January 4, 2025 7:37 PM
To: Tomasz Szypowski <tomasz.szypowski@asseco.pl>
Cc: pgsql-sql@lists.postgresql.org
Subject: Re: View performance with implicit cast

[Nie otrzymujesz cz?sto wiadomo?ci e-mail z tgl@sss.pgh.pa.us. Dowiedz si?, dlaczego jest to wa?ne, na stronie
https://aka.ms/LearnAboutSenderIdentification] 

Tomasz Szypowski <tomasz.szypowski@asseco.pl> writes:
> I have got an example, in which PostgreSQL could speed up:

The reason why the first version of the view doesn't behave well is that it's not optimized into an "append relation",
because
is_simple_union_all() doesn't think that's safe:

/*
 * is_simple_union_all
 *        Check a subquery to see if it's a simple UNION ALL.
 *
 * We require all the setops to be UNION ALL (no mixing) and there can't be
 * any datatype coercions involved, ie, all the leaf queries must emit the
 * same datatypes.
 */

Perhaps this could be improved, but it's a lot easier just to add the cast yourself.

                        regards, tom lane



RE: View performance with implicit cast

From
"Zornoza Sanchez, Jose Blas"
Date:
Hello, in this case both index and view have the same name (test), try a different one...

-----Mensaje original-----
De: Tomasz Szypowski <tomasz.szypowski@asseco.pl>
Enviado el: domingo, 5 de enero de 2025 23:45
Para: Tom Lane <tgl@sss.pgh.pa.us>
CC: pgsql-sql@lists.postgresql.org
Asunto: RE: View performance with implicit cast

Thanks

So how can i create the view, with different date data types (date, timestamp), so that both indexes would be involved.
Isee only the solution with index with casted value (create index ... on ...(cast(... as date))), but maybe it is a
bettersolution?
 

What`s more i found this strange behaviour:

CREATE VIEW test AS SELECT * FROM foo

CREATE INDEX test ON foo(id)

Now if i execute: DROP INDEX test, I receive:
ERROR:  "test" is not an index
HINT:  Use DROP VIEW to remove a view.

After executing DROP VIEW test, I can drop this index

Regards
Thomas Szypowski


-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Saturday, January 4, 2025 7:37 PM
To: Tomasz Szypowski <tomasz.szypowski@asseco.pl>
Cc: pgsql-sql@lists.postgresql.org
Subject: Re: View performance with implicit cast

[Nie otrzymujesz cz?sto wiadomo?ci e-mail z tgl@sss.pgh.pa.us. Dowiedz si?, dlaczego jest to wa?ne, na stronie
https://aka.ms/LearnAboutSenderIdentification]
 

Tomasz Szypowski <tomasz.szypowski@asseco.pl> writes:
> I have got an example, in which PostgreSQL could speed up:

The reason why the first version of the view doesn't behave well is that it's not optimized into an "append relation",
because
is_simple_union_all() doesn't think that's safe:

/*
 * is_simple_union_all
 *        Check a subquery to see if it's a simple UNION ALL.
 *
 * We require all the setops to be UNION ALL (no mixing) and there can't be
 * any datatype coercions involved, ie, all the leaf queries must emit the
 * same datatypes.
 */

Perhaps this could be improved, but it's a lot easier just to add the cast yourself.

                        regards, tom lane



________________________________

Este correo electrónico y, en su caso, cualquier fichero anexo al mismo, contiene información de carácter confidencial
exclusivamentedirigida a su destinatario o destinatarios. Si no es vd. el destinatario indicado, queda notificado que
lalectura, utilización, divulgación y/o copia sin autorización está prohibida en virtud de la legislación vigente. En
elcaso de haber recibido este correo electrónico por error, se ruega notificar inmediatamente esta circunstancia
mediantereenvío a la dirección electrónica del remitente.
 
Evite imprimir este mensaje si no es estrictamente necesario.

This email and any file attached to it (when applicable) contain(s) confidential information that is exclusively
addressedto its recipient(s). If you are not the indicated recipient, you are informed that reading, using,
disseminatingand/or copying it without authorisation is forbidden in accordance with the legislation in effect. If you
havereceived this email by mistake, please immediately notify the sender of the situation by resending it to their
emailaddress.
 
Avoid printing this message if it is not absolutely necessary.

Re: View performance with implicit cast

From
Tom Lane
Date:
"Zornoza Sanchez, Jose Blas" <jbzornoza@sia.es> writes:
> Hello, in this case both index and view have the same name (test), try a different one...

Yeah.  If you try the example as-presented it fails immediately:

postgres=# create table foo (id int);
CREATE TABLE
postgres=# CREATE VIEW test AS SELECT * FROM foo;
CREATE VIEW
postgres=# CREATE INDEX test ON foo(id);
ERROR:  relation "test" already exists

because you can't put a view named test and an index named test into
the same schema.  (They share the namespace of tables.)  What I think
the OP might have done is something similar to

postgres=# create schema s1;
CREATE SCHEMA
postgres=# create schema s2;
CREATE SCHEMA
postgres=# set search_path to s1, s2;
SET
postgres=# create table s2.foo (id int);
CREATE TABLE
postgres=# CREATE VIEW test AS SELECT * FROM foo;
CREATE VIEW
postgres=# CREATE INDEX test ON foo(id);
CREATE INDEX
postgres=# DROP INDEX test;
ERROR:  "test" is not an index
HINT:  Use DROP VIEW to remove a view.

View test is in schema s1, because that's the default creation schema
with this search_path setting.  But index test is in s2, because
indexes are always put in the same schema as their parent table.
So the CREATE INDEX doesn't fail.  But then the DROP searches the
search_path, and the first "test" it finds is the view s1.test,
so it complains.

            regards, tom lane



RE: View performance with implicit cast

From
Tomasz Szypowski
Date:
Thanks for the explanation, but what about the reported problem.
How can I force the view to use both indexes?


-----Original Message-----
From: Tom Lane <tgl@sss.pgh.pa.us>
Sent: Tuesday, January 7, 2025 3:52 PM
To: Zornoza Sanchez, Jose Blas <jbzornoza@sia.es>
Cc: Tomasz Szypowski <tomasz.szypowski@asseco.pl>; pgsql-sql@lists.postgresql.org
Subject: Re: View performance with implicit cast

"Zornoza Sanchez, Jose Blas" <jbzornoza@sia.es> writes:
> Hello, in this case both index and view have the same name (test), try a different one...

Yeah.  If you try the example as-presented it fails immediately:

postgres=# create table foo (id int);
CREATE TABLE
postgres=# CREATE VIEW test AS SELECT * FROM foo; CREATE VIEW postgres=# CREATE INDEX test ON foo(id);
ERROR:  relation "test" already exists

because you can't put a view named test and an index named test into the same schema.  (They share the namespace of
tables.) What I think the OP might have done is something similar to 

postgres=# create schema s1;
CREATE SCHEMA
postgres=# create schema s2;
CREATE SCHEMA
postgres=# set search_path to s1, s2;
SET
postgres=# create table s2.foo (id int); CREATE TABLE postgres=# CREATE VIEW test AS SELECT * FROM foo; CREATE VIEW
postgres=#CREATE INDEX test ON foo(id); CREATE INDEX postgres=# DROP INDEX test; 
ERROR:  "test" is not an index
HINT:  Use DROP VIEW to remove a view.

View test is in schema s1, because that's the default creation schema with this search_path setting.  But index test is
ins2, because indexes are always put in the same schema as their parent table. 
So the CREATE INDEX doesn't fail.  But then the DROP searches the search_path, and the first "test" it finds is the
views1.test, so it complains. 

            regards, tom lane



Re: View performance with implicit cast

From
Tom Lane
Date:
Tomasz Szypowski <tomasz.szypowski@asseco.pl> writes:
> Thanks for the explanation, but what about the reported problem.
> How can I force the view to use both indexes?

You can't, because the indexes are not on the same expressions
appearing in the view.  Your outer WHERE clause constrains those
expressions, not the underlying table columns.

If it's impractical to make the underlying tables share the same
column type, you would need to do something like

    create view v_test as
    select date_1 from test1
    union all
    select date_2::timestamp from test2;

    create index on test2 ((date_2::timestamp));

Consider the extra index as your penance for not having thought
harder about data type choices to begin with.

            regards, tom lane