Query much slower from php than psql or dbeaver - Mailing list pgsql-general

From Ekaterina Amez
Subject Query much slower from php than psql or dbeaver
Date
Msg-id CAFijohhPt+KAb+HPfWhkYvhNWhFGE3iCD_sn_88AsfUFyZ8ECQ@mail.gmail.com
Whole thread Raw
Responses Re: Query much slower from php than psql or dbeaver  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-general
Hi,

After receiving an Unknown Address error with pgsql-php@postgresql.org I've discovered this mailing list is catalogued as Inactive, so I'm sending my question to this list.

I've made a php cli script that downloads a file from FTP, loads it in a table and compares against the same table in 2 different databases that are connected to the main one with pgsql foreign data wrapper (so there are 3 DB involved via linked servers).

I've tested the query with psql and DBeaver and it takes only milliseconds: it returns 39 records and now there's only 16000 records on the table but I've tested it with <100K. When I've tested my php script the same query takes 14 minutes to return (more or less). I've checked there are no blocking processes while runnig this query, I've analyzed the table, dropped and recreated, added indexes... and every time I run my script it takes those 14 minutes.

The script opens the connection and after that makes all of the operations: some checks, load the file with COPY sentence, move data with some additional columns to final table, and after this and without dropping the connection it runs this query. After this long query, script iterates over the returned records to make some other queries for every record. And finally there are 2 more queries, that makes the reverse search: the slow query checks diferences or non-existing records from file data in the 2 linked databases, and the las 2 queries check non-existing records from their tables in the file. Everything goes smooth except this long-lasting query.

This is the query:
select t1.aaaa as maindb_aaaa, t1.bbb as maindb_bbb, t1.ccccc as maindb_ccccc, t1.timestamp_create as maindb_create,
    t1.timestamp_closed as maindb_close, t1.ddddddddd as maindb_ddddddddd, null::text as db1_sth,
    t2.eeeeeeee as db1_eeeeeeee, t2.ffffffff as db1_ffffffff, null::text as db2_sth,
    t3.eeeeeeee as db2_eeeeeeee, t3.ffffffff as db2_ffffffff
from table1 t1
left join database1_fdw.table2 t2 on t1.aaaa = t2.btatpd_aaaa
    and t1.file = 'file_name.csv'
    and t2.btatpd_fecha = '20220119120000'
    and substring(t1.bbb from 1 for 3) in (<some_values>)
left join database2_fdw.table2 t3 on t1.aaaa = t3.btatpd_aaaa
    and t1.fichero_origen = 'file_name.csv'
    and t3.btatpd_fecha = '20220119120000'
    and substring(t1.bbb from 1 for 3) in (<some_different_values)
where t1.ccccc = 'ACTIVE'
and t1.fichero_origen = 'file_name.csv'
and ((t2.eeeeeeee is null and t3.eeeeeeee is null)
    or
    (t2.eeeeeeee is not null and t1.ddddddddd <> t2.ffffffff)
    or
    (t3.eeeeeeee is not null and t1.ddddddddd <> t3.ffffffff)
    )
order by t1.bbb nulls last;

I'm using PHP 5.3.3, with PG 9.6 over CentOS 6.8. I'm connecting to PG with pg_connect and pg_query because here nobody is using PDO. And I have to warn you that I'm not a PHP expert. Also I'm aware how old these versions are, I'm trying to finish things to start with server migration.

This is the explain plan, in case it gives more info:
Sort  (cost=5345.39..5385.35 rows=15986 width=212) (actual time=112.315..112.318 rows=39 loops=1)
  Output: t1.aaaa, t1.bbb, t1.ccccc, t1.timestamp_create, t1.timestamp_closed, t1.ddddddddd, NULL::text, t2.eeeeeeee, t2.ffffffff, NULL::text, t3.eeeeeeee, t3.ffffffff
  Sort Key: t1.bbb
  Sort Method: quicksort  Memory: 29kB
  Buffers: shared hit=1255
  ->  Hash Left Join  (cost=237.54..2587.70 rows=15986 width=212) (actual time=81.272..112.248 rows=39 loops=1)
        Output: t1.aaaa, t1.bbb, t1.ccccc, t1.timestamp_create, t1.timestamp_closed, t1.ddddddddd, NULL::text, t2.eeeeeeee, t2.ffffffff, NULL::text, t3.eeeeeeee, t3.ffffffff
        Hash Cond: ((t1.aaaa)::text = (t3.btatpd_aaaa)::text)
        Join Filter: (((t1.file)::text = 'file_name.csv'::text) AND ("substring"((t1.bbb)::text, 1, 3) = ANY ('{<some_values>}'::text[])))
        Rows Removed by Join Filter: 1
        Filter: (((t2.eeeeeeee IS NULL) AND (t3.eeeeeeee IS NULL)) OR ((t2.eeeeeeee IS NOT NULL) AND (t1.ddddddddd <> t2.ffffffff)) OR ((t3.eeeeeeee IS NOT NULL) AND (t1.ddddddddd <> t3.ffffffff)))
        Rows Removed by Filter: 15794
        Buffers: shared hit=1252
        ->  Hash Left Join  (cost=118.77..2408.88 rows=15988 width=149) (actual time=71.890..101.261 rows=15820 loops=1)
              Output: t1.aaaa, t1.bbb, t1.ccccc, t1.timestamp_create, t1.timestamp_closed, t1.ddddddddd, t1.file, t2.eeeeeeee, t2.ffffffff
              Hash Cond: ((t1.aaaa)::text = (t2.btatpd_aaaa)::text)
              Join Filter: (((t1.file)::text = 'file_name.csv'::text) AND ("substring"((t1.bbb)::text, 1, 3) = ANY ('{<some_different_values>}'::text[])))
              Buffers: shared hit=1252
              ->  Seq Scan on public.table1 t1  (cost=0.00..2230.06 rows=15988 width=103) (actual time=0.176..19.882 rows=15817 loops=1)
                    Output: t1.id, t1.aaaa, t1.bbb, t1.ccccc, t1.timestamp_create, t1.timestamp_closed, t1.sbd_bundle_id, t1.ddddddddd, t1.file, t1.fecha_carga
                    Filter: (((t1.file)::text = 'file_name.csv'::text) AND ((t1.ccccc)::text = 'ACTIVE'::text))
                    Rows Removed by Filter: 49387
                    Buffers: shared hit=1252
              ->  Hash  (cost=118.73..118.73 rows=3 width=94) (actual time=71.699..71.699 rows=14244 loops=1)
                    Output: t2.eeeeeeee, t2.ffffffff, t2.btatpd_aaaa
                    Buckets: 16384 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 1074kB
                    ->  Foreign Scan on database1_fdw.table2 t2  (cost=100.00..118.73 rows=3 width=94) (actual time=3.961..67.271 rows=14244 loops=1)
                          Output: t2.eeeeeeee, t2.ffffffff, t2.btatpd_aaaa
                          Remote SQL: SELECT eeeeeeee, btatpd_aaaa, ffffffff FROM public.table2 WHERE ((btatpd_fecha = '20220119120000'::text))
        ->  Hash  (cost=118.73..118.73 rows=3 width=94) (actual time=6.512..6.513 rows=3051 loops=1)
              Output: t3.eeeeeeee, t3.ffffffff, t3.btatpd_aaaa
              Buckets: 4096 (originally 1024)  Batches: 1 (originally 1)  Memory Usage: 235kB
              ->  Foreign Scan on database2_fdw.table2 t3  (cost=100.00..118.73 rows=3 width=94) (actual time=1.494..5.857 rows=3051 loops=1)
                    Output: t3.eeeeeeee, t3.ffffffff, t3.btatpd_aaaa
                    Remote SQL: SELECT eeeeeeee, btatpd_aaaa, ffffffff FROM public.table2 WHERE ((btatpd_fecha = '20220119120000'::text))
Planning time: 2.137 ms
Execution time: 123.077 ms


Anyone can tell me why is this happening and if is there a solution to this?

Thank you for your time.

--

Ekaterina Amez González

ekaterina.amez@zunibal.com

 

 

ZUNIBAL | Idorsolo 1, 48160-Derio, Spain

Tel: +34 944 977 010 | Fax: +34 944 522 81| www.zunibal.com

 

Advertencia Legal. El contenido de este mensaje y de toda la documentación anexa es confidencial y va dirigido únicamente al destinatario del mismo. Si usted no fuera el destinatario le solicitamos que nos informe y no comunique su contenido a terceros, procediendo a su destrucción. / Legal advice. This message contains confidential information for the exclusive use of the recipient. Any unauthorised disclosure, use or dissemination, either whole or partial, is prohibited. If you are not the intended recipient of the message, please notify the sender as soon as possible. En cumplimiento de la Ley Orgánica 15/1999 de Protección de Datos de Carácter Personal, le informamos que los Datos Personales de usted que están en nuestra Base de Datos recabados con su consentimiento, forman parte de un fichero automatizado registrado en la Agencia Española de Protección de Datos. Estos datos sólo serán utilizados para realizar una correcta gestión de nuestra relación comercial. Si lo desea podrá ejercitar en todo momento los derechos de acceso, cancelación u oposición, remitiendo un correo electrónico a esta dirección. / According to the Law 15/1999 of Personal Data Protectionwe inform your data are included in our data base with your approval are registered in the spanish Agency of Data ProtectionThese data are only used for our usual business relationship. If you want you are in the right of accessing or cancelling them, just sending an e-mail to this address.

 

Antes de imprimir este mensajeasegúrese de que es necesario hacerlo. / Before printing this email, assess if it is really needed.

pgsql-general by date:

Previous
From: Tom Lane
Date:
Subject: Re: Cannot find hstore operator
Next
From: Avi Weinberg
Date:
Subject: RE: Multiple SELECT statements Using One WITH statement