Re: Help trying to tune query that executes 40x slower than in SqlServer - Mailing list pgsql-performance
From | Hugo Ferreira |
---|---|
Subject | Re: Help trying to tune query that executes 40x slower than in SqlServer |
Date | |
Msg-id | 4e8efcf5050309040836b06cc5@mail.gmail.com Whole thread Raw |
In response to | Re: Help trying to tune query that executes 40x slower than in SqlServer (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: Help trying to tune query that executes 40x slower than in SqlServer
|
List | pgsql-performance |
Hi, Well, I think the problem is far more complex than just joins reordering... I've restrucutred the query so that it won't use any explicit joins.Instead it now has a series of 'in (select ...)' and 'not exists (select ...)'. This actually got faster... sometimes!!! select 1, 10000168, C.contxt_id, C.contxt_elmt_ix, null, null, null, null, null, null, 1 from CONTXT as P, CONTXT_ELMT as C, MRS_REPLICATION_OUT as S, MRS_TRANSACTION as TRANS where S.age=0 and S.trans_id=1 and S.trans_id = TRANS.trans_id and S.ent_id = 10000029 and (P.contxt_id=C.contxt_id) and (P.contxt_id = S.pk1) and (C.owner_id not in (select non_repl_data_owner_id from NON_REPL_DATA_OWNER)) AND (C.owner_id not in (select repl_data_owner_id from REPL_DATA_OWNER_RSDNC where rsdnc_node_id = TRANS.recv_node_id)) AND (not exists (select pk1 from MRS_REPLICATION_OUT where trans_id=1 and ent_id=10000168 and C.contxt_id = pk1 AND C.contxt_elmt_ix = pk2)) AND (not exists (select pk1 from MRS_TRANSACTION RED_TRANS, MRS_REPLICATION_OUT RED_OUT where RED_TRANS.cat_code = 'OUT' and RED_TRANS.trans_type in ('X01', 'X02') and RED_TRANS.trans_id=RED_OUT.trans_id and RED_TRANS.prov_node_id=TRANS.prov_node_id and RED_TRANS.recv_node_id=TRANS.recv_node_id and RED_OUT.ent_id=10000168 and C.contxt_id = pk1 AND C.contxt_elmt_ix = pk2)) For example... I run the query, it takes 122seconds. Then I delete the target tables, vacuum the database, re-run it again: 9s. But if I run vacuum several times, and then run, it takes again 122seconds. If I stop this 122seconds query, say, at second 3 and then run it again, it will only take 9s. It simply doesn't make sense. Also, explain analyse will give me diferent plans each time I run it... Unfortunately, this is rendering PostgreSQL unusable for our goals. Any ideas? By the way, I got the following indexes over MRS_REPLICATION_OUT which seems to speed up things: CREATE INDEX ix_mrs_replication_out_all ON mrs_replication_out USING btree (ent_id, age, trans_id); CREATE INDEX ix_mrs_replication_pks ON mrs_replication_out USING btree (trans_id, ent_id, pk1, pk2, pk3, pk4, pk5, pk6, pk7); Note: pk2... pk7 are nullable columns. trans_id is the least variant column. pk1 is the most variant column. Most of the times, the execution plan includes an 'index scan' over the first index (ix_mrs_replication_out_all), followed by a filter with columns from the second index (trans_id, ent_id, pk1, pk2, pk3, pk4, pk5, pk6, pk7), though the 'age' column is not used... Any guess why?? Thanks in advance, Hugo Ferreira > It is possible but complicated to determine that reordering outer joins > is safe in some cases. We don't currently have such logic in PG. It > may be that SQL Server does have that capability and that's why it's > finding a much better plan ... but for now you have to do that by hand > in PG. -- GPG Fingerprint: B0D7 1249 447D F5BB 22C5 5B9B 078C 2615 504B 7B85
pgsql-performance by date: