Thread: Query too slow with "not in" condition
Dear Friends,
I have loaded the backup from a live database in a test system. Both run 8.3.5 versions. The plan for a query varies in these systems.
Test System
A. PostgreSQL 8.3.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20061115 (prerelease) (SUSE Linux)
B. explain select * from stock_transaction_detail_106 where transaction_id not in (select transaction_id from transaction_value);
Seq Scan on stock_transaction_detail_106 (cost=1829.78..2867.74 rows=16478 width=128)
Filter: (NOT (hashed subplan))
SubPlan
-> Seq Scan on transaction_value (cost=0.00..1598.02 rows=92702 width=4)
The query takes about 300 ms to run.
Production System
1. PostgreSQL 8.3.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.2.1 (SUSE Linux)
2. explain select * from stock_transaction_detail_106 where transaction_id not in (select transaction_id from transaction_value);
Seq Scan on stock_transaction_detail_106 (cost=2153.95..25245478.39 rows=17064 width=122)
Filter: (NOT (subplan))
SubPlan
-> Materialize (cost=2153.95..3401.01 rows=92905 width=4)
-> Seq Scan on transaction_value (cost=0.00..1743.05 rows=92905 width=4)
Here the query did not return any results after 1hour.
In both the computers same query with in condition runs fast (520 ms and 290 ms respectively)
Please help me to resolve this issue. (One configuration difference between these machines are pg_hba.conf file. In production machine it is password enabled. In test machine it is trust mode.)
Thanks and regards,
Ma Sivakumar
மா சிவகுமார்
எல்லோரும் எல்லாமும் பெற வேண்டும்
http://masivakumar.blogspot.com
I have loaded the backup from a live database in a test system. Both run 8.3.5 versions. The plan for a query varies in these systems.
Test System
A. PostgreSQL 8.3.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 20061115 (prerelease) (SUSE Linux)
B. explain select * from stock_transaction_detail_106 where transaction_id not in (select transaction_id from transaction_value);
Seq Scan on stock_transaction_detail_106 (cost=1829.78..2867.74 rows=16478 width=128)
Filter: (NOT (hashed subplan))
SubPlan
-> Seq Scan on transaction_value (cost=0.00..1598.02 rows=92702 width=4)
The query takes about 300 ms to run.
Production System
1. PostgreSQL 8.3.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.2.1 (SUSE Linux)
2. explain select * from stock_transaction_detail_106 where transaction_id not in (select transaction_id from transaction_value);
Seq Scan on stock_transaction_detail_106 (cost=2153.95..25245478.39 rows=17064 width=122)
Filter: (NOT (subplan))
SubPlan
-> Materialize (cost=2153.95..3401.01 rows=92905 width=4)
-> Seq Scan on transaction_value (cost=0.00..1743.05 rows=92905 width=4)
Here the query did not return any results after 1hour.
In both the computers same query with in condition runs fast (520 ms and 290 ms respectively)
Please help me to resolve this issue. (One configuration difference between these machines are pg_hba.conf file. In production machine it is password enabled. In test machine it is trust mode.)
Thanks and regards,
Ma Sivakumar
மா சிவகுமார்
எல்லோரும் எல்லாமும் பெற வேண்டும்
http://masivakumar.blogspot.com
> I have loaded the backup from a live database in a test system. Both run > 8.3.5 versions. The plan for a query varies in these systems. > Test System > A. PostgreSQL 8.3.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2 > 20061115 (prerelease) (SUSE Linux) > B. explain select * from stock_transaction_detail_106 where transaction_id > not in (select transaction_id from transaction_value); > Seq Scan on stock_transaction_detail_106 (cost=1829.78..2867.74 rows=16478 width=128) > Filter: (NOT (hashed subplan)) > SubPlan > -> Seq Scan on transaction_value (cost=0.00..1598.02 rows=92702 width=4) > The query takes about 300 ms to run. > Production System > 1. PostgreSQL 8.3.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.2.1 (SUSE Linux) > 2. explain select * from stock_transaction_detail_106 where transaction_id not in (select transaction_id from transaction_value); > Seq Scan on stock_transaction_detail_106 (cost=2153.95..25245478.39 rows=17064 width=122) > Filter: (NOT (subplan)) > SubPlan > -> Materialize (cost=2153.95..3401.01 rows=92905 width=4) > -> Seq Scan on transaction_value (cost=0.00..1743.05 rows=92905 width=4) > Here the query did not return any results after 1hour. > In both the computers same query with in condition runs fast (520 ms and 290 ms respectively) > Please help me to resolve this issue. (One configuration difference > between these machines are pg_hba.conf file. In production machine it is > password enabled. In test machine it is trust mode.) You might find this page interesting: http://www.depesz.com/index.php/2008/08/13/nulls-vs-not-in/ I assume workmem, effective_cache_size and random_page_cost are all the same in the 2 postgresql.conf? Do you get the same after you ANALYZE the database? If you read the page above you might realise NOT IN is not really what you want. Maybe NOT EXISTS or a LEFT OUTER JOIN ... WHERE transaction_id IS NULL ? David.
"David Rowley" <dgrowley@gmail.com> writes: > I assume workmem, effective_cache_size and random_page_cost are all the same > in the 2 postgresql.conf? Indeed, work_mem is probably the problem. The critical difference between the two plans seems to be that the first one is using a "hashed subplan" and the second one isn't. Assuming the same datatypes in both databases, the only reason not to use a hashed subplan is if the hashtable is estimated not to fit in work_mem. regards, tom lane
On Sun, Nov 30, 2008 at 3:28 PM, David Rowley <dgrowley@gmail.com> wrote:
Thanks David. Another issue I was faced with was exactly what the link you provided discusses.
Best regards,
Ma Sivakumar
மா சிவகுமார்
எல்லோரும் எல்லாமும் பெற வேண்டும்
http://masivakumar.blogspot.com
http://www.depesz.com/index.php/2008/08/13/nulls-vs-not-in/You might find this page interesting:
Thanks David. Another issue I was faced with was exactly what the link you provided discusses.
Best regards,
Ma Sivakumar
மா சிவகுமார்
எல்லோரும் எல்லாமும் பெற வேண்டும்
http://masivakumar.blogspot.com