Re: Query too slow with "not in" condition - Mailing list pgsql-general

From David Rowley
Subject Re: Query too slow with "not in" condition
Date
Msg-id 185DAC59E2914010B0E0A54A71638A68@amd64
Whole thread Raw
In response to Query too slow with "not in" condition  ("சிவகுமார் மா"<masivakumar@gmail.com>)
Responses Re: Query too slow with "not in" condition
Re: Query too slow with "not in" condition
List pgsql-general
> I have loaded the backup from a live database in a test system. Both run
> 8.3.5 versions. The plan for a query varies in these systems.

> Test System
> A. PostgreSQL 8.3.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.1.2
> 20061115 (prerelease) (SUSE Linux)

> B. explain select * from stock_transaction_detail_106 where transaction_id
> not in (select transaction_id from transaction_value);
> Seq Scan on stock_transaction_detail_106  (cost=1829.78..2867.74
rows=16478 width=128)
>   Filter: (NOT (hashed subplan))
>   SubPlan
>     ->  Seq Scan on transaction_value  (cost=0.00..1598.02 rows=92702
width=4)

> The query takes about 300 ms to run.

> Production System

> 1. PostgreSQL 8.3.5 on i686-pc-linux-gnu, compiled by GCC gcc (GCC) 4.2.1
(SUSE Linux)

> 2. explain select * from stock_transaction_detail_106 where transaction_id
not in (select transaction_id from transaction_value);
>  Seq Scan on stock_transaction_detail_106  (cost=2153.95..25245478.39
rows=17064 width=122)
>   Filter: (NOT (subplan))
>   SubPlan
>     ->  Materialize  (cost=2153.95..3401.01 rows=92905 width=4)
>           ->  Seq Scan on transaction_value  (cost=0.00..1743.05
rows=92905 width=4)

> Here the query did not return any results after 1hour.

> In both the computers same query with in condition runs fast (520 ms and
290 ms respectively)

> Please help me to resolve this issue. (One configuration difference
> between these machines are pg_hba.conf file. In production machine it is
> password enabled. In test machine it is trust mode.)


You might find this page interesting:
http://www.depesz.com/index.php/2008/08/13/nulls-vs-not-in/

I assume workmem, effective_cache_size and random_page_cost are all the same
in the 2 postgresql.conf?

Do you get the same after you ANALYZE the database?

If you read the page above you might realise NOT IN is not really what you
want. Maybe NOT EXISTS or a LEFT OUTER JOIN ... WHERE transaction_id IS NULL
?

David.



pgsql-general by date:

Previous
From: "சிவகுமார் மா"
Date:
Subject: Query too slow with "not in" condition
Next
From: "Andrus"
Date:
Subject: Re: db backup script in gentoo