Home > mailing lists

Re: Simple machine-killing query! - Mailing list pgsql-performance

From	Josh Berkus
Subject	Re: Simple machine-killing query!
Date	October 21, 2004 21:15:09
Msg-id	200410211014.39916.josh@agliodbs.com Whole thread Raw
In response to	Simple machine-killing query! (Victor Ciurus <vikcious@gmail.com>)
List	pgsql-performance

Tree view

Victor,

> [explain] select * from BIGMA where string not in (select * from DIRTY);
>                                QUERY PLAN
> ------------------------------------------------------------------------
>  Seq Scan on bigma  (cost=0.00..24582291.25 rows=500 width=145)
>    Filter: (NOT (subplan))
>    SubPlan
>      ->  Seq Scan on dirty  (cost=0.00..42904.63 rows=2503963 width=82)

This is what you call an "evil query".   I'm not surprised it takes forever;
you're telling the database "Compare every value in 2.7 million rows of text
against 2.5 million rows of text and give me those that don't match."   There
is simply no way, on ANY RDBMS, for this query to execute and not eat all of
your RAM and CPU for a long time.

You're simply going to have to allocate shared_buffers and sort_mem (about 2GB
of sort_mem would be good) to the query, and turn the computer over to the
task until it's done.   And, for the sake of sanity, when you find the
200,000 rows that don't match, flag them so that you don't have to do this
again.

--
Josh Berkus
Aglio Database Solutions
San Francisco

pgsql-performance by date:

From: Victor Ciurus
Date: 21 October 2004, 21:03:48
Subject: Re: Simple machine-killing query!

From: Sean Chittenden
Date: 21 October 2004, 22:40:49
Subject: Re: Anything to be gained from a 'Postgres Filesystem'?

Re: Simple machine-killing query! - Mailing list pgsql-performance

Previous

Next