Expensive query - Mailing list pgsql-general

From Christophe Pettus
Subject Expensive query
Date
Msg-id 4.1.19981030114400.00a1a810@exchange.postdirect.com
Whole thread Raw
Responses Re: [GENERAL] Expensive query
List pgsql-general
I have two tables, structured as:

users:

    id        varchar(70) not null unique

events:

    userid        varchar(70) not null,
    code        char(10) not null,
    when        datetime not null

The query I need to perform answers the question, "Which users do NOT
have a particular event (selected by code), and which do not have ANY
event for the last day?"  The query I use is:

    select id from users
       where
          id not in (
             select unique id from events
             where code = 'some code'
          )
          and
          id not in (
             select unique id from events
             where age('now',when) <= '1 day'
          );

This query is *very* expensive.  With 10,000 users and 40,000 events,
it can take up to 20 minutes (!) to execute, and the postgres process
grows to 40 megabytes (!!) of memory (on a Pentium II system running
BSDI).  This seems surprising, since each of the subqueries only needs
to be evaluated once, rather than once per row of users.  Is there a
way to reformulate this query to make it less expensive?

pgsql-general by date:

Previous
From: Mike Meyer
Date:
Subject: Transaction aborted?
Next
From: "Michael A. Koerber SR"
Date:
Subject: Re: [GENERAL] Creating web images from postgres data