Home > mailing lists

Re: [GENERAL] Expensive query - Mailing list pgsql-general

From	Dustin Sallings
Subject	Re: [GENERAL] Expensive query
Date	October 30, 1998 17:13:30
Msg-id	Pine.SGI.3.95.981030140928.28572A-100000@bleu.west.spy.net Whole thread Raw
In response to	Expensive query (Christophe Pettus <pettus@postdirect.com>)
Responses	Re: [GENERAL] Expensive query
List	pgsql-general

Tree view

On Fri, 30 Oct 1998, Christophe Pettus wrote:

    That users table looks kinda useless.  I'd recommend changing id
to name and making id an int, then putting your events users in as ints.
Currently, it doesn't do anything at all, and you might as well be doing
this query:

    select distinct id from events where code = 'Whatever'
        and age('now',when) <= ' 1 day';

    I would also recommend not making code a char(10).  If you put an
index on code and an index on date, you should be able to get your results
pretty quick.  If you change the userid and the code to integers and do a
three table join, it might be faster, but it would use *much* less disk
space.

# I have two tables, structured as:
#
# users:
#
#     id        varchar(70) not null unique
#
# events:
#
#     userid        varchar(70) not null,
#     code        char(10) not null,
#     when        datetime not null
#
# The query I need to perform answers the question, "Which users do NOT
# have a particular event (selected by code), and which do not have ANY
# event for the last day?"  The query I use is:
#
#     select id from users
#        where
#           id not in (
#              select unique id from events
#              where code = 'some code'
#           )
#           and
#           id not in (
#              select unique id from events
#              where age('now',when) <= '1 day'
#           );
#
# This query is *very* expensive.  With 10,000 users and 40,000 events,
# it can take up to 20 minutes (!) to execute, and the postgres process
# grows to 40 megabytes (!!) of memory (on a Pentium II system running
# BSDI).  This seems surprising, since each of the subqueries only needs
# to be evaluated once, rather than once per row of users.  Is there a
# way to reformulate this query to make it less expensive?
#
#

--
SA, beyond.com           My girlfriend asked me which one I like better.
pub  1024/3CAE01D5 1994/11/03 Dustin Sallings <dustin@spy.net>
|    Key fingerprint =  87 02 57 08 02 D0 DA D6  C8 0F 3E 65 51 98 D8 BE
L_______________________ I hope the answer won't upset her. ____________

pgsql-general by date:

From: "Michael A. Koerber SR"
Date: 30 October 1998, 16:49:24
Subject: Re: [GENERAL] Creating web images from postgres data

From: Adam Miller
Date: 30 October 1998, 17:25:16
Subject: thrashing like a beached shark

Re: [GENERAL] Expensive query - Mailing list pgsql-general

Previous

Next