Re: long running query running too long - Mailing list pgsql-performance

From Todd Fulton
Subject Re: long running query running too long
Date
Msg-id 00b401c3f596$78f92800$326aa8c0@juarez
Whole thread Raw
In response to Re: long running query running too long  (PC Drew <drewpc@ibsncentral.com>)
Responses Re: long running query running too long  (PC Drew <drewpc@ibsncentral.com>)
List pgsql-performance
Hey!  I think I have appropriate indexes, but might now.  You're
absolutely right on my join -- spk_tgplog has the 8.5 million rows,
spk_tgp around 2400.  I'll try the sub-select.  Here is the output you
asked for:

spank_prod=# \d spk_tgp;
                                             Table "spk_tgp"
     Column     |           Type           |
Modifiers
----------------+--------------------------+----------------------------
---------------------------------
 tgpid          | bigint                   | not null
 directoryname  | character varying(64)    | not null
 directoryurl   | character varying(1028)  | not null
 submiturl      | character varying(1028)  |
 submitdate     | date                     |
 acceptdate     | date                     |
 templateid     | character varying(64)    | not null
 reciprocalcode | character varying(2056)  |
 notes          | character varying(2056)  |
 createdate     | timestamp with time zone | not null default
('now'::text)::timestamp(6) with time zone
 modifydate     | timestamp with time zone | not null default
('now'::text)::timestamp(6) with time zone
 requested      | integer                  |
 hostid         | integer                  | default 1
Indexes: idx_spk_tgp_tgpid
Primary key: pk_spk_tgp

spank_prod=# \d idx_spk_tgp_tgpid
       Index "idx_spk_tgp_tgpid"
    Column     |         Type
---------------+-----------------------
 tgpid         | bigint
 directoryname | character varying(64)
btree

spank_prod=# \d spk_tgplog;
                                           Table "spk_tgplog"
    Column     |           Type           |
Modifiers
---------------+--------------------------+-----------------------------
--------------------------------
 remoteaddress | character varying(32)    | not null
 tgpid         | bigint                   | not null
 referer       | character varying(256)   |
 createdate    | timestamp with time zone | not null default
('now'::text)::timestamp(6) with time zone
Indexes: idx_spk_tgplog_createdate,
         idx_spk_tgplog_tgpid

spank_prod=# \d idx_spk_tgplog_createdate
   Index "idx_spk_tgplog_createdate"
   Column   |           Type
------------+--------------------------
 createdate | timestamp with time zone
btree

spank_prod=# \d idx_spk_tgplog_tgpid
Index "idx_spk_tgplog_tgpid"
 Column |  Type
--------+--------
 tgpid  | bigint
btree



Todd



-----Original Message-----
From: pgsql-performance-owner@postgresql.org
[mailto:pgsql-performance-owner@postgresql.org] On Behalf Of PC Drew
Sent: Tuesday, February 17, 2004 12:05 PM
To: Todd Fulton
Cc: pgsql-performance@postgresql.org
Subject: Re: [PERFORM] long running query running too long


On Feb 17, 2004, at 10:06 AM, Todd Fulton wrote:
>
>
> I’ve got a table with about 8 million rows and growing.  I must run
> reports daily off this table, and another smaller one.  Typical query
> – joins, groupings and aggregates included.  This certain report takes

> about 10 minutes on average and is getting longer.  I’ve created all
> the indices I think are necessary.
>
>

What indexes have you created?  The query is not using any indexes, so
there might be a problem there.  Can you disable seqscans temporarily
to test this?

>
> prod=# explain analyze SELECT t.tgpid, t.directoryname, t.templateid,
> count(*) AS requested FROM (spk_tgp t JOIN spk_tgplog l ON ((t.tgpid =

> l.tgpid))) GROUP BY t.tgpid, t.directoryname, t.templateid;


Can you please send the results of the following commands:

psql=# \d spk_tgp

and

psql=# \d spk_tgplog


You might also want to try using a sub-query instead of a join.  I'm
assuming that the spk_tgplog table has a lot of rows and spk_tgp has
very few rows.  It might make sense to try something like this:

EXPLAIN ANALYZE
SELECT t.tgpid, t.directoryname, t.templateid, r.requested
FROM (SELECT tgpid, count(*) AS requested FROM spk_tgplog GROUP BY
tgpid) r, spk_tgp t
WHERE r.tgpid = t.tgpid;

--
PC Drew


---------------------------(end of broadcast)---------------------------
TIP 1: subscribe and unsubscribe commands go to majordomo@postgresql.org


pgsql-performance by date:

Previous
From: PC Drew
Date:
Subject: Re: long running query running too long
Next
From: Mike Glover
Date:
Subject: RAID or manual split?