Thread: Very slow queries - please help.

Very slow queries - please help.

From
"Bealach-na Bo"
Date:
Hi Folks,

I'm new to Postgresql.

I'm having great difficulties getting the performance I had hoped for
from Postgresql 8.0. The typical query below takes ~20 minutes !!

I hope an expert out there will tell me what I'm doing wrong - I hope
*I* am doing something wrong.

Hardware
--------
Single processor, Intel Xeon 3.06 GHz machine running Red Hat
Ent. 4. with 1.5 GB of RAM.

The machine is dedicated to running Postgresql 8.0 and Apache/mod_perl
etc. The database is being accessed for report generation via a web
form. The web server talks to Pg over TCP/IP (I know, that I don't
need to do this if they are all on the same machine, but I have good
reasons for this and don't suspect that this is where my problems are
- I have the same poor performance when running from psql on the
server.)

Database
--------
Very simple, not fully normalized set of two tables. The first table,
very small (2000 lines of 4 cols with very few chars and integers in
in col). The other quite a bit larger (500000 lines with 15
cols. with the largest fields ~ 256 chars)

Typical query
------------

SELECT n.name
FROM node n
WHERE n.name
LIKE '56x%'
AND n.type='H'
AND n.usage='TEST'
AND n.node_id
NOT IN
(select n.node_id
FROM job_log j
INNER JOIN node n
ON j.node_id = n.node_id
WHERE n.name
LIKE '56x%'
AND n.type='H'
AND n.usage='TEST'
AND j.job_name = 'COPY FILES'
AND j.job_start >= '2005-11-14 00:00:00'
AND (j.job_stop <= '2005-11-22 09:31:10' OR j.job_stop IS NULL))
ORDER BY n.name


The node table is the small table and the job_log table is the large
table.


I've tried all the basic things that I found in the documentation like
VACUUM ANALYZE, EXPLAIN etc., but I suspect there is something
terribly wrong with what I'm doing and these measures will not shave
off 19 min and 50 seconds off the query time.

Any help and comments would be very much appreciated.


Bealach



Re: Very slow queries - please help.

From
Claus Guttesen
Date:
> Typical query
> ------------
>
> SELECT n.name
> FROM node n
> WHERE n.name
> LIKE '56x%'
> AND n.type='H'
> AND n.usage='TEST'
> AND n.node_id
> NOT IN
> (select n.node_id
> FROM job_log j
> INNER JOIN node n
> ON j.node_id = n.node_id
> WHERE n.name
> LIKE '56x%'
> AND n.type='H'
> AND n.usage='TEST'
> AND j.job_name = 'COPY FILES'
> AND j.job_start >= '2005-11-14 00:00:00'
> AND (j.job_stop <= '2005-11-22 09:31:10' OR j.job_stop IS NULL))
> ORDER BY n.name

Do you have any indexes?

regards
Claus

Re: Very slow queries - please help.

From
"Bealach-na Bo"
Date:
Hi,

Thanks for your comments. I've explicitly made any indexes, but the
default ones are:



user@10.0.0.2.dbdev=> \di
                   List of relations
Schema  |      Name       | Type  |  Owner  |  Table
---------+-----------------+-------+---------+---------
user | job_log_id_pkey | index | user | job_log
user | node_id_pkey    | index | user | node
user | node_name_key   | index | user | node
(3 rows)



I'm also sending the EXPLAIN outputs.





      explain SELECT n.name,n.type,
             n.usage, j.status,
             j.job_start,j.job_stop,
             j.nfiles_in_job,j.job_name
      FROM job_log j
      INNER JOIN node n
      ON j.node_id = n.node_id
      WHERE n.name
      LIKE '56x%'
      AND n.type = 'K'
      AND n.usage = 'LIVE'
      AND j.job_name = 'COPY FILES'
      AND j.job_start >= '2005-11-14 00:00:00'
      AND (j.job_stop <= '2005-11-14 05:00:00' OR j.job_stop IS NULL)
      ORDER BY n.name;



                                  QUERY PLAN

--------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Nested Loop  (cost=0.00..75753.31 rows=1 width=461)
   Join Filter: ("inner".node_id = "outer".node_id)
   ->  Index Scan using node_name_key on node n  (cost=0.00..307.75 rows=1
width=181)
         Filter: ((name ~~ '56x%'::text) AND ("type" = 'K'::bpchar) AND
("usage" = 'LIVE'::bpchar))
   ->  Seq Scan on job_log j  (cost=0.00..75445.54 rows=1 width=288)
         Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start >=
'2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop <=
'2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL)))
(6 rows)


      explain SELECT n.name, n.type, n.usage
      FROM node n
      WHERE n.name
      LIKE '56x%'
      AND n.type  = 'K'
      AND n.usage = 'LIVE'
      AND n.node_id
      NOT IN
      (SELECT n.node_id
      FROM job_log j
      INNER JOIN node n
      ON j.node_id = n.node_id
      WHERE n.name
      LIKE '56x%'
      AND n.type  = 'K'
      AND n.usage = 'LIVE'
      AND j.job_name = 'COPY FILES'
      AND j.job_start >= '2005-11-14 00:00:00'
      AND (j.job_stop <= '2005-11-14 05:00:00' OR j.job_stop IS NULL))
      ORDER BY n.name;







                                      QUERY PLAN

----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
Index Scan using node_name_key on node n  (cost=75451.55..75764.94 rows=1
width=177)
   Filter: ((name ~~ '56x%'::text) AND ("type" = 'K'::bpchar) AND ("usage" =
'LIVE'::bpchar) AND (NOT (hashed subplan)))
   SubPlan
     ->  Nested Loop  (cost=0.00..75451.54 rows=1 width=4)
           ->  Seq Scan on job_log j  (cost=0.00..75445.54 rows=1 width=4)
                 Filter: ((job_name = 'COPY FILES'::bpchar) AND (job_start
 >= '2005-11-14 00:00:00'::timestamp without time zone) AND ((job_stop <=
'2005-11-14 05:00:00'::timestamp without time zone) OR (job_stop IS NULL)))
           ->  Index Scan using node_id_pkey on node n  (cost=0.00..5.99
rows=1 width=4)
                 Index Cond: ("outer".node_id = n.node_id)
                 Filter: ((name ~~ '56x%'::text) AND ("type" = 'K'::bpchar)
AND ("usage" = 'LIVE'::bpchar))


Yours,

Bealach


>From: Claus Guttesen <kometen@gmail.com>
>To: Bealach-na Bo <bealach_na_bo@hotmail.com>
>CC: pgsql-performance@postgresql.org
>Subject: Re: [PERFORM] Very slow queries - please help.
>Date: Thu, 24 Nov 2005 14:23:38 +0100
>
> > Typical query
> > ------------
> >
> > SELECT n.name
> > FROM node n
> > WHERE n.name
> > LIKE '56x%'
> > AND n.type='H'
> > AND n.usage='TEST'
> > AND n.node_id
> > NOT IN
> > (select n.node_id
> > FROM job_log j
> > INNER JOIN node n
> > ON j.node_id = n.node_id
> > WHERE n.name
> > LIKE '56x%'
> > AND n.type='H'
> > AND n.usage='TEST'
> > AND j.job_name = 'COPY FILES'
> > AND j.job_start >= '2005-11-14 00:00:00'
> > AND (j.job_stop <= '2005-11-22 09:31:10' OR j.job_stop IS NULL))
> > ORDER BY n.name
>
>Do you have any indexes?
>
>regards
>Claus



Re: Very slow queries - please help.

From
Guillaume Smet
Date:
Hi,

> I'm also sending the EXPLAIN outputs.

Please provide EXPLAIN ANALYZE outputs instead of EXPLAIN. You will have
more information.

Indexes on your tables are obviously missing. You should try to add:

CREATE INDEX idx_node_filter ON node(name, type, usage);
CREATE INDEX idx_job_log_filter ON job_log(job_name, job_start, job_stop);

I'm not so sure it's a good idea to add job_stop in this index as you
have an IS NULL in your query so I'm not sure it can be used. You should
try it anyway and remove it if not needed.

I added all your search fields in the indexes but it depends a lot on
the selectivity of your conditions. I don't know your data but I think
you understand the idea.

HTH

--
Guillaume

Re: Very slow queries - please help.

From
Tom Lane
Date:
"Bealach-na Bo" <bealach_na_bo@hotmail.com> writes:
> I'm having great difficulties getting the performance I had hoped for
> from Postgresql 8.0. The typical query below takes ~20 minutes !!

You need to show us the table definition (including indexes) and the
EXPLAIN ANALYZE results for the query.

It seems likely that the NOT IN is the source of your problems,
but it's hard to be sure without EXPLAIN results.

            regards, tom lane