Re: [GENERAL] Query Optimisation - Mailing list pgsql-performance

From Stephan Szabo
Subject Re: [GENERAL] Query Optimisation
Date
Msg-id 20030715090339.K95323-100000@megazone.bigpanda.com
Whole thread Raw
List pgsql-performance
[replying to -performance]

On Tue, 15 Jul 2003 psql-mail@freeuk.com wrote:

> Hi, I have the following query - is there anything i've missed or is it
> just slow?!

The fact that it underestimates the number of matching message rows by a
factor of about 4000 doesn't help. I'm not sure you're going to be able to
get a better estimate using message.header||message.body ILIKE '%chicken%'
(possibly using two ilikes with or might help but probably not enough).
Have you vacuum analyzed the two tables recently?  The seq scan cost on
message seems fairly low given what I would expect to be the size of that
table.

> I am currently in the process of setting up full text indexing as
> described in the techdocs.postgresql.org i guess this is the main way
> of speeding up searches through ~40GB of bulk text?

That's still probably the best way.

> EXPLAIN ANALYZE SELECT meta.msg_id, date, from_line, subject FROM
> message ,meta WHERE meta.date >= '15-06-2003 00:00:00' AND meta.date <=
> '26-06-2003 00:00:00' AND message.header||message.body ILIKE '%chicken%'
> AND meta.sys_id = message.sys_id ORDER BY meta.date DESC;
>   QUERY PLAN
>
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> ------
> Sort  (cost=50.16..50.17 rows=1 width=120) (actual time=412333.65..
> 412333.76 rows=166 loops=1)
> Sort Key: meta.date
> ->  Nested Loop  (cost=0.00..50.15 rows=1 width=120) (actual time=
> 400713.41..412332.53 rows=166 loops=1)
> ->  Seq Scan on message  (cost=0.00..25.00 rows=5 width=8) (actual time=
> 58.18..410588.49 rows=20839 loops=1)
> Filter: ((header || body) ~~* '%chicken%'::text)
> ->  Index Scan using meta_pkey on meta  (cost=0.00..5.02 rows=1 width=
> 112) (actual time=0.07..0.07 rows=0 loops=20839)
> Index Cond: (meta.sys_id = "outer".sys_id)
> Filter: ((date >= '2003-06-15 00:00:00'::timestamp without time zone)
> AND (date <= '2003-06-26 00:00:00'::timestamp without time zone))
> Total runtime: 412334.08 msec
> (9 rows)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>


pgsql-performance by date:

Previous
From: "Shridhar Daithankar"
Date:
Subject: Re: Sanity check requested
Next
From: Vivek Khera
Date:
Subject: Re: Tunning FreeeBSD and PostgreSQL