Thread: Query Optimisation

Query Optimisation

From
psql-mail@freeuk.com
Date:
Hi, I have the following query - is there anything i've missed or is it
just slow?!

I have an index on meta.date that i thought might have been used but
isn't (I know it would only be a small performance increase in the
current plan).

meta.date is between 1999 and 2003. I think generally the most
efficient order to do things would be to extract all the messages
within the date range and then search over just them.

I am currently in the process of setting up full text indexing as
described in the techdocs.postgresql.org i guess this is the main way
of speeding up searches through ~40GB of bulk text?

Thanks!...
m

EXPLAIN ANALYZE SELECT meta.msg_id, date, from_line, subject FROM
message ,meta WHERE meta.date >= '15-06-2003 00:00:00' AND meta.date <=
'26-06-2003 00:00:00' AND message.header||message.body ILIKE '%chicken%'
AND meta.sys_id = message.sys_id ORDER BY meta.date DESC;
  QUERY PLAN

------------------------------------------------------------------------
------------------------------------------------------------------------
------
Sort  (cost=50.16..50.17 rows=1 width=120) (actual time=412333.65..
412333.76 rows=166 loops=1)
Sort Key: meta.date
->  Nested Loop  (cost=0.00..50.15 rows=1 width=120) (actual time=
400713.41..412332.53 rows=166 loops=1)
->  Seq Scan on message  (cost=0.00..25.00 rows=5 width=8) (actual time=
58.18..410588.49 rows=20839 loops=1)
Filter: ((header || body) ~~* '%chicken%'::text)
->  Index Scan using meta_pkey on meta  (cost=0.00..5.02 rows=1 width=
112) (actual time=0.07..0.07 rows=0 loops=20839)
Index Cond: (meta.sys_id = "outer".sys_id)
Filter: ((date >= '2003-06-15 00:00:00'::timestamp without time zone)
AND (date <= '2003-06-26 00:00:00'::timestamp without time zone))
Total runtime: 412334.08 msec
(9 rows)

Re: Query Optimisation

From
Jonathan Bartlett
Date:
Try explicitly casting the values into the appropriate date type.

Jon

On Tue, 15 Jul 2003 psql-mail@freeuk.com wrote:

> Hi, I have the following query - is there anything i've missed or is it
> just slow?!
>
> I have an index on meta.date that i thought might have been used but
> isn't (I know it would only be a small performance increase in the
> current plan).
>
> meta.date is between 1999 and 2003. I think generally the most
> efficient order to do things would be to extract all the messages
> within the date range and then search over just them.
>
> I am currently in the process of setting up full text indexing as
> described in the techdocs.postgresql.org i guess this is the main way
> of speeding up searches through ~40GB of bulk text?
>
> Thanks!...
> m
>
> EXPLAIN ANALYZE SELECT meta.msg_id, date, from_line, subject FROM
> message ,meta WHERE meta.date >= '15-06-2003 00:00:00' AND meta.date <=
> '26-06-2003 00:00:00' AND message.header||message.body ILIKE '%chicken%'
> AND meta.sys_id = message.sys_id ORDER BY meta.date DESC;
>   QUERY PLAN
>
> ------------------------------------------------------------------------
> ------------------------------------------------------------------------
> ------
> Sort  (cost=50.16..50.17 rows=1 width=120) (actual time=412333.65..
> 412333.76 rows=166 loops=1)
> Sort Key: meta.date
> ->  Nested Loop  (cost=0.00..50.15 rows=1 width=120) (actual time=
> 400713.41..412332.53 rows=166 loops=1)
> ->  Seq Scan on message  (cost=0.00..25.00 rows=5 width=8) (actual time=
> 58.18..410588.49 rows=20839 loops=1)
> Filter: ((header || body) ~~* '%chicken%'::text)
> ->  Index Scan using meta_pkey on meta  (cost=0.00..5.02 rows=1 width=
> 112) (actual time=0.07..0.07 rows=0 loops=20839)
> Index Cond: (meta.sys_id = "outer".sys_id)
> Filter: ((date >= '2003-06-15 00:00:00'::timestamp without time zone)
> AND (date <= '2003-06-26 00:00:00'::timestamp without time zone))
> Total runtime: 412334.08 msec
> (9 rows)
>
> ---------------------------(end of broadcast)---------------------------
> TIP 6: Have you searched our list archives?
>
>                http://archives.postgresql.org
>


Re: Query Optimisation

From
Tom Lane
Date:
psql-mail@freeuk.com writes:
> ->  Seq Scan on message  (cost=0.00..25.00 rows=5 width=8) (actual time=
> 58.18..410588.49 rows=20839 loops=1)
> Filter: ((header || body) ~~* '%chicken%'::text)

Estimated cost of a seqscan only 25?  Have you ever vacuumed or analyzed
that table?  The planner evidently thinks it is tiny ...

            regards, tom lane