Re: Query Analysis - Mailing list pgsql-performance

From
Subject Re: Query Analysis
Date
Msg-id 49187.216.80.95.13.1037895374.squirrel@www.l-i-e.com
Whole thread Raw
In response to Re: Query Analysis  (Josh Berkus <josh@agliodbs.com>)
List pgsql-performance
Since it's 7.1.3 I don't have the "ANALYZE" bit in EXPLAIN, but:

archive_beta=> explain SELECT DISTINCT *, 0 + (0 + 10 * (lower(title) like
'%einstein%') ::int + 10 * (lower(author_flattened) like '%einstein%')
::int + 30 * (lower(subject_flattened) like '%einstein%') ::int + 30 *
(lower(text) LIKE '%einstein%') ::int + 9 * (substring(lower(title), 1,
20) like '%einstein%') ::int + 25 * (substring(lower(text), 1, 20) LIKE
'%einstein%') ::int ) AS points FROM article WHERE TRUE AND (FALSE OR
(lower(title) like '%einstein%') OR (lower(author_flattened) like
'%einstein%') OR (lower(subject_flattened) like '%einstein%') OR
(lower(text) LIKE '%einstein%') ) ORDER BY points desc, volume, number,
article.article LIMIT 10, 0 ;
NOTICE:  QUERY PLAN:

Limit  (cost=1374.97..1375.02 rows=1 width=212)
  ->  Unique  (cost=1374.97..1375.02 rows=1 width=212)
        ->  Sort  (cost=1374.97..1374.97 rows=1 width=212)
              ->  Seq Scan on article  (cost=0.00..1374.96 rows=1 width=212)

EXPLAIN
archive_beta=> explain  SELECT *, 0 + 3 * ( title like '%Einstein%' )::int
+ 3 * ( author like  '%Einstein%' )::int + ( ( 1 + 1 * ( lower(text) like
'%einstein%' )::int )  + ( 0 + ( subject like '%Einstein%' )::int ) ) AS
points FROM article  WHERE TRUE AND title like '%Einstein%' AND author
like '%Einstein%' AND (  ( TRUE AND lower(text) like '%einstein%' ) OR (
FALSE OR subject like  '%Einstein%' ) ) ORDER BY points desc, volume,
number, article.article  LIMIT 10, 0;
NOTICE:  QUERY PLAN:

Limit  (cost=1243.48..1243.48 rows=1 width=212)
  ->  Sort  (cost=1243.48..1243.48 rows=1 width=212)
        ->  Seq Scan on article  (cost=0.00..1243.47 rows=1 width=212)

While the first one is higher, these two do not seem drastically different
to me -- Those numbers are accumulative, right?  So the top row is my
"final answer"  The extra Unique row doesn't seem to be adding
significantly to the numbers as far as EXPLAIN can tell...

And yet the queries are orders of magnitude apart in actual performance.

'Course, I don't claim to completely understand the output of EXPLAIN yet
either.

I also took out the DISTINCT in the first one, just to test.  It was
certainly "faster" but not nearly so much that it "caught up" to the other
query.

Thanks in advance for any help!




pgsql-performance by date:

Previous
From: David Pradier
Date:
Subject: Is there a system of cache in pgsql 7.3rc1 ?
Next
From: "Rajesh Kumar Mallah."
Date:
Subject: H/W RAID 5 on slower disks versus no raid on faster HDDs