Thread: Very slow query

Very slow query

From
"Chad Thompson"
Date:
I am looking for some ideas to speed up this query.
I need to look at the first 6 digits of a clients phone number and compare it with the npanxx # in the tiers table.
Then find out if they are in Tier A, B or C.
 
select count(*) as a
from call_results_fixed cr, tiers t
where cr.start_time between '07/22/2002 19:30' and '07/22/2002 21:30'
and cr.project_id = 11
and substring(cr.phonenum from 1 for 6) = t.npanxx
and t.tier = 'A' ;
 
Here are the results of explain analyze.

NOTICE:  QUERY PLAN:
 
Aggregate  (cost=38021.71..38021.71 rows=1 width=29) (actual time=42412.76..4241
2.76 rows=1 loops=1)
  ->  Nested Loop  (cost=0.00..38021.71 rows=1 width=29) (actual time=42412.74..
42412.74 rows=0 loops=1)
        ->  Seq Scan on call_results_fixed  (cost=0.00..37488.47 rows=24 width=1
9) (actual time=42412.73..42412.73 rows=0 loops=1)
        ->  Seq Scan on tiers t  (cost=0.00..22.50 rows=5 width=10)
Total runtime: 42412.95 msec
 
EXPLAIN
 
Its the substring function that slows things down so much.  Is there a better way to compare these string values?
substr(cr.phonenum, 1,6) is the same speed.  (and according to the docs, basicly the same function)
 
TIA
Chad

Re: Very slow query

From
Josh Berkus
Date:
Chad,

> Its the substring function that slows things down so much.  Is there a
better way to compare these string values?
> substr(cr.phonenum, 1,6) is the same speed.  (and according to the docs,
basicly the same function)

Yeah.  You can index it:

CREATE FUNCTION phonenum_substr (
    VARCHAR ) RETURNS VARCHAR AS '
SELECT substr(cr.phonenum, 1, 6);
' LANGUAGE 'sql' WITH (ISCACHABLE, ISSTRICT);
-- this lets you index on the substring, as the straight substr function is
not indexable as of Postgres 7.2.1

Then:
CREATE INDEX idx_cr_phonenum_substr ON phonenum_substr(cr.phonenum);

This should speed things up.



--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: Very slow query

From
Ron Johnson
Date:
On Tue, 2002-07-30 at 16:03, Josh Berkus wrote:
> Chad,
>
> > Its the substring function that slows things down so much.  Is there a
> better way to compare these string values?
> > substr(cr.phonenum, 1,6) is the same speed.  (and according to the docs,
> basicly the same function)
>
> Yeah.  You can index it:
>
> CREATE FUNCTION phonenum_substr (
>     VARCHAR ) RETURNS VARCHAR AS '
> SELECT substr(cr.phonenum, 1, 6);
> ' LANGUAGE 'sql' WITH (ISCACHABLE, ISSTRICT);
> -- this lets you index on the substring, as the straight substr function is
> not indexable as of Postgres 7.2.1
>
> Then:
> CREATE INDEX idx_cr_phonenum_substr ON phonenum_substr(cr.phonenum);
>
> This should speed things up.

A couple of months ago, I asked a similar question, when I saw
that that a COUNT(*) was scanning, even though it had a perfect
supporting index.  This is the paraphrased answer:
The aggregate functions module is a complicated black box that
the developers are scared to look at and break.  Besides, I
hardly ever use COUNT(*), so you don't need it either.

Horrified by the fact that aggregate don't use indexes, I've
stopped my experiments with Postgres, since reporting/summarizing
is what we were looking to Postgres for, anyway.

--
+-----------------------------------------------------------------+
| Ron Johnson, Jr.        Home: ron.l.johnson@cox.net             |
| Jefferson, LA  USA                                              |
|                                                                 |
| "The greatest dangers to liberty lurk in insidious encroachment |
|  by men of zeal, well-meaning, but without understanding."      |
|   Justice Louis Brandeis, dissenting, Olmstead v US (1928)      |
+-----------------------------------------------------------------+


Re: Very slow query

From
Josh Berkus
Date:
Ron,

> A couple of months ago, I asked a similar question, when I saw
> that that a COUNT(*) was scanning, even though it had a perfect
> supporting index.  This is the paraphrased answer:
> The aggregate functions module is a complicated black box that
> the developers are scared to look at and break.  Besides, I
> hardly ever use COUNT(*), so you don't need it either.

That's the price we pay for having the ability to create custom aggregates.
Since it's possible for me in Postgres to make an aggregate called
comma_cat(varchar) that concatinates a varchar column into a comma-delimited
list -- for example -- the parser cannot optimize for what goes on inside the
aggregate.

The hackers list has discussed the possibility of writing parser optimization
just for the built-in aggregates for which aggregation is reasonable (COUNT,
MIN, MAX).   However, nobody who cares enough about the issue has stepped up
to the plate to offer their code.  And it would require *extensive* testing.

Its a priority thing.   Sometimes I'm annoyed that MAX() is slow in postgres.
More times I'm annoyed that MSSQL does not support custom aggregates.

And, this doesn't affect my advice to Chad.  Indexing phonenum_substr *will*
speed up the query by speeding up the JOIN portion.

--
-Josh Berkus
 Aglio Database Solutions
 San Francisco


Re: Very slow query

From
"Henshall, Stuart - WCP"
Date:

I thought there was also a problem with count(*) in that it would have to scan the tuple any way to check for visibility. Not 100% sure on that.

I believe the best way to do select max() is to:
SELECT * FROM tbl ORDER BY mx_field DESC LIMIT 1
This does of course have the problem of being none standards compliant.
I'm not sure how optimised this is so tell me if I'm talking gibberish :)
- Stuart

> -----Original Message-----
> From: Josh Berkus [mailto:josh@agliodbs.com]
> Sent: 31 July 2002 00:00
> To: Ron Johnson; PgSQL Novice ML
> Subject: Re: [NOVICE] Very slow query
>
>
> Ron,
>
> > A couple of months ago, I asked a similar question, when I saw
> > that that a COUNT(*) was scanning, even though it had a perfect
> > supporting index.  This is the paraphrased answer:
> > The aggregate functions module is a complicated black box that
> > the developers are scared to look at and break.  Besides, I
> > hardly ever use COUNT(*), so you don't need it either.
>
> That's the price we pay for having the ability to create
> custom aggregates.   
> Since it's possible for me in Postgres to make an aggregate called
> comma_cat(varchar) that concatinates a varchar column into a
> comma-delimited
> list -- for example -- the parser cannot optimize for what
> goes on inside the
> aggregate.
>
> The hackers list has discussed the possibility of writing
> parser optimization
> just for the built-in aggregates for which aggregation is
> reasonable (COUNT,
> MIN, MAX).   However, nobody who cares enough about the issue
> has stepped up
> to the plate to offer their code.  And it would require
> *extensive* testing.
>
> Its a priority thing.   Sometimes I'm annoyed that MAX() is
> slow in postgres. 
> More times I'm annoyed that MSSQL does not support custom aggregates.
>
> And, this doesn't affect my advice to Chad.  Indexing
> phonenum_substr *will*
> speed up the query by speeding up the JOIN portion.
>
> --
> -Josh Berkus
>  Aglio Database Solutions
>  San Francisco
>
>
> ---------------------------(end of
> broadcast)---------------------------
> TIP 1: subscribe and unsubscribe commands go to
> majordomo@postgresql.org
>