performance inconsistency - Mailing list pgsql-sql

From Phuong Ma
Subject performance inconsistency
Date
Msg-id 3ACB9C4B.1071C2F1@commandprompt.com
Whole thread Raw
Responses Re: performance inconsistency
List pgsql-sql
Good day,

We're running a week-old CVS snapshot of PostgreSQL 7.1, and I'm not
sure
if this performance inconsistency is specific to it, or if this is just
something in PostgreSQL in general, but it seems kind of odd, and I
could
use some help here. ;)

I have run two queries in a table full of invoices where the only
difference
between the queries is the where clause:

SELECT COUNT(DISTINCT l."ORDER_NUM") AS line_orders,
        SUM(l."GROSS_PROFIT" + "REPLACEMENT_COST") AS grand_total,
        SUM(l."LIST_PRICE" * l."SHIPPED_QUANTITY") AS
grand_total_alternative,
        SUM(l."GROSS_PROFIT") AS grand_total_profit
 FROM   cu_invoice_li1_tbl l INNER JOIN cu_invoice_03_tbl h
   ON   l."ORDER_NUM"=h."ORDER_NUM"

WHERE substr("ORDER_NUM", 1, 1) != 'W'; # (Orders NOT prefixed with W.)

OR...

WHERE substr("ORDER_NUM", 1, 1) = 'I';  # (Orders prefixed with I.)

The first query returns more rows than the second (because it includes
orders prefixed with I, C, F, etc), whereas the second only returns
orders
prefixed with I. It seems that the logic of identifying a single
character
as *not* being a single value would be identical or at least similar in
efficiency to the logic of a single character *being* being a single
value.

However, the first query runs in about 10-15 seconds, and the second
query
ran for over 40 minutes before I cancelled it.

Now, I can easily use a series of "!=" statements to get it down so that
it's only the 'I' orders (which runs even faster, at about 5-6
seconds!),
but can anyone explain this to me? :)

Are "!=" substring evaluations inherently faster by an obscene order of
magnitude, or is something really wrong here?



J.

pgsql-sql by date:

Previous
From: Cedar Cox
Date:
Subject: Re: max( bool )?
Next
From: Tom Lane
Date:
Subject: Re: max( bool )?