Huge speed penalty using <>TRUE instead of =FALSE - Mailing list pgsql-bugs

From Jan-Ivar Mellingen
Subject Huge speed penalty using <>TRUE instead of =FALSE
Date
Msg-id 4A603279.3030205@alreg.no
Whole thread Raw
Responses Re: Huge speed penalty using <>TRUE instead of =FALSE  (Peter Eisentraut <peter_e@gmx.net>)
Re: Huge speed penalty using <>TRUE instead of =FALSE  (Mikael Krantz <mk@zigamorph.se>)
Re: Huge speed penalty using <>TRUE instead of =FALSE  (Tom Lane <tgl@sss.pgh.pa.us>)
Re: Huge speed penalty using <>TRUE instead of =FALSE  (Jan-Ivar Mellingen <jan-ivar.mellingen@alreg.no>)
List pgsql-bugs
One of our customers discovered that by replacing <>TRUE with =FALSE in
a query of a table containing 750.000 records reduced the query time
from about 12 seconds to about 60 milliseconds!

The problematic query looks like this:
SELECT * FROM AlarmLogg WHERE Logg_avsluttet <> TRUE AND
Alarm_status='X' ORDER BY ID DESC

If it is changed to this it works as expected:
SELECT * FROM AlarmLogg WHERE Logg_avsluttet = FALSE AND
Alarm_status='X' ORDER BY ID DESC

After investigation (on a smaller dataset on my own database) I found
that the query was resulting in a sequential scan:

"explain analyze SELECT * FROM AlarmLogg WHERE Logg_avsluttet <> TRUE
AND Alarm_status='X' ORDER BY ID DESC"
"Sort  (cost=49936.96..49936.96 rows=1 width=405) (actual
time=837.793..837.793 rows=0 loops=1)"
"  Sort Key: id"
"  Sort Method:  quicksort  Memory: 17kB"
"  ->  Seq Scan on alarmlogg  (cost=0.00..49936.95 rows=1 width=405)
(actual time=837.782..837.782 rows=0 loops=1)"
"        Filter: ((logg_avsluttet <> true) AND ((alarm_status)::text =
'X'::text))"
"Total runtime: 837.896 ms"

The modified query gave this result:
"explain analyze SELECT * FROM AlarmLogg WHERE Logg_avsluttet = FALSE
AND Alarm_status='X' ORDER BY ID DESC"
"Sort  (cost=8.36..8.37 rows=1 width=405) (actual time=0.032..0.032
rows=0 loops=1)"
"  Sort Key: id"
"  Sort Method:  quicksort  Memory: 17kB"
"  ->  Index Scan using i_alarmlogg_logg_avsluttet on alarmlogg
(cost=0.00..8.35 rows=1 width=405) (actual time=0.024..0.024 rows=0
loops=1)"
"        Index Cond: (logg_avsluttet = false)"
"        Filter: ((NOT logg_avsluttet) AND ((alarm_status)::text =
'X'::text))"
"Total runtime: 0.123 ms"

This is a dramatical difference, but I cannot understand why. In my head
"<>TRUE" should behave exactly the same as "=FALSE". This looks like a
bug to me, or am I overlooking something?

This was verified on PostgreSQL 8.3.7, both on Windows Xp and Ubuntu 8.10.

Some relevant details from the table definition:
CREATE TABLE alarmlogg
(
   id serial NOT NULL,
   alarm_status character varying(1) DEFAULT ''::character varying,
   logg_avsluttet boolean DEFAULT false,
   ...
   CONSTRAINT alarmlogg_pkey PRIMARY KEY (id)
)

CREATE INDEX i_alarmlogg_alarm_status
  ON alarmlogg
  USING btree
  (alarm_status);

CREATE INDEX i_alarmlogg_logg_avsluttet
  ON alarmlogg
  USING btree
  (logg_avsluttet);

Regards,
Jan-Ivar Mellingen
Securinet AS

pgsql-bugs by date:

Previous
From: Heikki Linnakangas
Date:
Subject: Re: BUG #4926: too few pathkeys for mergeclauses
Next
From: Frank van Vugt
Date:
Subject: Re: bug or simply not enough stack space?