OR vs UNION vs UNION ALL - Mailing list pgsql-general

From Etienne Champetier
Subject OR vs UNION vs UNION ALL
Date
Msg-id CAOdf3gqf9Hhxk3g6qqkV6SkQEpM2JiARY45W0ATQ=JdqvCntLA@mail.gmail.com
Whole thread Raw
Responses Re: OR vs UNION vs UNION ALL
List pgsql-general
Hi,

I've reworked a bit an application to make 1 query (with many OR) instead of thousands (2900 in this bench),
but was a bit disappointed with the perf. Here follow a quick bench and 2 questions at the end:

Here is my table
CREATE TABLE properties
(
  item_id text NOT NULL,
  calendar_id uuid NOT NULL,
  clef text,
  valeur text,
  recurrence_id bigint,
  CONSTRAINT fk_props_id FOREIGN KEY (item_id, calendar_id, recurrence_id)
      REFERENCES cal_item (itemid, calendar_id, recurrence_id) MATCH SIMPLE
      ON UPDATE CASCADE ON DELETE CASCADE
)


Here are 3 versions of the "same" query, there is ~2900 conditions (OR, UNION, UNION ALL), returning ~9000 lines (same exact number each time)
(i've ran the tests multiple time, using postgresql 9.3.5 on rhel 6, on an idle physical server)

1)
SELECT clef,valeur,ITEM_ID,CALENDAR_ID,RECURRENCE_ID FROM PROPERTIES WHERE
 (ITEM_ID=((E'1263d4ac-d8a3-44d7-b971-5eee947ff36a')::text) AND CALENDAR_ID=((E'f9532857-ca68-497b-b2a9-162507110e55')::uuid) AND RECURRENCE_ID=((0)::int8)) OR
 (ITEM_ID=((E'9a1e40bb-0280-4e79-ae13-ee57d614fdef')::text) AND CALENDAR_ID=((E'f9532857-ca68-497b-b2a9-162507110e55')::uuid) AND RECURRENCE_ID=((0)::int8)) OR
.....

total time: 7.6s
plan:
Index Cond,
Bitmap index scan on fk_props_id,
BitmapOr,
Recheck Cond,
Bitmap Heap Scan on properties

2)
SELECT clef,valeur,ITEM_ID,CALENDAR_ID,RECURRENCE_ID FROM PROPERTIES WHERE (ITEM_ID=((E'1263d4ac-d8a3-44d7-b971-5eee947ff36a')::text) AND CALENDAR_ID=((E'f9532857-ca68-497b-b2a9-162507110e55')::uuid) AND RECURRENCE_ID=((0)::int8)) UNION
SELECT clef,valeur,ITEM_ID,CALENDAR_ID,RECURRENCE_ID FROM PROPERTIES WHERE (ITEM_ID=((E'9a1e40bb-0280-4e79-ae13-ee57d614fdef')::text) AND CALENDAR_ID=((E'f9532857-ca68-497b-b2a9-162507110e55')::uuid) AND RECURRENCE_ID=((0)::int8)) UNION
...

total time: 1s
plan:
Index Cond,
Index Scan using fki_props_id on properties properties_XXX,
Append,
HashAggregate

3)
SELECT clef,valeur,ITEM_ID,CALENDAR_ID,RECURRENCE_ID FROM PROPERTIES WHERE (ITEM_ID=((E'1263d4ac-d8a3-44d7-b971-5eee947ff36a')::text) AND CALENDAR_ID=((E'f9532857-ca68-497b-b2a9-162507110e55')::uuid) AND RECURRENCE_ID=((0)::int8)) UNION ALL
SELECT clef,valeur,ITEM_ID,CALENDAR_ID,RECURRENCE_ID FROM PROPERTIES WHERE (ITEM_ID=((E'9a1e40bb-0280-4e79-ae13-ee57d614fdef')::text) AND CALENDAR_ID=((E'f9532857-ca68-497b-b2a9-162507110e55')::uuid) AND RECURRENCE_ID=((0)::int8)) UNION ALL
...

total time: 1.2s
plan:
Index Cond,
Index Scan using fki_props_id on properties properties_XXX,
Append


Two questions:
1) Is it normal to have such a big difference between OR and UNION and should i always prefer UNION when possible?

2) How can UNION ALL be slower than UNION, it's not doing the HashAggregate but is 0.2s slower?


Thanks in advance
Etienne

pgsql-general by date:

Previous
From: Daniel Begin
Date:
Subject: Re: Advise on memory usage limitation by PostgreSQL on Windows
Next
From: Jeff Dik
Date:
Subject: Re: The curious case of two inserts, a shrinking xmax, and a ShareLock on transaction