BUG #8334: Merge Join drops records - Mailing list pgsql-bugs

From euclid@yandex-team.ru
Subject BUG #8334: Merge Join drops records
Date
Msg-id E1V2ODG-00026n-Kk@wrigleys.postgresql.org
Whole thread Raw
Responses Re: BUG #8334: Merge Join drops records
List pgsql-bugs
The following bug has been logged on the website:

Bug reference:      8334
Logged by:          Euclid Nikiforov
Email address:      euclid@yandex-team.ru
PostgreSQL version: 9.2.0
Operating system:   Linux 3.8.0-11-na
Description:

Hi all


Recently investigating problem in our backend query result, I have found
strange behaviour.


The exact query was:
SELECT r.*, state FROM revision.object_revision r JOIN revision.commit ON id
IN (32397,32403,32404) AND commit_id=id WHERE ((commit_id=32397 AND
(object_id=279638691 OR object_id=279638692)) OR (commit_id=32403 AND
object_id IN (279638118,279638128,279638970)) OR (commit_id=32404 AND
object_id=279638966));


It expected to return 6 rows, but returned only 3 when enable_mergejoin is
'on'
EXPLAIN ANALYZE shows Merge Join at final stage
while with Nested Loop (set enable_mergejoin=off;) everything is fine.


Since data is very huge, here is small synthetic way to reproduce the
behaviuor
In actual environment we don't set any restrictions on planner.
If you manipulate enable_mergejoin or enable_sort, you will see how results
differ.
Seems that merge join fails because of unsorted result from inner loops


Regards
Euclid
---------EXAMPLE--------
SET enable_mergejoin=on;
SET enable_nestloop=off;
SET enable_hashjoin=off;
SET enable_seqscan=off;
SET enable_sort=off;


CREATE TABLE testcase.c
(
  id bigint NOT NULL UNIQUE,
  s bigint,
  CONSTRAINT cpk PRIMARY KEY (id)
);


CREATE TABLE testcase.r
(
  c_id bigint NOT NULL,
  o_id bigint NOT NULL,
  CONSTRAINT rpk PRIMARY KEY (c_id, o_id)
);


CREATE INDEX rpkc ON testcase.r USING btree (c_id);

DELETE FROM testcase.c;
INSERT INTO testcase.c values(1,0);
INSERT INTO testcase.c values(2,0);
INSERT INTO testcase.c values(3,0);


DELETE FROM testcase.r;
INSERT INTO testcase.r values(1,1);
INSERT INTO testcase.r values(2,2);
INSERT INTO testcase.r values(2,3);
INSERT INTO testcase.r values(3,4);
INSERT INTO testcase.r values(3,5);
INSERT INTO testcase.r values(3,6);


SELECT s, r.* FROM testcase.r r JOIN testcase.c ON id IN (1, 2, 3) AND
id=r.c_id
WHERE ((c_id=2 AND (o_id=2 OR o_id=3)) OR
(c_id=3 AND o_id in (4,5,6)) OR (c_id=1 AND o_id=1))

pgsql-bugs by date:

Previous
From: Tom Lane
Date:
Subject: Re: BUG #8271: Configure warning: sys/ucred.h: present but cannot be compiled
Next
From: Pavan Deolasee
Date:
Subject: Re: BUG #8334: Merge Join drops records