Re: Wrong rows selected with view - Mailing list pgsql-general

From Tom Lane
Subject Re: Wrong rows selected with view
Date
Msg-id 22055.1132202061@sss.pgh.pa.us
Whole thread Raw
In response to Re: Wrong rows selected with view  (Bill Moseley <moseley@hank.org>)
Responses Re: Wrong rows selected with view  (Bill Moseley <moseley@hank.org>)
List pgsql-general
Bill Moseley <moseley@hank.org> writes:
> That ORDER BY is added by PG -- it's not part of my view when I define
> it.  I assume PG adds that so it can do the DISTINCT ON.

Well, then you're even further from following the protocol for DISTINCT
ON.  You *must* provide an ORDER BY to get reliable results from it.

> Still, I don't have any duplicate class.id rows in this select that I
> can see.  class.id 1243 and 1244 are not the same, yet PG is
> (sometimes) throwing out one of them.  Are you saying that somehow PG
> thinks they are the same class.id and is thus removing one?

No, I'm saying that the underlying data (the join result before applying
DISTINCT ON) looks like this:

bill=# select  "class".id, person.id AS person_id
bill-#    FROM "class", "location", region, person, instructors
bill-#   WHERE "class"."location" = "location".id AND "class".id = instructors."class"
bill-# AND instructors.person = person.id AND "location".region = region.id
bill-#   ORDER BY "class".id;
  id  | person_id
------+-----------
    1 |        49
    2 |        27
    3 |        19
    4 |        82
    5 |        12
 ...
 1238 |        61
 1238 |        60
 1239 |        40
 1240 |        67
 1241 |        11
 1243 |        26
 1243 |        84
 1244 |        26
 1244 |        84
(1311 rows)

The DISTINCT ON will take just one of the two rows with id = 1243, and
just one of the rows with id = 1244, and *it is effectively random which
one gets picked*.  So when you then select rows with person_id = 84, you
may or may not see these rows in the end result.

> The reason I'm using DISTINCT ON is because the "class_list" view is
> suppose to just return a list of unique classes, and a class might have
> more than one instructor which would result in extra rows -- as shown
> here:

Exactly.  So your view is going to return the class id along with a
randomly selected one of the instructor ids.  It seems to me that
filtering this result on instructor id is perhaps a bit ill-advised,
even if you fix the view so that the chosen instructor id isn't so
random (eg, you could fix it to display the lowest-numbered instructor
id for the particular class).  Even then, are you searching for the
instructor id that the view happens to show for that class, or some
other one?

            regards, tom lane

pgsql-general by date:

Previous
From: Justin Hawkins
Date:
Subject: Re: Trouble with recursive trigger
Next
From: "A. Kretschmer"
Date:
Subject: Re: absolute or relative updates