Thread: Wrong rows selected with view

Wrong rows selected with view

From
Bill Moseley
Date:
PostgreSQL 7.4.8 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.2 20050816 (prerelease) (Debian 4.0.1-5)

Hopefully this is something simple -- I assume it's a problem with my
SQL.  But it looks really weird to me at this late hour.

I have some tables for managing workshops, and I have a view (below).
I noticed for a given select it was returning the wrong number of
rows.  I expected seven rows back, but only got six.

I rewrote the view with just the joins and it returned the correct number
of rows.  So I started the brute force method of removing one thing at
a time in the view to see what would make it start returning the
correct number of rows.  That just confused me more.

Below if I comment out *any* single column that's marked "-- this" then
I get the correct number of rows.  If I comment out any rows marked
"-- nope" then there's no change (get six rows returned).  But, if I
start commenting out more than one "-- nope" then I get seven rows.

Can someone point out my error?


create view bar AS
    SELECT DISTINCT ON (class.id)  -- this
               class.id AS id,
               class.name AS name, -- this
               class.class_time AS class_time, -- this
               class.begin_reg_time AS begin_reg_time, -- this

               (CASE WHEN class.register_cutoff_time > class.class_time -- this
                    THEN class.register_cutoff_time
                    ELSE class.class_time
                END) AS register_stop_time,

               class.location AS location, -- nope
               class.workshop AS workshop, -- nope
               class.review_mode AS review_mode, -- nope
               class.workshop_group AS workshop_group, -- nope
               location.name AS location_name, -- this
               location.address AS address, -- this
               location.city AS city, -- nope
               location.state AS state, -- this
               location.zip AS zip, -- nope

               region.id AS region, -- nope
               region.name AS region_name, -- nope
               region.sort_order AS region_sort,  -- nope

               person.id AS person_id,
               UPPER( person.last_name || person.first_name ) AS instructor, -- this

               class.cutoff_message AS cutoff_message,    -- this
               class.full_message AS full_message,      -- this
               class.wait_description AS wait_description  -- this

          FROM class, location, region, person, instructors

         WHERE class.location           = location.id   -- join with location
               AND class.id             = instructors.class   -- join the instructors
               AND instructors.person   = person.id     -- join the person(s)
               AND location.region      = region.id;    -- join the location to a region
select id from bar where person_id = 84;
drop view bar;

  id
------
  727
  739
  804
  813
  867
 1244
(6 rows

Comment out one column:


               -- class.full_message AS full_message,      -- this
               class.wait_description AS wait_description  -- this

          FROM class, location, region, person, instructors

         WHERE class.location           = location.id   -- join with location
               AND class.id             = instructors.class   -- join the instructors
               AND instructors.person   = person.id     -- join the person(s)
               AND location.region      = region.id;    -- join the location to a region
select id from bar where person_id = 84;
drop view bar;

  id
------
  727
  739
  804
  813
  867
 1243
 1244
(7 rows)

It's always class.id 1243 that doesn't show up.  Not sure this helps,
but:


moseley@bumby:~$ echo '\x \\ select * from class where id = 1243' | psql ws2 > 1243
moseley@bumby:~$ echo '\x \\ select * from class where id = 1244' | psql ws2 > 1244
moseley@bumby:~$ diff -U 0 1243 1244
--- 1243        2005-11-15 20:16:26.619412721 -0800
+++ 1244        2005-11-15 20:16:30.438646443 -0800
@@ -3 +3 @@
-id                      | 1243
+id                      | 1244
@@ -8,4 +8,4 @@
-class_time              | 2005-12-12 07:00:00-08
-class_end_time          | 2005-12-12 14:00:00-08
-class_size              | 55
-begin_reg_time          | 2005-11-15 17:36:00-08
+class_time              | 2005-12-25 07:15:00-08
+class_end_time          | 2005-12-25 11:00:00-08
+class_size              | 33
+begin_reg_time          |
@@ -27 +27 @@
-register_cutoff_time    | 2005-12-11 19:00:00-08
+register_cutoff_time    | 2005-12-24 19:15:00-08



Thanks,


--
Bill "stabbing in the dark" Moseley
moseley@hank.org


Re: Wrong rows selected with view

From
Andreas Seltenreich
Date:
Bill Moseley writes:

> PostgreSQL 7.4.8 on i486-pc-linux-gnu, compiled by GCC cc (GCC) 4.0.2 20050816 (prerelease) (Debian 4.0.1-5)
>
> Hopefully this is something simple -- I assume it's a problem with my
> SQL.  But it looks really weird to me at this late hour.
>
> I have some tables for managing workshops, and I have a view (below).
> I noticed for a given select it was returning the wrong number of
> rows.  I expected seven rows back, but only got six.
>
> I rewrote the view with just the joins and it returned the correct number
> of rows.  So I started the brute force method of removing one thing at
> a time in the view to see what would make it start returning the
> correct number of rows.  That just confused me more.

How does the query plan change when you make those changes? If it only
occurs if a certain index is used, it might be corrupt (=> REINDEX).

regards,
Andreas

Re: Wrong rows selected with view

From
Bill Moseley
Date:
On Wed, Nov 16, 2005 at 03:42:19PM +0100, Andreas Seltenreich wrote:
> How does the query plan change when you make those changes? If it only
> occurs if a certain index is used, it might be corrupt (=> REINDEX).

I did a "reindex database ws2;" and no change.

I'm not very good at reading the query plans.  For one thing, they
always send me off on some tangent wondering why it's doing a Seq Scan
instead of a index scan. ;)

The first plan below returns the correct number of rows, the second plan does
not.  These are after I did the reindex, btw.


ws2=> explain select id from bar where person_id = 84;

 Subquery Scan bar  (cost=1225.81..1243.32 rows=6 width=4)
   Filter: (person_id = 84)
   ->  Unique  (cost=1225.81..1230.82 rows=1000 width=334)
         ->  Sort  (cost=1225.81..1228.31 rows=1003 width=334)
               Sort Key: "class".id
               ->  Hash Join  (cost=802.15..1175.81 rows=1003 width=334)
                     Hash Cond: ("outer".person = "inner".id)
                     ->  Hash Join  (cost=67.50..203.81 rows=1003 width=315)
                           Hash Cond: ("outer".region = "inner".id)
                           ->  Hash Join  (cost=45.00..163.77 rows=1002 width=279)
                                 Hash Cond: ("outer"."location" = "inner".id)
                                 ->  Hash Join  (cost=22.50..118.74 rows=1001 width=141)
                                       Hash Cond: ("outer"."class" = "inner".id)
                                       ->  Seq Scan on instructors  (cost=0.00..20.00 rows=1000 width=8)
                                       ->  Hash  (cost=20.00..20.00 rows=1000 width=137)
                                             ->  Seq Scan on "class"  (cost=0.00..20.00 rows=1000 width=137)
                                 ->  Hash  (cost=20.00..20.00 rows=1000 width=142)
                                       ->  Seq Scan on "location"  (cost=0.00..20.00 rows=1000 width=142)
                           ->  Hash  (cost=20.00..20.00 rows=1000 width=40)
                                 ->  Seq Scan on region  (cost=0.00..20.00 rows=1000 width=40)
                     ->  Hash  (cost=649.12..649.12 rows=10212 width=23)
                           ->  Seq Scan on person  (cost=0.00..649.12 rows=10212 width=23)
(22 rows)


This returns one row less and the only change to the view is this
commented out column:

    -- class.full_message AS full_message,      -- this


ws2=> explain select id from bar where person_id = 84;


 Subquery Scan bar  (cost=1222.54..1240.05 rows=6 width=4)
   Filter: (person_id = 84)
   ->  Unique  (cost=1222.54..1227.55 rows=1000 width=366)
         ->  Sort  (cost=1222.54..1225.05 rows=1003 width=366)
               Sort Key: "class".id
               ->  Hash Join  (cost=779.65..1172.54 rows=1003 width=366)
                     Hash Cond: ("outer".person = "inner".id)
                     ->  Hash Join  (cost=45.00..204.14 rows=1003 width=347)
                           Hash Cond: ("outer".region = "inner".id)
                           ->  Hash Join  (cost=22.50..164.10 rows=1002 width=311)
                                 Hash Cond: ("outer"."location" = "inner".id)
                                 ->  Merge Join  (cost=0.00..119.06 rows=1001 width=173)
                                       Merge Cond: ("outer".id = "inner"."class")
                                       ->  Index Scan using class_pkey on "class"  (cost=0.00..52.00 rows=1000
width=169)
                                       ->  Index Scan using instructors_class_index on instructors  (cost=0.00..52.00
rows=1000width=8) 
                                 ->  Hash  (cost=20.00..20.00 rows=1000 width=142)
                                       ->  Seq Scan on "location"  (cost=0.00..20.00 rows=1000 width=142)
                           ->  Hash  (cost=20.00..20.00 rows=1000 width=40)
                                 ->  Seq Scan on region  (cost=0.00..20.00 rows=1000 width=40)
                     ->  Hash  (cost=649.12..649.12 rows=10212 width=23)
                           ->  Seq Scan on person  (cost=0.00..649.12 rows=10212 width=23)
(21 rows)


--
Bill Moseley
moseley@hank.org


Re: Wrong rows selected with view

From
Tom Lane
Date:
Bill Moseley <moseley@hank.org> writes:
> The first plan below returns the correct number of rows, the second plan does
> not.  These are after I did the reindex, btw.

Bizarre.  What are the datatypes of the columns being joined on?  If
they're string types, what's your database locale and encoding?

            regards, tom lane

Re: Wrong rows selected with view

From
Bill Moseley
Date:
On Wed, Nov 16, 2005 at 10:53:21AM -0500, Tom Lane wrote:
> Bill Moseley <moseley@hank.org> writes:
> > The first plan below returns the correct number of rows, the second plan does
> > not.  These are after I did the reindex, btw.
>
> Bizarre.  What are the datatypes of the columns being joined on?  If
> they're string types, what's your database locale and encoding?

The primary keys are all SERIAL, and the FKs are integer.  Nothing too
odd.

The odd thing is the row that is not returned is basically a clone of
another row -- which is why I diff'ed them in my first posting.

BTW, this might be obvious, but the reason I'm doing DISTINCT ON
class.id is that the instructors table is a link table and a class can
have more than one instructor.  I only want a list of classes, not one
per instructor (which could duplicate them).



I'm still a novice with Pg, so I assume this is what you are asking
(although none of my joins are on text fields).

ws2=> SHOW LC_CTYPE;
-[ RECORD 1 ]---
lc_ctype | en_US

ws2=> SHOW SERVER_ENCODING;
-[ RECORD 1 ]---+-------
server_encoding | LATIN1

So my joins are:

         WHERE class.location           = location.id   -- join with location
               AND class.id             = instructors.class   -- join the instructors
               AND instructors.person   = person.id     -- join the person(s)
               AND location.region      = region.id;    -- join the location to a region

And the .id are all SERIAL integer and the FKs are all integer.

Trying to avoid sending too much unnecessary data to the list, but
here's a sample of the tables:


ws2=> \d region
                             Table "public.region"
   Column   |  Type   |                       Modifiers
------------+---------+--------------------------------------------------------
 id         | integer | not null default nextval('public.region_id_seq'::text)
 active     | boolean | not null default true
 sort_order | integer | not null default 1
 name       | text    | not null
Indexes:
    "region_pkey" primary key, btree (id)
    "region_name_key" unique, btree (nam


ws2=> \d instructors
  Table "public.instructors"
 Column |  Type   | Modifiers
--------+---------+-----------
 person | integer | not null
 class  | integer | not null
Indexes:
    "instructors_pkey" primary key, btree (person, "class")
    "instructors_class_index" btree ("class")
    "instructors_person_index" btree (person)
Foreign-key constraints:
    "$1" FOREIGN KEY (person) REFERENCES person(id)
    "$2" FOREIGN KEY ("class") REFERENCES "class"(id)




ws2=> \d class
                                             Table "public.class"
         Column          |            Type             |                       Modifiers
-------------------------+-----------------------------+-------------------------------------------------------
 id                      | integer                     | not null default nextval('public.class_id_seq'::text)
 name                    | text                        | not null
 old_id                  | integer                     |
 location                | integer                     | not null
 workshop                | integer                     | not null
 class_time              | timestamp(0) with time zone | not null
 class_end_time          | timestamp(0) with time zone | not null
 class_size              | integer                     | not null
 begin_reg_time          | timestamp(0) with time zone |
 class_list_sent_time    | timestamp(0) with time zone |
 class_list_sent_email   | text                        |
 reminder_sent_time      | timestamp(0) with time zone |
 ride_list_sent_time     | timestamp(0) with time zone |
 html_description        | text                        | not null
 short_description       | text                        |
 special_instructions    | text                        |
 on_hold_message         | text                        |
 review_mode             | boolean                     | not null default false
 workshop_group          | integer                     | not null
 distance_ed             | boolean                     | not null default false
 contract_class          | boolean                     | not null default false
 online_evaluation       | boolean                     | not null default true
 price_scheme            | integer                     | not null
 duration                | text                        |
 register_cutoff_time    | timestamp(0) with time zone | not null
 cutoff_message          | text                        |
 full_message            | text                        |
 wait_list_size          | integer                     |
 wait_description        | text                        |
 wait_instructions       | text                        |
 wait_email_instructions | text                        |
 cancel_late_hours       | integer                     |
 cancel_cutoff_hours     | integer                     |
 cancel_email            | text                        |
 send_confirmation       | boolean                     | not null default true
 confirmed_change_notify | text                        |
 send_class_list_email   | text                        |
 send_class_hours        | integer                     |
 ride_list_hours         | integer                     |
 reminder_hours          | integer                     |
 notify_email            | text                        |
Indexes:
    "class_pkey" primary key, btree (id)
    "class_old_id_key" unique, btree (old_id)
    "class_class_time_index" btree (class_time)
    "class_old_id_index" btree (old_id)
    "class_workshop_index" btree (workshop)
Foreign-key constraints:
    "$1" FOREIGN KEY ("location") REFERENCES "location"(id)
    "$2" FOREIGN KEY (workshop) REFERENCES workshop(id)
    "$3" FOREIGN KEY (workshop_group) REFERENCES workshop_group(id)
    "$4" FOREIGN KEY (price_scheme) REFERENCES pricing(id)





ws2=> \d location
                                  Table "public.location"
      Column      |     Type     |                        Modifiers
------------------+--------------+----------------------------------------------------------
 id               | integer      | not null default nextval('public.location_id_seq'::text)
 old_id           | integer      |
 active           | boolean      | not null default true
 name             | text         | not null
 class_size       | integer      | not null
 num_workstations | integer      |
 time_zone        | integer      | not null
 directions       | text         | not null
 region           | integer      | not null
 phone            | text         |
 address          | text         |
 city             | text         |
 zip              | text         |
 state            | character(2) |
 map_link         | text         |
Indexes:
    "location_pkey" primary key, btree (id)
    "location_name_key" unique, btree (name)
    "location_old_id_key" unique, btree (old_id)
    "location_old_id" btree (old_id)
Foreign-key constraints:
    "$1" FOREIGN KEY (time_zone) REFERENCES timezone(id)
    "$2" FOREIGN KEY (region) REFERENCES region(id)
    "$3" FOREIGN KEY (state) REFERENCES geo_state(state)




--
Bill Moseley
moseley@hank.org


Re: Wrong rows selected with view

From
Tom Lane
Date:
Bill Moseley <moseley@hank.org> writes:
> On Wed, Nov 16, 2005 at 10:53:21AM -0500, Tom Lane wrote:
>> Bizarre.  What are the datatypes of the columns being joined on?  If
>> they're string types, what's your database locale and encoding?

> The primary keys are all SERIAL, and the FKs are integer.  Nothing too
> odd.

Well, that eliminates the theory I had, which was that string comparison
was messing up because of incompatible locale/encoding choices.

I think you may have found a PG bug.  Can you duplicate the misbehavior
if you dump the data and reload it into a fresh database?  (Note you'll
probably need to re-ANALYZE to get back to the same query plans.)  If
so, would you be willing to send me the dump off-list?  If the data is
not too sensitive, this'd probably be easier than trying to find a
smaller test case.

            regards, tom lane

Re: Wrong rows selected with view

From
Tom Lane
Date:
Bill Moseley <moseley@hank.org> writes:
> [ strange behavior ]

Oh, duh, it's not a PG bug: the problem is that the view is
underspecified.  You have

SELECT DISTINCT ON (class.id)
    ... a bunch of stuff ...
    FROM ... a bunch of tables ...
    ORDER BY class.id;

The difficulty with this is that DISTINCT ON will take the first row in
each group with the same class.id.  And since you're only sorting by
class.id, "the first row" is ill-defined.  I'm not sure why qsort's
behavior seems to depend on the width of the rows, but there's no doubt
that it's sorting different rows to the front of each group depending
on which view you use.

To get stable results from this view, what you need to do is add enough
ORDER BY conditions to make sure you are getting a consistent "first
row" in each group.  Adding the primary keys of each of the tables would
be enough, though it might be overkill.

It could also be that you don't want to be using DISTINCT ON at all;
have you thought through exactly what this view ought to produce for
each class.id?

            regards, tom lane

Re: Wrong rows selected with view

From
Bill Moseley
Date:
On Wed, Nov 16, 2005 at 07:48:06PM -0500, Tom Lane wrote:
> Bill Moseley <moseley@hank.org> writes:
> > [ strange behavior ]
>
> Oh, duh, it's not a PG bug: the problem is that the view is
> underspecified.  You have
>
> SELECT DISTINCT ON (class.id)
>     ... a bunch of stuff ...
>     FROM ... a bunch of tables ...
>     ORDER BY class.id;
>
> The difficulty with this is that DISTINCT ON will take the first row in
> each group with the same class.id.  And since you're only sorting by
> class.id, "the first row" is ill-defined.

Sorry, but I fear I'm missing something.


That ORDER BY is added by PG -- it's not part of my view when I define
it.  I assume PG adds that so it can do the DISTINCT ON.

Still, I don't have any duplicate class.id rows in this select that I
can see.  class.id 1243 and 1244 are not the same, yet PG is
(sometimes) throwing out one of them.  Are you saying that somehow PG
thinks they are the same class.id and is thus removing one?


I'm asking for a list of all classes taught by instructor 84.


ws2=> select * from instructors where person = 84 order by class;
 person | class
--------+-------
     84 |   727
     84 |   739
     84 |   804
     84 |   813
     84 |   867
     84 |  1243
     84 |  1244
(7 rows)


The reason I'm using DISTINCT ON is because the "class_list" view is
suppose to just return a list of unique classes, and a class might have
more than one instructor which would result in extra rows -- as shown
here:

ws2=> select * from instructors where class in (select class from instructors where person = 84);
 person | class
--------+-------
     84 |   727
     84 |   739
     84 |   804
     84 |   813
     84 |   867
     84 |  1243
     26 |  1243
     84 |  1244
     26 |  1244
(9 rows)

So when I don't want duplicates:

ws2=> select distinct on (class) * from instructors where class in (select class from instructors where person = 84);
 person | class
--------+-------
     84 |   727
     84 |   739
     84 |   804
     84 |   813
     84 |   867
     84 |  1243
     84 |  1244
(7 rows)

I don't care in this case about the DISTINCT ON throwing out the
duplicates -- I just care about distinct classes, not that all the
instructors are included in this select.




And even if I throw in all my other joins I get the same thing:

ws2=>        SELECT DISTINCT ON (class.id)
ws2->             class.id AS id
ws2->
ws2->           FROM class, location, region, person, instructors
ws2->
ws2->          WHERE class.location           = location.id   -- join with location
ws2->                AND class.id             = instructors.class   -- join the instructors
ws2->                AND instructors.person   = person.id     -- join the person(s)
ws2->                AND location.region      = region.id   -- join the location to a region
ws2->                AND person.id = 84;
  id
------
  727
  739
  804
  813
  867
 1243
 1244
(7 rows)


> I'm not sure why qsort's
> behavior seems to depend on the width of the rows, but there's no doubt
> that it's sorting different rows to the front of each group depending
> on which view you use.

I just don't see what groups there are, though in this case.


> It could also be that you don't want to be using DISTINCT ON at all;
> have you thought through exactly what this view ought to produce for
> each class.id?

Yes, I think so.  A list of columns related to it, with the exception
of when there's duplicate instructors I want one of those duplicates
thrown out (and I don't care which one).

When I do a query that generates duplicate class.id's such as
when a class has more than one instructor:

ws2=>         select  class.id AS class_id,
ws2->                 person.id AS person_id
ws2->
ws2->           FROM class, location, region, person, instructors
ws2->
ws2->          WHERE class.location           = location.id   -- join with location
ws2->                AND class.id             = instructors.class   -- join the instructors
ws2->                AND instructors.person   = person.id     -- join the person(s)
ws2->                AND location.region      = region.id   -- join the location to a region
ws2->                AND class_time > now();
 class_id | person_id
----------+-----------
      561 |        95
      614 |        95
      747 |       111
      762 |       111
      772 |       111
      883 |        13
      924 |        26
      935 |        26
      945 |        26
     1243 |        84
     1243 |        26
     1244 |        84
     1244 |        26
(13 rows)


You can see some classes are listed twice, so using distinct on gets
just my list of unique classes:


ws2=>         SELECT DISTINCT ON (class.id)
ws2->                 class.id AS class_id,
ws2->                 person.id AS person_id
ws2->
ws2->           FROM class, location, region, person, instructors
ws2->
ws2->          WHERE class.location           = location.id   -- join with location
ws2->                AND class.id             = instructors.class   -- join the instructors
ws2->                AND instructors.person   = person.id     -- join the person(s)
ws2->                AND location.region      = region.id   -- join the location to a region
ws2->                AND class_time > now();
 class_id | person_id
----------+-----------
      561 |        95
      614 |        95
      747 |       111
      762 |       111
      772 |       111
      883 |        13
      924 |        26
      935 |        26
      945 |        26
     1243 |        84
     1244 |        84
(11 rows)


All the view does is include more columns in the result set, and that
seems to change the number of results.  Here's the view doing the
previous query based on "class_time":

ws2=> select id, person_id from class_list where class_time > now();
  id  | person_id
------+-----------
  561 |        95
  614 |        95
  747 |       111
  762 |       111
  772 |       111
  883 |        13
  924 |        26
  935 |        26
  945 |        26
 1243 |        26 << PG selected the other instructor this time
 1244 |        84
(11 rows)




FYI - For those following along, none of those queries show the actual
problem I'm having with the view.  Which is the view of the above join
is returning the wrong number of rows -- 1243 is missing.

Again, selecting by person_id:

ws2=> select id from class_list where person_id = 84;
  id
------
  727
  739
  804
  813
  867
 1244
(6 rows)


I have a feeling this is going to be a "doh!" when I finally see it...



--
Bill Moseley
moseley@hank.org


Re: Wrong rows selected with view

From
Tom Lane
Date:
Bill Moseley <moseley@hank.org> writes:
> That ORDER BY is added by PG -- it's not part of my view when I define
> it.  I assume PG adds that so it can do the DISTINCT ON.

Well, then you're even further from following the protocol for DISTINCT
ON.  You *must* provide an ORDER BY to get reliable results from it.

> Still, I don't have any duplicate class.id rows in this select that I
> can see.  class.id 1243 and 1244 are not the same, yet PG is
> (sometimes) throwing out one of them.  Are you saying that somehow PG
> thinks they are the same class.id and is thus removing one?

No, I'm saying that the underlying data (the join result before applying
DISTINCT ON) looks like this:

bill=# select  "class".id, person.id AS person_id
bill-#    FROM "class", "location", region, person, instructors
bill-#   WHERE "class"."location" = "location".id AND "class".id = instructors."class"
bill-# AND instructors.person = person.id AND "location".region = region.id
bill-#   ORDER BY "class".id;
  id  | person_id
------+-----------
    1 |        49
    2 |        27
    3 |        19
    4 |        82
    5 |        12
 ...
 1238 |        61
 1238 |        60
 1239 |        40
 1240 |        67
 1241 |        11
 1243 |        26
 1243 |        84
 1244 |        26
 1244 |        84
(1311 rows)

The DISTINCT ON will take just one of the two rows with id = 1243, and
just one of the rows with id = 1244, and *it is effectively random which
one gets picked*.  So when you then select rows with person_id = 84, you
may or may not see these rows in the end result.

> The reason I'm using DISTINCT ON is because the "class_list" view is
> suppose to just return a list of unique classes, and a class might have
> more than one instructor which would result in extra rows -- as shown
> here:

Exactly.  So your view is going to return the class id along with a
randomly selected one of the instructor ids.  It seems to me that
filtering this result on instructor id is perhaps a bit ill-advised,
even if you fix the view so that the chosen instructor id isn't so
random (eg, you could fix it to display the lowest-numbered instructor
id for the particular class).  Even then, are you searching for the
instructor id that the view happens to show for that class, or some
other one?

            regards, tom lane

Re: Wrong rows selected with view

From
Bill Moseley
Date:
On Wed, Nov 16, 2005 at 11:34:21PM -0500, Tom Lane wrote:
> No, I'm saying that the underlying data (the join result before applying
> DISTINCT ON) looks like this:
>
> bill=# select  "class".id, person.id AS person_id
> bill-#    FROM "class", "location", region, person, instructors
> bill-#   WHERE "class"."location" = "location".id AND "class".id = instructors."class"
> bill-# AND instructors.person = person.id AND "location".region = region.id
> bill-#   ORDER BY "class".id;
>   id  | person_id
> ------+-----------
>     1 |        49
>     2 |        27
>     3 |        19
>     4 |        82
>     5 |        12
>  ...
>  1238 |        61
>  1238 |        60
>  1239 |        40
>  1240 |        67
>  1241 |        11
>  1243 |        26
>  1243 |        84
>  1244 |        26
>  1244 |        84
> (1311 rows)
>
> The DISTINCT ON will take just one of the two rows with id = 1243, and
> just one of the rows with id = 1244, and *it is effectively random which
> one gets picked*.  So when you then select rows with person_id = 84, you
> may or may not see these rows in the end result.

Yikes!  The problem is *when* DISTINCT ON happens, right?

And, Tom,  you actually explained this to me on the list back on Aug 25th,
but that's when I was using the view in a different way.  You noted
that the order was "unpredictable" but at that time it didn't matter
which row was selected to me.

    http://archives.postgresql.org/pgsql-general/2005-08/msg01291.php

This current problem was due to my assumption of how PG executes the
query:

My assumption was that the select would first do the joins (including
limit by class.id = 84) *then* weed out the duplicate class.ids.

But if PG is first doing the the joins on all the tables (before
limiting by class.id = 84) and then weeding out the duplicate
class.ids, and then finally limiting by class.id = 84 then I can see
where I might end up wit the missing row.

Frankly, I expected the first to happen because it would use an index
to select just the records of class.id = 84, then do the joins on
that small set of records.  Didn't seem likely that the database would
join all the records first and then limit by class.id.  Seems like the
hard way to do the query.  But the query planner works in strange and
mysterious ways. ;)


Does that also explain why PG was sometimes returning the "correct"
number of rows?  Depending on which of the two query plans above
were used?

> Exactly.  So your view is going to return the class id along with a
> randomly selected one of the instructor ids.  It seems to me that
> filtering this result on instructor id is perhaps a bit ill-advised,
> even if you fix the view so that the chosen instructor id isn't so
> random (eg, you could fix it to display the lowest-numbered instructor
> id for the particular class).  Even then, are you searching for the
> instructor id that the view happens to show for that class, or some
> other one?

Well, clearly, my "one-size-fits-all view" doesn't work in this case.
I just need another view without distinct when limiting by instructor.

It was that red-herring of removing a seemingly random column from the
view that made it hard to see what was really happening.

Thanks very much for all your time.


--
Bill Moseley
moseley@hank.org