Thread: BUG #2237: SELECT optimizer drops everything improperly

BUG #2237: SELECT optimizer drops everything improperly

From
"Alexis Wilke"
Date:
The following bug has been logged online:

Bug reference:      2237
Logged by:          Alexis Wilke
Email address:      alexis@m2osw.com
PostgreSQL version: 8.0.1
Operating system:   Linux (RH9)
Description:        SELECT optimizer drops everything improperly
Details:

Hi guys,

It looks like it could have been fixed by Tom in 8.1.1

* Fix incorrect optimizations of outer-join conditions (Tom)

But I cannot upgrade just right now to test whether it would work.

The following is a very simple set of SQL commands.

Create a new database, then do:

  psql -f <name>.sql <database name>

where <name>.sql is the following SQL saved in a file named <name>.sql and
<database name> is that empty database.

Note that it is written so you can run it multiple times, which means I DROP
tables and the first time you get some warnings which is fine.

Hope this helps.

Best regards,
Alexis




--
-- Sample script to create several tables and show that
-- a select does not behave as it should.
--

-- Assuming a database named empty_table_bug exists
-- and we are connected to it somehow.

-- In case you want to try several times, generates warnings...
DROP TABLE phpbb_users;
DROP TABLE phpbb_topics;
DROP TABLE phpbb_forums_watch;
DROP TABLE phpbb_topics_watch;

-- Found the bug hacking phpBB so tables are in link
-- with what phpBB uses.

CREATE TABLE phpbb_users (user_id int, user_name text);

CREATE TABLE phpbb_topics (topic_id int, user_id int, forum_id int,
topic_title text);

CREATE TABLE phpbb_forums_watch (forum_id int, user_id int);

CREATE TABLE phpbb_topics_watch (topic_id int, user_id int);

INSERT INTO phpbb_users (user_id, user_name) VALUES (1, 'alexis');
INSERT INTO phpbb_users (user_id, user_name) VALUES (2, 'john');

INSERT INTO phpbb_topics (topic_id, user_id, forum_id, topic_title)
        VALUES (1, 1, 1, 'Misc');

INSERT INTO phpbb_forums_watch (forum_id, user_id)
        VALUES (1, 1);

-- This did not happen yet and therefore phpbb_topics_watch is empty
-- INSERT INTO phpbb_topics_watch (topic_id, user_id) VALUES (1, 1);

-- This is what I need to work, it works with this simple SELECT
SELECT 'The next SELECT finds 1 row. Perfect!' AS message;
SELECT u.user_id, u.user_name, t.topic_title
    FROM phpbb_users u, phpbb_forums_watch fw, phpbb_topics t
    WHERE
        t.topic_id = 1 -- some variable id
        AND fw.forum_id = t.forum_id
        AND fw.user_id = u.user_id;

-- In this select, it detects that the phpbb_topics_watch is
-- empty and thus ignores the WHERE clause thinking since that
-- table is empty the SELECT will be empty
SELECT 'The next SELECT finds 0 row. It should find the same row!' AS
message;
SELECT u.user_id, u.user_name, t.topic_title
    FROM phpbb_users u, phpbb_forums_watch fw, phpbb_topics t,
phpbb_topics_watch tw
    WHERE
        (t.topic_id = 1 -- some variable id
        AND fw.forum_id = t.forum_id
        AND fw.user_id = u.user_id)
    OR
        (tw.topic_id = 1
        AND u.user_id = tw.user_id
        AND t.topic_id = 1);

-- insert a dummy value...
INSERT INTO phpbb_topics_watch (topic_id, user_id)
        VALUES (0, 0);

-- now it works!
SELECT 'Make the phpbb_topics_watch table ''non-empty'' and it works' AS
message;
SELECT u.user_id, u.user_name, t.topic_title
    FROM phpbb_users u, phpbb_forums_watch fw, phpbb_topics t,
phpbb_topics_watch tw
    WHERE
        (t.topic_id = 1 -- some variable id
        AND fw.forum_id = t.forum_id
        AND fw.user_id = u.user_id)
    OR
        (tw.topic_id = 1
        AND u.user_id = tw.user_id
        AND t.topic_id = 1);

Re: BUG #2237: SELECT optimizer drops everything improperly

From
Tom Lane
Date:
"Alexis Wilke" <alexis@m2osw.com> writes:
> -- In this select, it detects that the phpbb_topics_watch is
> -- empty and thus ignores the WHERE clause thinking since that
> -- table is empty the SELECT will be empty
> SELECT 'The next SELECT finds 0 row. It should find the same row!' AS
> message;
> SELECT u.user_id, u.user_name, t.topic_title
>     FROM phpbb_users u, phpbb_forums_watch fw, phpbb_topics t,
> phpbb_topics_watch tw
>     WHERE
>         (t.topic_id = 1 -- some variable id
>         AND fw.forum_id = t.forum_id
>         AND fw.user_id = u.user_id)
>     OR
>         (tw.topic_id = 1
>         AND u.user_id = tw.user_id
>         AND t.topic_id = 1);

I see no bug here.  This SELECT is defined to return the rows in the
cartesian product of the four FROM tables that satisfy the WHERE
condition.  Since one of the tables is empty, so is the cartesian
product.

Perhaps you meant to use a LEFT JOIN?

            regards, tom lane

Re: BUG #2237: SELECT optimizer drops everything improperly

From
alexis@m2osw.com
Date:
Hi Tom,

Yes. It looks like that works properly. And I'm replying because I had a
hard time to write the actual SQL command! I put it below.

It looks like your page about the joins would need a few examples. I tried
different syntax and they didn't work (thought they are valid for MySQL or
Oracle).

For instance, the first three tables, I would think I can define them like
this: (phpbb_users u, phpbb_forums_watch fw, phpbb_topics t) LEFT JOIN ...
but postgres doesn't like it. I had to use CROSS JOIN instead.

One other thing, which is certainly a bit harder right now, when I forget
to put the ON clause, the parser gives me an error saying "I don't
understand that WHERE". It would be much more helpful to have a message
such as "I don't like the WHERE here because I expected an ON statement."
This is certainly trickier to fix however. 8-)


SELECT u.user_id, u.user_name, t.topic_title
        FROM
                phpbb_users u CROSS JOIN phpbb_forums_watch fw CROSS JOIN
phpbb_topics t LEFT JOIN phpbb_topics_watch tw
                ON u.user_id = tw.user_id
        WHERE
                (t.topic_id = 1 -- some variable id
                AND fw.forum_id = t.forum_id
                AND fw.user_id = u.user_id)
        OR
                (tw.topic_id = 1
                AND u.user_id = tw.user_id
                AND t.topic_id = 1)
        ;


Thank you very much for your hints.

Alexis


On Sat, 4 Feb 2006, Tom Lane wrote:

> "Alexis Wilke" <alexis@m2osw.com> writes:
> > -- In this select, it detects that the phpbb_topics_watch is
> > -- empty and thus ignores the WHERE clause thinking since that
> > -- table is empty the SELECT will be empty
> > SELECT 'The next SELECT finds 0 row. It should find the same row!' AS
> > message;
> > SELECT u.user_id, u.user_name, t.topic_title
> >     FROM phpbb_users u, phpbb_forums_watch fw, phpbb_topics t,
> > phpbb_topics_watch tw
> >     WHERE
> >         (t.topic_id = 1 -- some variable id
> >         AND fw.forum_id = t.forum_id
> >         AND fw.user_id = u.user_id)
> >     OR
> >         (tw.topic_id = 1
> >         AND u.user_id = tw.user_id
> >         AND t.topic_id = 1);
>
> I see no bug here.  This SELECT is defined to return the rows in the
> cartesian product of the four FROM tables that satisfy the WHERE
> condition.  Since one of the tables is empty, so is the cartesian
> product.
>
> Perhaps you meant to use a LEFT JOIN?
>
>             regards, tom lane
>