Thread: table aliasing problem with 6.5...

table aliasing problem with 6.5...

From
Howie
Date:
odd problem in postgres 6.5.  

basically, if you use aliased tables _and_ use the full table name,
postgres goes nuts.  ive verified this on two separate databases
with the same postgres installation. (PostgreSQL 6.5.0 on
i686-pc-linux-gnu, compiled by egcc)

pics=> explain
pics-> SELECT pd.picnum, pd.picname, d.dirid, d.diskid, pd.tstamp,
pd.mdsum, pd.fsize,
pics-> d.dirname FROM picdata pd,dirs d WHERE picdata.dirid=dirs.dirid;
NOTICE:  QUERY PLAN:

Hash Join  (cost=35573920.00 rows=1073741849 width=68) ->  Nested Loop  (cost=19621602.00 rows=483339288 width=64)
->  Nested Loop  (cost=143382.61 rows=3528024 width=60)             ->  Seq Scan on picdata pd  (cost=1205.82
rows=25752width=40)             ->  Seq Scan on dirs d  (cost=5.52 rows=137 width=20)       ->  Seq Scan on dirs
(cost=5.52rows=137 width=4) ->  Hash  (cost=1205.82 rows=25752 width=4)       ->  Seq Scan on picdata  (cost=1205.82
rows=25752width=4)
 

picdata contains 25752 rows.
dirs contains 137 rows.

483339288 is a cartesian join across (seemingly) pd, d, and dirs.
1073741849 doesnt seem to be anything that i can figure out.
3528024 is from a cartesian join across pd and d ( or picdata and dirs )

but changing 'dirs.dirid' to 'd.dirid' and 'picdata.dirid' to 'pd.dirid' (
using the alised tables ) fixes the excessive row problem:

pics=> explain
pics-> SELECT pd.picnum, pd.picname, d.dirid, d.diskid, pd.tstamp,
pd.mdsum, pd.fsize,
pics-> d.dirname FROM picdata pd,dirs d WHERE pd.dirid=d.dirid;
NOTICE:  QUERY PLAN:

Nested Loop  (cost=1202.40 rows=25752 width=64) ->  Seq Scan on dirs d  (cost=5.52 rows=137 width=20) ->  Index Scan
usingpd_dirid_idx on picdata pd  (cost=8.74 rows=25752 width=44)
 

with another database:

ircbot=> explain 
ircbot-> SELECT u.usernum, u.nick, u.botnick, u.passwd, um.hostmask
ircbot-> FROM users u, usermasks um
ircbot-> WHERE users.usernum=usermasks.usernum;
NOTICE:  QUERY PLAN:

Hash Join  (cost=50951.80 rows=3932364 width=60) ->  Nested Loop  (cost=30348.58 rows=623971 width=56)       ->  Nested
Loop (cost=444.54 rows=9313 width=52)             ->  Seq Scan on users u  (cost=3.21 rows=67 width=40)             ->
SeqScan on usermasks um  (cost=6.59 rows=139 width=12)       ->  Seq Scan on users  (cost=3.21 rows=67 width=4) ->
Hash (cost=6.59 rows=139 width=4)       ->  Seq Scan on usermasks  (cost=6.59 rows=139 width=4)
 

ircbot=> explain 
ircbot-> SELECT u.usernum, u.nick, u.botnick, u.passwd, um.hostmask
ircbot-> FROM users u, usermasks um
ircbot-> WHERE u.usernum=um.usernum;
NOTICE:  QUERY PLAN:

Hash Join  (cost=17.60 rows=423 width=56) ->  Seq Scan on usermasks um  (cost=6.59 rows=139 width=16) ->  Hash
(cost=3.21rows=67 width=40)       ->  Seq Scan on users u  (cost=3.21 rows=67 width=40)
 

so, questions:

(1) why is it trying to do a cartesian join when not using the aliased   tables?
(2) is this 'bug' present in 6.5.1 ?
(3) why am i still awake at 7.30am EST on a saturday?

---
Howie <caffeine@toodarkpark.org>   URL: http://www.toodarkpark.org    
"The distance between insanity and genius is measured only by success."




Re: [SQL] table aliasing problem with 6.5...

From
Tom Lane
Date:
Howie <caffeine@toodarkpark.org> writes:
> (1) why is it trying to do a cartesian join when not using the aliased
>     tables?

That's what it's supposed to do.  When you provide an alias for a table
name in FROM, then as far as the rest of that query is concerned, that
alias *is* the name of the table --- it has no other.  When you refer
to the original table name in the WHERE clause, that's taken as creating
a separate table reference that's implicitly added to FROM.  Your query
is a four-way join with only one join having a restriction clause :-(

The alias behavior is necessary in order to handle self-joins properly,
for example to find married couples:   SELECT * FROM person, person other WHERE person.spouse = other.spouse;
This would be ambiguous if "person" were exposed by the second FROM clause.
SQL92 requires it to work this way:
        <table reference> ::=               <table name> [ [ AS ] <correlation name>                   [ <left paren>
<derivedcolumn list> <right paren> ] ]
 
...
        1) A <correlation name> immediately contained in a <table refer-           ence> TR is exposed by TR. A <table
name>immediately contained           in a <table reference> TR is exposed by TR if and only if TR           does not
specifya <correlation name>.
 


I think that implicitly adding a table to FROM is a Postgres extension
not found in SQL92 --- we probably really ought to reject such a query
with an error, since this behavior seems to be surprising...
        regards, tom lane


Re: [SQL] table aliasing problem with 6.5...

From
Bruce Momjian
Date:
> I think that implicitly adding a table to FROM is a Postgres extension
> not found in SQL92 --- we probably really ought to reject such a query
> with an error, since this behavior seems to be surprising...

Or perhaps print a message to the user?

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] table aliasing problem with 6.5...

From
Bruce Momjian
Date:
> Howie <caffeine@toodarkpark.org> writes:
> > (1) why is it trying to do a cartesian join when not using the aliased
> >     tables?
> 
> That's what it's supposed to do.  When you provide an alias for a table
> name in FROM, then as far as the rest of that query is concerned, that
> alias *is* the name of the table --- it has no other.  When you refer
> to the original table name in the WHERE clause, that's taken as creating
> a separate table reference that's implicitly added to FROM.  Your query
> is a four-way join with only one join having a restriction clause :-(
> 
> The alias behavior is necessary in order to handle self-joins properly,
> for example to find married couples:
>     SELECT * FROM person, person other WHERE person.spouse = other.spouse;
> This would be ambiguous if "person" were exposed by the second FROM clause.
> SQL92 requires it to work this way:
> 
>          <table reference> ::=
>                 <table name> [ [ AS ] <correlation name>
>                     [ <left paren> <derived column list> <right paren> ] ]
> 
>     ...
> 
>          1) A <correlation name> immediately contained in a <table refer-
>             ence> TR is exposed by TR. A <table name> immediately contained
>             in a <table reference> TR is exposed by TR if and only if TR
>             does not specify a <correlation name>.
> 
> 
> I think that implicitly adding a table to FROM is a Postgres extension
> not found in SQL92 --- we probably really ought to reject such a query
> with an error, since this behavior seems to be surprising...

OK, I have added an elog(NOTICE) for this behavior:test=> select * from pg_language;lanname
|lanispl|lanpltrusted|lanplcallfoid|lancompiler  --------+-------+------------+-------------+--------------internal|f
  |f           |            0|n/a           lisp    |f      |f           |            0|/usr/ucb/lisztC       |f
|f          |            0|/bin/cc       sql     |f      |f           |            0|postgres      (4 rows)test=>
selectpg_language.*;     
 
-->    NOTICE:  Auto-creating query reference to table pg_languagelanname
|lanispl|lanpltrusted|lanplcallfoid|lancompiler  --------+-------+------------+-------------+--------------internal|f
  |f           |            0|n/a           lisp    |f      |f           |            0|/usr/ucb/lisztC       |f
|f          |            0|/bin/cc       sql     |f      |f           |            0|postgres      (4 rows)
 


I am interested in comments about my wording, and this behavour.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] table aliasing problem with 6.5...

From
Tom Lane
Date:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
> OK, I have added an elog(NOTICE) for this behavior:
>     test=> select * from pg_language;
>     lanname |lanispl|lanpltrusted|lanplcallfoid|lancompiler   
>     --------+-------+------------+-------------+--------------
>     internal|f      |f           |            0|n/a           
>     lisp    |f      |f           |            0|/usr/ucb/liszt
>     C       |f      |f           |            0|/bin/cc       
>     sql     |f      |f           |            0|postgres      
>     (4 rows)
>     test=> select pg_language.*;     
--> NOTICE:  Auto-creating query reference to table pg_language
>     lanname |lanispl|lanpltrusted|lanplcallfoid|lancompiler   
>     --------+-------+------------+-------------+--------------
>     internal|f      |f           |            0|n/a           
>     lisp    |f      |f           |            0|/usr/ucb/liszt
>     C       |f      |f           |            0|/bin/cc       
>     sql     |f      |f           |            0|postgres      
>     (4 rows)

> I am interested in comments about my wording, and this behavour.

I think this is a fine idea, but that wording might not be much help
to novices.  Maybe "Adding missing FROM-clause entry for pg_language",
or some such?

In a situation where you've got subselects, it may not be immediately
obvious which FROM list the entry got added to.  I can't think of any
simple way of identifying that, however :-(
        regards, tom lane


Re: [SQL] table aliasing problem with 6.5...

From
Bruce Momjian
Date:
> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> > OK, I have added an elog(NOTICE) for this behavior:
>     
> >     test=> select * from pg_language;
> >     lanname |lanispl|lanpltrusted|lanplcallfoid|lancompiler   
> >     --------+-------+------------+-------------+--------------
> >     internal|f      |f           |            0|n/a           
> >     lisp    |f      |f           |            0|/usr/ucb/liszt
> >     C       |f      |f           |            0|/bin/cc       
> >     sql     |f      |f           |            0|postgres      
> >     (4 rows)
>     
> >     test=> select pg_language.*;     
> --> NOTICE:  Auto-creating query reference to table pg_language
> >     lanname |lanispl|lanpltrusted|lanplcallfoid|lancompiler   
> >     --------+-------+------------+-------------+--------------
> >     internal|f      |f           |            0|n/a           
> >     lisp    |f      |f           |            0|/usr/ucb/liszt
> >     C       |f      |f           |            0|/bin/cc       
> >     sql     |f      |f           |            0|postgres      
> >     (4 rows)
> 
> > I am interested in comments about my wording, and this behavour.
> 
> I think this is a fine idea, but that wording might not be much help
> to novices.  Maybe "Adding missing FROM-clause entry for pg_language",
> or some such?

Yes, I like that much better.  Let me make the change.

> In a situation where you've got subselects, it may not be immediately
> obvious which FROM list the entry got added to.  I can't think of any
> simple way of identifying that, however :-(


--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] table aliasing problem with 6.5...

From
Bruce Momjian
Date:
> > I am interested in comments about my wording, and this behavour.
> 
> I think this is a fine idea, but that wording might not be much help
> to novices.  Maybe "Adding missing FROM-clause entry for pg_language",
> or some such?

Wording change to your version.

> 
> In a situation where you've got subselects, it may not be immediately
> obvious which FROM list the entry got added to.  I can't think of any
> simple way of identifying that, however :-(

Not sure how to handle that either.  I could print the subquery level
number, or I could say "in subquery" or even go fancy and do "in
sub-sub-query" depending on the number of levels down.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] table aliasing problem with 6.5...

From
Tom Lane
Date:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
>> In a situation where you've got subselects, it may not be immediately
>> obvious which FROM list the entry got added to.  I can't think of any
>> simple way of identifying that, however :-(

> Not sure how to handle that either.  I could print the subquery level
> number, or I could say "in subquery" or even go fancy and do "in
> sub-sub-query" depending on the number of levels down.

That seems like a good idea; it won't help tell the difference between
two subqueries at the same level, but in a lot of practical cases it
would tell you what you needed to know, and it won't confuse a novice.

I like the "... in subquery", "... in sub-subquery", etc wording.

BTW, Jan was complaining that a number of the regress tests now "fail"
because they provoke this message.  Should we just update the expected
outputs, or should we change the tests not to use the feature?
        regards, tom lane


Re: [SQL] table aliasing problem with 6.5...

From
Bruce Momjian
Date:
> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> >> In a situation where you've got subselects, it may not be immediately
> >> obvious which FROM list the entry got added to.  I can't think of any
> >> simple way of identifying that, however :-(
> 
> > Not sure how to handle that either.  I could print the subquery level
> > number, or I could say "in subquery" or even go fancy and do "in
> > sub-sub-query" depending on the number of levels down.
> 
> That seems like a good idea; it won't help tell the difference between
> two subqueries at the same level, but in a lot of practical cases it
> would tell you what you needed to know, and it won't confuse a novice.
> 
> I like the "... in subquery", "... in sub-subquery", etc wording.
> 
> BTW, Jan was complaining that a number of the regress tests now "fail"
> because they provoke this message.  Should we just update the expected
> outputs, or should we change the tests not to use the feature?

I asked Thomas to regenerate the expected output.  I have too many other
differences to patch the files.  I can disable it for the time being if
Jan wants me to.

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] table aliasing problem with 6.5...

From
Bruce Momjian
Date:
> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> >> In a situation where you've got subselects, it may not be immediately
> >> obvious which FROM list the entry got added to.  I can't think of any
> >> simple way of identifying that, however :-(
> 
> > Not sure how to handle that either.  I could print the subquery level
> > number, or I could say "in subquery" or even go fancy and do "in
> > sub-sub-query" depending on the number of levels down.
> 
> That seems like a good idea; it won't help tell the difference between
> two subqueries at the same level, but in a lot of practical cases it
> would tell you what you needed to know, and it won't confuse a novice.
> 
> I like the "... in subquery", "... in sub-subquery", etc wording.
> 
> BTW, Jan was complaining that a number of the regress tests now "fail"
> because they provoke this message.  Should we just update the expected
> outputs, or should we change the tests not to use the feature?

I just looked at it, and I woild be required to loop up through the
pstate->parentParseState counting how many times I can continue going
up, and creating an elog string on the fly to print.  Doesn't seem worth
it, does it?

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026
 


Re: [SQL] table aliasing problem with 6.5...

From
Tom Lane
Date:
Bruce Momjian <maillist@candle.pha.pa.us> writes:
> I just looked at it, and I woild be required to loop up through the
> pstate->parentParseState counting how many times I can continue going
> up, and creating an elog string on the fly to print.  Doesn't seem worth
> it, does it?

It's probably worth saying "in subquery" or not, but if you don't feel
like determining the exact depth then never mind the sub-subquery bit...
        regards, tom lane


Re: [SQL] table aliasing problem with 6.5...

From
Bruce Momjian
Date:
> Bruce Momjian <maillist@candle.pha.pa.us> writes:
> > I just looked at it, and I woild be required to loop up through the
> > pstate->parentParseState counting how many times I can continue going
> > up, and creating an elog string on the fly to print.  Doesn't seem worth
> > it, does it?
> 
> It's probably worth saying "in subquery" or not, but if you don't feel
> like determining the exact depth then never mind the sub-subquery bit...
> 
>             regards, tom lane
> 

Good idea.  I have:

---------------------------------------------------------------------------

test=> select pg_language.*;
NOTICE:  Adding missing FROM-clause entry for table pg_language
lanname |lanispl|lanpltrusted|lanplcallfoid|lancompiler   
--------+-------+------------+-------------+--------------
internal|f      |f           |            0|n/a           
lisp    |f      |f           |            0|/usr/ucb/liszt
C       |f      |f           |            0|/bin/cc       
sql     |f      |f           |            0|postgres      
(4 rows)

test=> select * from pg_language a where lanname in (select pg_language.lanname);
NOTICE:  Adding missing FROM-clause entry in subquery for table pg_language
lanname |lanispl|lanpltrusted|lanplcallfoid|lancompiler   
--------+-------+------------+-------------+--------------
internal|f      |f           |            0|n/a           
lisp    |f      |f           |            0|/usr/ucb/liszt
C       |f      |f           |            0|/bin/cc       
sql     |f      |f           |            0|postgres      
(4 rows)

--  Bruce Momjian                        |  http://www.op.net/~candle maillist@candle.pha.pa.us            |  (610)
853-3000+  If your life is a hard drive,     |  830 Blythe Avenue +  Christ can be your backup.        |  Drexel Hill,
Pennsylvania19026