Thread: prob with aggregate and group by - returns multiples
I have a group by that groups some, but not all, identical rows. Here are the details: I can reproduce the problem using one single table, details below; Table = hits +----------------------------------+----------------------------------+----- --+ | Field | Type | Length| +----------------------------------+----------------------------------+----- --+ | userid | varchar | 12 | | dat | date | 4 | | tim | time | 8 | | ipa | int4 | 4 | | ipb | int4 | 4 | | ipc | int4 | 4 | | ipd | int4 | 4 | | site | varchar | 50 | +----------------------------------+----------------------------------+----- --+ I want a report showing how many occurrences of "site" there are for each distinct "site". There is a lot of data, in there, so I'll look at a particular example of the problem. When I run this query: select count(site), site from hitsgroup by site; The output contains lines like the following. Note that these are all consecutive in the output. 2|xlink.zdnet.com 2|xlink.zdnet.com 1|xlink.zdnet.com 2|xlink.zdnet.com 2|xlink.zdnet.com 2|xlink.zdnet.com 3|xlink.zdnet.com 2|xlink.zdnet.com 3|xlink.zdnet.com 2|xlink.zdnet.com 1|xlink.zdnet.com Suspecting that there were differences in each xlink.zdnet.com, I counted just them: web=> select count (*) from hits where site='xlink.zdnet.com'; count ----- 22 (1 row) So, all 22 xlink.zdnet.com are selected above, but they have not grouped in the previous one. Any ideas why? On a lesser note: I tried "select * into temp from hits" as per the doco, but it barfs. Looks like the syntax has changed. Any current "railway" diagrams for the commands anywhere? Also, where can I find out about other environment settings like PGDATESTYLE? I know about that one, but what others are there? I'm trying to port from Oracle. Any guides? There are many differences in the SQL. ************************************************************** The information contained in this E-Mail is confidential and is intended only for the use of the addressee(s). If you receive this E-Mail in error, any use, distribution or copying of this E-Mail is not permitted. You are requested to forward unwanted E-Mail and address any problems to the MIM Holdings Limited Help Desk. E-Mail: helpdesk@mim.com.au or phone: Australia 07 3833 8042. **************************************************************
George Dau <gedau@isa.mim.com.au> writes: > select count(site), site from hits group by site; > The output contains lines like the following. Note that these are all > consecutive in the output. > 2|xlink.zdnet.com > 2|xlink.zdnet.com > 1|xlink.zdnet.com > 2|xlink.zdnet.com > 2|xlink.zdnet.com > 2|xlink.zdnet.com > 3|xlink.zdnet.com > 2|xlink.zdnet.com > 3|xlink.zdnet.com > 2|xlink.zdnet.com > 1|xlink.zdnet.com > Suspecting that there were differences in each xlink.zdnet.com, I counted > just them: > web=> select count (*) from hits where site='xlink.zdnet.com'; > count > ----- > 22 > (1 row) > So, all 22 xlink.zdnet.com are selected above, but they have not grouped in > the previous one. Any ideas why? Wow, that is bizarre. My first thought was that you had varying numbers of trailing blanks in the "site" values, but your second example seems to disprove that theory. What Postgres version are you running, and on what platform? Can you generate a self-contained example (a script that demonstrates the error from a standing start)? I suspect you may be hitting a platform- specific porting problem, but it's just speculation unless we have a self-contained test case to try on other machines. > On a lesser note: I tried "select * into temp from hits" as per the doco, > but it barfs. Postgres thinks that TEMP is a keyword, so it won't take it as a table name unless you put quotes around it. If we have doco examples that use TEMP as a table name, they need to be changed --- do you recall where you saw that, exactly? regards, tom lane
Tom Lane writes: > > On a lesser note: I tried "select * into temp from hits" as per the doco, > > but it barfs. > > Postgres thinks that TEMP is a keyword, so it won't take it as a table > name unless you put quotes around it. This is really an unfortunate case where someone should have read the SQL standard before putting in a feature. The SQL keyword is TEMPORARY, and TEMP is really a popular name for a dummy table. I tried making TEMP a ColId but it croaks on this syntactic contruct: SELECT xxx INTO [TEMP] [TABLE] tablename which is incidentally not SQL either. If someone is interested in allowing 'temp' as an identifier, there doesn't seem to be a good way without requiring the TABLE keyword above. Would that be worth it for 7.0 maybe? Of course the documentation should be changed to TEMPORARY as well in various places. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returns multiples
From
Thomas Lockhart
Date:
> > Postgres thinks that TEMP is a keyword, so it won't take it as a table > > name unless you put quotes around it. > This is really an unfortunate case where someone should have read the SQL > standard before putting in a feature. The SQL keyword is TEMPORARY, and > TEMP is really a popular name for a dummy table. So why not yank TEMP and require TEMPORARY? Saving an extra 5 characters of typing is not a good enough reason to keep it imho, and if the SQL92 standard requires a particular form why bother extending it? A major release is a good time to adjust syntax to promote compliance... - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
Thomas Lockhart <lockhart@alumni.caltech.edu> writes: >> This is really an unfortunate case where someone should have read the SQL >> standard before putting in a feature. The SQL keyword is TEMPORARY, and >> TEMP is really a popular name for a dummy table. > So why not yank TEMP and require TEMPORARY? Probably we ought to stop to ask why TEMP is in there to begin with; perhaps for compatibility with some other RDBMS? If not, I'd vote for pulling it out. That's a heck of a poor word to reserve. regards, tom lane
Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returns multiplesh
From
Bruce Momjian
Date:
> Thomas Lockhart <lockhart@alumni.caltech.edu> writes: > >> This is really an unfortunate case where someone should have read the SQL > >> standard before putting in a feature. The SQL keyword is TEMPORARY, and > >> TEMP is really a popular name for a dummy table. > > > So why not yank TEMP and require TEMPORARY? > > Probably we ought to stop to ask why TEMP is in there to begin with; > perhaps for compatibility with some other RDBMS? It was me. Informix uses it. > > If not, I'd vote for pulling it out. That's a heck of a poor word to > reserve. I am afraid of lots of user complaints, even if we had not already used TEMP. -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returns multiplesh
From
Thomas Lockhart
Date:
> > If not, I'd vote for pulling it out. That's a heck of a poor word to > > reserve. > I am afraid of lots of user complaints, even if we had not already used > TEMP. OK, but we've already got "user complaints" about TEMP being a reserved word, so that part seems to balance out. There is apparently no basis in published standards for TEMP being a reserved word. And btw it is not currently documented as a reserved word in syntax.sgml... - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returns multiples
From
Peter Eisentraut
Date:
On Tue, 29 Feb 2000, Thomas Lockhart wrote: > > > Postgres thinks that TEMP is a keyword, so it won't take it as a table > > > name unless you put quotes around it. > > This is really an unfortunate case where someone should have read the SQL > > standard before putting in a feature. The SQL keyword is TEMPORARY, and > > TEMP is really a popular name for a dummy table. > > So why not yank TEMP and require TEMPORARY? Saving an extra 5 > characters of typing is not a good enough reason to keep it imho, and > if the SQL92 standard requires a particular form why bother extending > it? > > A major release is a good time to adjust syntax to promote > compliance... I've been (lightly) bashed in the past for proposing such things (see END/ABORT) but I'm with you. I think that TEMP may be far too wide-spread by now, though. -- Peter Eisentraut Sernanders vaeg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returnsmultiples
From
Thomas Lockhart
Date:
> > A major release is a good time to adjust syntax to promote > > compliance... > I've been (lightly) bashed in the past for proposing such things (see > END/ABORT) but I'm with you. I think that TEMP may be far too wide-spread > by now, though. Well, imho the TEMP issue is not identical to END/ABORT. For TEMP, we are unnecessarily restricting the space of possible identifiers, eliminating a common and obvious name. The fix is trivial, and the affected parties are *only* those who use temporary tables and who chose to *not* use SQL92 syntax, which was always available. - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
At 06:48 AM 2/29/00 +0000, Thomas Lockhart wrote: >> > If not, I'd vote for pulling it out. That's a heck of a poor word to >> > reserve. >> I am afraid of lots of user complaints, even if we had not already used >> TEMP. > >OK, but we've already got "user complaints" about TEMP being a >reserved word, so that part seems to balance out. There is apparently >no basis in published standards for TEMP being a reserved word. And >btw it is not currently documented as a reserved word in >syntax.sgml... I vote for the SQL92 TEMPORARY. Let's not add a keyword that is non-standard just because one or another commercial database makes use of it, unless there's some real functionality to be gained that's not covered by the standard. TEMP is covered in SQL92 by TEMPORARY. As an example of when adopting a construct from another commercial database makes sense to me, SEQUENCE and SERIAL are both convenient means of generating unique keys that have no equivalent in the standard. - Don Baccus, Portland OR <dhogaza@pacifier.com> Nature photos, on-line guides, Pacific Northwest Rare Bird Alert Serviceand other goodies at http://donb.photo.net.
>>>> If not, I'd vote for pulling it out. That's a heck of a poor word to >>>> reserve. >> I am afraid of lots of user complaints, even if we had not already used >> TEMP. > OK, but we've already got "user complaints" about TEMP being a > reserved word, so that part seems to balance out. There is apparently > no basis in published standards for TEMP being a reserved word. And > btw it is not currently documented as a reserved word in > syntax.sgml... The real problem is not that we accept TEMP as a synonym for TEMPORARY; it is that we treat TEMP as a reserved word. What are the chances that we could make it a member of the ColId list? I am thinking that "... INTO TEMP temp" is *not* ambiguous given one token lookahead... regards, tom lane
Don Baccus <dhogaza@pacifier.com> writes: > I vote for the SQL92 TEMPORARY. Let's not add a keyword that is non-standard > just because one or another commercial database makes use of it, You're missing the point: we are not talking about *adding* a keyword, we're talking about *removing* one that we've already supported for a year or so. That changes matters considerably, IMHO. I have in fact been able to make a conflict-free grammar in which TEMP is accepted but not reserved. It requires a certain amount of redundancy in the productions (see below), but I think this is a worthwhile tradeoff for not breaking existing user code. Shall I commit this? regards, tom lane /** Redundancy here is needed to avoid shift/reduce conflicts,* since TEMP is not a reserved word. See also OptTemp.** Theresult is a cons cell (not a true list!) containing* a boolean and a table name.*/ OptTempTableName: TEMPORARY opt_table relation_name { $$ = lcons(makeInteger(TRUE), (List *) $3); } | TEMP opt_table relation_name { $$ = lcons(makeInteger(TRUE), (List *) $3); } | LOCAL TEMPORARYopt_table relation_name { $$ = lcons(makeInteger(TRUE), (List *) $4); } | LOCAL TEMP opt_tablerelation_name { $$ = lcons(makeInteger(TRUE), (List *) $4); } | GLOBAL TEMPORARY opt_tablerelation_name { elog(ERROR, "GLOBAL TEMPORARY TABLE is not currently supported"); $$ = lcons(makeInteger(TRUE), (List *) $4); } | GLOBAL TEMP opt_tablerelation_name { elog(ERROR, "GLOBAL TEMPORARY TABLE is not currently supported"); $$ = lcons(makeInteger(TRUE), (List *) $4); } | TABLE relation_name { $$ = lcons(makeInteger(FALSE), (List *) $2); } | relation_name { $$= lcons(makeInteger(FALSE), (List *) $1); } ;
Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returns multiplesh
From
Peter Eisentraut
Date:
On Tue, 29 Feb 2000, Don Baccus wrote: > I vote for the SQL92 TEMPORARY. Let's not add a keyword that is non-standard > just because one or another commercial database makes use of it, unless > there's some real functionality to be gained that's not covered by the > standard. The difference is that TEMP is already a keyword since 6.5 and we're considering removing it. -- Peter Eisentraut Sernanders väg 10:115 peter_e@gmx.net 75262 Uppsala http://yi.org/peter-e/ Sweden
Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returns multiplesh
From
Bruce Momjian
Date:
> > > If not, I'd vote for pulling it out. That's a heck of a poor word to > > > reserve. > > I am afraid of lots of user complaints, even if we had not already used > > TEMP. > > OK, but we've already got "user complaints" about TEMP being a > reserved word, so that part seems to balance out. There is apparently > no basis in published standards for TEMP being a reserved word. And > btw it is not currently documented as a reserved word in > syntax.sgml... OK, I certainly didn't look at the standard to when I implemented TEMP tables. In fact, I was surprised it worked considering it is just a hack on the cache code. Let's forget I made a mistake, and consider how many people are going to think they should use TEMP and how many TEMPORARY. I personally would guess TEMP and never TEMPORARY. I wonder if others would too. So are we willing to field questions from people trying to use TEMP tables and trying TEMP and not TEMPORARY. I realize the restriction on a field called TEMP, but we don't get those very often. How many people are going to guess TEMP and not TEMPORARY? Of course, as a Unix guy, I may have guessed TMP too. :-) -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returns multiplesh
From
Bruce Momjian
Date:
Looks good to me. > Don Baccus <dhogaza@pacifier.com> writes: > > I vote for the SQL92 TEMPORARY. Let's not add a keyword that is non-standard > > just because one or another commercial database makes use of it, > > You're missing the point: we are not talking about *adding* a keyword, > we're talking about *removing* one that we've already supported for > a year or so. That changes matters considerably, IMHO. > > I have in fact been able to make a conflict-free grammar in which TEMP > is accepted but not reserved. It requires a certain amount of > redundancy in the productions (see below), but I think this is a > worthwhile tradeoff for not breaking existing user code. > > Shall I commit this? > > regards, tom lane > > > /* > * Redundancy here is needed to avoid shift/reduce conflicts, > * since TEMP is not a reserved word. See also OptTemp. > * > * The result is a cons cell (not a true list!) containing > * a boolean and a table name. > */ > OptTempTableName: TEMPORARY opt_table relation_name > { $$ = lcons(makeInteger(TRUE), (List *) $3); } > | TEMP opt_table relation_name > { $$ = lcons(makeInteger(TRUE), (List *) $3); } > | LOCAL TEMPORARY opt_table relation_name > { $$ = lcons(makeInteger(TRUE), (List *) $4); } > | LOCAL TEMP opt_table relation_name > { $$ = lcons(makeInteger(TRUE), (List *) $4); } > | GLOBAL TEMPORARY opt_table relation_name > { > elog(ERROR, "GLOBAL TEMPORARY TABLE is not currently supported"); > $$ = lcons(makeInteger(TRUE), (List *) $4); > } > | GLOBAL TEMP opt_table relation_name > { > elog(ERROR, "GLOBAL TEMPORARY TABLE is not currently supported"); > $$ = lcons(makeInteger(TRUE), (List *) $4); > } > | TABLE relation_name > { $$ = lcons(makeInteger(FALSE), (List *) $2); } > | relation_name > { $$ = lcons(makeInteger(FALSE), (List *) $1); } > ; > -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returns multiplesh
From
Christopher Sawtell
Date:
On Wed, 01 Mar 2000, Tom Lane wrote: > Don Baccus <dhogaza@pacifier.com> writes: > > I vote for the SQL92 TEMPORARY. Let's not add a keyword that is non-standard > > just because one or another commercial database makes use of it, > > You're missing the point: we are not talking about *adding* a keyword, > we're talking about *removing* one that we've already supported for > a year or so. That changes matters considerably, IMHO. > > I have in fact been able to make a conflict-free grammar in which TEMP > is accepted but not reserved. It requires a certain amount of > redundancy in the productions (see below), but I think this is a > worthwhile tradeoff for not breaking existing user code. > > Shall I commit this? Is there not also the possibility of making this a configure-time option? ./configure --temporary-table=TEMPORARY # or TEMP as you wish Default to TEMPORARY in accord with SQL 92. -- Sincerely etc., NAME Christopher Sawtell - Support Engineer - iOpen Technologies Ltd.CELL PHONE 021 257 4451ICQ UIN 45863470EMAIL chris @ iopen . co . nz, csawtell @ xtra . co . nzCNOTES ftp://ftp.funet.fi/pub/languages/C/tutorials/sawtell_C.tar.gz ---->>> Please refrain from using HTML attachments in e-mails to me. <<<----
Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returns multiplesh
From
Thomas Lockhart
Date:
> So are we willing to field questions from people trying to use TEMP > tables and trying TEMP and not TEMPORARY. I realize the restriction on > a field called TEMP, but we don't get those very often. How many people > are going to guess TEMP and not TEMPORARY? Well, that's one reason to stick to a published standard. No guessing required since one could look it up ;) - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
Thomas Lockhart <lockhart@alumni.caltech.edu> writes: >> So are we willing to field questions from people trying to use TEMP >> tables and trying TEMP and not TEMPORARY. I realize the restriction on >> a field called TEMP, but we don't get those very often. How many people >> are going to guess TEMP and not TEMPORARY? > Well, that's one reason to stick to a published standard. No guessing > required since one could look it up ;) As of an hour ago, neither one is a reserved word ;-) regards, tom lane
Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returns multiplesh
From
Bruce Momjian
Date:
> > So are we willing to field questions from people trying to use TEMP > > tables and trying TEMP and not TEMPORARY. I realize the restriction on > > a field called TEMP, but we don't get those very often. How many people > > are going to guess TEMP and not TEMPORARY? > > Well, that's one reason to stick to a published standard. No guessing > required since one could look it up ;) If you survive reading the standard. :-) -- Bruce Momjian | http://www.op.net/~candle pgman@candle.pha.pa.us | (610) 853-3000+ If your life is a hard drive, | 830 Blythe Avenue + Christ can be your backup. | Drexel Hill, Pennsylvania19026
Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returns multiplesh
From
Thomas Lockhart
Date:
> As of an hour ago, neither one is a reserved word ;-) Darn, I was just warming up to a good argument :)) - Thomas -- Thomas Lockhart lockhart@alumni.caltech.edu South Pasadena, California
Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returns multiplesh
From
"Moray McConnachie"
Date:
----- Original Message ----- From: Bruce Momjian <pgman@candle.pha.pa.us> To: Thomas Lockhart <lockhart@alumni.caltech.edu> Cc: Tom Lane <tgl@sss.pgh.pa.us>; 'pgsql-sql@postgresql.org' <pgsql-sql@postgreSQL.org>; PostgreSQL Development <pgsql-hackers@postgreSQL.org> Sent: Wednesday, March 01, 2000 6:18 AM Subject: Re: [HACKERS] Re: [SQL] prob with aggregate and group by - returns multiplesh > > > So are we willing to field questions from people trying to use TEMP > > > tables and trying TEMP and not TEMPORARY. I realize the restriction on > > > a field called TEMP, but we don't get those very often. How many people > > > are going to guess TEMP and not TEMPORARY? > > > > Well, that's one reason to stick to a published standard. No guessing > > required since one could look it up ;) > > If you survive reading the standard. :-) If you can even *find* the standard :-> (is there an on-line source - all links I've ever seen 404ed...?)