Re: pg, mysql comparison with "group by" clause - Mailing list pgsql-sql

From Scott Marlowe
Subject Re: pg, mysql comparison with "group by" clause
Date
Msg-id 1129228672.29961.214.camel@state.g2switchworks.com
Whole thread Raw
In response to Re: pg, mysql comparison with "group by" clause  (Greg Stark <gsstark@mit.edu>)
Responses Re: pg, mysql comparison with "group by" clause  (Tom Lane <tgl@sss.pgh.pa.us>)
List pgsql-sql
On Thu, 2005-10-13 at 13:26, Greg Stark wrote:
> Scott Marlowe <smarlowe@g2switchworks.com> writes:
> 
> > Sorry, but it's worse than that.  It is quite possible that two people
> > could run this query at the same time and get different data from the
> > same set and the same point in time.  That shouldn't happen accidentally
> > in SQL, you should know it's coming.
> 
> I'm pretty unsympathetic to the "we should make a language less powerful and
> more awkward because someone might use it wrong" argument.

I'm in favor of getting the right answer.  If my database can't do that,
then it's not worth the bits to download it.  But I'm funny that way.

> > > In standard SQL you have to
> > > write GROUP BY ... and list every single column you need from the master
> > > table. Forcing the database to do a lot of redundant comparisons and sort on
> > > uselessly long keys where in fact you only really need it to sort and group by
> > > the primary key.
> > 
> > But again, you're getting whatever row the database feels like giving
> > you.  A use of a simple, stupid aggregate like an any() aggregate would
> > be fine here, and wouldn't require a lot of overhead, and would meet the
> > SQL spec.
> 
> Great, so I have a user table with, oh, say, 40 columns. And I want to return
> all those columns plus their current account balance in a single query.
> 
> The syntax under discussion would be:
> 
> select user.*, sum(money) from user join user_money using (user_id) group by user_id

> You would prefer:
> 
> select user_id, 
>        any(username) as username, any(firstname) as firstname, 
>        any(lastname) as lastname, any(address) as address,
>        any(city) as city, any(street) as street, any(phone) as phone,
>        any(last_update) as last_update, any(last_login) as last_login,
>        any(referrer_id) as referrer_id, any(register_date) as register_date,
>        ...
>        sum(money) as balance,
>        count(money) as num_txns
>   from user join user_money using (user_id) group by user_id

I's select the SINGLE entries from a child table that matched the parent
id and add the sum(money) to it.  Then, there'd be no need for aggregate
functions on those fields, or inaccurate / possibly random data.

But I'm funny that way.

> Having a safeties is fine but when I have to disengage the safety for every
> single column it starts to get more than a little annoying. 
> 
> Note that you cannot write the above as a subquery since there are two
> aggregates. You could write it as a join against a view but don't expect to
> get the same plans from Postgres for that.

I'd just write is a big join.  Again, getting the right answer is
important to me.


pgsql-sql by date:

Previous
From: "Andy"
Date:
Subject: Re: Strange join...maybe some improvements???
Next
From: Andrew Sullivan
Date:
Subject: Re: pg, mysql comparison with "group by" clause