Thread: Problem with group by command

Problem with group by command

From
"Jessica Ord"
Date:
Dear PGSQL-BUG group,

I would be grateful if any of you could provide me any information that you
may have.

I have written a perl program which runs fine on a machine that was using
PostgreSQL 6.4.2.  I setup another machine which was running PostgreSQL
6.5.3 bundled by the Linux 6.2.  I used the copy command to migrate most of
the tables on the new machine and re-run the same program.

The same error message keeps on appear said that:
Invalid use of aggregates or non-group column in target list.

The SQL statement used is:
select username, user_index, date, status from subscribers, cellnet_mesg_log
where status != 3 and date >= '25/10/2000' and date < '01/11/2000' and
cellnet_mesg_log.user_index = subscribers.oid group by username;

The 'group by' works in the new machine only if the selected data contains
username which is so weird.  I wonder whether some of the attributes might
have been changed since I did not restore the same tables from pg_dumpall
command.
If that is the case, I wonder if I could manually compare and reset the
attributes.
Unfortunately, I couldn't find this information from your web page.

The tables structure are as below.  The cellnet_mesg_log's user_index is map
with the oid of the subscribers table.

Table    = subscribers
+----------------------------------+----------------------------+
|Field                             |Type                | Length|
+----------------------------------+--------------------+-------+
| username                         | varchar()          |    32 |
| acct_no                          | int4               |     4 |
| acct_type                        | int4               |     4 |
| build_factor                     | int4               |     4 |
+----------------------------------+-----------------------------


Table    = cellnet_mesg_log
+----------------------------------+----------------------------------+-----
--+
|              Field               |              Type                |
Length|
+----------------------------------+----------------------------------+-----
--+
| user_index                       | oid                              |
4 |
| date                             | datetime                         |
8 |
| sender                           | varchar()                        |
64 |
| part                             | int4                             |
4 |
| part_total                       | int4                             |
4 |
| status                           | int4                             |
4 |
| mesg_id                          | varchar()                        |
64 |
| message                          | varchar()                        |
160 |
+----------------------------------+----------------------------------+-----
--+

Hope to hear from any of you soon.  Thank you for your kind attention.


regards,
Jessica Ord

Re: Problem with group by command

From
Stephan Szabo
Date:
On Fri, 27 Oct 2000, Jessica Ord wrote:

> I would be grateful if any of you could provide me any information that you
> may have.
>
> I have written a perl program which runs fine on a machine that was using
> PostgreSQL 6.4.2.  I setup another machine which was running PostgreSQL
> 6.5.3 bundled by the Linux 6.2.  I used the copy command to migrate most of
> the tables on the new machine and re-run the same program.

You'll probably be best off getting the 7.0.x rpms and using those instead
of 6.5.x (see below).

> The same error message keeps on appear said that:
> Invalid use of aggregates or non-group column in target list.
>
> The SQL statement used is:
> select username, user_index, date, status from subscribers, cellnet_mesg_log
> where status != 3 and date >= '25/10/2000' and date < '01/11/2000' and
> cellnet_mesg_log.user_index = subscribers.oid group by username;

Well, I don't believe that's a legal SQL statement.  All columns in the
select list of a group by query must either be grouped or inside a set
value function (min, max, etc...).  I think old versions of postgres got
this wrong, so you may be seeing fallout from that.
Fundamentally, *which* value of date and status would you want to give
if there was more than one for a given username?  If there's only one,
using group by is useless, if there's more than one, the query gives
indeterminate results.

Under 7.0.x you can use distinct on and order by to give repeatable
results (if you order by all the fields, and use distinct on (username)
I believe you'd get the first/last row by the ordering criteria that
matched each username)

Re: Problem with group by command

From
Tom Lane
Date:
"Jessica Ord" <jo@uk.xo.com> writes:
> I have written a perl program which runs fine on a machine that was using
> PostgreSQL 6.4.2.  I setup another machine which was running PostgreSQL
> 6.5.3 bundled by the Linux 6.2.

> The same error message keeps on appear said that:
> Invalid use of aggregates or non-group column in target list.

> select username, user_index, date, status from subscribers, cellnet_mesg_log
> where status != 3 and date >= '25/10/2000' and date < '01/11/2000' and
> cellnet_mesg_log.user_index = subscribers.oid group by username;

This statement is not valid SQL, but 6.4.* failed to detect that, and
instead gave you some randomly-chosen result.  6.5 is pickier.

You need to revise the query to conform to SQL specs: the SELECT output
list can't refer to columns that aren't grouped by GROUP BY, except as
arguments of an aggregate function like MIN() or MAX().  The reason is
that GROUP BY implies that only one output tuple will be produced per
distinct value of the group-by column(s).  There might be more than one
input tuple, with different values of user_index, date, etc, for each
value of username --- in which case, what output do you expect to get?
It's not well-defined.

            regards, tom lane