Thread: Group by problem!

Group by problem!

From
shahrzad khorrami
Date:

hi all,




Column |         Type          |                     Modifiers                    
--------+-----------------------+---------------------------------------------------
 id     | integer               | not null default nextval('test_id_seq'::regclass)
 f1     | character varying(32) |
 f3     | character varying(32) |
 f4     | character varying(32) |
 f5     | character varying(32) |
 f6     | character varying(32) |
 f7     | character varying(32) |
 f8     | character varying(32) |
 f9     | character varying(32) |
 f11    | character varying(32) |
 f12    | character varying(32) |
 f13    | character varying(32) |
 f14    | character varying(32) |
 f2     | character varying(32) |
 f10    | character varying(32) |

512 | 432350221818600,355801020050525 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 082234 | 121109
513 | 432350221818600,355801020050525 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 082234 | 121109
521 | 432350221818600,355801020050524 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 192213 | 121109
523 | 432350221818600,355801020050524 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 092213 | 121109
577 | 432350221818600,355801020050525 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 082234 | 121109
...

I want to have a query that find the records of each client that are in ('')  with the maximum f2(time) and f10(date) and if they were max and we have two similar records, it return a record with higher id
(it means for each client that are in ('','','',....) give us one record that have maximum date and time)

according to the above data, I want this result:
577 | 432350221818600,355801020050525 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 082234 | 121109
521 | 432350221818600,355801020050524 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 192213 | 121109

----------------------------------------------------------------------------------------------------------------------
what Do I...>>


select * from test where id in( SELECT id FROM test
WHERE f1 in  ('432350221818600,355801020050524','432350221818600,355801020050525')
GROUP BY f1
HAVING MAX(f10::int)>1 and MAX(f2::int)>1);

but this query raised an error:

ERROR:  column "test.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select * from test where id in( SELECT id FROM test

but if I change Group BY f1,id , the result will be wrong , what can I do? or if I MAX(id).... result is wrong to :
577 | 432350221818600,355801020050525 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 082234 | 121109
523 | 432350221818600,355801020050524 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 092213 | 121109


Thanks in advace

--
Shahrzad Khorrami

Re: Group by problem!

From
Grzegorz Jaśkiewicz
Date:


On Wed, Nov 4, 2009 at 9:11 AM, shahrzad khorrami <shahrzad.khorrami@gmail.com> wrote:

hi all,




Column |         Type          |                     Modifiers                    
--------+-----------------------+---------------------------------------------------
 id     | integer               | not null default nextval('test_id_seq'::regclass)
 f1     | character varying(32) |
 f3     | character varying(32) |
 f4     | character varying(32) |
 f5     | character varying(32) |
 f6     | character varying(32) |
 f7     | character varying(32) |
 f8     | character varying(32) |
 f9     | character varying(32) |
 f11    | character varying(32) |
 f12    | character varying(32) |
 f13    | character varying(32) |
 f14    | character varying(32) |
 f2     | character varying(32) |
 f10    | character varying(32) |

512 | 432350221818600,355801020050525 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 082234 | 121109
513 | 432350221818600,355801020050525 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 082234 | 121109
521 | 432350221818600,355801020050524 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 192213 | 121109
523 | 432350221818600,355801020050524 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 092213 | 121109
577 | 432350221818600,355801020050525 | A  | 43.28 | N  | -80.07 | E  | 08 | 0.05 | 1381.0 | 48.28 | 1/4042 | 24/0 | 082234 | 121109
...

select * from test where id in( SELECT id FROM test
WHERE f1 in  ('432350221818600,355801020050524','432350221818600,355801020050525')
GROUP BY f1
HAVING MAX(f10::int)>1 and MAX(f2::int)>1);

but this query raised an error:

ERROR:  column "test.id" must appear in the GROUP BY clause or be used in an aggregate function
LINE 1: select * from test where id in( SELECT id FROM test


it is quite obvious, isn't it ?
btw, is that db a joke ?
You obviously don't expect that to be fast, ever.

--
GJ

Re: Group by problem!

From
Sam Mason
Date:
On Wed, Nov 04, 2009 at 12:41:25PM +0330, shahrzad khorrami wrote:
>  Column |         Type          |
>  id     | integer               | not null default
>  f1     | character varying(32) |
>  f3     | character varying(32) |
>  f4     | character varying(32) |
>  f5     | character varying(32) |
>  f6     | character varying(32) |
>  f7     | character varying(32) |
>  f8     | character varying(32) |
>  f9     | character varying(32) |
>  f11    | character varying(32) |
>  f12    | character varying(32) |
>  f13    | character varying(32) |
>  f14    | character varying(32) |
>  f2     | character varying(32) |
>  f10    | character varying(32) |

I'd highly recommend giving these columns more human readable names and
appropriate data types.  You suggest that f2 is a time and f10 a date,
with their values being "082234" and "121109" respectively.  PG will be
able to help you *much* more if you do this.  For example, I'd combine
those two fields into a "timestamp" field[1] and I'd guess its value would
be:

  2009-11-12 08:22:34

there's a function called to_timestamp[2] that would help a lot here.
For example, you can run:

  ALTER TABLE test ADD COLUMN datetime TIMESTAMP;

to add in a new column called "datetime" (I'm sure you can think of a
better name, for example creationtime or entrydate or similar).  Then
you need to give it values:

  UPDATE test SET datetime = to_timestamp(f10||f2,'ddmmyyhh24miss');

This will combine the opaque (to the database) text fields into
something that will give PG something to get its hands on.  I'd
similarly store the numeric fields in appropriate types (i.e. numeric,
integer or float8).  Note that || appends two strings (i.e. text or
your varchar fields) onto the the end of each other and its behavior is
documented in [3].

Once you've done that we'll have more of a chance of helping you! :)

--
  Sam  http://samason.me.uk/

 [1] http://www.postgresql.org/docs/current/static/datatype-datetime.html
 [2] http://www.postgresql.org/docs/current/static/functions-formatting.html
 [3] http://www.postgresql.org/docs/current/static/functions-string.html