Thread: Trying to find miss and mister of the last month with highest rating

Trying to find miss and mister of the last month with highest rating

From
Alexander Farber
Date:
Hello,

at my website users can rate each other:

# select id, nice, last_rated from pref_rep where nice=true
  order by last_rated desc limit 7;
           id           | nice |         last_rated
------------------------+------+----------------------------
 OK152565298368         | t    | 2011-07-07 14:26:38.325716
 OK452217781481         | t    | 2011-07-07 14:26:10.831353
 OK524802920494         | t    | 2011-07-07 14:25:28.961652
 OK348972427664         | t    | 2011-07-07 14:25:17.214928
 DE11873                | t    | 2011-07-07 14:25:05.303104
 OK335285460379         | t    | 2011-07-07 14:24:39.062652
 OK353639875983         | t    | 2011-07-07 14:23:33.811986

And I know their gender:

# select id, female from pref_users limit 7;
       id       | female
----------------+--------
 OK351636836012 | f
 OK366097485338 | f
 OK251293359874 | t
 OK7848446207   | f
 OK335478250992 | t
 OK355400714550 | f
 OK146955222542 | t

I'm trying to construct 2 queries -
one to find the female user with
highest count of ratings for the last month
(not just for the last 30 days -
and this condition is already killing me)
and the same for non-female users.

Any help please? SQL is so hard sometimes.

Regards
Alex,

using PostgreSQL 8.4.8 / CentOS 5.6

Re: Trying to find miss and mister of the last month with highest rating

From
"David Johnston"
Date:
-----Original Message-----
From: pgsql-general-owner@postgresql.org
[mailto:pgsql-general-owner@postgresql.org] On Behalf Of Alexander Farber
Sent: Thursday, July 07, 2011 9:54 AM
To: pgsql-general
Subject: [GENERAL] Trying to find miss and mister of the last month with
highest rating

# select id, nice, last_rated from pref_rep where nice=true
  order by last_rated desc limit 7;
           id           | nice |         last_rated
------------------------+------+----------------------------
 OK152565298368         | t    | 2011-07-07 14:26:38.325716
 OK452217781481         | t    | 2011-07-07 14:26:10.831353
 OK524802920494         | t    | 2011-07-07 14:25:28.961652
 OK348972427664         | t    | 2011-07-07 14:25:17.214928
 DE11873                | t    | 2011-07-07 14:25:05.303104
 OK335285460379         | t    | 2011-07-07 14:24:39.062652
 OK353639875983         | t    | 2011-07-07 14:23:33.811986

And I know their gender:

# select id, female from pref_users limit 7;
       id       | female
----------------+--------
 OK351636836012 | f
 OK366097485338 | f

I'm trying to construct 2 queries -
one to find the female user with
highest count of ratings for the last month (not just for the last 30 days -
and this condition is already killing me) and the same for non-female users.

Any help please? SQL is so hard sometimes.


>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

For the dates you basically need to figure out the correct year, month and
day values to represent the prior month using the current month as a base
(then build a "date string" and cast it to an actual date).  Hint; the last
day of the prior month is one day before the first day of the current month.
Use a WITH or sub-query to select only ratings between the dates while
joining the gender table.  You can also perform your "COUNT(*) at this level
and group by ID, Gender.

In the main query try to use the "RANK()" window function with an ORDER BY
on the "count" field and partitioned by "gender".  You can probably put this
in the HAVING clause and check for "(RANK(*) OVER ...) = 1"

Not totally sure on the syntax and don't have time to load up some test data
and try different permutations but this should at least get you headed in
the right direction if no-one else comes along and provides a more detailed
explanation.

David J.





Re: Trying to find miss and mister of the last month with highest rating

From
Alexander Farber
Date:
Hello,

I will do 2 queries -
one for female users (to find "the miss of last month)
and one for males (the "mister of last month").

Here I can fetch all females rated nicely in June:

# select r.id, nice, r.last_rated
from pref_rep r, pref_users u
where r.nice=true and
to_char(current_timestamp - interval '1 month', 'IYYY-MM') =
to_char(r.last_rated, 'IYYY-MM') and
u.female=true and
r.id=u.id;
 OK475705800909         | t    | 2011-06-15 09:34:29.527786
 DE8890                 | t    | 2011-06-21 14:27:12.442744
 OK332253578018         | t    | 2011-06-01 01:13:06.767902
 OK147226095421         | t    | 2011-06-21 11:01:58.151309
 VK56919399             | t    | 2011-06-25 10:47:52.057593
 VK4123791              | t    | 2011-06-17 22:44:38.763625
 OK259892905389         | t    | 2011-06-04 20:12:43.54472
 MR13003057189952933403 | t    | 2011-06-13 21:38:16.935786

Do you think it's a good query?
(takes few seconds here)

What to do next to pick up
the person having most ratings?

(if there are several persons
having same amount - then I'd like
to pick 1 random - my website is so
obscure, that the users won't notice :-)

And I understand that it would be
most effective to run this query just
once on the 1st of the month,
but I'm too lazy to maintain the
cache files/data, so I want to re-run
query every time the script (actually
going to be a Drupal 7.4 block) runs
(I've switched the hourly block caching on).

Thank you
Alex

Re: Trying to find miss and mister of the last month with highest rating

From
Alexander Farber
Date:
Do you think this query is good?
(or is it allocating loads
of strings for the month comparisons?)

# select r.id, count(r.id)
from pref_rep r, pref_users u
where r.nice=true and
to_char(current_timestamp - interval '1 month', 'IYYY-MM') =
to_char(r.last_rated, 'IYYY-MM') and
u.female=true and
r.id=u.id
group by r.id
order by count desc
limit 7;
       id       | count
----------------+-------
 OK348033534186 |    49
 OK145143239265 |    46
 OK4087658302   |    41
 DE11370        |    36
 DE11467        |    36
 OK351488505084 |    35
 OK524565727413 |    33
(7 rows)

(I'll just change "limit 7" to "limit 1"
above to pick the "miss of ls month")

And why can't I add u.name, u.avatar
to fetch all the info I need in 1 pass?

# select r.id, count(r.id), u.name, u.avatar, u.city
from pref_rep r, pref_users u
where r.nice=true and
to_char(current_timestamp - interval '1 month', 'IYYY-MM') =
to_char(r.last_rated, 'IYYY-MM') and
u.female=true and
r.id=u.id
group by r.id
order by count desc
limit 7;
ERROR:  column "u.*" must appear in the GROUP BY clause or be used in
an aggregate function
LINE 1: select r.id, count(r.id), u.name, u.avatar, u.city
                                  ^

Is there a way to workaround it?

Regards
Alex

Re: Trying to find miss and mister of the last month with highest rating

From
Alexander Farber
Date:
This seems to work, but I wonder
if my query for "the miss of the last month"
could be improved

# select r.id, count(r.id), u.first_name, u.avatar, u.city
from pref_rep r, pref_users u
where r.nice=true and
to_char(current_timestamp - interval '1 month', 'IYYY-MM') =
to_char(r.last_rated, 'IYYY-MM') and
u.female=true and
r.id=u.id
group by r.id , u.first_name, u.avatar, u.city
order by count desc
limit 1;
       id       | count | first_name |                                 avatar
             |   city
----------------+-------+------------+-----------------------------------------------------------
-------------+-----------
 OK348033534186 |    49 | Елена      |
http://i398.odnoklassniki.ru/getImage?photoId=194373317258
&photoType=0 | Хабаровск
(1 row)

(I'm sorry, I'm probably asking same
questions again and again and
not even not noticing it.
SQL is a tough language for me)

Should I maybe better use
date_trunc( 'month', now() ) - '1 MONTH'::INTERVAL
instead of comparing to_char() results?

Thank you
Alex

Re: Trying to find miss and mister of the last month with highest rating

From
"David Johnston"
Date:
-----Original Message-----

And why can't I add u.name, u.avatar
to fetch all the info I need in 1 pass?

# select r.id, count(r.id), u.name, u.avatar, u.city from pref_rep r,
pref_users u where r.nice=true and to_char(current_timestamp - interval '1
month', 'IYYY-MM') = to_char(r.last_rated, 'IYYY-MM') and u.female=true and
r.id=u.id group by r.id order by count desc limit 7;
ERROR:  column "u.*" must appear in the GROUP BY clause or be used in an
aggregate function LINE 1: select r.id, count(r.id), u.name, u.avatar,
u.city
                                  ^

Is there a way to workaround it?

>>>>>>>>>>>>>>>>>>>>

I believe you need to put 'name' in quotes ( like u."name" )

The fact that the warning indicates "u.*" where you didn't use "u.*"
anywhere in your literal syntax means that PostgreSQL is interpreting
something funny.  Trial and error should have narrowed down the options if
you didn't catch that "name" is so common as to likely be utilized by the
database.

Going from memory here...

David J.