Thread: Getting statistics from tables
Hi all: Let's say I have a table that keeps track of the pages that an user has seen in my website, something like: Users | Pages ----------------------------- pepe | index.html johnsen | about.html paco | about.html paco | index.html pepe | download.html It's easy to see the last 5, or 10, or 15 pages that an user has seen, or the last users that have viewed a page. But how can I find out the 5 most seen pages, or the 5 most frequent users? My first thought (to find the most viewed pages) was to: 1) "SELECT DISTINCT pages FROM <table>" 2) For each page, "SELECT count(users) FROM <table> WHERE pages='<page>'" 3) Find out the 5 pages that have returned the highest counts. At which point I imagined the perfomance problems that all the above would cause in a dynamic page (PHP) and said to myself "ew!!". Is there any other way? Any built-in function in Postgres that can do this, perhaps? (I used pageviews just as an example; what I have is a number of similar situations where I need to find out this kind of data, and in all of them said data is updated dynamically and needs to be displayed in real time). Paulo Jan. DDnet.
> Users | Pages > ----------------------------- > pepe | index.html > johnsen | about.html > paco | about.html > paco | index.html > pepe | download.html > > > It's easy to see the last 5, or 10, or 15 pages that an user has seen, > or the last users that have viewed a page. But how can I find out the 5 > most seen pages It's an easier problem than you think. =) select pages, count(*) from table group by pages order by count(*) desc limit 5;
How about: SELECT pages FROM <table> GROUP BY pages ORDER BY count(*) DESC LIMIT 5; and SELECT users FROM <table> GROUP BY users ORDER BY count(*) DESC LIMIT 5; Make sure the table is index on pages and on users. JLL Paulo Jan wrote: > > Hi all: > > Let's say I have a table that keeps track of the pages that an user has > seen in my website, something like: > > Users | Pages > ----------------------------- > pepe | index.html > johnsen | about.html > paco | about.html > paco | index.html > pepe | download.html > > It's easy to see the last 5, or 10, or 15 pages that an user has seen, > or the last users that have viewed a page. But how can I find out the 5 > most seen pages, or the 5 most frequent users? My first thought (to find > the most viewed pages) was to: > > 1) "SELECT DISTINCT pages FROM <table>" > 2) For each page, "SELECT count(users) FROM <table> WHERE > pages='<page>'" > 3) Find out the 5 pages that have returned the highest counts. > > At which point I imagined the perfomance problems that all the above > would cause in a dynamic page (PHP) and said to myself "ew!!". > Is there any other way? Any built-in function in Postgres that can do > this, perhaps? > (I used pageviews just as an example; what I have is a number of > similar situations where I need to find out this kind of data, and in > all of them said data is updated dynamically and needs to be displayed > in real time). > > Paulo Jan. > DDnet. > > ---------------------------(end of broadcast)--------------------------- > TIP 4: Don't 'kill -9' the postmaster
Paulo, Try something like: SELECT Users, count(*) AS User_Count FROM Table GROUP BY Users to see the number of pages/user, or SELECT Pages, count(*) AS Page_Count FROM Table GROUP BY Pages to see the number of users/page. You can add a limit as well, if you only want the top x listed; SELECT Users, count(*) AS User_Count FROM Table GROUP BY Users ORDER BY count(*) DESC LIMIT 5; Thanks, Peter Darley -----Original Message----- From: pgsql-general-owner@postgresql.org [mailto:pgsql-general-owner@postgresql.org]On Behalf Of Paulo Jan Sent: Tuesday, April 09, 2002 10:01 AM To: pgsql-general@postgresql.org Subject: [GENERAL] Getting statistics from tables Hi all: Let's say I have a table that keeps track of the pages that an user has seen in my website, something like: Users | Pages ----------------------------- pepe | index.html johnsen | about.html paco | about.html paco | index.html pepe | download.html It's easy to see the last 5, or 10, or 15 pages that an user has seen, or the last users that have viewed a page. But how can I find out the 5 most seen pages, or the 5 most frequent users? My first thought (to find the most viewed pages) was to: 1) "SELECT DISTINCT pages FROM <table>" 2) For each page, "SELECT count(users) FROM <table> WHERE pages='<page>'" 3) Find out the 5 pages that have returned the highest counts. At which point I imagined the perfomance problems that all the above would cause in a dynamic page (PHP) and said to myself "ew!!". Is there any other way? Any built-in function in Postgres that can do this, perhaps? (I used pageviews just as an example; what I have is a number of similar situations where I need to find out this kind of data, and in all of them said data is updated dynamically and needs to be displayed in real time). Paulo Jan. DDnet. ---------------------------(end of broadcast)--------------------------- TIP 4: Don't 'kill -9' the postmaster