Thread: Database design advice
Hi all, I'm writing a property rental web application, and one of the metrics I wish to track is the number of page views per month for each property. I originally envisaged a table with a column for each month (one row per property), for which the relevant column would be incremented each time a property is viewed, depending on the month. But this raises questions as to the best way to maintain this table (i.e. Create a new month column each month) I'd prefer to keep all logic in the database, so would prefer not to use a cron job to do this. I then thought of using a trigger to check if the relevant column existed before attempting to increment, but this would only be utilised once, and then just create overhead at every update in the future. Should I just create a table with 10 years worth of columns? Is there a different alternative entirely? None of the above strike me as architecturally sound, and so any advice from someone more seasoned in database design would be gratefully received. Kind Regards, Neil.
On Fri, October 21, 2005 11:47 am, Neil Saunders said: > Hi all, > > I'm writing a property rental web application, and one of the metrics > I wish to track is the number of page views per month for each > property. > > I originally envisaged a table with a column for each month (one row > per property), for which the relevant column would be incremented each > time a property is viewed, depending on the month. But this raises > questions as to the best way to maintain this table (i.e. Create a new > month column each month) I'd prefer to keep all logic in the database, > so would prefer not to use a cron job to do this. > > I then thought of using a trigger to check if the relevant column > existed before attempting to increment, but this would only be > utilised once, and then just create overhead at every update in the > future. Should I just create a table with 10 years worth of columns? > Is there a different alternative entirely? > > None of the above strike me as architecturally sound, and so any > advice from someone more seasoned in database design would be > gratefully received. Are you tracking anything else per page/property view? My first thought would be to have a separate table of 'view' information: When, what property, and whatever else you want to track. Then, if you want to know how many times a particular page/property was viewed you just do a select on that table for all the records related to it during that time period and count the rows. As a bonus, you aren't limited to any particular time interval, and can add fields in the future. Daniel T. Staal --------------------------------------------------------------- This email copyright the author. Unless otherwise noted, you are expressly allowed to retransmit, quote, or otherwise use the contents for non-commercial purposes. This copyright will expire 5 years after the author's death, or in 30 years, whichever is longer, unless such a period is in excess of local copyright law. ---------------------------------------------------------------
--- "Daniel T. Staal" <DStaal@usa.net> wrote: > On Fri, October 21, 2005 11:47 am, Neil Saunders > said: > > Hi all, > > > > I'm writing a property rental web application, and > one of the metrics > > I wish to track is the number of page views per > month for each > > property. > > > > I originally envisaged a table with a column for > each month (one row > > per property), for which the relevant column would > be incremented each > > time a property is viewed, depending on the month. > But this raises > > questions as to the best way to maintain this > table (i.e. Create a new > > month column each month) I'd prefer to keep all > logic in the database, > > so would prefer not to use a cron job to do this. > > > > I then thought of using a trigger to check if the > relevant column > > existed before attempting to increment, but this > would only be > > utilised once, and then just create overhead at > every update in the > > future. Should I just create a table with 10 years > worth of columns? > > Is there a different alternative entirely? > > > > None of the above strike me as architecturally > sound, and so any > > advice from someone more seasoned in database > design would be > > gratefully received. > > Are you tracking anything else per page/property > view? > > My first thought would be to have a separate table > of 'view' information: > When, what property, and whatever else you want to > track. Then, if you > want to know how many times a particular > page/property was viewed you just > do a select on that table for all the records > related to it during that > time period and count the rows. > > As a bonus, you aren't limited to any particular > time interval, and can > add fields in the future. > > Daniel T. Staal i'd just add that you could take the table that Daniel discusses and populate a second table with addresses and monthly hits. each address and month would be a new record in the table (as opposed to a new column). off hand, i see a link table connecting a table of months to a table of addresses and a table of years - but i'm not sure this would be the optimal design. this would probably only be necessary if you got to the point that your query on the main table became time consuming - and you hope to have that problem! ;-) good luck. __________________________________ Yahoo! FareChase: Search multiple travel sites in one click. http://farechase.yahoo.com
I've been using MySQL for about the past year and am trying to convert to PostgreSQL now but don't know the simplest way to import my table data into Postgre. There are many free tools (phpmyadmin, Sqlyog, MySQL Query Browser, etc.) for MySQL that let you import Excel, CSV, Text, XML, and many more into the database but I can't find anything other than the EMS Data Import for Postgre. Are there any free tools for Postgre that I can use for uploading data into the database? Is I am very new to Postgre, I would also appreciate correction if I am overlooking something simple or just going about populating my database the wrong way. Thanks!
--- Wes Williams <wes_williams@fcbonline.net> wrote: > I've been using MySQL for about the past year and am > trying to convert to > PostgreSQL now but don't know the simplest way to > import my table data into > Postgre. There are many free tools (phpmyadmin, > Sqlyog, MySQL Query > Browser, etc.) for MySQL that let you import Excel, > CSV, Text, XML, and many > more into the database but I can't find anything > other than the EMS Data > Import for Postgre. Are there any free tools for > Postgre that I can use for > uploading data into the database? > > Is I am very new to Postgre, I would also appreciate > correction if I am > overlooking something simple or just going about > populating my database the > wrong way. > > Thanks! Wes, there is definitely a way to do it. i think it is through the command line, though. i had some data in an openoffice.org's calc and i imported it into a table via cygwin (which means the commandline was involved). i'm trying to research my old posts that addessed this. update: the save you data to a csv format and then use the copy command (copy from, specifically), to get the job done. i had a problem with the encoding being wrong out of the default csv export in OOo, however, when i exported to ASCII/US, the copy from command worked great. good luck. __________________________________ Yahoo! Mail - PC Magazine Editors' Choice 2005 http://mail.yahoo.com
On Oct 22, 2005, at 0:47 , Neil Saunders wrote: > I'm writing a property rental web application, and one of the metrics > I wish to track is the number of page views per month for each > property. > > I originally envisaged a table with a column for each month (one row > per property), for which the relevant column would be incremented each > time a property is viewed, depending on the month. But this raises > questions as to the best way to maintain this table (i.e. Create a new > month column each month) I'd prefer to keep all logic in the database, > so would prefer not to use a cron job to do this. I strongly suggest *not* making a column for each month, but rather a row for each month--or even for each page view for greater granularity. Databases are a great way of storing raw information and transforming it into different kinds of summaries, one of which could be a cross tab of properties and the hits per month. In this case your raw data is tracking when a page view occurred, so you'd want to have a table that captures this information: property, timestamp (or date). Take a look at tablefunc in contrib about how to generate cross tabs. I've found them very helpful for generating this kind of summary (though in my case it's been orders per month). Depending on your performance needs, you may want to generate an interim table that pre-calculates your totals per month. For example, if your property view tracking table is create table property_views ( property_id integer not null references properties (property_id) , view_timestamp timestamp(0) with time zone not null ) without oids; you'd have a table create table property_views_per_month as select property_id , date_trunc('month', view_timestamp) as view_month , count(property_id) as view_count from property_views group by property_id, view_month; You'd then generate your crosstab from the property_views_per_month table. (Of course, you'd have to drop and recreate or otherwise update this table periodically, as it doesn't capture up-to-date data). This is an optimization step, however, so unless you find that you need the data faster, you can just use a view and generate the crosstab from the view, e.g., create view property_views_per_month_view as select property_id , date_truc('month', view_timestamp) as view_month , count(property_id) as view_count from property_views group by property_id, view_month; Hope this helps! Michael Glaesemann grzm myrealbox com
Apologies for the delayed reply - Thank you all for the advice. I'll get on to installing crosstab tonight. I really like the idea of dumping each page view to a table and creating a view to present that data, but since this data will be used to render a graph every time a user logs in I've got this feeling that this will prove a bottleneck if (when!!) the site becomes successful and the query starts to slow to a crawl. Of course my fears are totally without substance, but I'd rather spend more time thinking about the design than undo-ing a bad design mistake in the future. Thanks once again for you input! Kind Regards, Neil. On 10/21/05, Michael Glaesemann <grzm@myrealbox.com> wrote: > > On Oct 22, 2005, at 0:47 , Neil Saunders wrote: > > > I'm writing a property rental web application, and one of the metrics > > I wish to track is the number of page views per month for each > > property. > > > > I originally envisaged a table with a column for each month (one row > > per property), for which the relevant column would be incremented each > > time a property is viewed, depending on the month. But this raises > > questions as to the best way to maintain this table (i.e. Create a new > > month column each month) I'd prefer to keep all logic in the database, > > so would prefer not to use a cron job to do this. > > I strongly suggest *not* making a column for each month, but rather a > row for each month--or even for each page view for greater > granularity. Databases are a great way of storing raw information and > transforming it into different kinds of summaries, one of which could > be a cross tab of properties and the hits per month. In this case > your raw data is tracking when a page view occurred, so you'd want to > have a table that captures this information: property, timestamp (or > date). Take a look at tablefunc in contrib about how to generate > cross tabs. I've found them very helpful for generating this kind of > summary (though in my case it's been orders per month). > > Depending on your performance needs, you may want to generate an > interim table that pre-calculates your totals per month. For example, > if your property view tracking table is > > create table property_views ( > property_id integer not null references properties (property_id) > , view_timestamp timestamp(0) with time zone not null > ) without oids; > > you'd have a table > > create table property_views_per_month as > select property_id > , date_trunc('month', view_timestamp) as view_month > , count(property_id) as view_count > from property_views > group by property_id, view_month; > > You'd then generate your crosstab from the property_views_per_month > table. (Of course, you'd have to drop and recreate or otherwise > update this table periodically, as it doesn't capture up-to-date > data). This is an optimization step, however, so unless you find that > you need the data faster, you can just use a view and generate the > crosstab from the view, e.g., > > create view property_views_per_month_view as > select property_id > , date_truc('month', view_timestamp) as view_month > , count(property_id) as view_count > from property_views > group by property_id, view_month; > > Hope this helps! > > Michael Glaesemann > grzm myrealbox com > >
Following the advice given in this thread, I'd tought I'd just post back what I'd come up with for the benefit of interested parties. After a bit of searching I found a little program called pgLogd http://www.digitalstratum.com/pglogd/ Which is a lightweight deamon that saves Apache access logs to a PostgreSQL table. In order to generate statistics for property views (A single page, passed a GET variable), I will use an intermediate table as suggseted by Daniel which would be a updated via script that is cronned to run every night. With appropriate indexes on the date field in the primary log table, speed shouldn't be significantly affected as the table grows. It also has the advantage of retaining all data to allow for wider offline analysis to assist with site trend analysis etc. The 'filtered' data table will stay lean and mean, containing only number of hits per property per month. Knid Regards, Neil Saunders. On 10/25/05, Neil Saunders <n.j.saunders@gmail.com> wrote: > Apologies for the delayed reply - Thank you all for the advice. > > I'll get on to installing crosstab tonight. I really like the idea of > dumping each page view to a table and creating a view to present that > data, but since this data will be used to render a graph every time a > user logs in I've got this feeling that this will prove a bottleneck > if (when!!) the site becomes successful and the query starts to slow > to a crawl. Of course my fears are totally without substance, but I'd > rather spend more time thinking about the design than undo-ing a bad > design mistake in the future. > > Thanks once again for you input! > > Kind Regards, > > Neil. > > > On 10/21/05, Michael Glaesemann <grzm@myrealbox.com> wrote: > > > > On Oct 22, 2005, at 0:47 , Neil Saunders wrote: > > > > > I'm writing a property rental web application, and one of the metrics > > > I wish to track is the number of page views per month for each > > > property. > > > > > > I originally envisaged a table with a column for each month (one row > > > per property), for which the relevant column would be incremented each > > > time a property is viewed, depending on the month. But this raises > > > questions as to the best way to maintain this table (i.e. Create a new > > > month column each month) I'd prefer to keep all logic in the database, > > > so would prefer not to use a cron job to do this. > > > > I strongly suggest *not* making a column for each month, but rather a > > row for each month--or even for each page view for greater > > granularity. Databases are a great way of storing raw information and > > transforming it into different kinds of summaries, one of which could > > be a cross tab of properties and the hits per month. In this case > > your raw data is tracking when a page view occurred, so you'd want to > > have a table that captures this information: property, timestamp (or > > date). Take a look at tablefunc in contrib about how to generate > > cross tabs. I've found them very helpful for generating this kind of > > summary (though in my case it's been orders per month). > > > > Depending on your performance needs, you may want to generate an > > interim table that pre-calculates your totals per month. For example, > > if your property view tracking table is > > > > create table property_views ( > > property_id integer not null references properties (property_id) > > , view_timestamp timestamp(0) with time zone not null > > ) without oids; > > > > you'd have a table > > > > create table property_views_per_month as > > select property_id > > , date_trunc('month', view_timestamp) as view_month > > , count(property_id) as view_count > > from property_views > > group by property_id, view_month; > > > > You'd then generate your crosstab from the property_views_per_month > > table. (Of course, you'd have to drop and recreate or otherwise > > update this table periodically, as it doesn't capture up-to-date > > data). This is an optimization step, however, so unless you find that > > you need the data faster, you can just use a view and generate the > > crosstab from the view, e.g., > > > > create view property_views_per_month_view as > > select property_id > > , date_truc('month', view_timestamp) as view_month > > , count(property_id) as view_count > > from property_views > > group by property_id, view_month; > > > > Hope this helps! > > > > Michael Glaesemann > > grzm myrealbox com > > > > >