On Mar 3, 2005, at 8:34 AM, Dracula 007 wrote:
>
>> It looks like you are going to always do a sequential scan on the
>> tables, as you always look a the entire table(s). How often do you
>> do the query as compared to the load on the database? If you do the
>> query often relative to the load, could you keep a table of counts
>> something like:
>> create table summarize_use (
>> action_date date,
>> sessions_count int,
>> visitors_count int,
>> isp_count int)
>> and then use triggers from the sessions and actions to increment the
>> various counts in the summarize_use table based on the action_date
>> and session_date date parts? The summarize_use table would then look
>> like:
>
> I can't exactly predict how often these queries will be executed -
> this is an example of one of many different queries that are executed
> over these two tables (in these two tables the main part of statistics
> of our web application is stored). And for every query there can be
> different parameter values etc.
>
> It definitely will be less than 1% of all queries run on the server,
> but the problem is that running of it consumes most of the cpu, so all
> the other queries have to wait and timeouts. And we don't want the
> visitors on the web to wait ...
>
> As a temporary solution we build all the possible results once a week
> (every Monday on 00:00) and store it in a separate tables, so it's
> pretty fast to view, but it takes about 3 hours to rebuild all the
> possible stats (total, by month, by day, by week) for different view
> (pages, articles, visitors, etc). I still hope I'll be able to speed
> it up somehow.
>
> The solution using a triggers looks nice, I'll try that and it
> probably will work, but I can't predict how complicated it will be to
> log all the interesting stats.
>
> t.v.
>
If your group by is at its finest grain only daily, then the most you
will run these queries is daily, correct? Could you try some of your
queries doing:
select BLAH,BLAH,BLAH from BLAH where action_year=2005 and action_day=3
and action_month=3
after building indices on the columns? If this is fast (which it very
well may be), then you can simply run that set of queries daily and
insert the result into your "big stats" table, as presumably all other
data in the "big stats" table is static if the date has past. That
would save you writing the triggers, which could be complicated from
what you are saying....
Sean