Re: SQL (Venn diagram type of logic) - Mailing list pgsql-sql

From Aaron Bono
Subject Re: SQL (Venn diagram type of logic)
Date
Msg-id bf05e51c0607121237m2feafe04ue699bd2841c0026a@mail.gmail.com
Whole thread Raw
In response to Re: SQL (Venn diagram type of logic)  ("Aaron Bono" <postgresql@aranya.com>)
List pgsql-sql
On 7/11/06, Vinnie Ma <nyubambam@yahoo.com> wrote:
Hello Aaron,

Thank you for the help.  My apologies for the direct
email but i couldn't figure out how to reply to the
thread on the postresql site.

 Just doing a reply to all should send it to the mailing list but the from address must match the email address you registered on the list.

Once i get the purchases seperated by categories, I
plan to track the month on month status of each
account.

For example, in may 2006, based on her previous
purchaes, customer1 was in category A but in june
2006, she bought something in category B.  This would
make her in bucket 2.

bucket 1: A
bucket 2: A&B
bucket 3: A&C
bucket 4: B
bucket 5: B&C
bucket 6: C
bucket 7: A&B&C

(these are the combinations i was thinking about when
i had the venn diagram in my head.)

To follow with the example above:

In May, she would be counted as part of bucket 1 and
her total spending (life to may) will be part of
running total for bucket 1

In June,  she would be counted as part of bucket 2 and
her total spending (life to june) will be part of
running total for bucket 2

In the end, we will have count(customers),
sum(all_purchases) for each of the 7 buckets for each
month.  From a business perpective, we hope this
exercise will show the general migration patterns of
customers from one bucket to another over time.

I figured out a way to do it but it is no where near
efficient.

i added three boolean fields to the account table to
indicate catA, catB, and catC.  i update the table 3
times for each of the categories, turning on the
boolean fields where applicable.

then i select the count, and sum fields from an inner
join on account and purchase tables for each of the 7
buckets.  it works, but i would manually do it for
each month.

any thoughts would be most appreciated.  thanks and
please have a great day.

What I have done for situations like this is to first create a report schema.  Then I have a daily process run that updates reporting tables every night.  These reporting tables keep full history and we only update the latest information.

The main reason for doing it this way is to:
  1. Remove the need to worry about performance of the query to build the report - it is done once a day during low activity on the server
  2. Segment permissions so people have rights to run reports but not dig through the main database (the tech savy analysts tend to write bad queries and run them on production)
  3. Keep history without having to run the report on everything (my refreshes update this month and last month - this provides enough overlap that I don't have to worry about changing months, leap year, etc.)

Hope this helps!

==================================================================
   Aaron Bono
   Aranya Software Technologies, Inc.
   http://www.aranya.com
==================================================================

pgsql-sql by date:

Previous
From: "Aaron Bono"
Date:
Subject: Re: Avoiding RI failure with INSERT ... SELECT FROM
Next
From: "Aaron Bono"
Date:
Subject: Re: Can function results be used in WHERE?