Re: Summing & Grouping in a Hierarchical Structure - Mailing list pgsql-sql

From Don Parris
Subject Re: Summing & Grouping in a Hierarchical Structure
Date
Msg-id CAJ-7yo=6dSoydFCEJo_9LSdUuoA1s68SFSwxzbwGh0oAKk0whw@mail.gmail.com
Whole thread Raw
In response to Re: Summing & Grouping in a Hierarchical Structure  (Don Parris <parrisdc@gmail.com>)
Responses Re: Summing & Grouping in a Hierarchical Structure
List pgsql-sql
Ok, I managed to accomplish my goal with the ltree:

test_ltree=> SELECT subltree(path,0,2), sum(trans_amt) FROM testcat GROUP BY subltree;
      subltree      |  sum  
--------------------+--------
 TOP.Groceries      | 325.00
 TOP.Transportation | 180.00
(2 rows)


It took a bit of experimenting with the various functions of ltree, but I got what I wanted.

What I did was keep the ltree path column in the same table as the line-item amount, but this requires building the entire path statement in every row.  That's ok for a quick test to figure out how ltree works, but I think I do need to keep the category structure in a separate table:

Category (category_id, path)

Anyway, thanks for suggesting ltree.


On Fri, Feb 22, 2013 at 7:15 PM, Don Parris <parrisdc@gmail.com> wrote:
Hi Misa,

I decided to try out ltree, and have made some progress with it.  If I understand correctly how to use it, I simply insert the 'path' column into my table, using ltree as the data type.  That eliminates the need for a category table, if I understand correctly.  I just need to ensure the category path is correct for each line item in the transaction details table.

However, I have difficulty figuring out how to sum the amounts as I would like:
test_ltree=> SELECT path, SUM(trans_amt) FROM testcat WHERE path ~ 'TOP.*{2}' GROUP BY path;
 path | sum
------+-----
(0 rows)


This works fine:
test_ltree=> SELECT path, trans_amt FROM testcat;
                  path                   | trans_amt
-----------------------------------------+-----------
 TOP.Transportation.Auto.Fuel            |     50.00
 TOP.Transportation.Auto.Maintenance     |     30.00
 TOP.Transportation.Auto.Fuel            |     25.00
 TOP.Transportation.Bicycle.Gear         |     40.00
 TOP.Transportation.Bicycle.Gear         |     20.00
 TOP.Transportation.Fares.Bus            |     10.00
 TOP.Transportation.Fares.Train          |      5.00
 TOP.Groceries.Food.Beverages            |     30.00
 TOP.Groceries.Food.Fruit_Veggies        |     40.00
 TOP.Groceries.Food.Meat_Fish            |     80.00
 TOP.Groceries.Food.Grains_Cereals       |     30.00
 TOP.Groceries.Beverages.Alcohol.Beer    |     25.00
 TOP.Groceries.Beverages.Alcohol.Spirits |     10.00
 TOP.Groceries.Beverages.Alcohol.Wine    |     50.00
 TOP.Groceries.Beverages.Juice           |     45.00
 TOP.Groceries.Beverages.Other           |     15.00
(16 rows)


But I can do that in my sleep with any regular query.  This also works great (sum the trans_amt column at level 3 (not counting "TOP"):
test_ltree=> SELECT path, SUM(trans_amt) FROM testcat WHERE path ~ 'TOP.*{3}' GROUP BY path;
                path                 |  sum 
-------------------------------------+-------
 TOP.Groceries.Beverages.Juice       | 45.00
 TOP.Groceries.Beverages.Other       | 15.00
 TOP.Groceries.Food.Beverages        | 30.00
 TOP.Groceries.Food.Fruit_Veggies    | 40.00
 TOP.Groceries.Food.Grains_Cereals   | 30.00
 TOP.Groceries.Food.Meat_Fish        | 80.00
 TOP.Transportation.Auto.Fuel        | 75.00
 TOP.Transportation.Auto.Maintenance | 30.00
 TOP.Transportation.Bicycle.Gear     | 60.00
 TOP.Transportation.Fares.Bus        | 10.00
 TOP.Transportation.Fares.Train      |  5.00
(11 rows)

So if I want to see:
TOP.Groceries        | 240.00
TOP.Transportation | 180.00

How do I get this?  Can you help?

I am running PostgreSQL 9.1 on Kubuntu 12.10, in case that makes any difference.



On Fri, Feb 22, 2013 at 4:00 AM, Misa Simic <misa.simic@gmail.com> wrote:
Hi,


Have you considered maybe ltree datatype?


I think it solves a lot of problems in topic....

Kind regards,

Misa


On Friday, February 15, 2013, Don Parris wrote:
Hi all,

I posted to this list some time ago about working with a hierarchical category structure.   I had great difficulty with my problem and gave up for a time.  I recently returned to it and resolved a big part of it.  I have one step left to go, but at least I have solved this part.

Here is the original thread (or one of them):
http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=Rmo1VkbA@mail.gmail.com

Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category:
http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/


Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information.  A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low.  In my case, I could see the details, but not the summary.  Well now I can summarize by what I refer to as the 2nd-level categories.

Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS.  If anyone sees something I should explain better or in more depth, please let me know.

Regards,
Don
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
GPG Key ID: F5E179BE



--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
GPG Key ID: F5E179BE



--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
GPG Key ID: F5E179BE

pgsql-sql by date:

Previous
From: Don Parris
Date:
Subject: Re: Summing & Grouping in a Hierarchical Structure
Next
From: Bryan L Nuse
Date:
Subject: Re: Summing & Grouping in a Hierarchical Structure