Thread: Summing & Grouping in a Hierarchical Structure
Here is the original thread (or one of them):
http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=Rmo1VkbA@mail.gmail.com
http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/
Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories.
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
I would try a recursive query to determine the category structure and aggregate as you go. I had a similar problem with a hierarchical structure for an organization structure. Another thing you might try is to create a separate CTE for each category and then aggregate the individual CTEs.
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Don Parris
Sent: Thursday, February 14, 2013 7:58 PM
To: pgsql-sql@postgresql.org
Subject: [SQL] Summing & Grouping in a Hierarchical Structure
Hi all,
I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part.
Here is the original thread (or one of them):
http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=Rmo1VkbA@mail.gmail.com
Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category:
http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/
Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories.
Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know.
Regards,
Don
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
GPG Key ID: F5E179BE
> From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Don Parris > Sent: Thursday, February 14, 2013 8:58 PM > To: pgsql-sql@postgresql.org > Subject: [SQL] Summing & Grouping in a Hierarchical Structure > > Hi all, > I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty withmy problem and gave up for a time. > I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part. > > Here is the original thread (or one of them): > http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=Rmo1VkbA@mail.gmail.com > > > Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category: > http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/ > > > Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers,so as to show more summarized information. > A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailedlevels if something is unusually high or low. > In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-levelcategories. > Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of workingwith hierarchical structures in a 2-dimensional RDBMS. > If anyone sees something I should explain better or in more depth, please let me know. > > Regards, > Don > -- > D.C. Parris, FMP, Linux+, ESL Certificate > Minister, Security/FM Coordinator, Free Software Advocate > http://dcparris.net/ > GPG Key ID: F5E179BE My two cents would be to actually use a different tool for the job of presenting this data. I'd have used a pivot tablein Microsoft Excel. Not sure what your environment or requirements are but pivot tables are widely used in business,easy to share, can be formatted, and give the user the ability to drill down and navigate to the data they wantto see. I'd set up a query to pull the raw data you need with all of the categories and associated data you need. Then bring thatdata to Excel to present and summarize it. Mike
I would try a recursive query to determine the category structure and aggregate as you go. I had a similar problem with a hierarchical structure for an organization structure. Another thing you might try is to create a separate CTE for each category and then aggregate the individual CTEs.
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Don Parris
Sent: Thursday, February 14, 2013 7:58 PM
To: pgsql-sql@postgresql.org
Subject: [SQL] Summing & Grouping in a Hierarchical Structure
Hi all,
I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part.
Here is the original thread (or one of them):
http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=Rmo1VkbA@mail.gmail.com
Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category:
http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/
Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories.Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know.
Regards,
Don
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software AdvocateGPG Key ID: F5E179BE
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
> From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Don Parris
> Sent: Thursday, February 14, 2013 8:58 PM
> To: pgsql-sql@postgresql.org
> Subject: [SQL] Summing & Grouping in a Hierarchical Structure
>My two cents would be to actually use a different tool for the job of presenting this data. I'd have used a pivot table in Microsoft Excel. Not sure what your environment or requirements are but pivot tables are widely used in business, easy to share, can be formatted, and give the user the ability to drill down and navigate to the data they want to see.> Hi all,
> I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time.
> I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part.
>
> Here is the original thread (or one of them):
> http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=Rmo1VkbA@mail.gmail.com
>
>
> Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category:
> http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/
>
>
> Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information.
> A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low.
> In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories.
> Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS.
> If anyone sees something I should explain better or in more depth, please let me know.
>
> Regards,
> Don
> --
> D.C. Parris, FMP, Linux+, ESL Certificate
> Minister, Security/FM Coordinator, Free Software Advocate
> http://dcparris.net/
> GPG Key ID: F5E179BE
I'd set up a query to pull the raw data you need with all of the categories and associated data you need. Then bring that data to Excel to present and summarize it.
Mike
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
I would use the recursive CTE to gather the hierarchical portion of the data you need and then join that CTE to another table or CTE with the other data you need. I had a situation like this at my job were organization info was in a hierarchal table and I needed to join it to two other tables. I created a CTE with the combined data from the non-hierarchical tables and left joined it to the recursive CTE.
If you’re having trouble with this, I suggest looking into CTEs and the different types of joins.
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Don Parris
Sent: Thursday, February 21, 2013 4:38 AM
To: pgsql-sql@postgresql.org
Subject: Re: [SQL] Summing & Grouping in a Hierarchical Structure
Hi Alexander,
I appreciate you taking time to reply to my post. I like the idea of the WITH RECURSIVE query, but... The two examples in the link you offered are not so helpful to me. For example, the initial WITH query shown uses a single table, and I wander how that might apply in my case, where the relevant information is actually found in two tables, one of them a recursive table.
The second example, which applies the WITH RECURSIVE clause, is even less so. I wonder if there is a good tutorial somewhere on this that shows some other examples? That might help me catch on a little better. I'll search for that today.
On Thu, Feb 14, 2013 at 11:30 PM, Alexander Gataric <gataric@usa.net> wrote:
I would try a recursive query to determine the category structure and aggregate as you go. I had a similar problem with a hierarchical structure for an organization structure. Another thing you might try is to create a separate CTE for each category and then aggregate the individual CTEs.
From: pgsql-sql-owner@postgresql.org [mailto:pgsql-sql-owner@postgresql.org] On Behalf Of Don Parris
Sent: Thursday, February 14, 2013 7:58 PM
To: pgsql-sql@postgresql.org
Subject: [SQL] Summing & Grouping in a Hierarchical Structure
Hi all,
I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part.
Here is the original thread (or one of them):
http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=Rmo1VkbA@mail.gmail.com
Here is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category:
http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/
Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories.
Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know.
Regards,
Don
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
GPG Key ID: F5E179BE
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
GPG Key ID: F5E179BE
On Friday, February 15, 2013, Don Parris wrote:
Hi all,I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part.
Here is the original thread (or one of them):
http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=Rmo1VkbA@mail.gmail.comHere is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category:
http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/
Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories.Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know.Regards,Don--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software AdvocateGPG Key ID: F5E179BE
However, I have difficulty figuring out how to sum the amounts as I would like:
test_ltree=> SELECT path, SUM(trans_amt) FROM testcat WHERE path ~ 'TOP.*{2}' GROUP BY path;
path | sum
------+-----
(0 rows)
This works fine:
test_ltree=> SELECT path, trans_amt FROM testcat;
path | trans_amt
-----------------------------------------+-----------
TOP.Transportation.Auto.Fuel | 50.00
TOP.Transportation.Auto.Maintenance | 30.00
TOP.Transportation.Auto.Fuel | 25.00
TOP.Transportation.Bicycle.Gear | 40.00
TOP.Transportation.Bicycle.Gear | 20.00
TOP.Transportation.Fares.Bus | 10.00
TOP.Transportation.Fares.Train | 5.00
TOP.Groceries.Food.Beverages | 30.00
TOP.Groceries.Food.Fruit_Veggies | 40.00
TOP.Groceries.Food.Meat_Fish | 80.00
TOP.Groceries.Food.Grains_Cereals | 30.00
TOP.Groceries.Beverages.Alcohol.Beer | 25.00
TOP.Groceries.Beverages.Alcohol.Spirits | 10.00
TOP.Groceries.Beverages.Alcohol.Wine | 50.00
TOP.Groceries.Beverages.Juice | 45.00
TOP.Groceries.Beverages.Other | 15.00
(16 rows)
test_ltree=> SELECT path, SUM(trans_amt) FROM testcat WHERE path ~ 'TOP.*{3}' GROUP BY path;
path | sum
-------------------------------------+-------
TOP.Groceries.Beverages.Juice | 45.00
TOP.Groceries.Beverages.Other | 15.00
TOP.Groceries.Food.Beverages | 30.00
TOP.Groceries.Food.Fruit_Veggies | 40.00
TOP.Groceries.Food.Grains_Cereals | 30.00
TOP.Groceries.Food.Meat_Fish | 80.00
TOP.Transportation.Auto.Fuel | 75.00
TOP.Transportation.Auto.Maintenance | 30.00
TOP.Transportation.Bicycle.Gear | 60.00
TOP.Transportation.Fares.Bus | 10.00
TOP.Transportation.Fares.Train | 5.00
(11 rows)
Hi,Have you considered maybe ltree datatype?I think it solves a lot of problems in topic....Kind regards,Misa
On Friday, February 15, 2013, Don Parris wrote:Hi all,I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part.
Here is the original thread (or one of them):
http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=Rmo1VkbA@mail.gmail.comHere is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category:
http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/
Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories.Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know.Regards,Don--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software AdvocateGPG Key ID: F5E179BE
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
test_ltree=> SELECT subltree(path,0,2), sum(trans_amt) FROM testcat GROUP BY subltree;
subltree | sum
--------------------+--------
TOP.Groceries | 325.00
TOP.Transportation | 180.00
(2 rows)
It took a bit of experimenting with the various functions of ltree, but I got what I wanted.
I am running PostgreSQL 9.1 on Kubuntu 12.10, in case that makes any difference.How do I get this? Can you help?TOP.Transportation | 180.00TOP.Groceries | 240.00So if I want to see:But I can do that in my sleep with any regular query. This also works great (sum the trans_amt column at level 3 (not counting "TOP"):Hi Misa,I decided to try out ltree, and have made some progress with it. If I understand correctly how to use it, I simply insert the 'path' column into my table, using ltree as the data type. That eliminates the need for a category table, if I understand correctly. I just need to ensure the category path is correct for each line item in the transaction details table.
However, I have difficulty figuring out how to sum the amounts as I would like:
test_ltree=> SELECT path, SUM(trans_amt) FROM testcat WHERE path ~ 'TOP.*{2}' GROUP BY path;
path | sum
------+-----
(0 rows)
This works fine:
test_ltree=> SELECT path, trans_amt FROM testcat;
path | trans_amt
-----------------------------------------+-----------
TOP.Transportation.Auto.Fuel | 50.00
TOP.Transportation.Auto.Maintenance | 30.00
TOP.Transportation.Auto.Fuel | 25.00
TOP.Transportation.Bicycle.Gear | 40.00
TOP.Transportation.Bicycle.Gear | 20.00
TOP.Transportation.Fares.Bus | 10.00
TOP.Transportation.Fares.Train | 5.00
TOP.Groceries.Food.Beverages | 30.00
TOP.Groceries.Food.Fruit_Veggies | 40.00
TOP.Groceries.Food.Meat_Fish | 80.00
TOP.Groceries.Food.Grains_Cereals | 30.00
TOP.Groceries.Beverages.Alcohol.Beer | 25.00
TOP.Groceries.Beverages.Alcohol.Spirits | 10.00
TOP.Groceries.Beverages.Alcohol.Wine | 50.00
TOP.Groceries.Beverages.Juice | 45.00
TOP.Groceries.Beverages.Other | 15.00
(16 rows)
test_ltree=> SELECT path, SUM(trans_amt) FROM testcat WHERE path ~ 'TOP.*{3}' GROUP BY path;
path | sum
-------------------------------------+-------
TOP.Groceries.Beverages.Juice | 45.00
TOP.Groceries.Beverages.Other | 15.00
TOP.Groceries.Food.Beverages | 30.00
TOP.Groceries.Food.Fruit_Veggies | 40.00
TOP.Groceries.Food.Grains_Cereals | 30.00
TOP.Groceries.Food.Meat_Fish | 80.00
TOP.Transportation.Auto.Fuel | 75.00
TOP.Transportation.Auto.Maintenance | 30.00
TOP.Transportation.Bicycle.Gear | 60.00
TOP.Transportation.Fares.Bus | 10.00
TOP.Transportation.Fares.Train | 5.00
(11 rows)On Fri, Feb 22, 2013 at 4:00 AM, Misa Simic <misa.simic@gmail.com> wrote:Hi,Have you considered maybe ltree datatype?I think it solves a lot of problems in topic....Kind regards,Misa
On Friday, February 15, 2013, Don Parris wrote:Hi all,I posted to this list some time ago about working with a hierarchical category structure. I had great difficulty with my problem and gave up for a time. I recently returned to it and resolved a big part of it. I have one step left to go, but at least I have solved this part.
Here is the original thread (or one of them):
http://www.postgresql.org/message-id/CAJ-7yonw4_qDCp-ZNYwEkR2jdLKeL8nfGc+-TLLSW=Rmo1VkbA@mail.gmail.comHere is my recent blog post about how I managed to show my expenses summed and grouped by a mid-level category:
http://dcparris.net/2013/02/13/hierarchical-categories-rdbms/
Specifically, I wanted to sum and group expenses according to categories, not just at the bottom tier, but at higher tiers, so as to show more summarized information. A CEO primarily wants to know the sum total for all the business units, yet have the ability to drill down to more detailed levels if something is unusually high or low. In my case, I could see the details, but not the summary. Well now I can summarize by what I refer to as the 2nd-level categories.Anyway, I hope this helps someone, as I have come to appreciate - and I mean really appreciate - the challenge of working with hierarchical structures in a 2-dimensional RDBMS. If anyone sees something I should explain better or in more depth, please let me know.Regards,Don--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software AdvocateGPG Key ID: F5E179BE
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software AdvocateGPG Key ID: F5E179BE
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
TOP.Transportation | 180.00TOP.Groceries | 240.00So if I want to see:
This works fine:
test_ltree=> SELECT path, trans_amt FROM testcat;
path | trans_amt
-----------------------------------------+-----------
TOP.Transportation.Auto.Fuel | 50.00
TOP.Transportation.Auto.Maintenance | 30.00
TOP.Transportation.Auto.Fuel | 25.00
TOP.Transportation.Bicycle.Gear | 40.00
TOP.Transportation.Bicycle.Gear | 20.00
TOP.Transportation.Fares.Bus | 10.00
TOP.Transportation.Fares.Train | 5.00
TOP.Groceries.Food.Beverages | 30.00
TOP.Groceries.Food.Fruit_Veggies | 40.00
TOP.Groceries.Food.Meat_Fish | 80.00
TOP.Groceries.Food.Grains_Cereals | 30.00
TOP.Groceries.Beverages.Alcohol.Beer | 25.00
TOP.Groceries.Beverages.Alcohol.Spirits | 10.00
TOP.Groceries.Beverages.Alcohol.Wine | 50.00
TOP.Groceries.Beverages.Juice | 45.00
TOP.Groceries.Beverages.Other | 15.00
(16 rows)
On Saturday, February 23, 2013, Bryan L Nuse wrote:
TOP.Transportation | 180.00TOP.Groceries | 240.00So if I want to see:
This works fine:
test_ltree=> SELECT path, trans_amt FROM testcat;
path | trans_amt
-----------------------------------------+-----------
TOP.Transportation.Auto.Fuel | 50.00
TOP.Transportation.Auto.Maintenance | 30.00
TOP.Transportation.Auto.Fuel | 25.00
TOP.Transportation.Bicycle.Gear | 40.00
TOP.Transportation.Bicycle.Gear | 20.00
TOP.Transportation.Fares.Bus | 10.00
TOP.Transportation.Fares.Train | 5.00
TOP.Groceries.Food.Beverages | 30.00
TOP.Groceries.Food.Fruit_Veggies | 40.00
TOP.Groceries.Food.Meat_Fish | 80.00
TOP.Groceries.Food.Grains_Cereals | 30.00
TOP.Groceries.Beverages.Alcohol.Beer | 25.00
TOP.Groceries.Beverages.Alcohol.Spirits | 10.00
TOP.Groceries.Beverages.Alcohol.Wine | 50.00
TOP.Groceries.Beverages.Juice | 45.00
TOP.Groceries.Beverages.Other | 15.00
(16 rows)Hello Don,Perhaps I am missing something about what your constraints are, or what you're trying to achieve, but is there any reason you could not use a series of joined tables indicating parent-child relationships? The following example follows that in your previous posts. Note that this approach (as given) will not work if branches stemming from the same node are different lengths. That is, if you have costs associated with "Transportation.Bicycle.Gear", you could not also have a category "Transportation.Bicycle.Gear.Chain_ring". (To add the latter category, you'd have to put costs from the former under something like "Transportation.Bicycle.Gear.General" -- or modify the approach.) However, lengthening the "Alcohol" branches, e.g., by tacking on a level5 table would be easy. Notice that level3 and level4 are not true look-up tables, since they may contain duplicate cat values.If I'm off base, by all means specify just how.Regards,Bryan--------------------------------------------------CREATE TABLE level1 (cat text PRIMARY KEY);CREATE TABLE level2 (cat text PRIMARY KEY,parent text REFERENCES level1(cat));CREATE TABLE level3 (cat text,parent text REFERENCES level2(cat),cost numeric(6,2));CREATE TABLE level4 (cat text,parent text,cost numeric(6,2));INSERT INTO level1VALUES ('Transportation'),('Groceries');INSERT INTO level2VALUES ('Auto', 'Transportation'),('Bicycle', 'Transportation'),('Fares', 'Transportation'),('Food', 'Groceries'),('Beverages', 'Groceries');INSERT INTO level3VALUES ('Fuel', 'Auto', 50.00),('Maintenance', 'Auto', 30.00),('Fuel', 'Auto', 25.00),('Gear', 'Bicycle', 40.00),('Gear', 'Bicycle', 20.00),('Bus', 'Fares', 10.00),('Train', 'Fares', 5.00),('Beverages', 'Food', 30.00),('Fruit_Veg', 'Food', 40.00),('Meat_Fish', 'Food', 80.00),('Grains_Cereals', 'Food', 30.00),('Alcohol', 'Beverages', NULL),('Juice', 'Beverages', 45.00),('Other', 'Beverages', 15.00);INSERT INTO level4VALUES ('Beer', 'Alcohol', 25.00),('Spirits', 'Alcohol', 10.00),('Wine', 'Alcohol', 50.00);CREATE VIEW all_cats AS (SELECT a.cat AS level4,b.cat AS level3,c.cat AS level2,d.cat AS level1,CASE WHEN a.cost IS NULL THEN 0WHEN a.cost IS NOT NULL THEN a.costEND+ CASE WHEN b.cost IS NULL THEN 0WHEN b.cost IS NOT NULL THEN b.costEND AS costFROM level4 aFULL JOINlevel3 bON (a.parent = b.cat)FULL JOINlevel2 cON (b.parent = c.cat)FULL JOINlevel1 dON (c.parent = d.cat)ORDER BY level1, level2, level3, level4);SELECT * FROM all_cats;level4 | level3 | level2 | level1 | cost---------+----------------+-----------+----------------+-------Beer | Alcohol | Beverages | Groceries | 25.00Spirits | Alcohol | Beverages | Groceries | 10.00Wine | Alcohol | Beverages | Groceries | 50.00| Juice | Beverages | Groceries | 45.00| Other | Beverages | Groceries | 15.00| Beverages | Food | Groceries | 30.00| Fruit_Veg | Food | Groceries | 40.00| Grains_Cereals | Food | Groceries | 30.00| Meat_Fish | Food | Groceries | 80.00| Fuel | Auto | Transportation | 50.00| Fuel | Auto | Transportation | 25.00| Maintenance | Auto | Transportation | 30.00| Gear | Bicycle | Transportation | 20.00| Gear | Bicycle | Transportation | 40.00| Bus | Fares | Transportation | 10.00| Train | Fares | Transportation | 5.00(16 rows)SELECT level1,count(cost) AS num_branches,sum(cost) AS total_costFROM all_catsGROUP BY level1ORDER BY level1;level1 | num_branches | total_cost----------------+--------------+------------Groceries | 9 | 325.00Transportation | 7 | 180.00(2 rows)
<SNIP>Hello Don,Perhaps I am missing something about what your constraints are, or what you're trying to achieve, but is there any reason you could not use a series of joined tables indicating parent-child relationships? The following example follows that in your previous posts. Note that this approach (as given) will not work if branches stemming from the same node are different lengths. That is, if you have costs associated with "Transportation.Bicycle.Gear", you could not also have a category "Transportation.Bicycle.Gear.Chain_ring". (To add the latter category, you'd have to put costs from the former under something like "Transportation.Bicycle.Gear.General" -- or modify the approach.) However, lengthening the "Alcohol" branches, e.g., by tacking on a level5 table would be easy. Notice that level3 and level4 are not true look-up tables, since they may contain duplicate cat values.If I'm off base, by all means specify just how.Regards,Bryan--------------------------------------------------CREATE TABLE level1 (cat text PRIMARY KEY);CREATE TABLE level2 (cat text PRIMARY KEY,parent text REFERENCES level1(cat));CREATE TABLE level3 (cat text,parent text REFERENCES level2(cat),cost numeric(6,2));CREATE TABLE level4 (cat text,parent text,cost numeric(6,2));INSERT INTO level1VALUES ('Transportation'),('Groceries');INSERT INTO level2VALUES ('Auto', 'Transportation'),('Bicycle', 'Transportation'),('Fares', 'Transportation'),('Food', 'Groceries'),('Beverages', 'Groceries');INSERT INTO level3VALUES ('Fuel', 'Auto', 50.00),('Maintenance', 'Auto', 30.00),('Fuel', 'Auto', 25.00),('Gear', 'Bicycle', 40.00),('Gear', 'Bicycle', 20.00),('Bus', 'Fares', 10.00),('Train', 'Fares', 5.00),('Beverages', 'Food', 30.00),('Fruit_Veg', 'Food', 40.00),('Meat_Fish', 'Food', 80.00),('Grains_Cereals', 'Food', 30.00),('Alcohol', 'Beverages', NULL),('Juice', 'Beverages', 45.00),('Other', 'Beverages', 15.00);INSERT INTO level4VALUES ('Beer', 'Alcohol', 25.00),('Spirits', 'Alcohol', 10.00),('Wine', 'Alcohol', 50.00);CREATE VIEW all_cats AS (SELECT a.cat AS level4,b.cat AS level3,c.cat AS level2,d.cat AS level1,CASE WHEN a.cost IS NULL THEN 0WHEN a.cost IS NOT NULL THEN a.costEND+ CASE WHEN b.cost IS NULL THEN 0WHEN b.cost IS NOT NULL THEN b.costEND AS costFROM level4 aFULL JOINlevel3 bON (a.parent = b.cat)FULL JOINlevel2 cON (b.parent = c.cat)FULL JOINlevel1 dON (c.parent = d.cat)ORDER BY level1, level2, level3, level4);SELECT * FROM all_cats;level4 | level3 | level2 | level1 | cost---------+----------------+-----------+----------------+-------Beer | Alcohol | Beverages | Groceries | 25.00Spirits | Alcohol | Beverages | Groceries | 10.00Wine | Alcohol | Beverages | Groceries | 50.00| Juice | Beverages | Groceries | 45.00| Other | Beverages | Groceries | 15.00| Beverages | Food | Groceries | 30.00| Fruit_Veg | Food | Groceries | 40.00| Grains_Cereals | Food | Groceries | 30.00| Meat_Fish | Food | Groceries | 80.00| Fuel | Auto | Transportation | 50.00| Fuel | Auto | Transportation | 25.00| Maintenance | Auto | Transportation | 30.00| Gear | Bicycle | Transportation | 20.00| Gear | Bicycle | Transportation | 40.00| Bus | Fares | Transportation | 10.00| Train | Fares | Transportation | 5.00(16 rows)SELECT level1,count(cost) AS num_branches,sum(cost) AS total_costFROM all_catsGROUP BY level1ORDER BY level1;level1 | num_branches | total_cost----------------+--------------+------------Groceries | 9 | 325.00Transportation | 7 | 180.00(2 rows)
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
> That said, now that I have finally gotten the chance to try ltree, I think I like it a lot. Hello Don, Yes, after looking at ltree --which I had not done before-- I have to agree with Misa that it looks like the right solutionfor your problem. That is not to say that "brute force" SQL couldn't provide a workable arrangement; but ltree looksvery flexible, especially as it allows you to assign cost values to non-terminal nodes. If it were me, though, I'dstill make use of VIEWs to report results of the workhorse queries: staring at a list of items like "Transportation.Bicycle.Gear.Chain_ring"sounds like headache. That's a matter of taste, of course. Bryan
> That said, now that I have finally gotten the chance to try ltree, I think I like it a lot.Hello Don,
Yes, after looking at ltree --which I had not done before-- I have to agree with Misa that it looks like the right solution for your problem. That is not to say that "brute force" SQL couldn't provide a workable arrangement; but ltree looks very flexible, especially as it allows you to assign cost values to non-terminal nodes. If it were me, though, I'd still make use of VIEWs to report results of the workhorse queries: staring at a list of items like "Transportation.Bicycle.Gear.Chain_ring" sounds like headache. That's a matter of taste, of course.
Bryan
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
Hi Don,Yes, its better to use it in category table...Bryan, how many levels there will be - we dont know...With one table - and ltree is solved all cases...To add new subcategory user just picks the parent category... So it is easy to add chain ring to gear... As category...In another transaction table is category_id, amount...Don already posted query for sum... In these case just category and transaction table should be joined sum amount, group by functions on lpath....(depending what is the goal...)Kind Regards,Misa
On Saturday, February 23, 2013, Bryan L Nuse wrote:TOP.Transportation | 180.00TOP.Groceries | 240.00So if I want to see:
This works fine:
test_ltree=> SELECT path, trans_amt FROM testcat;
path | trans_amt
-----------------------------------------+-----------
TOP.Transportation.Auto.Fuel | 50.00
TOP.Transportation.Auto.Maintenance | 30.00
TOP.Transportation.Auto.Fuel | 25.00
TOP.Transportation.Bicycle.Gear | 40.00
TOP.Transportation.Bicycle.Gear | 20.00
TOP.Transportation.Fares.Bus | 10.00
TOP.Transportation.Fares.Train | 5.00
TOP.Groceries.Food.Beverages | 30.00
TOP.Groceries.Food.Fruit_Veggies | 40.00
TOP.Groceries.Food.Meat_Fish | 80.00
TOP.Groceries.Food.Grains_Cereals | 30.00
TOP.Groceries.Beverages.Alcohol.Beer | 25.00
TOP.Groceries.Beverages.Alcohol.Spirits | 10.00
TOP.Groceries.Beverages.Alcohol.Wine | 50.00
TOP.Groceries.Beverages.Juice | 45.00
TOP.Groceries.Beverages.Other | 15.00
(16 rows)Hello Don,Perhaps I am missing something about what your constraints are, or what you're trying to achieve, but is there any reason you could not use a series of joined tables indicating parent-child relationships? The following example follows that in your previous posts. Note that this approach (as given) will not work if branches stemming from the same node are different lengths. That is, if you have costs associated with "Transportation.Bicycle.Gear", you could not also have a category "Transportation.Bicycle.Gear.Chain_ring". (To add the latter category, you'd have to put costs from the former under something like "Transportation.Bicycle.Gear.General" -- or modify the approach.) However, lengthening the "Alcohol" branches, e.g., by tacking on a level5 table would be easy. Notice that level3 and level4 are not true look-up tables, since they may contain duplicate cat values.If I'm off base, by all means specify just how.Regards,Bryan--------------------------------------------------CREATE TABLE level1 (cat text PRIMARY KEY);CREATE TABLE level2 (cat text PRIMARY KEY,parent text REFERENCES level1(cat));CREATE TABLE level3 (cat text,parent text REFERENCES level2(cat),cost numeric(6,2));CREATE TABLE level4 (cat text,parent text,cost numeric(6,2));INSERT INTO level1VALUES ('Transportation'),('Groceries');INSERT INTO level2VALUES ('Auto', 'Transportation'),('Bicycle', 'Transportation'),('Fares', 'Transportation'),('Food', 'Groceries'),('Beverages', 'Groceries');INSERT INTO level3VALUES ('Fuel', 'Auto', 50.00),('Maintenance', 'Auto', 30.00),('Fuel', 'Auto', 25.00),('Gear', 'Bicycle', 40.00),('Gear', 'Bicycle', 20.00),('Bus', 'Fares', 10.00),('Train', 'Fares', 5.00),('Beverages', 'Food', 30.00),('Fruit_Veg', 'Food', 40.00),('Meat_Fish', 'Food', 80.00),('Grains_Cereals', 'Food', 30.00),('Alcohol', 'Beverages', NULL),('Juice', 'Beverages', 45.00),('Other', 'Beverages', 15.00);INSERT INTO level4VALUES ('Beer', 'Alcohol', 25.00),('Spirits', 'Alcohol', 10.00),('Wine', 'Alcohol', 50.00);CREATE VIEW all_cats AS (SELECT a.cat AS level4,b.cat AS level3,c.cat AS level2,d.cat AS level1,CASE WHEN a.cost IS NULL THEN 0WHEN a.cost IS NOT NULL THEN a.costEND+ CASE WHEN b.cost IS NULL THEN 0WHEN b.cost IS NOT NULL THEN b.costEND AS costFROM level4 aFULL JOINlevel3 bON (a.parent = b.cat)FULL JOINlevel2 cON (b.parent = c.cat)FULL JOINlevel1 dON (c.parent = d.cat)ORDER BY level1, level2, level3, level4);SELECT * FROM all_cats;level4 | level3 | level2 | level1 | cost---------+----------------+-----------+----------------+-------Beer | Alcohol | Beverages | Groceries | 25.00Spirits | Alcohol | Beverages | Groceries | 10.00Wine | Alcohol | Beverages | Groceries | 50.00| Juice | Beverages | Groceries | 45.00| Other | Beverages | Groceries | 15.00| Beverages | Food | Groceries | 30.00| Fruit_Veg | Food | Groceries | 40.00| Grains_Cereals | Food | Groceries | 30.00| Meat_Fish | Food | Groceries | 80.00| Fuel | Auto | Transportation | 50.00| Fuel | Auto | Transportation | 25.00| Maintenance | Auto | Transportation | 30.00| Gear | Bicycle | Transportation | 20.00| Gear | Bicycle | Transportation | 40.00| Bus | Fares | Transportation | 10.00| Train | Fares | Transportation | 5.00(16 rows)SELECT level1,count(cost) AS num_branches,sum(cost) AS total_costFROM all_catsGROUP BY level1ORDER BY level1;level1 | num_branches | total_cost----------------+--------------+------------Groceries | 9 | 325.00Transportation | 7 | 180.00(2 rows)
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate
On Saturday, February 23, 2013, Don Parris wrote:
Or do the elements of the path have to use underscores and dashes?Misa,Is it possible to use spaces in the ltree path, like so: TOP.Groceries.Food.Herbs & SpicesOn Sat, Feb 23, 2013 at 7:19 AM, Misa Simic <misa.simic@gmail.com> wrote:--Hi Don,Yes, its better to use it in category table...Bryan, how many levels there will be - we dont know...With one table - and ltree is solved all cases...To add new subcategory user just picks the parent category... So it is easy to add chain ring to gear... As category...In another transaction table is category_id, amount...Don already posted query for sum... In these case just category and transaction table should be joined sum amount, group by functions on lpath....(depending what is the goal...)Kind Regards,Misa
On Saturday, February 23, 2013, Bryan L Nuse wrote:TOP.Transportation | 180.00TOP.Groceries | 240.00So if I want to see:
This works fine:
test_ltree=> SELECT path, trans_amt FROM testcat;
path | trans_amt
-----------------------------------------+-----------
TOP.Transportation.Auto.Fuel | 50.00
TOP.Transportation.Auto.Maintenance | 30.00
TOP.Transportation.Auto.Fuel | 25.00
TOP.Transportation.Bicycle.Gear | 40.00
TOP.Transportation.Bicycle.Gear | 20.00
TOP.Transportation.Fares.Bus | 10.00
TOP.Transportation.Fares.Train | 5.00
TOP.Groceries.Food.Beverages | 30.00
TOP.Groceries.Food.Fruit_Veggies | 40.00
TOP.Groceries.Food.Meat_Fish | 80.00
TOP.Groceries.Food.Grains_Cereals | 30.00
TOP.Groceries.Beverages.Alcohol.Beer | 25.00
TOP.Groceries.Beverages.Alcohol.Spirits | 10.00
TOP.Groceries.Beverages.Alcohol.Wine | 50.00
TOP.Groceries.Beverages.Juice | 45.00
TOP.Groceries.Beverages.Other | 15.00
(16 rows)Hello Don,Perhaps I am missing something about what your constraints are, or what you're trying to achieve, but is there any reason you could not use a series of joined tables indicating parent-child relationships? The following example follows that in your previous posts. Note that this approach (as given) will not work if branches stemming from the same node are different lengths. That is, if you have costs associated with "Transportation.Bicycle.Gear", you could not also have a category "Transportation.Bicycle.Gear.Chain_ring". (To add the latter category, you'd have to put costs from the former under something like "Transportation.Bicycle.Gear.General" -- or modify the approach.) However, lengthening the "Alcohol" branches, e.g., by tacking on a level5 table would be easy. Notice that level3 and level4 are not true look-up tables, since they may contain duplicate cat values.If I'm off base, by all means specify just how.Regards,Bryan--------------------------------------------------CREATE TABLE level1 (cat text PRIMARY KEY);CREATE TABLE level2 (cat text PRIMARY KEY,parent text REFERENCES level1(cat));
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software AdvocateGPG Key ID: F5E179BE
Quoth parrisdc@gmail.com (Don Parris): > > Is it possible to use spaces in the ltree path, like so: > TOP.Groceries.Food.Herbs & Spices > > Or do the elements of the path have to use underscores and dashes? From the docs: | A label is a sequence of alphanumeric characters and underscores (for | example, in C locale the characters A-Za-z0-9_ are allowed). Labels | must be less than 256 bytes long. [...] | | A label path is a sequence of zero or more labels separated by dots, | for example L1.L2.L3, representing a path from the root of a | hierarchical tree to a particular node. The length of a label path | must be less than 65Kb, but keeping it under 2Kb is preferable. If you need to store non-alphanumeric labels, one answer (as long as they aren't too long) would be to use URL-encoding, like TOP.Groceries.Food.Herbs_20_26_20Spices Of course, you would need to encode _ as well, and you would need to be sure the labels weren't going to come out too long. Another alternative would be to MD5 each label and use (say) the first 10 bytes of that MD5 in hex as the ltree label. (Annoyingly there's only one non-alphanumeric, so you can't use base64.) If you were going to do that you would need to consider the possibility of an attacker arranging a hash collision: I don't know where you're labels come from, so I don't know if this would be an issue. Ben
Quoth parrisdc@gmail.com (Don Parris):>From the docs:
> Is it possible to use spaces in the ltree path, like so:
> TOP.Groceries.Food.Herbs & Spices
>
> Or do the elements of the path have to use underscores and dashes?
| A label is a sequence of alphanumeric characters and underscores (for
| example, in C locale the characters A-Za-z0-9_ are allowed). Labels
| must be less than 256 bytes long. [...]
|
| A label path is a sequence of zero or more labels separated by dots,
| for example L1.L2.L3, representing a path from the root of a
| hierarchical tree to a particular node. The length of a label path
| must be less than 65Kb, but keeping it under 2Kb is preferable.
If you need to store non-alphanumeric labels, one answer (as long as
they aren't too long) would be to use URL-encoding, like
TOP.Groceries.Food.Herbs_20_26_20Spices
Of course, you would need to encode _ as well, and you would need to be
sure the labels weren't going to come out too long. Another alternative
would be to MD5 each label and use (say) the first 10 bytes of that MD5
in hex as the ltree label. (Annoyingly there's only one
non-alphanumeric, so you can't use base64.) If you were going to do that
you would need to consider the possibility of an attacker arranging a
hash collision: I don't know where you're labels come from, so I don't
know if this would be an issue.
Ben
--
D.C. Parris, FMP, Linux+, ESL Certificate
Minister, Security/FM Coordinator, Free Software Advocate