Thread: grouping a many to many relation set
Hi, I am having a problem grouping a many to many relationship with payments and receipts, where a payment can be for multiple receipts, and a receipt can have multiple payments. I got a list of records that are involved in such relations, but now I don't know how to group them so that all payments and rececipts belonging to the same group are properly grouped. Here's the list: bankbookdetid | receiptid ---------------+----------- 147 | 25 157 | 25 157 | 622 321 | 100 332 | 101 332 | 100 2156 | 573 2156 | 574 2156 | 575 1710 | 575 1710 | 576 I have already grouped them according to the way they should be grouped: bankbook payments and receipt amounts that are part of the same transaction (they are a subset of a large set of payments and receipts, most 1-1, 1-n and n-1, which are solved relatively easy). As you can see there are a few records that interconnect the payments and receipts: bankbookdetid | receiptid ---------------+----------- 157 | 25 332 | 100 2156 | 575 1710 | 575 I tried now for some time how a SQL statement could give a set grouped as you can see above, but I just don't seem to see it. Is there anyone around over here that had a similar situation and has found a solution? Should I try to do this in PL/SQL? Is there a solution for the problem anyway? -johan
Johan Henselmans wrote: > Hi, I am having a problem grouping a many to many relationship with > payments and receipts, where a payment can be for multiple receipts, and > a receipt can have multiple payments. I got a list of records that are > involved in such relations, but now I don't know how to group them so > that all payments and rececipts belonging to the same group are properly > grouped. Here's the list: > > > bankbookdetid | receiptid > ---------------+----------- > 147 | 25 > 157 | 25 > 157 | 622 > > 321 | 100 > 332 | 101 > 332 | 100 ... I think what's missing here is the explicit statement of which group these belong in. Without a value to sort/group by, there's nothing for your queries to "get a grip on". So - add a "group_id" column to the bank-book and receipt tables. Create a sequence to generate group id's on demand. Then you'll want a set of triggers that keeps the group details up to date. Of course, groups can shift as you add more records - particularly in the case of two groups merging when you add a "linking" row. Maybe someone smarter than me can come up with a non-procedural solution. Personally, I've got a nagging feeling that this sort of "connectedness" problem is NP, so scaling could be a problem for you. -- Richard Huxton Archonet Ltd
Richard Huxton wrote: > Johan Henselmans wrote: > >> Hi, I am having a problem grouping a many to many relationship with >> payments and receipts, where a payment can be for multiple receipts, >> and a receipt can have multiple payments. I got a list of records that >> are involved in such relations, but now I don't know how to group them >> so that all payments and rececipts belonging to the same group are >> properly grouped. Here's the list: >> >> >> bankbookdetid | receiptid >> ---------------+----------- >> 147 | 25 >> 157 | 25 >> 157 | 622 >> >> 321 | 100 >> 332 | 101 >> 332 | 100 > > ... > > I think what's missing here is the explicit statement of which group > these belong in. Without a value to sort/group by, there's nothing for > your queries to "get a grip on". > > So - add a "group_id" column to the bank-book and receipt tables. Create > a sequence to generate group id's on demand. > > Then you'll want a set of triggers that keeps the group details up to > date. Of course, groups can shift as you add more records - particularly > in the case of two groups merging when you add a "linking" row. > > Maybe someone smarter than me can come up with a non-procedural > solution. Personally, I've got a nagging feeling that this sort of > "connectedness" problem is NP, so scaling could be a problem for you. > > -- > Richard Huxton > Archonet Ltd > > ---------------------------(end of broadcast)--------------------------- > TIP 5: Have you checked our extensive FAQ? > > http://www.postgresql.org/docs/faqs/FAQ.html > Thanks for the reply. Adding a group_id column would defeat the whole purpose of the relational model. I do not want to add a grouping beforehand. The grouping should take place according to certain criteria, in this case: group all the records that have at least one of two attributes in common. I am surprised that I haven't found any reference to such a n:m grouping, while googling. All I found was a description of the problem on can get Johan.
On Wed, Dec 01, 2004 at 06:57:54AM +0100, Johan Henselmans wrote: > Richard Huxton wrote: > > > I think what's missing here is the explicit statement of which group > > these belong in. Without a value to sort/group by, there's nothing for > > your queries to "get a grip on". > > > > So - add a "group_id" column to the bank-book and receipt tables. Create > > a sequence to generate group id's on demand. > > Thanks for the reply. Adding a group_id column would defeat the whole > purpose of the relational model. I do not want to add a grouping > beforehand. How is an application going to know which records belong to which groups without a group ID? Or is a group ID acceptable as long as it's not part of the data, but rather generated by the query or function that does the grouping? > The grouping should take place according to certain criteria, in > this case: group all the records that have at least one of two > attributes in common. What about chains like this: bankbookdetid | receiptid ---------------+----------- 100 | 1 100 | 2 101 | 2 101 | 3 102 | 3 102 | 4 Should 1 be grouped with 2, 3, and 4 since 1 has an attribute in common with 2, 2 has an attribute in common with 3, and 3 has an attribute in common with 4? Or doesn't your model permit this situation? -- Michael Fuhr http://www.fuhr.org/~mfuhr/
On 2-dec-04, at 3:45, Michael Fuhr wrote: > On Wed, Dec 01, 2004 at 06:57:54AM +0100, Johan Henselmans wrote: >> Richard Huxton wrote: >> >>> I think what's missing here is the explicit statement of which group >>> these belong in. Without a value to sort/group by, there's nothing >>> for >>> your queries to "get a grip on". >>> >>> So - add a "group_id" column to the bank-book and receipt tables. >>> Create >>> a sequence to generate group id's on demand. >> >> Thanks for the reply. Adding a group_id column would defeat the whole >> purpose of the relational model. I do not want to add a grouping >> beforehand. > > How is an application going to know which records belong to which > groups without a group ID? Or is a group ID acceptable as long as > it's not part of the data, but rather generated by the query or > function that does the grouping? > >> The grouping should take place according to certain criteria, in >> this case: group all the records that have at least one of two >> attributes in common. > > What about chains like this: > > bankbookdetid | receiptid > ---------------+----------- > 100 | 1 > 100 | 2 > 101 | 2 > 101 | 3 > 102 | 3 > 102 | 4 > > Should 1 be grouped with 2, 3, and 4 since 1 has an attribute in > common with 2, 2 has an attribute in common with 3, and 3 has an > attribute in common with 4? Or doesn't your model permit this > situation? > > -- > Michael Fuhr > http://www.fuhr.org/~mfuhr/ > That is possible indeed. This was the original situation. The trick I use is that from that collection I select the records of which both attribues have a count > 1. But that leads to the situation that some attributes still appear several times in the resulting set. In the above case the resulting set would be 100 | 2 101 | 2 101 | 3 102 | 3 So the group would end up to be all the records that have 2 and 3 or 100, 101 and 102 as part. In the mean time, breaking my head about it, I think I have found a solution: from the resuling set I once again select the record attributes that have a common attribute count > 1 (In your example that would be bankbookdetid = 101, and receiptid 2 and 3). Then we reduce the resulting set to: 101 | 2 101 | 3 Now I can add a group id , which is the attribute of which the count is > 1 (in this case 101), to make sure we don't mixup groupid because bankbookdetid and receiptid might have the same number, so we add 3000000 to receiptid groups, and 400000000 to bankbookdetid groups. That is the final set, from where one could go back to the previous set (get all the records that have 101 or 2 and 3 in their respective attributes), that would result in 100 | 2 | 400000101 101 | 2 | 400000101 101 | 3 | 400000101 102 | 3 | 400000101 and from there one would go back to ask all the records that have a part in these records, so you could go back to ---------------+----------- 100 | 1 | 400000101 100 | 2 | 400000101 101 | 2 | 400000101 101 | 3 | 400000101 102 | 3 | 400000101 102 | 4 | 400000101 Now it is possible to group. What do you think of the idea? -johan