Hello,
I have a table filled from third party that basically has the following
structure:
link_id | feat_id | other | columns...
--------+---------+-------+-----------
1 | 2 | ...
2 | 5 | ...
2 | 23 | ...
3 | 5 | ...
3 | 23 | some | data
3 | 23 | other | data
5 | 23 | ...
9 | 23 | ...
This structure is fixed, and we can't change it, but we can create
We have about 37 million different link_ids, and 35 million feat_ids.
There are feat_ids that appear at several thousand link_ids, but a
link_id does not have more than a douzen feat_ids.
Now I need to group together all link_ids that have the same set of
feat_ids. In the example above, the sets would be (1), (2,3) and (5,9),
and the other columns would be run through some aggregate functions.
Currently, this is done via an external JAVA application, but I'm
looking for a way to express this via sql / plpgsql to ease deployment.
I could imagine some ugly code using ARRAY (not tried yet), but how
would you pack this problem? It seems that I'm just stuck in my thoughts
and miss the beauty way to solve it.
Thanks,
Markus
--
Markus Schaber | Logical Tracking&Tracing International AG
Dipl. Inf. | Software Development GIS
Fight against software patents in EU! www.ffii.org www.nosoftwarepatents.org