Re: max length of sql select statement (long!) - Mailing list pgsql-sql
From | markus brosch |
---|---|
Subject | Re: max length of sql select statement (long!) |
Date | |
Msg-id | 1058187439.5983.2.camel@sam Whole thread Raw |
In response to | Re: max length of sql select statement ? (markus brosch <brosch@gmx.de>) |
Responses |
Re: max length of sql select statement (long!)
|
List | pgsql-sql |
It's me again - with the easy and complex statement at the same time ;-) Who can help me?! As far, my solution is still unsatisfactory. I'll explain my full task this time, otherwise it is hard to explain! My tables: (redundant for speed reasons - doesn't matter - only requests!) =============================================================== table: goGraph primarykey(parent,child) index (B-tree) on parent column and on child column. parent | child -------------------------- GO:0000001 | GO:0000002 GO:0000001 | GO:0000003 GO:0000002 | GO:0000005 GO:0000002 | GO:0000006 GO:0000003 | GO:0000006 GO:0000003 | GO:0000007 ... | ... about 15000 elements more! This represents a direct acyclic graph: GO:0000001 / \ v v GO:0000002 GO:0000003 / \ / \ v v v v GO:0000005 GO:0000006 GO:0000007 =============================================================== table: binaryInteraction primarykey(baitProtein, preyProtein, gogoKey) index (B-tree) on gogokey. baitProtein | preyProtein | "baitGoId" | "preyGoId" | gogoKey -------------------------------------------------------------- 123 | 123 | GO:0000001 | GO:0000001 | 11 123 | 123 | GO:0000020 | GO:0000001 | 120 123 | 123 | GO:0000020 | GO:0000020 | 2020 123 | 345 | GO:0000001 | GO:0000001 | 11 123 | 345 | GO:0000020 | GO:0000001 | 120 ... | ... | ... | ... | ... up to several millions entries! =============================================================== So, what I have to do is: Someone choose 2 goIds. For these ids and all their children (information from goGraph) I want the count(*) of "fits" within the table binaryInteraction. That means, baitGoId and preyGoId must be either the two choosen goIds or one of the related children. I use the gogoKey for this task: I compute Integers (within Java) from the goIds and sort smallerGoId.concatenate(biggerGoId) - so I've unique keys for baitGo / preyGo pairs. One more problem: The goGraph (see table) is a graph ... and for that reason one node can have more parents. If I choose e.g. GO:000002 and GO:000003 from the example above and want to compute their children, I don't want to take GO:000006 into account (two parents in which I am intested in). That menas, whenever I ask for children of two nodes, I want a DISTINCT SET of children. Example how I am processing the data at the moment: User chosse GoId1: GO:0000005 and GO:0000008; Java: - get all childs of GO:000005 by jdbc from goGraph - collection "one" - get all childs of GO:000006 by jdbc from goGraph - collection "two" - compute intersection of the two collections - substract of each collection the intersection - result: distinct children of two goIds Now the binaryInteraction table came into our game ;-) I apply the one distinct collection "one" and collection "two" to gogoKey: ----------------------- 1. SELECT count(*) FROM (SELECT DISTINCT bait, prey FROM binaryInteraction WHERE gogoKey IN (gogokey1,gogokey2, gogokey3, ... ) ) AS foo; gogokeyX will be processed by java -> it generates for each go-go-combination the unique gogokey. Speed: medium Limit: only 10000 Elements within the IN statement can be processed by default. In postgres.conf I can change this limit, as Rod Taylor already posted. But in the manual there is a warning of BufferOverflow if the value it too high ... If I reach the limit, I get something like: java.sql.SQLException: ERROR: Expression too complex: nesting depth exceeds max_expr_depth = 10000 BTW: Why nested depth? Within IN I compute thousands of "OR"s, but why nested ... ??? ----------------------- 2. for all gogokeys I generate a temporary table and do a join with the binaryInteraction table. SELECT COUNT(*) FROM (SELECT DISTINCT binaryInteraction.bait binaryInteraction.prey FROM binaryInteraction, tempTable WHERE binaryInteraction.gogokey= tempTable.gogokey) AS temp) Speed: very slow, but no limit! ----------------------- After all, I want to compute and precalculate every possible goId-goId-combination ;-) So, each of these things I explained here, I've to do millions of time! I tried to find a recursive solution - impossible! Who can help? Maybe there is a way to compute it in one but more complex sql-statemnt. Me as far I am still new to sql I don't know the "tricks"! If there would be a way to calculate the intersection ... then maybe a solution is not that far ... Also I was thinking about a denormalized goGraph table: parent -> allChilds. Then I compute (within SQL?) the intersection between the 2 GoIds and then I apply it to binaryInteraction. Thanks for any ideas and thanks at least for reading that long story ;-) Cheers Markus PS: If you have any additional questions: please feel free to ask!