Re: optimizing a query over tree-like structure - Mailing list pgsql-sql
From | Oliveiros Cristina |
---|---|
Subject | Re: optimizing a query over tree-like structure |
Date | |
Msg-id | 083601c9231e$2b3b08d0$ec5a3d0a@marktestcr.marktest.pt Whole thread Raw |
In response to | optimizing a query over tree-like structure (az@svilendobrev.com) |
List | pgsql-sql |
Svil, Please advice me, You have values and one table for N,R,P,F and O and Z, right? And you have ownership which is a "catch-all" associative table between values and whatever other table, is this correct? You want to retrieve the values for a certain N, and all to all the other entities that belong to that N, is this correct? Fine, I have a few questions on your associative tablesmm_N2P -- An N can be associated to P in a many to many relationship? N Can have one or none P, am I right? It can have many Ps ? And a P can belong to many Ns ? mm_P2O, -- I understand an O can have many Positions, but a Positon can be in more than one O? is this a many to many relationship? mm_O2O, -- if it is a hierarchical tree do you need an associative table? An O can have many sub-departments but a sub-department (O) can just belong to a (super-)department, -- isn't it so?mm_N2Z -- this one was mistake of yours, it doesn't exist, right? You wrote that giant query by yourself or was it generated by some automated tool? (sqlalchemy , I have no idea :p ) Your ultimate goal is to retrieve the values associated with a certain Department( and all its sub-departments) which is associated with a certain Position, which values are also to obtain, such position being associated with a certain N, which values are also to obtain, is my understandin correct? Explain me the simplest case, where you have an N not associated with any R or P, just with an F. Which info is to be retrieved, exactly in this case ? Best, Oliveiros ----- Original Message ----- From: <az@svilendobrev.com> To: <pgsql-sql@postgresql.org> Sent: Tuesday, September 30, 2008 3:24 PM Subject: Re: [SQL] optimizing a query over tree-like structure > another idea i just got, to decrease the number of tables in one FROM, > is to represent the first-level ORs with unions. but i'm not sure how > exactly to do it: are these equivalent? > > FROM N,ownership,mm_N2R > where ( > N.dbid = ownership.N > OR > N.dbid = mm_N2R.left AND mm_N2R.right = ownership.R > OR > ... > ) AND N.obj = whatever-filter-by-N > > -------- > > FROM ( > (ownership join N on ownership.N = N.dbid) > union > (ownership join mm_N2R on ownership.R = mm_N2R.right join N on > mm_N2R.left = N.dbid ) > OR > ) ... > > and should i bundle the filtering-by-N/Employment in every of above > union-members? > > On Tuesday 30 September 2008 15:21:09 Oliveiros Cristina wrote: >> Hi, Svil >> >> I 'd like to first fully understand the background of your problem >> before figurin out if I can be of any help (or not). >> >> You have a tree, of which N is the root, is this correct? >> Then, what are the next sublevel? >> F, P and R? If so, why is R linked to a sibling (F) ? >> And the next one? >> O and Z? >> Is O connected to itself? >> And i am not understanding your concept of "shortcuts". Could you >> please explain ? >> What kind of tree do you have exactly? Binary? Trenary? >> The mm_* tables keep relations between nodes, I guess.... If so , >> the mm_N2Z one is empty, in this example, right?As there is no edge >> from N to Z (not direct). >> But what is the Nazn table? What records does it keep? >> And what is the ownership table? And the value? >> Could you tell which columns these tables have, at least the >> relevant ones for your problem ? >> >> Please kindly advice me on these issues. >> I am not very specialized in optimizing queries, but I see you have >> a lot of cartesian products on your FROM clause, which, from my own >> experience, I guess it has tendency to be slow... >> >> Best, >> >> Oliveiros >> >> ----- Original Message ----- >> From: <az@svilendobrev.com> >> To: <pgsql-sql@postgresql.org> >> Sent: Tuesday, September 30, 2008 9:32 AM >> Subject: [SQL] optimizing a query over tree-like structure >> >> > hi. >> > sorry for the vague syntax used below, but the query is huge so >> > i've tried to present it in simple terms. And sorry if i'm doing >> > obviously stupid things, i have lots of years programming behind >> > me but NO sql involved. >> > i have a somewhat tree-like structure of objects that link to >> > each other via many2many associations. it looks like: >> > (N is "root") >> > N links to R,P,F >> > R links to F >> > P links to O,F >> > O links to O,F #recursively >> > F links to Z >> > All links to F but the one in O are "shortcuts", to avoid looking >> > it up recursively. >> > each of these objects has some associated values (again >> > many2many, ownership). >> > >> > what i want is to get all the values related to a given N and its >> > sublevels, in one query. >> > >> > one variant of what i've invented so far is (~pseudocode, no >> > recursion on O): >> > >> > SELECT ownership.*, value.* >> > FROM Nazn, mm_N2P, mm_P2O, mm_O2O, mm_O2O AS mm_O2O1, mm_N2Z, >> > ownership JOIN value ON ownership.value = value.dbid >> > WHERE ( >> > N.dbid = ownership.N >> > OR >> > N.dbid = mm_N2R.left AND mm_N2R.right = ownership.R >> > OR >> > N.dbid = mm_N2P.left AND ( >> > mm_N2P.right = ownership.P >> > OR >> > mm_N2P.right = mm_P2O.left AND ( >> > mm_P2O.right = ownership.O >> > OR >> > mm_P2O.right = mm_O2O.left AND ( >> > mm_O2O.right = ownership.O >> > OR >> > mm_O2O.right = mm_O2O1.left AND >> > mm_O2O1.right = ownership.O >> > ))) >> > OR >> > Nazn.dbid = mm_N2F.left AND ( >> > mm_N2F.right = ownership.F >> > OR >> > mm_N2Z.right = ownership.Z >> > ) >> > ) AND ownership.value = value.dbid AND N.obj = >> > whatever-filter-by-N >> > >> > ---------------- >> > this scales very poor. >> > it uses the shortcut to F present in N. >> > for just 200 rows with related associations, it takes 4 seconds >> > to get result. >> > if i use the shortcut to F present in P, it takes 2 seconds - but >> > thats still inacceptable. >> > seems that the number or consequtive ORs on same level is killing >> > it. EXPLAIN gives nested loops all over. >> > What am i doing wrong here? >> > should i expand the A-to-B links of the sort >> > mm_N2P.right = mm_P2O.left >> > into >> > mm_N2P.right = P.dbid and P.dbid == mm_P2O.left ? >> > >> > the query is generated via sqlalchemy and a layer on top, so i >> > can tweak it any way required (and it has many other >> > sub/filterings which i've ommited for brevity - they dont make it >> > better/worse). >> > >> > any pointers of how such queries should be written are >> > appreciated - e.g. what is considered fine, what doable and what >> > is a no-no. >> > >> > thanks ahead >> > ciao >> > svil >> > >> > www.svilendobrev.com >> > dbcook.sf.net >> > >> > -- >> > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) >> > To make changes to your subscription: >> > http://www.postgresql.org/mailpref/pgsql-sql > > > > -- > Sent via pgsql-sql mailing list (pgsql-sql@postgresql.org) > To make changes to your subscription: > http://www.postgresql.org/mailpref/pgsql-sql