Re: improvements to query with hierarchical elements - Mailing list pgsql-sql

From Steve Midgley
Subject Re: improvements to query with hierarchical elements
Date
Msg-id 20080122052133.65CF22E3881@postgresql.org
Whole thread Raw
In response to improvements to query with hierarchical elements  (Ryan Wallace <rywall@interchange.ubc.ca>)
List pgsql-sql
>Date: Sun, 20 Jan 2008 20:01:08 -0800
>From: Ryan Wallace <rywall@interchange.ubc.ca>
>To: pgsql-sql@postgresql.org
>Subject: improvements to query with hierarchical elements
>Message-ID: <002601c85be2$410aea30$c320be90$@ubc.ca>
>Greetings,
>
>I have a complex query which I am trying to figure out the most 
>efficient
>way of performing.
>
>My database is laid out as follows:
>items -have_many-> events -have_many-> event_locations -have_many->
>locations
>
>also rows in the location_links table link two locations together in a
>parent-child relationship and rows in the location_descendants table 
>provide
>a full list of the descendants of a
>particular location.
>
>I am trying to find all locations which both are direct children of a 
>given
>parent location, and are associated with at least one item in a 
>constrained
>subset of items.
>(eg. Find all states of the USA in which at least one wooden axe was 
>made.
>Also find the number of wooden axes made in each state.)
>
>I have developed the following query:
>
>SELECT  locations.*,
>         location_ids.item_count AS item_count
>FROM    locations
>         JOIN
>                 (SELECT immediate_descendants.ancestor_id AS id,
>                         COUNT(DISTINCT creation_events.item_id) AS
>item_count
>                 FROM    event_locations
>                         JOIN
>                                 (SELECT *
>                                 FROM    location_descendants
>                                 WHERE   ancestor_id IN
>                                         (SELECT child_id
>                                         FROM    location_links
>                                         WHERE   parent_id = *note 1*
>                                         )
>                                 ) AS immediate_descendants
>                         ON      event_locations.location_id =
>immediate_descendants.descendant_id
>                         JOIN
>                                 (SELECT *
>                                 FROM    events
>                                 WHERE   item_id IN (*note 2*) AND
>association = 'creation'
>                                 ) AS creation_events
>                         ON      event_locations.event_id =
>creation_events.id
>                 GROUP BY immediate_descendants.ancestor_id
>                 ) AS location_ids ON locations.id = location_ids.id
>
>*note 1* - the id of the parent location.
>*note 2* - the query which returns a list of constrained item ids
>
>This works but I am looking for any way to improve the performance of 
>the
>query (including changing the layout of the tables). Any ideas, 
>suggestions
>or general pointers would be greatly appreciated.
>
>Thanks very much,
>Ryan

Hi Ryan,

I have built some similar queries so I might be able to help you. But 
it's a little hard (for me) to dig into your query without a test set. 
Could you please post some create table and insert statements to give 
us a little test bed to run your query in? I realize that may be a fair 
bit of work for you but it would help me to give you some ideas.

Without seeing a more formal schema and being able to toy with it, I'm 
not sure I can give good advice. Others may have different opinions 
which I would welcome.

Sincerely,

Steve




pgsql-sql by date:

Previous
From: Christian Schröder
Date:
Subject: Re: (possible) bug with constraint exclusion
Next
From: silly_sad
Date:
Subject: currval() within one statement