On 05/17/2012 03:06 AM, Seref Arikan wrote:
I have a key value table in my Postgresql db, which represents hierarchical data through parent_feature_mapping column that points to id of feature_mapping_id column of the same table.
I need to select root nodes that has children which satisfy various conditions. The conditions may extend to children of children, so I'm trying to find roots of trees that contain paths that satisfy the given constraints.
An example is finding the trees where the root node has type 'COMPOSITION' and root node's archetypeNodeId attribute has value 'openEHR-EHR-COMPOSITION.discharge.v1' another constraint is root node having a child of type 'CONTENTITEM' that in turn has a child of type 'ITEMSTRUCTURE'
All nodes in a tree have the same payload Id. The fastest query that I could write so far is given below.
SELECT root.id from path_value as root
WHERE
root.rm_type_name = 'COMPOSITION'
AND
root.feature_name = 'composition'
AND
EXISTS (SELECT 1 from path_value as anodeid
WHERE
anodeId.parent_feature_mapping_id = root.feature_mapping_id
AND
anodeId.payload_id = root.payload_id
AND
anodeId.feature_name = 'archetypeNodeId'
AND
anodeId.val_string = 'openEHR-EHR-COMPOSITION.discharge.v1'
LIMIT 1
)
AND
EXISTS (SELECT 1 from path_value as node1
WHERE
node1.payload_id = root.payload_id
AND
node1.parent_feature_mapping_id = root.feature_mapping_id
AND
node1.feature_name = 'content'
AND
node1.rm_type_name = 'CONTENTITEM'
AND
EXISTS (SELECT 1 from path_value as node2
WHERE
node2.payload_id = node1.payload_id
AND
node2.parent_feature_mapping_id = node1.feature_mapping_id
AND
node2.rm_type_name = 'ITEMSTRUCTURE'
LIMIT 1)
LIMIT 1)
My question is: is this the best approach in terms of performance? This is an attempt to identify XML payloads that fit certain criteria. I have also considered using an ltree column that will contain the tree in a from that I can query as an alternative to sql based method, or I can use xpath queries on XML payload.
The create statement for my table is as follows:
CREATE TABLE public.path_value (
val_string TEXT,
feature_mapping_id INTEGER NOT NULL,
parent_feature_mapping_id INTEGER,
feature_name TEXT,
rm_type_name TEXT,
path INTEGER NOT NULL,
payload_id INTEGER NOT NULL,
id INTEGER NOT NULL,
ehr_id INTEGER,
CONSTRAINT path_value_pkey PRIMARY KEY(id)
) WITHOUT OIDS;
Best regards
Seref
Any other constraints or indexes on that table?