Thread: possible to create multivalued index from xpath() results in 8.3?
I've working with XML in Postgres 8.3 and am trying to find a way to create a text-based index from an XPath that returns multiple nodes. For example, if I have an XPath like /elem[@key="mykey"]/text() which might return a few text nodes like value1 value2 value3 I'd like 3 index values associated with this row's index key: {value1, value2, value3). I was trying to figure out a way to define an index like this but ran into a couple of issues: 1) The xpath() function returns an array of XML type, but in the above example the text nodes are joined together into a single xml result node, like {value1value2value3}. How can I get it to return 3 individual text nodes, so an array of 3 values instead of 1? 2) Even if I could have an xpath() result return an array with multiple values, like {value1,value2,value3} I wasn't able to define a GIN index against the xml[] type. Should this be possible? -- m@
"Matt Magoffin" <postgresql.org@msqr.us> writes: > 2) Even if I could have an xpath() result return an array with multiple > values, like {value1,value2,value3} I wasn't able to define a GIN index > against the xml[] type. Should this be possible? Dunno about your other questions, but the answer to this one is "no" --- AFAICS there is no indexing support of any kind for the xml type in 8.3. Try casting to text[] instead. regards, tom lane
> "Matt Magoffin" <postgresql.org@msqr.us> writes: >> 2) Even if I could have an xpath() result return an array with multiple >> values, like {value1,value2,value3} I wasn't able to define a GIN index >> against the xml[] type. Should this be possible? > > Dunno about your other questions, but the answer to this one is "no" > --- AFAICS there is no indexing support of any kind for the xml type > in 8.3. Try casting to text[] instead. I had tried that, but it does not actually work because of my first issue, of a way for the XPath to return 3 individual text nodes. If I return 3 elements, like /elem[@key="mykey"] => { <elem key="mykey">value1</elem>, <elem key="mykey">value2</elem>, <elem key="mykey">value3</elem> } and cast that to text[], I get 3 XML strings, including the <elem></elem>. I want only the element text content. Should the xpath() function return 3 individual text nodes like this: /elem[@key="mykey"]/text() => { value1, value2, value3 } rather than concatenating these into a single text node result? I also tried something like string(/elem[@key="mykey"]) but that throws an XPath error. It looks like the code is converting this to /x/string(/elem[@key="mykey"] internally, which is not a valid XPath. So if xpath() cannot return individual text node results, would a possible solution be to write a trigger function that generates a tsvector from the XML array, and then use text search to locate matches? -- m@
"Matt Magoffin" <postgresql.org@msqr.us> writes: > Should the xpath() function return 3 individual text nodes like this: > /elem[@key="mykey"]/text() => { > value1, > value2, > value3 > } > rather than concatenating these into a single text node result? AFAICT that's exactly what it does. regression=# select xpath('//foo[@key="mykey"]/text()', '<value>ABC<foo key="mykey">XYZ</foo></value><foo key="mykey">RST</foo><foo>DEF</foo>'); xpath ----------- {XYZ,RST} (1 row) regression=# Of course this is of type xml[], but you can cast to text[] and then index. regards, tom lane
> AFAICT that's exactly what it does. > > regression=# select xpath('//foo[@key="mykey"]/text()', '<value>ABC<foo > key="mykey">XYZ</foo></value><foo key="mykey">RST</foo><foo>DEF</foo>'); > xpath > ----------- > {XYZ,RST} > (1 row) > > regression=# > > Of course this is of type xml[], but you can cast to text[] and then > index. Ugh, you're right of course! Somehow I had this wrong. So I tried to create an index on the xml[] result by casting to text[] but I got the "function must be immutable" error. Is there any reason the xml[] to text[] cast is not immutable? I worked around it by writing a function like CREATE OR REPLACE FUNCTION xpath_to_text(xml_array xml[]) RETURNS text[] AS $BODY$ BEGIN RETURN xml_array::text[]; END; $BODY$ LANGUAGE 'plpgsql' IMMUTABLE; and wrapping my CREATE INDEX call with that, like: create index type_flag_idx on lead using gin ( (xpath_to_text(xpath('/element[@key="foo"]/text()', xml))) ); -- m@
"Matt Magoffin" <postgresql.org@msqr.us> writes: > Ugh, you're right of course! Somehow I had this wrong. So I tried to > create an index on the xml[] result by casting to text[] but I got the > "function must be immutable" error. Is there any reason the xml[] to > text[] cast is not immutable? Hmm ... I see that xmltotext() is marked 'stable' in pg_proc.h, but texttoxml() is marked 'immutable', which is at best inconsistent. It looks to me like they both depend on the GUC setting "xmloption", which would mean they should both be stable. Peter, is there a bug there? Also, is there a way to get rid of the GUC dependency so that there's a reasonably safe way to index XML values? regards, tom lane