Re: XML Index again - Mailing list pgsql-general

From Alban Hertroys
Subject Re: XML Index again
Date
Msg-id CE29DB4D-A865-4591-AB63-F5C6FEC81AA4@solfertje.student.utwente.nl
Whole thread Raw
In response to Re: XML Index again  (Chris Roffler <croffler@earthlink.net>)
Responses Re: XML Index again  (Chris Roffler <croffler@earthlink.net>)
List pgsql-general
On 8 Mar 2010, at 11:39, Chris Roffler wrote:

> Alban
>
> Thanks for your help, your suggestion worked.
>
> I need another xpath expression to find any Attribute with  Name ="xxxx" under the Attributes node. (not just if it
isin a specific position) 
> see   query below.

Your previous query wasn't about attributes in any specific position - it returned documents that contained more than
zeroattributes matching a given name. What are you trying to do this time? 

> How do I create an index for this xpath expression ?

You don't need to create another index (although one w/o the external_attributes column would probably be more
convenient);the index you have contains those names already. 

Just make sure you use the same expression you used to create the index to match the part in your xml and compare it to
thetext you're looking for. 

If you want to use indexes on your xml, then you'll need to stop putting the variable parts of your queries inside your
xpathexpressions - you make them unindexable that way. So move those [Name='xxxx']'s out of your xpath expressions.
Insteadhave the expressions result in the names so that you can compare them to the names stored in your index(es). 

It won't be as fast as looking for those names using xpath in an xml document, as every attribute name is a candidate
forcomparison now, but at least it's indexable. 

Alternatively you could try to build an index from the names contained in each xml document. Something like:

CREATE INDEX xml_attribute_names
    ON time_series
 USING btree (xpath('/AttributeList/Attributes/Attribute/Name/text()'));

This stores the array of all attribute names in an index. You can query for existence of specific attributes using the
ANYoperator on the resulting array (see archives and docs for how to do that). 

I believe (I've never needed to use arrays) the syntax is:

SELECT * FROM time_series WHERE 'xxxx' = ANY (xpath('/AttributeList/Attributes/Attribute/text()'));

It'll probably be faster than the previous solution, but use more disk space and memory.

>  SELECT * FROM time_series
>         WHERE array_upper((xpath('/AttributeList/Attributes/Attribute[Name="xxxxx"]',   external_attributes)),1) > 0

>
>
> On Sun, Mar 7, 2010 at 1:11 PM, Alban Hertroys <dalroi@solfertje.student.utwente.nl> wrote:
...
> You seem to want to test for the existence of nodes with a specific name, maybe this is what you're looking for?:
>
> SELECT id FROM time_series t1 WHERE EXISTS (
>        SELECT 1
>          FROM time_series t2
>         WHERE (xpath('/AttributeList/Attributes/Attribute/Name/text()', external_attributes))[1]::text =
('Attribute122021',external_attributes) 
>           AND t2.id = t1.id
> );

For clarity, if you would have an index on just that xpath expression - without the external_attributes column - this
querywould look simpler: 

SELECT id FROM time_series t1 WHERE EXISTS (
       SELECT 1
         FROM time_series t2
        WHERE xpath('/AttributeList/Attributes/Attribute/Name/text()')[1]::text = 'Attribute122021'
          AND t2.id = t1.id
);

Alban Hertroys

--
If you can't see the forest for the trees,
cut the trees and you'll see there is no forest.


!DSPAM:737,4b94df40296929445119198!



pgsql-general by date:

Previous
From: Chris Roffler
Date:
Subject: Re: XML Index again
Next
From: Alban Hertroys
Date:
Subject: Re: XML Index again