Re: resolved: WITH RECURSIVE: ARRAY[id] All column datatypes must be hashable - Mailing list pgsql-sql

From Will Furnass
Subject Re: resolved: WITH RECURSIVE: ARRAY[id] All column datatypes must be hashable
Date
Msg-id 1289125085332-3253813.post@n5.nabble.com
Whole thread Raw
In response to Re: resolved: WITH RECURSIVE: ARRAY[id] All column datatypes must be hashable  (Will Furnass <will@thearete.co.uk>)
List pgsql-sql
> 
>> 
>>> rawi <only4com@web.de> writes:
>>> The Error was caused because I used UNION in place of UNION ALL.
>>>
>>> I still don't understand why the ARRAY (path) could not be grouped...
>> 
>> Yeah, it's an unimplemented feature --- there's no hashing support for
>> arrays.  I hope to get that done for 8.5.  In the meantime you have
>> to use UNION ALL there.  If you really need to eliminate duplicate
>> rows, you can do that via DISTINCT in the outer query.
> 
> I'm trying to do a similar sort of thing for a network containing ~9000 
> edges.  I'm using a WITH RECURSIVE subquery, an array to track 
> visited edges and a test to see whether the id of the 'current' edge is
> already in that array, as per the examples in the PostgreSQL 9.0 docs.
> Initially my main query seemed to run indefinitely so I introduced a 
> LIMIT.  I then found that as the LIMIT was increased the number of 
> non-distinct edges returned by the query grew at a far greater rate 
> than the number of distinct edges (with LIMIT 50000 the number of 
> distinct edges returned is only 628).  Am I right in thinking that until 
> arrays can be hashed that this issue could well limit the size of the 
> networks that I can analyse, given a particular hardware config?  

Got around the problem by learning pg/plsql and writing a non-recursive
breadth-first graph traversal function.  It maintains an array of visited
nodes and a double-ended queue, implemented as an array, of encountered
nodes that require processing during the traversal.  Not as fast as a 'WITH
RECURSIVE' SQL-only traversal for graphs containing only a few nodes but
much, much more efficient for graphs containing many nodes and a
considerable number of back edges.

-- 
View this message in context:
http://postgresql.1045698.n5.nabble.com/WITH-RECURSIVE-ARRAY-id-All-column-datatypes-must-be-hashable-tp2154712p3253813.html
Sent from the PostgreSQL - sql mailing list archive at Nabble.com.


pgsql-sql by date:

Previous
From: Ralph Smith
Date:
Subject: Re: I'm stuck - I just can't get this small FUNCT to run!
Next
From: Andreas
Date:
Subject: unexpected ORDER BY