Re: Do we want a hashset type? - Mailing list pgsql-hackers

From Joel Jacobson
Subject Re: Do we want a hashset type?
Date
Msg-id 0f0be7fa-c8b0-4397-890f-ab021f9eb73d@app.fastmail.com
Whole thread Raw
In response to Re: Do we want a hashset type?  ("Joel Jacobson" <joel@compiler.org>)
Responses Re: Do we want a hashset type?
List pgsql-hackers
On Thu, Jun 1, 2023, at 09:02, Joel Jacobson wrote:
> Here is an example using a real anonymised social network.

I realised the "found" column is not necessary in this particular example,
since we only care about the friends at the exact depth level. Simplified query:

CREATE OR REPLACE VIEW friends_of_friends AS
WITH RECURSIVE friends_of_friends AS (
    SELECT 
        ARRAY[5867::bigint] AS current,
        0 AS depth
    UNION ALL
    SELECT
        new_current,
        friends_of_friends.depth + 1
    FROM
        friends_of_friends
    CROSS JOIN LATERAL (
        SELECT
            array_agg(DISTINCT edges.to_node) AS new_current
        FROM
            edges
        WHERE
            from_node = ANY(friends_of_friends.current)
    ) q
    WHERE
        friends_of_friends.depth < 3
)
SELECT
    depth,
    coalesce(array_length(current, 1), 0)
FROM
    friends_of_friends
WHERE
    depth = 3;
;

-- PostgreSQL 15.2:

EXPLAIN ANALYZE SELECT * FROM friends_of_friends;
                                                                            QUERY PLAN

-------------------------------------------------------------------------------------------------------------------------------------------------------------------
 CTE Scan on friends_of_friends  (cost=2687.88..2688.58 rows=1 width=8) (actual time=2076.362..2076.454 rows=1
loops=1)
   Filter: (depth = 3)
   Rows Removed by Filter: 3
   CTE friends_of_friends
     ->  Recursive Union  (cost=0.00..2687.88 rows=31 width=36) (actual time=0.008..2075.073 rows=4 loops=1)
           ->  Result  (cost=0.00..0.01 rows=1 width=36) (actual time=0.002..0.002 rows=1 loops=1)
           ->  Subquery Scan on "*SELECT* 2"  (cost=89.44..268.75 rows=3 width=36) (actual time=518.613..518.622 rows=1
loops=4)
                 ->  Nested Loop  (cost=89.44..268.64 rows=3 width=36) (actual time=515.523..515.523 rows=1 loops=4)
                       ->  WorkTable Scan on friends_of_friends friends_of_friends_1  (cost=0.00..0.22 rows=3 width=36)
(actualtime=0.001..0.001 rows=1 loops=4)
 
                             Filter: (depth < 3)
                             Rows Removed by Filter: 0
                       ->  Aggregate  (cost=89.44..89.45 rows=1 width=32) (actual time=687.356..687.356 rows=1
loops=3)
                             ->  Index Only Scan using edges_pkey on edges  (cost=0.56..83.96 rows=2191 width=4)
(actualtime=0.139..290.996 rows=3486910 loops=3)
 
                                   Index Cond: (from_node = ANY (friends_of_friends_1.current))
                                   Heap Fetches: 0
 Planning Time: 0.557 ms
 Execution Time: 2076.990 ms
(17 rows)



pgsql-hackers by date:

Previous
From: "Joel Jacobson"
Date:
Subject: Re: Do we want a hashset type?
Next
From: "Yu Shi (Fujitsu)"
Date:
Subject: RE: Support logical replication of DDLs