Re: Do we want a hashset type? - Mailing list pgsql-hackers

From Joel Jacobson
Subject Re: Do we want a hashset type?
Date
Msg-id d8759507-7db8-4cae-b13e-21ae4e382b89@app.fastmail.com
Whole thread Raw
In response to Re: Do we want a hashset type?  (Tomas Vondra <tomas.vondra@enterprisedb.com>)
Responses Re: Do we want a hashset type?
List pgsql-hackers
On Tue, Jun 6, 2023, at 13:20, Tomas Vondra wrote:
> it cuts the timing to about 50% on my laptop, so maybe it'll be ~300ms
> on your system. There's a bunch of opportunities for more improvements,
> as the hash table implementation is pretty naive/silly, the on-disk
> format is wasteful and so on.
>
> But before spending more time on that, it'd be interesting to know what
> would be a competitive timing. I mean, what would be "good enough"? What
> timings are achievable with graph databases?

Your hashset is now almost exactly as fast as the corresponding roaringbitmap query, +/- 1 ms on my machine.

I tested Neo4j and the results are surprising; it appears to be significantly *slower*.
However, I've probably misunderstood something, maybe I need to add some index or something.
Even so, it's interesting it's apparently not fast "by default".

The query I tested:
MATCH (user:User {id: '5867'})-[:FRIENDS_WITH*3..3]->(fof)
RETURN COUNT(DISTINCT fof)

Here is how I loaded the data into it:

% pwd
/Users/joel/Library/Application Support/Neo4j
Desktop/Application/relate-data/dbmss/dbms-3837aa22-c830-4dcf-8668-ef8e302263c7

% head import/*
==> import/friendships.csv <==
1,13,FRIENDS_WITH
1,11,FRIENDS_WITH
1,6,FRIENDS_WITH
1,3,FRIENDS_WITH
1,4,FRIENDS_WITH
1,5,FRIENDS_WITH
1,15,FRIENDS_WITH
1,14,FRIENDS_WITH
1,7,FRIENDS_WITH
1,8,FRIENDS_WITH

==> import/friendships_header.csv <==
:START_ID(User),:END_ID(User),:TYPE

==> import/users.csv <==
1,User
2,User
3,User
4,User
5,User
6,User
7,User
8,User
9,User
10,User

==> import/users_header.csv <==
id:ID(User),:LABEL

% ./bin/neo4j-admin database import full --overwrite-destination --nodes=User=import/users_header.csv,import/users.csv
--relationships=FRIENDS_WIDTH=import/friendships_header.csv,import/friendships.csvneo4j
 

/Joel



pgsql-hackers by date:

Previous
From: Joseph Koshakow
Date:
Subject: Re: is_superuser is not documented
Next
From: Mehmet Emin KARAKAŞ
Date:
Subject: [DOCS] alter_foreign_table.sgml typo