An inverted index using roaring bitmaps - Mailing list pgsql-hackers
From | Chinmay Kanchi |
---|---|
Subject | An inverted index using roaring bitmaps |
Date | |
Msg-id | CAJqPDh9o=oxRP=xZhbCzzAs4Vr5hNUkYLMspGee+=mgbqG3tAg@mail.gmail.com Whole thread Raw |
Responses |
Re: An inverted index using roaring bitmaps
|
List | pgsql-hackers |
I've been doing some preliminary prep work to see how an inverted index using roaring bitmaps (https://roaringbitmap.org/) would perform. I'm presenting some early work using SQL code with the roaring bitmap Postgres extension (https://github.com/ChenHuajun/pg_roaringbitmap) to simulate a hypothetical index using this approach.
Since a bitmap requires you to convert your inputs into integers, I created a function as a hack to convert our TIDs to integers. It's ugly as hell, but it serves. 2048 is 2^11, which according to the GIN index source code is a safe assumption for the highest possible MaxHeapTuplesPerPage.
create function ctid_to_int(ctid tid) returns integer as $$
select (ctid::text::point)[0] * 2048 + (ctid::text::point)[1];
$$
language sql returns null on null input;
create table cities_rb as (select country, roaringbitmap(array_agg(ctid_to_int(ctid))::text) idx from cities group by country);
Table "public.cities_rb"
Column | Type | Collation | Nullable | Default
---------+---------------+-----------+----------+---------
country | text | | |
idx | roaringbitmap | | |
Now for the fun stuff - to simulate the "index" I will be running some queries against the cities_rb table using bitmap aggregations and comparing them to functionally the same queries using the BTree index on cities.
I'd like to solicit feedback from the community to see if this is something worth pursuing or if there are potential issues that I'm not aware of (or if this approach has been considered in the past and discarded for whatever reason). For context, my experience with Postgres is primarily as a user and I'm not at all familiar with the code base, so please be gentle :).
That said, here's a quick and dirty demo:
I have a table "cities"
Table "public.cities"
Column | Type | Collation | Nullable | Default
---------+------+-----------+----------+---------
city | text | | |
country | text | | |
Indexes:
"cities_country_idx" btree (country)
Column | Type | Collation | Nullable | Default
---------+------+-----------+----------+---------
city | text | | |
country | text | | |
Indexes:
"cities_country_idx" btree (country)
select count(*) from cities;
count
--------
139739
(1 row)
count
--------
139739
(1 row)
And just some sample rows:
select * from cities order by random() limit 10;
city | country
------------------------+------------------------------
Alcalá de la Selva | Spain
Bekirhan | Turkey
Ceggia | Italy
Châtillon-en-Vendelais | France
Hohenfelde | Germany
Boedo | Argentina
Saint-Vith | Belgium
Gaggenau | Germany
Lake Ozark | United States
Igunga | Tanzania, United Republic of
(10 rows)
city | country
------------------------+------------------------------
Alcalá de la Selva | Spain
Bekirhan | Turkey
Ceggia | Italy
Châtillon-en-Vendelais | France
Hohenfelde | Germany
Boedo | Argentina
Saint-Vith | Belgium
Gaggenau | Germany
Lake Ozark | United States
Igunga | Tanzania, United Republic of
(10 rows)
Since a bitmap requires you to convert your inputs into integers, I created a function as a hack to convert our TIDs to integers. It's ugly as hell, but it serves. 2048 is 2^11, which according to the GIN index source code is a safe assumption for the highest possible MaxHeapTuplesPerPage.
create function ctid_to_int(ctid tid) returns integer as $$
select (ctid::text::point)[0] * 2048 + (ctid::text::point)[1];
$$
language sql returns null on null input;
And the reverse:
create function int_to_ctid(i integer) returns tid as $$
select point(i/2048, i%2048)::text::tid;
$$
language sql returns null on null input;
select point(i/2048, i%2048)::text::tid;
$$
language sql returns null on null input;
In addition, I created a table "cities_rb" to roughly represent an "index" on the country column:
create table cities_rb as (select country, roaringbitmap(array_agg(ctid_to_int(ctid))::text) idx from cities group by country);
Table "public.cities_rb"
Column | Type | Collation | Nullable | Default
---------+---------------+-----------+----------+---------
country | text | | |
idx | roaringbitmap | | |
Now for the fun stuff - to simulate the "index" I will be running some queries against the cities_rb table using bitmap aggregations and comparing them to functionally the same queries using the BTree index on cities.
explain analyze select ctid from cities where country = 'Japan';
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on cities (cost=18.77..971.83 rows=1351 width=6) (actual time=0.041..0.187 rows=1322 loops=1)
Recheck Cond: (country = 'Japan'::text)
Heap Blocks: exact=65
-> Bitmap Index Scan on cities_country_idx (cost=0.00..18.43 rows=1351 width=0) (actual time=0.031..0.031 rows=1322 loops=1)
Index Cond: (country = 'Japan'::text)
Planning Time: 0.055 ms
Execution Time: 0.233 ms
(7 rows)
QUERY PLAN
----------------------------------------------------------------------------------------------------------------------------------
Bitmap Heap Scan on cities (cost=18.77..971.83 rows=1351 width=6) (actual time=0.041..0.187 rows=1322 loops=1)
Recheck Cond: (country = 'Japan'::text)
Heap Blocks: exact=65
-> Bitmap Index Scan on cities_country_idx (cost=0.00..18.43 rows=1351 width=0) (actual time=0.031..0.031 rows=1322 loops=1)
Index Cond: (country = 'Japan'::text)
Planning Time: 0.055 ms
Execution Time: 0.233 ms
(7 rows)
explain analyze select rb_to_array(idx) from cities_rb where country = 'Japan';
QUERY PLAN
-----------------------------------------------------------------------------------------------------
Seq Scan on cities_rb (cost=0.00..14.88 rows=1 width=32) (actual time=0.050..0.067 rows=1 loops=1)
Filter: (country = 'Japan'::text)
Rows Removed by Filter: 229
Planning Time: 0.033 ms
Execution Time: 0.076 ms
(5 rows)
explain analyze select count(*) from cities where country = 'Japan';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=35.31..35.32 rows=1 width=8) (actual time=0.151..0.151 rows=1 loops=1)
-> Index Only Scan using cities_country_idx on cities (cost=0.29..31.94 rows=1351 width=0) (actual time=0.026..0.103 rows=1322 loops=1)
Index Cond: (country = 'Japan'::text)
Heap Fetches: 0
Planning Time: 0.056 ms
Execution Time: 0.180 ms
(6 rows)
explain analyze select rb_cardinality(idx) from cities_rb where country = 'Japan';
QUERY PLAN
----------------------------------------------------------------------------------------------------
Seq Scan on cities_rb (cost=0.00..14.88 rows=1 width=8) (actual time=0.037..0.053 rows=1 loops=1)
Filter: (country = 'Japan'::text)
Rows Removed by Filter: 229
Planning Time: 0.037 ms
Execution Time: 0.063 ms
(5 rows)
explain analyze select country, count(*) from cities group by country;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=2990.09..2992.22 rows=214 width=17) (actual time=34.054..34.076 rows=230 loops=1)
Group Key: country
Batches: 1 Memory Usage: 77kB
-> Seq Scan on cities (cost=0.00..2291.39 rows=139739 width=9) (actual time=0.005..8.552 rows=139739 loops=1)
Planning Time: 0.051 ms
Execution Time: 34.103 ms
(6 rows)
explain analyze select country, rb_cardinality(idx) from cities_rb;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Seq Scan on cities_rb (cost=0.00..14.88 rows=230 width=19) (actual time=0.008..0.184 rows=230 loops=1)
Planning Time: 0.030 ms
Execution Time: 0.200 ms
(3 rows)
The simulated index in this case is outrageously fast, up to ~150x on the GROUP BY.
QUERY PLAN
-----------------------------------------------------------------------------------------------------
Seq Scan on cities_rb (cost=0.00..14.88 rows=1 width=32) (actual time=0.050..0.067 rows=1 loops=1)
Filter: (country = 'Japan'::text)
Rows Removed by Filter: 229
Planning Time: 0.033 ms
Execution Time: 0.076 ms
(5 rows)
explain analyze select count(*) from cities where country = 'Japan';
QUERY PLAN
---------------------------------------------------------------------------------------------------------------------------------------------
Aggregate (cost=35.31..35.32 rows=1 width=8) (actual time=0.151..0.151 rows=1 loops=1)
-> Index Only Scan using cities_country_idx on cities (cost=0.29..31.94 rows=1351 width=0) (actual time=0.026..0.103 rows=1322 loops=1)
Index Cond: (country = 'Japan'::text)
Heap Fetches: 0
Planning Time: 0.056 ms
Execution Time: 0.180 ms
(6 rows)
explain analyze select rb_cardinality(idx) from cities_rb where country = 'Japan';
QUERY PLAN
----------------------------------------------------------------------------------------------------
Seq Scan on cities_rb (cost=0.00..14.88 rows=1 width=8) (actual time=0.037..0.053 rows=1 loops=1)
Filter: (country = 'Japan'::text)
Rows Removed by Filter: 229
Planning Time: 0.037 ms
Execution Time: 0.063 ms
(5 rows)
explain analyze select country, count(*) from cities group by country;
QUERY PLAN
-------------------------------------------------------------------------------------------------------------------
HashAggregate (cost=2990.09..2992.22 rows=214 width=17) (actual time=34.054..34.076 rows=230 loops=1)
Group Key: country
Batches: 1 Memory Usage: 77kB
-> Seq Scan on cities (cost=0.00..2291.39 rows=139739 width=9) (actual time=0.005..8.552 rows=139739 loops=1)
Planning Time: 0.051 ms
Execution Time: 34.103 ms
(6 rows)
explain analyze select country, rb_cardinality(idx) from cities_rb;
QUERY PLAN
---------------------------------------------------------------------------------------------------------
Seq Scan on cities_rb (cost=0.00..14.88 rows=230 width=19) (actual time=0.008..0.184 rows=230 loops=1)
Planning Time: 0.030 ms
Execution Time: 0.200 ms
(3 rows)
The simulated index in this case is outrageously fast, up to ~150x on the GROUP BY.
Cheers,
Chinmay
pgsql-hackers by date: