[ANN] PGroonga 1.0.0 - Make PostgreSQL fast full text search platform for all languages - Mailing list pgsql-announce

From Kouhei Sutou
Subject [ANN] PGroonga 1.0.0 - Make PostgreSQL fast full text search platform for all languages
Date
Msg-id 20151030.121817.1960714424927057566.kou@clear-code.com
Whole thread Raw
List pgsql-announce
Hi,

PGroonga 1.0.0 has been released!
It's the first major release!

  http://groonga.org/en/blog/2015/10/29/pgroonga-1.0.0.html

### About PGroonga

PGroonga is a PostgreSQL extension that makes PostgreSQL
fast full text search platform for all languages!

There are some PostgreSQL extensions that improves full text
search feature of PostgreSQL such as pg_trgm(*1) and
pg_bigm(*2).

(*1) http://www.postgresql.org/docs/current/static/pgtrgm.html
(*2) http://pgbigm.osdn.jp/index_en.html)

pg_trgm doesn't support languages that use non-alphanumerics
characters such as Japanese and Chinese.

pg_bigm supports languages that use non-alphanumerics
characters but it's slow.

PGroonga supports all languages, provides rich full text
search related features and is very fast. Because PGroonga
uses Groonga(*3) that is a full-fledged full text search
engine as backend.

(*3) http://groonga.org/

For example, PGroonga is a few times faster than pg_bigm. In
some cases, PGroonga is 10 times over faster than pg_bigm.

Here are benchmark results between PGroonga and
pg_bigm. They use Japanese Wikipedia data.

Here is a benchmark result for creating an index:

Extension  | Index creation time
-----------|--------------------
PGroonga   |    25m 37s
pg_bigm    | 5h 56m 15s

In this case, PGroonga is about 14 times faster than pg_bigm.

Here is a benchmark result for full text search:

Search keywords             | N hits   | PGroonga | pg_bigm
----------------------------|----------|----------|---------
"PostgreSQL" or "MySQL"     | 368      | *0.030s* |  0.107s
"database" in Japanese      | 17172    | *0.121s* |  1.224s
"TV animation" in Japanese) | 22885    | *0.179s* |  2.472s
"Japan" in Japanese         | 625792   |  0.646s  | *0.556s*

In "Japan" in Japanese case, pg_bigm is a bit faster(*4)
than PGroonga. But PGroonga is 3 times to 14 times faster
than pg_bigm in other cases. The result shows that PGroonga
can perform stable high performance fast full text search
against all keywords.

(*4) pg_bigm can perform faster full text search against
keywords that have 2 or less characters rather than keywords
that have 3 or more characters. "Japan" in Japanese is a
keyword that has 2 characters.


PGroonga also supports JSON search. You can use each value
for condition. You can also perform full text search against
all texts in JSON. No other extension such as JsQuery(*5)
doesn't provide full text search feature against JSON.

(*5) https://github.com/postgrespro/jsquery

### Usage

You can use PGroonga without full text search knowledge. You
just create an index and puts a condition into WHERE:

  CREATE INDEX index_name ON table USING pgroonga (column);

  SELECT * FROM table WHERE column @@ 'PostgreSQL';

You can also use LIKE to use PGroonga. PGroonga provides a
feature that performs LIKE with index. LIKE with PGroonga
index is faster than LIKE without index. It means that you
can improve performance without changing your application
that uses the following SQL:

  SELECT * FROM table WHERE column LIKE '%PostgreSQL%';

Are you interested in PGroonga? Please install(*5) and try
tutorial(*6). You can know all PGroonga features.

(*5) http://pgroonga.github.io/install/
(*6) http://pgroonga.github.io/tutorial/

You can install PGroonga easily. Because PGroonga provides
packages for major platforms. There are binaries for
Windows.


Thanks,
--
kou


pgsql-announce by date:

Previous
From: David Fetter
Date:
Subject: == PostgreSQL Weekly News - October 25 2015 ==
Next
From: David Fetter
Date:
Subject: == PostgreSQL Weekly News - November 01 2015 ==