Home > mailing lists

Re: ZStandard (with dictionaries) compression support for TOAST compression - Mailing list pgsql-hackers

From	Nikhil Kumar Veldanda
Subject	Re: ZStandard (with dictionaries) compression support for TOAST compression
Date	March 6 23:59:01
Msg-id	CAFAfj_GACKVftwuRjy3Ls-1Xc3ojUUbVh=Rm7KpRuYbaS=uLPg@mail.gmail.com Whole thread Raw
In response to	Re: ZStandard (with dictionaries) compression support for TOAST compression (Robert Haas <robertmhaas@gmail.com>)
List	pgsql-hackers

Tree view

Hi Robert,

> I think that solving the problems around using a dictionary is going
> to be really hard. Can we see some evidence that the results will be
> worth it?

With the latest patch I've shared,

Using a Kaggle dataset of Nintendo-related tweets[1], we leveraged
PostgreSQL's acquire_sample_rows function to quickly gather just 1,000
sample rows for a specific attribute out of 104695 rows. These raw
samples were passed into Zstd's sampling buffer, generating a custom
dictionary. This dictionary was then directly used to compress the
documents, resulting in 62% of space savings after compressed:

```
test=# \dt+
                                         List of tables
 Schema |      Name      | Type  |  Owner   | Persistence | Access
method |  Size  | Description
--------+----------------+-------+----------+-------------+---------------+--------+-------------
 public | lz4            | table | nikhilkv | permanent   | heap
   | 297 MB |
 public | pglz           | table | nikhilkv | permanent   | heap
   | 259 MB |
 public | zstd_with_dict | table | nikhilkv | permanent   | heap
   | 114 MB |
 public | zstd_wo_dict   | table | nikhilkv | permanent   | heap
   | 210 MB |
(4 rows)
```

We've observed similarly strong results on other datasets as well with
using dictionaries.

[1] https://www.kaggle.com/code/dcalambas/nintendo-tweets-analysis/data

---
Nikhil Veldanda

pgsql-hackers by date:

From: Jacob Champion
Date: 06 March, 23:57:24
Subject: Re: [PoC] Federated Authn/z with OAUTHBEARER

From: Andrew Dunstan
Date: 07 March, 00:02:41
Subject: Re: what's going on with lapwing?

Re: ZStandard (with dictionaries) compression support for TOAST compression - Mailing list pgsql-hackers

Previous

Next