Home > mailing lists

Optimizing pglz compressor - Mailing list pgsql-hackers

From	Heikki Linnakangas
Subject	Optimizing pglz compressor
Date	March 1, 2013 18:28:41
Msg-id	5130C914.8080106@vmware.com Whole thread Raw
Responses	Re: Optimizing pglz compressor
List	pgsql-hackers

Tree view

I spotted some low-hanging fruit in the pglz compression routine. It
uses a hash table to keep track of string prefixes it has seen:

#define PGLZ_HISTORY_LISTS        8192    /* must be power of 2 */
#define PGLZ_HISTORY_SIZE        4096

/* ----------
  * Statically allocated work arrays for history
  * ----------
  */
static PGLZ_HistEntry *hist_start[PGLZ_HISTORY_LISTS];
static PGLZ_HistEntry hist_entries[PGLZ_HISTORY_SIZE];

The hist_start array has to be zeroed at every invocation of
pglz_compress(). That's relatively expensive, when compressing small
values; on a 64-bit machine, 8192 * 8 = 64 kB.

The pointers in hist_start point to entries in hist_entries. If we
replace the pointers with int16 indexes into hist_entries, we can shrink
hist_start array to 8192 * 2 = 16 kB.

Also, in PGLZ_HistEntry:

typedef struct PGLZ_HistEntry
{
    struct PGLZ_HistEntry *next;    /* links for my hash key's list */
    struct PGLZ_HistEntry *prev;
    int            hindex;            /* my current hash key */
    const char *pos;            /* my input position */
} PGLZ_HistEntry;

we can replace next and prev with indexes into the hist_entries array,
halving the size of that struct, and thus the hist_entries array from
32*4096=128kB to 64kB. I'm not sure how significant that is -
hist_entries array doesn't need to be zeroed out like hist_start does -
but it might buy us something in CPU cache efficiency.

I ran some performance tests with this:

      testname      | unpatched | patched
-------------------+-----------+---------
  5k text           |   1.55933 | 1.57337
  512b random       |   6.70911 | 5.41420
  100k of same byte |   1.92319 | 1.94880
  2k random         |   4.29494 | 3.88782
  100k random       |   1.10400 | 1.07982
  512b text         |   9.26736 | 7.55828
(6 rows)

The datum used in the "5k text" test was the first 5k from the
"doc/src/sgml/func.sgml" file in the source tree, and "512b text" was
the first 512 bytes from the same file. The data in the random tests was
completely random, and in the "100k of same byte" test, a single byte
was repeated 100000 times (ie. compresses really well).

Each test inserts rows into a temporary table. The table has 10 bytea
columns, and the same value is inserted in every column. The number of
rows inserted was scaled so that each test inserts roughly 500 MB of
data. Each test was repeated 20 times, and the values listed above are
the minimum runtimes in seconds.

I spotted this while looking at Amit's WAL update delta encoding patch.
My earlier suggestion to just use the pglz compressor for the delta
encoding didn't work too well because the pglz compressor was too
expensive, especially for small values. This patch might help with that..

In summary, this seems like a pretty clear win for short values, and a
wash for long values. Not surprising, as this greatly lowers the startup
cost of pglz_compress(). We're past feature freeze, but how would people
feel about committing this?

- Heikki

--
- Heikki

Attachment

pgsql-hackers by date:

From: Ants Aasma
Date: 01 March 2013, 18:01:42
Subject: Re: Materialized views WIP patch

From: Peter Eisentraut
Date: 01 March 2013, 18:33:12
Subject: Re: scanner/parser minimization

Optimizing pglz compressor - Mailing list pgsql-hackers

Attachment

Previous

Next