Re: pgsql: Optimize pglz compressor for small inputs. - Mailing list pgsql-hackers

From Stephen Frost
Subject Re: pgsql: Optimize pglz compressor for small inputs.
Date
Msg-id 20130714171212.GD2511@tamriel.snowman.net
Whole thread Raw
Responses Re: pgsql: Optimize pglz compressor for small inputs.
List pgsql-hackers
Heikki,

* Heikki Linnakangas (heikki.linnakangas@iki.fi) wrote:
> This patch alleviates that in two ways. First, instead of storing pointers
> in the hash table, store 16-bit indexes into the hist_entries array. That
> slashes the size of the hash table to 1/2 or 1/4 of the original, depending
> on the pointer width. Secondly, adjust the size of the hash table based on
> input size. For very small inputs, you don't need a large hash table to
> avoid collisions.
 The coverity scanner has a bit of an issue with this patch which, at least on first blush, looks like a valid concern.
While the change in pg_lzcompress.c:pglz_find_match() to loop on:  while (hent != INVALID_ENTRY_PTR) {  const char *ip
=input;  const char *hp = hent->pos; 
 looks good, and INVALID_ENTRY_PTR is the address of the first entry in the array (and can't be NULL), towards the end
ofthe loop we do: 
 hent = hent->next; if (hent)  ...
 Should we really be checking for 'hent != INVALID_ENTRY_PTR' here?  If not, and hent really can end up as NULL, then
we'regoing to segfault on the next loop due to the unchecked 'hent->pos' early in the loop. If hent can never be NULL,
thenwe probably don't need this check at all. 
     Thanks,
    Stephen

pgsql-hackers by date:

Previous
From: Atri Sharma
Date:
Subject: Re: Removing Inner Joins
Next
From: Greg Smith
Date:
Subject: Re: [PATCH] pgbench --throttle (submission 7 - with lag measurement)