Hello
I did a brief review of bloom contrib and I don't think I like it much.
Here are some issues I believe should be fixed before committing it to
PostgreSQL.
1) Most of the code is not commented. Every procedure should at least
have a breif description of what it does, what arguments it receives
and what it returns. Same for structures and enums.
2) There are only 3 Asserts. For sure there are much more invariants in
this contrib.
3) I don't like this code (blinsert.c):
```
typedef struct
{ BloomState blstate; MemoryContext tmpCtx; char data[BLCKSZ]; int64 count;
} BloomBuildState;
/* ... */
/** (Re)initialize cached page in BloomBuildState.*/
static void
initCachedPage(BloomBuildState *buildstate)
{ memset(buildstate->data, 0, BLCKSZ); BloomInitPage(buildstate->data, 0); buildstate->count = 0;
}
```
It looks wrong because 1) blkstate and tmpCtx are left uninitialized 2)
as we discussed recently [1] you should avoid leaving "holes" with
uninitialized data in structures. Please fix this or provide a comment
that describes why things are done here the way they are done.
Perhaps there are also other places like this that I missed.
4) I don't think I like such a code either:
```
/* blutuls.c */
static BloomOptions *
makeDefaultBloomOptions(BloomOptions *opts)
{ int i;
if (!opts) opts = palloc0(sizeof(BloomOptions));
/* ... */
/* see also blvacuum.c and other places I could miss */
```
Sometimes we create a new zero-initialized structure and sometimes we
use a provided one filled with some data. If I'll use this procedure
multiple times in a loop memory will leak. Thus sometimes we need
pfree() returned value manually and sometimes we don't. Such a code is
hard to reason about. You could do it much simpler choosing only one
thing to do --- either allocate a new structure or use a provided one.
5) Code is not pgindent'ed and has many trailing spaces.
[1] http://www.postgresql.org/message-id/56EFF347.20500@anastigmatix.net
--
Best regards,
Aleksander Alekseev
http://eax.me/