Re: HASH_BLOBS hazards (was Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions) - Mailing list pgsql-hackers
From | Tom Lane |
---|---|
Subject | Re: HASH_BLOBS hazards (was Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions) |
Date | |
Msg-id | 1073314.1607983297@sss.pgh.pa.us Whole thread Raw |
In response to | Re: HASH_BLOBS hazards (was Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions) (Tom Lane <tgl@sss.pgh.pa.us>) |
Responses |
Re: HASH_BLOBS hazards (was Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions)
|
List | pgsql-hackers |
I wrote: > There are a couple of other API oddities that maybe we should think > about while we're here: > * Should we just have a blanket insistence that all callers supply > HASH_ELEM? The default sizes that dynahash.c uses without that are > undocumented and basically useless. We're already asserting that > in the HASH_BLOBS path, which is the majority use-case, and this > patch now asserts it for HASH_STRINGS too. Here's a follow-up patch for that part, which also tries to respond a bit to Heikki's complaint about skimpy documentation. While at it, I const-ified the HASHCTL argument, since there's no need for hash_create to modify that. regards, tom lane diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c index 07cae638df..49f21b77bb 100644 --- a/src/backend/utils/hash/dynahash.c +++ b/src/backend/utils/hash/dynahash.c @@ -317,11 +317,20 @@ string_compare(const char *key1, const char *key2, Size keysize) * *info: additional table parameters, as indicated by flags * flags: bitmask indicating which parameters to take from *info * - * The flags value must include exactly one of HASH_STRINGS, HASH_BLOBS, + * The flags value *must* include HASH_ELEM. (Formerly, this was nominally + * optional, but the default keysize and entrysize values were useless.) + * The flags value must also include exactly one of HASH_STRINGS, HASH_BLOBS, * or HASH_FUNCTION, to define the key hashing semantics (C strings, * binary blobs, or custom, respectively). Callers specifying a custom * hash function will likely also want to use HASH_COMPARE, and perhaps * also HASH_KEYCOPY, to control key comparison and copying. + * Another often-used flag is HASH_CONTEXT, to allocate the hash table + * under info->hcxt rather than under TopMemoryContext; the default + * behavior is only suitable for session-lifespan hash tables. + * Other flags bits are special-purpose and seldom used. + * + * Fields in *info are read only when the associated flags bit is set. + * It is not necessary to initialize other fields of *info. * * Note: for a shared-memory hashtable, nelem needs to be a pretty good * estimate, since we can't expand the table on the fly. But an unshared @@ -330,11 +339,19 @@ string_compare(const char *key1, const char *key2, Size keysize) * large nelem will penalize hash_seq_search speed without buying much. */ HTAB * -hash_create(const char *tabname, long nelem, HASHCTL *info, int flags) +hash_create(const char *tabname, long nelem, const HASHCTL *info, int flags) { HTAB *hashp; HASHHDR *hctl; + /* + * Hash tables now allocate space for key and data, but you have to say + * how much space to allocate. + */ + Assert(flags & HASH_ELEM); + Assert(info->keysize > 0); + Assert(info->entrysize >= info->keysize); + /* * For shared hash tables, we have a local hash header (HTAB struct) that * we allocate in TopMemoryContext; all else is in shared memory. @@ -385,7 +402,6 @@ hash_create(const char *tabname, long nelem, HASHCTL *info, int flags) { Assert(!(flags & HASH_STRINGS)); /* We can optimize hashing for common key sizes */ - Assert(flags & HASH_ELEM); if (info->keysize == sizeof(uint32)) hashp->hash = uint32_hash; else @@ -402,7 +418,6 @@ hash_create(const char *tabname, long nelem, HASHCTL *info, int flags) * it's almost certainly an integer or pointer not a string. */ Assert(flags & HASH_STRINGS); - Assert(flags & HASH_ELEM); Assert(info->keysize > 8); hashp->hash = string_hash; @@ -529,16 +544,9 @@ hash_create(const char *tabname, long nelem, HASHCTL *info, int flags) hctl->dsize = info->dsize; } - /* - * hash table now allocates space for key and data but you have to say how - * much space to allocate - */ - if (flags & HASH_ELEM) - { - Assert(info->entrysize >= info->keysize); - hctl->keysize = info->keysize; - hctl->entrysize = info->entrysize; - } + /* remember the entry sizes, too */ + hctl->keysize = info->keysize; + hctl->entrysize = info->entrysize; /* make local copies of heavily-used constant fields */ hashp->keysize = hctl->keysize; @@ -617,10 +625,6 @@ hdefault(HTAB *hashp) hctl->dsize = DEF_DIRSIZE; hctl->nsegs = 0; - /* rather pointless defaults for key & entry size */ - hctl->keysize = sizeof(char *); - hctl->entrysize = 2 * sizeof(char *); - hctl->num_partitions = 0; /* not partitioned */ /* table has no fixed maximum size */ diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h index 666ad33567..c3daaae92b 100644 --- a/src/include/utils/hsearch.h +++ b/src/include/utils/hsearch.h @@ -124,7 +124,7 @@ typedef struct * one of these. */ extern HTAB *hash_create(const char *tabname, long nelem, - HASHCTL *info, int flags); + const HASHCTL *info, int flags); extern void hash_destroy(HTAB *hashp); extern void hash_stats(const char *where, HTAB *hashp); extern void *hash_search(HTAB *hashp, const void *keyPtr, HASHACTION action,
pgsql-hackers by date: