Re: HASH_BLOBS hazards (was Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions) - Mailing list pgsql-hackers
| From | Tom Lane |
|---|---|
| Subject | Re: HASH_BLOBS hazards (was Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions) |
| Date | |
| Msg-id | 1073314.1607983297@sss.pgh.pa.us Whole thread Raw |
| In response to | Re: HASH_BLOBS hazards (was Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions) (Tom Lane <tgl@sss.pgh.pa.us>) |
| Responses |
Re: HASH_BLOBS hazards (was Re: PATCH: logical_work_mem and logical streaming of large in-progress transactions)
|
| List | pgsql-hackers |
I wrote:
> There are a couple of other API oddities that maybe we should think
> about while we're here:
> * Should we just have a blanket insistence that all callers supply
> HASH_ELEM? The default sizes that dynahash.c uses without that are
> undocumented and basically useless. We're already asserting that
> in the HASH_BLOBS path, which is the majority use-case, and this
> patch now asserts it for HASH_STRINGS too.
Here's a follow-up patch for that part, which also tries to respond
a bit to Heikki's complaint about skimpy documentation. While at it,
I const-ified the HASHCTL argument, since there's no need for
hash_create to modify that.
regards, tom lane
diff --git a/src/backend/utils/hash/dynahash.c b/src/backend/utils/hash/dynahash.c
index 07cae638df..49f21b77bb 100644
--- a/src/backend/utils/hash/dynahash.c
+++ b/src/backend/utils/hash/dynahash.c
@@ -317,11 +317,20 @@ string_compare(const char *key1, const char *key2, Size keysize)
* *info: additional table parameters, as indicated by flags
* flags: bitmask indicating which parameters to take from *info
*
- * The flags value must include exactly one of HASH_STRINGS, HASH_BLOBS,
+ * The flags value *must* include HASH_ELEM. (Formerly, this was nominally
+ * optional, but the default keysize and entrysize values were useless.)
+ * The flags value must also include exactly one of HASH_STRINGS, HASH_BLOBS,
* or HASH_FUNCTION, to define the key hashing semantics (C strings,
* binary blobs, or custom, respectively). Callers specifying a custom
* hash function will likely also want to use HASH_COMPARE, and perhaps
* also HASH_KEYCOPY, to control key comparison and copying.
+ * Another often-used flag is HASH_CONTEXT, to allocate the hash table
+ * under info->hcxt rather than under TopMemoryContext; the default
+ * behavior is only suitable for session-lifespan hash tables.
+ * Other flags bits are special-purpose and seldom used.
+ *
+ * Fields in *info are read only when the associated flags bit is set.
+ * It is not necessary to initialize other fields of *info.
*
* Note: for a shared-memory hashtable, nelem needs to be a pretty good
* estimate, since we can't expand the table on the fly. But an unshared
@@ -330,11 +339,19 @@ string_compare(const char *key1, const char *key2, Size keysize)
* large nelem will penalize hash_seq_search speed without buying much.
*/
HTAB *
-hash_create(const char *tabname, long nelem, HASHCTL *info, int flags)
+hash_create(const char *tabname, long nelem, const HASHCTL *info, int flags)
{
HTAB *hashp;
HASHHDR *hctl;
+ /*
+ * Hash tables now allocate space for key and data, but you have to say
+ * how much space to allocate.
+ */
+ Assert(flags & HASH_ELEM);
+ Assert(info->keysize > 0);
+ Assert(info->entrysize >= info->keysize);
+
/*
* For shared hash tables, we have a local hash header (HTAB struct) that
* we allocate in TopMemoryContext; all else is in shared memory.
@@ -385,7 +402,6 @@ hash_create(const char *tabname, long nelem, HASHCTL *info, int flags)
{
Assert(!(flags & HASH_STRINGS));
/* We can optimize hashing for common key sizes */
- Assert(flags & HASH_ELEM);
if (info->keysize == sizeof(uint32))
hashp->hash = uint32_hash;
else
@@ -402,7 +418,6 @@ hash_create(const char *tabname, long nelem, HASHCTL *info, int flags)
* it's almost certainly an integer or pointer not a string.
*/
Assert(flags & HASH_STRINGS);
- Assert(flags & HASH_ELEM);
Assert(info->keysize > 8);
hashp->hash = string_hash;
@@ -529,16 +544,9 @@ hash_create(const char *tabname, long nelem, HASHCTL *info, int flags)
hctl->dsize = info->dsize;
}
- /*
- * hash table now allocates space for key and data but you have to say how
- * much space to allocate
- */
- if (flags & HASH_ELEM)
- {
- Assert(info->entrysize >= info->keysize);
- hctl->keysize = info->keysize;
- hctl->entrysize = info->entrysize;
- }
+ /* remember the entry sizes, too */
+ hctl->keysize = info->keysize;
+ hctl->entrysize = info->entrysize;
/* make local copies of heavily-used constant fields */
hashp->keysize = hctl->keysize;
@@ -617,10 +625,6 @@ hdefault(HTAB *hashp)
hctl->dsize = DEF_DIRSIZE;
hctl->nsegs = 0;
- /* rather pointless defaults for key & entry size */
- hctl->keysize = sizeof(char *);
- hctl->entrysize = 2 * sizeof(char *);
-
hctl->num_partitions = 0; /* not partitioned */
/* table has no fixed maximum size */
diff --git a/src/include/utils/hsearch.h b/src/include/utils/hsearch.h
index 666ad33567..c3daaae92b 100644
--- a/src/include/utils/hsearch.h
+++ b/src/include/utils/hsearch.h
@@ -124,7 +124,7 @@ typedef struct
* one of these.
*/
extern HTAB *hash_create(const char *tabname, long nelem,
- HASHCTL *info, int flags);
+ const HASHCTL *info, int flags);
extern void hash_destroy(HTAB *hashp);
extern void hash_stats(const char *where, HTAB *hashp);
extern void *hash_search(HTAB *hashp, const void *keyPtr, HASHACTION action,
pgsql-hackers by date: