Home > mailing lists

Re: RFC: Packing the buffer lookup table - Mailing list pgsql-hackers

From	Zhang Mingli
Subject	Re: RFC: Packing the buffer lookup table
Date	February 1, 2025 08:01:31
Msg-id	ed906d93-c877-479b-84b1-bf8532d2768e@Spark Whole thread Raw
List	pgsql-hackers

Tree view

Zhang Mingli

www.hashdata.xyz

On Jan 30, 2025 at 15:49 +0800, Matthias van de Meent <boekewurm+postgres@gmail.com>, wrote:

Some time ago I noticed that every buffer table entry is quite large at 40 bytes (+8): 16 bytes of HASHELEMENT header (of which the last 4 bytes are padding), 20 bytes of BufferTag, and 4 bytes for the offset into the shared buffers array, with generally 8 more bytes used for the bucket pointers. (32-bit systems: 32 (+4) bytes)

Does anyone know why we must have the buffer tag in the buffer table?
It seems to me we can follow the offset pointer into the shared BufferDesc array whenever we find out we need to compare the tags (as opposed to just the hash, which is stored and present in HASHELEMENT). If we decide to just follow the pointer, we can immediately shave 16 bytes (40%) off the lookup table's per-element size, or 24 if we pack the 4-byte shared buffer offset into the unused bytes in HASHELEMENT, reducing the memory usage of that hash table by ~50%: We'd have 16 bytes for every ELEMENT+shared_buffer_offset, plus 8 bytes for every bucket pointer (of which there are approximately as many as there are elements), resulting in 24 bytes /max_alloc elements.

Hi,

Thanks for your insights.
While the buffer tag consumes a relatively small amount of space in the overall shared buffer architecture, including the BufferDescriptors array and page buffers, but every little helps, I think.

Regarding the code, I've read through some. Here are my initial thoughts:

```
int
BufTableInsert(BufferTag *tagPtr, uint32 hashcode, int buf_id)
{
const int surrogateid = SurrogateBuffer;
BufferLookupEnt *result;
bool found;

Assert(buf_id >= 0); /* -1 is reserved for not-in-table */
Assert(tagPtr->blockNum != P_NEW); /* invalid tag */

MyBackendBufLookup = tagPtr;

result = (BufferLookupEnt *)
hash_search_with_hash_value(SharedBufHash,
&surrogateid,
hashcode,
HASH_ENTER,
&found);

MyBackendBufLookup = NULL;

if (found) /* found something already in the table */
{
Assert(result->id != SurrogateBuffer);
return result->id;
}
```

In the BufTableInsert function, it appears that the key changes from BufferTag to an integer surrogateid.
Given that multiple buckets exist based on the hash code, we need to iterate through the bucket lists to find a slot by comparing the keys, and if surrogateid is set to -1, will the comparison function always return false?

match(ELEMENTKEY(currBucket), keyPtr, keysize) == 0) inside hash_search_with_hash_value

```
while (currBucket != NULL)
{
if (currBucket->hashvalue == hashvalue &&
match(ELEMENTKEY(currBucket), keyPtr, keysize) == 0)
break;
prevBucketPtr = &(currBucket->link);
currBucket = *prevBucketPtr;
```

Additionally, I'm curious about the purpose of MyBackendBufLookup, which is set and reset around the hash_search_with_hash_value call. Is there a concurrency consideration here, even though we have a lock before the buffer insertion?

And, a potential drawback? : databases built on PostgreSQL might manipulate the buffer table directly (e.g., reading it for specific purposes).
In this case, the buffer tags stored in the table would reveal the infos without needing to reference the buffer descriptor array.
While I understand that Postgres doesn’t have promise about that, just a consideration.

pgsql-hackers by date:

From: Tom Lane
Date: 01 February 2025, 05:32:42
Subject: Re: jsonlog missing from logging_collector description

From: Amit Kapila
Date: 01 February 2025, 08:07:07
Subject: Re: Conflict detection for update_deleted in logical replication

Re: RFC: Packing the buffer lookup table - Mailing list pgsql-hackers

Previous

Next