Use simplehash.h instead of dynahash in SMgr - Mailing list pgsql-hackers
From | David Rowley |
---|---|
Subject | Use simplehash.h instead of dynahash in SMgr |
Date | |
Msg-id | CAApHDvpkWOGLh_bYg7jproXN8B2g2T9dWDcqsmKsXG5+WwZaqw@mail.gmail.com Whole thread Raw |
Responses |
Re: Use simplehash.h instead of dynahash in SMgr
Re: Use simplehash.h instead of dynahash in SMgr |
List | pgsql-hackers |
Hackers, Last year, when working on making compactify_tuples() go faster for 19c60ad69, I did quite a bit of benchmarking of the recovery process. The next thing that was slow after compactify_tuples() was the hash lookups done in smgropen(). Currently, we use dynahash hash tables to store the SMgrRelation so we can perform fast lookups by RelFileNodeBackend. However, I had in mind that a simplehash table might perform better. So I tried it... The attached converts the hash table lookups done in smgr.c to use simplehash instead of dynahash. This does require a few changes in simplehash.h to make it work. The reason being is that RelationData.rd_smgr points directly into the hash table entries. This works ok for dynahash as that hash table implementation does not do any reallocations of existing items or move any items around in the table, however, simplehash moves entries around all the time, so we can't point any pointers directly at the hash entries and expect them to be valid after adding or removing anything else from the table. To work around that, I've just made an additional type that serves as the hash entry type that has a pointer to the SMgrRelationData along with the hash status and hash value. It's just 16 bytes (or 12 on 32-bit machines). I opted to keep the hash key in the SMgrRelationData rather than duplicating it as it keeps the SMgrEntry struct nice and small. We only need to dereference the SMgrRelation pointer when we find an entry with the same hash value. The chances are quite good that an entry with the same hash value is the one that we want, so any additional dereferences to compare the key are not going to happen very often. I did experiment with putting the hash key in SMgrEntry and found it to be quite a bit slower. I also did try to use hash_bytes() but found building a hash function that uses murmurhash32 to be quite a bit faster. Benchmarking =========== I did some of that. It made my test case about 10% faster. The test case was basically inserting 100 million rows one at a time into a hash partitioned table with 1000 partitions and 2 int columns and a primary key on one of those columns. It was about 12GB of WAL. I used a hash partitioned table in the hope to create a fairly random-looking SMgr hash table access pattern. Hopefully something similar to what might happen in the real world. Over 10 runs of recovery, master took an average of 124.89 seconds. The patched version took 113.59 seconds. About 10% faster. I bumped shared_buffers up to 10GB, max_wal_size to 20GB and checkpoint_timeout to 60 mins. To make the benchmark more easily to repeat I patched with the attached recovery_panic.patch.txt. This just PANICs at the end of recovery so that the database shuts down before performing the end of recovery checkpoint. Just start the database up again to do another run. I did 10 runs. The end of recovery log message reported: master (aa271209f) CPU: user: 117.89 s, system: 5.70 s, elapsed: 123.65 s CPU: user: 117.81 s, system: 5.74 s, elapsed: 123.62 s CPU: user: 119.39 s, system: 5.75 s, elapsed: 125.20 s CPU: user: 117.98 s, system: 4.39 s, elapsed: 122.41 s CPU: user: 117.92 s, system: 4.79 s, elapsed: 122.76 s CPU: user: 119.84 s, system: 4.75 s, elapsed: 124.64 s CPU: user: 120.60 s, system: 5.82 s, elapsed: 126.49 s CPU: user: 118.74 s, system: 5.71 s, elapsed: 124.51 s CPU: user: 124.29 s, system: 6.79 s, elapsed: 131.14 s CPU: user: 118.73 s, system: 5.67 s, elapsed: 124.47 s master + v1 patch CPU: user: 106.90 s, system: 4.45 s, elapsed: 111.39 s CPU: user: 107.31 s, system: 5.98 s, elapsed: 113.35 s CPU: user: 107.14 s, system: 5.58 s, elapsed: 112.77 s CPU: user: 105.79 s, system: 5.64 s, elapsed: 111.48 s CPU: user: 105.78 s, system: 5.80 s, elapsed: 111.63 s CPU: user: 113.18 s, system: 6.21 s, elapsed: 119.45 s CPU: user: 107.74 s, system: 4.57 s, elapsed: 112.36 s CPU: user: 107.42 s, system: 4.62 s, elapsed: 112.09 s CPU: user: 106.54 s, system: 4.65 s, elapsed: 111.24 s CPU: user: 113.24 s, system: 6.86 s, elapsed: 120.16 s I wrote this patch a few days ago. I'm only posting it now as I know a couple of other people have expressed an interest in working on this. I didn't really want any duplicate efforts, so thought I'd better post it now before someone else goes and writes a similar patch. I'll park this here and have another look at it when the PG15 branch opens. David
Attachment
pgsql-hackers by date: