Thread: Do PostgreSQL have map and set structure(like STL in C++)?
Dear PostgreSQL Community:
I want to add a feature in PostgreSQL, and I need use map structure and set structure(like STL in C++). Do PostgreSQL have realized these structures? Where can I find the functions?
What I need in the code is just like this:
map<char*, set<char*> >
set<char*>
Thank you,
Liu Baozhu
Hello, > I want to add a feature in PostgreSQL, and I need use map structure and > set structure(like STL in C++). Do PostgreSQL have realized these > structures? Where can I find the functions? What I need in the code is > just like this: map<char*, set<char*>>, set<char*> You are looking for a hash table, see under "src/backend/utils/hash/". -- Fabien.
Hello Liu! > 21 апр. 2019 г., в 17:32, 梦旅人 <liubaozhu1258@qq.com> написал(а): > I want to add a feature in PostgreSQL, and I need use map structure and set structure(like STL in C++). Do PostgreSQLhave realized these structures? Where can I find the functions? > What I need in the code is just like this: > map<char*, set<char*> > > set<char*> You can use HTAB at utils/hsearch.h [0] It is Larson's dynamic hashing, implementation is in backend/utils/hash/dynahash.c Mostly like unordered_map. Accordingly, it lacks lower bound functionality as sorted sets do. Also, you can user RB-tree in lib/rbtree.h [1] It's usual red-black tree. Best regards, Andrey Borodin. [0] https://github.com/postgres/postgres/blob/master/src/include/utils/hsearch.h [1] https://github.com/postgres/postgres/blob/master/src/include/lib/rbtree.h
On Mon, Apr 22, 2019 at 5:22 AM Andrey Borodin <x4mmm@yandex-team.ru> wrote: > > 21 апр. 2019 г., в 17:32, 梦旅人 <liubaozhu1258@qq.com> написал(а): > > I want to add a feature in PostgreSQL, and I need use map structure and set structure(like STL in C++). Do PostgreSQLhave realized these structures? Where can I find the functions? > > What I need in the code is just like this: > > map<char*, set<char*> > > > set<char*> > > You can use HTAB at utils/hsearch.h [0] > > It is Larson's dynamic hashing, implementation is in backend/utils/hash/dynahash.c > Mostly like unordered_map. Accordingly, it lacks lower bound functionality as sorted sets do. > > Also, you can user RB-tree in lib/rbtree.h [1] It's usual red-black tree. There is also src/include/lib/simplehash.h. A bit like C++'s std::unordered_map and std::unordered_set templates, it is specialised by type with inlined hash and eq functions (though it's done with preprocessor tricks). Unlike those it it uses a variant of Robin Hood algorithm for collisions, while the C++ standard effectively requires collision chains. Like dynahash (hsearch.h), it's both map and set at the same time: instead of using a pair<key,value> it has a simple element type and it's your job to decide if the whole thing or just part of it is the key. Unlike dynahash, it doesn't have a way to work in shared memory. There is also src/include/lib/dshash.h. It uses collision chains, and works in DSA shared memory (a strange allocator that can deal with memory mapped at different addresses in different processes), and has a lock/partitioning scheme to deal with sharing, but it's not very featureful. Unlike dynahash in shared memory mode, it can allocate more memory as required, and can be created and destroyed any time (whereas dynahash in shared memory mode is trapped in a fixed sized region of memory that lives as long as the cluster). Andrey mentioned Larson's dynamic hashing; that's quite interesting, and can be seen in our on-disk hash indexes, but dynahash does it for in-memory hash tables, whereas simplehash and dshash just rebuild the whole hash table when one unlucky caller inserts the new element that exceeds the load factor target. AFAIK it's quite unusual to use the dynamic expansion trick for an in-memory-only hash table (?). So yeah, that's three general purpose hash table/set implementations. I wouldn't mind if we had some more generic container and algorithm code like simplehash.h in the tree; it's easy to show performance benefits of inlining in microbenchmarks, and it's nicer to program without explicitly dealing in void pointers all over the place. I suppose it would be theoretically possible to make some of these design choices into policies you could select when instantiating, rather than having separate library components. Another thing that is on the radar for future development is concurrent lock-free extensible hash tables as seen in some other projects. -- Thomas Munro https://enterprisedb.com