Thread: pgsql: Add template for adaptive radix tree
Add template for adaptive radix tree This implements a radix tree data structure based on the design in "The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases" by Viktor Leis, Alfons Kemper, and ThomasNeumann, 2013. The main technique that makes it adaptive is using several different node types, each with a different capacity of elements, and a different algorithm for accessing them. The nodes start small and grow/shrink as needed. The main advantage over hash tables is efficient sorted iteration and better memory locality when successive keys are lexicographically close together. The implementation currently assumes 64-bit integer keys, and traversing the tree is in general slower than a linear probing hash table, so this is not a general-purpose associative array. The paper describes two other techniques not implemented here, namely "path compression" and "lazy expansion". These can further reduce memory usage and speed up traversal, but the former would add significant complexity and the latter requires storing the full key with the value. We do trivially compress the path when leading bytes of the key are zeros, however. For value storage, we use "combined pointer/value slots", as recommended in the paper. Values of size equal or smaller than the the platform's pointer type are stored in the array of child pointers in the last level node, while larger values are each stored in a separate allocation. This is for now fixed at compile time, but it would be fairly trivial to allow determining at runtime how variable-length values are stored. One innovation in our implementation compared to the ART paper is decoupling the notion of node "size class" from "kind". The size classes within a given node kind have the same underlying type, but a variable capacity for children, so we can introduce additional node sizes with little additional code. To enable different use cases to specialize for different value types and for shared/local memory, we use macro-templatized code generation in the same manner as simplehash.h and sort_template.h. Future commits will use this infrastructure for storing TIDs. Patch by Masahiko Sawada and John Naylor, but a substantial amount of credit is due to Andres Freund, whose proof-of-concept was a valuable source of coding idioms and awareness of performance pitfalls, and who reviewed earlier versions. Discussion: https://postgr.es/m/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA%40mail.gmail.com Branch ------ master Details ------- https://git.postgresql.org/pg/commitdiff/ee1b30f128d8f63a5184d2bcf1c48a3efc3fcbf9 Modified Files -------------- src/backend/utils/mmgr/dsa.c | 13 + src/include/lib/radixtree.h | 3009 ++++++++++++++++++++ src/include/utils/dsa.h | 1 + src/test/modules/Makefile | 1 + src/test/modules/meson.build | 1 + src/test/modules/test_radixtree/.gitignore | 4 + src/test/modules/test_radixtree/Makefile | 23 + .../test_radixtree/expected/test_radixtree.out | 41 + src/test/modules/test_radixtree/meson.build | 34 + .../modules/test_radixtree/sql/test_radixtree.sql | 7 + .../modules/test_radixtree/test_radixtree--1.0.sql | 8 + src/test/modules/test_radixtree/test_radixtree.c | 473 +++ .../modules/test_radixtree/test_radixtree.control | 4 + src/tools/pgindent/typedefs.list | 1 + 14 files changed, 3620 insertions(+)
John Naylor <john.naylor@postgresql.org> writes: > This implements a radix tree data structure based on the design in > "The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases" > by Viktor Leis, Alfons Kemper, and ThomasNeumann, 2013. Hm ... do we know that this is not patent-encumbered technology? Commits citing such specific prior art make me nervous. regards, tom lane
On Thu, Mar 7, 2024 at 1:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > John Naylor <john.naylor@postgresql.org> writes: > > This implements a radix tree data structure based on the design in > > "The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases" > > by Viktor Leis, Alfons Kemper, and ThomasNeumann, 2013. > > Hm ... do we know that this is not patent-encumbered technology? > Commits citing such specific prior art make me nervous. There are several open source implementations in a variety of languages, so I assumed not.
John Naylor <johncnaylorls@gmail.com> writes: > On Thu, Mar 7, 2024 at 1:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: >> Hm ... do we know that this is not patent-encumbered technology? >> Commits citing such specific prior art make me nervous. > There are several open source implementations in a variety of > languages, so I assumed not. Are any of those implementations used in places that might entice a patent troll to come after them? (If you think Postgres isn't an inviting target for patent trolls, you're wrong. We've avoided getting sued so far, but man this topic scares me.) regards, tom lane
On Thu, Mar 7, 2024 at 1:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > > John Naylor <johncnaylorls@gmail.com> writes: > > On Thu, Mar 7, 2024 at 1:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote: > >> Hm ... do we know that this is not patent-encumbered technology? > >> Commits citing such specific prior art make me nervous. > > > There are several open source implementations in a variety of > > languages, so I assumed not. > > Are any of those implementations used in places that might > entice a patent troll to come after them? > > (If you think Postgres isn't an inviting target for patent > trolls, you're wrong. We've avoided getting sued so far, > but man this topic scares me.) Understandably so. FWIW, the use case proposed by the authors was for secondary indexes within in-memory databases, not as a type of associative array. I'm unable to find patents for the thing itself, but IANAL. I believe I've been in contact with some of the same authors about a different subject, and they seemed open to people trying to implement their ideas (it was a different paper, to be sure, and unfortunately I no longer email account).
On 2024-Mar-07, John Naylor wrote: > Understandably so. FWIW, the use case proposed by the authors was for > secondary indexes within in-memory databases, not as a type of > associative array. I'm unable to find patents for the thing itself, > but IANAL. I believe I've been in contact with some of the same > authors about a different subject, and they seemed open to people > trying to implement their ideas (it was a different paper, to be sure, > and unfortunately I no longer email account). Well, surely we can email them. Their profiles show they're still active. https://db.in.tum.de/people/sites/kemper/?lang=en https://db.in.tum.de/people/sites/neumann/?lang=en https://db.in.tum.de/people/sites/leis/?lang=en -- Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/ "Pido que me den el Nobel por razones humanitarias" (Nicanor Parra)
On Thu, Mar 7, 2024 at 4:36 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > Well, surely we can email them. Their profiles show they're still > active. > https://db.in.tum.de/people/sites/kemper/?lang=en > https://db.in.tum.de/people/sites/neumann/?lang=en > https://db.in.tum.de/people/sites/leis/?lang=en I thought the same, and have sent them a message after completing some research into remaining buildfarm failures.
On 2024-03-07 Th 00:49, John Naylor wrote: > Add template for adaptive radix tree drongo and fairywren don't like this, See e.g. <https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2024-03-07%2016%3A51%3A00> cheers andrew -- Andrew Dunstan EDB: https://www.enterprisedb.com
On Thu, Mar 7, 2024 at 4:47 PM John Naylor <johncnaylorls@gmail.com> wrote: > > On Thu, Mar 7, 2024 at 4:36 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote: > > > > Well, surely we can email them. Their profiles show they're still > > active. > > https://db.in.tum.de/people/sites/kemper/?lang=en > > https://db.in.tum.de/people/sites/neumann/?lang=en > > https://db.in.tum.de/people/sites/leis/?lang=en > > I thought the same, and have sent them a message after completing some > research into remaining buildfarm failures. The authors have confirmed they did not patent adaptive radix tree. Now, after I get some coffee I'll look into the Windows failures.
On Fri, 8 Mar 2024 at 13:48, John Naylor <johncnaylorls@gmail.com> wrote: > Now, after I get some coffee I'll look into the Windows failures. I had a look at this and the attached fixes the broken build on MSVC for me. I didn't take the time to fully understand it, but I did also try PGDLLEXPORT and that *didn't* fix it. My guess is it's because these are function pointer variables rather than functions. git grep -E "PGDLLIMPORT.*\(" does not show anything else that does this for function pointers. I did try commenting out "#define TRY_POPCNT_FAST 1" and the build still works. That was the extent of my research. David
Attachment
On Fri, Mar 8, 2024 at 10:29 AM David Rowley <dgrowleyml@gmail.com> wrote: > > On Fri, 8 Mar 2024 at 13:48, John Naylor <johncnaylorls@gmail.com> wrote: > > Now, after I get some coffee I'll look into the Windows failures. > > I had a look at this and the attached fixes the broken build on MSVC for me. > > I didn't take the time to fully understand it, but I did also try > PGDLLEXPORT and that *didn't* fix it. My guess is it's because these > are function pointer variables rather than functions. Thanks, I was getting close to committing a hackish workaround -- this seems better!
On Fri, 8 Mar 2024 at 16:37, John Naylor <johncnaylorls@gmail.com> wrote: > Thanks, I was getting close to committing a hackish workaround -- this > seems better! ok cool. I also noticed a typo. Maybe worth fixing that at the same time? Attached. David
Attachment
On Fri, Mar 8, 2024 at 10:43 AM David Rowley <dgrowleyml@gmail.com> wrote: > > On Fri, 8 Mar 2024 at 16:37, John Naylor <johncnaylorls@gmail.com> wrote: > > Thanks, I was getting close to committing a hackish workaround -- this > > seems better! > > ok cool. I also noticed a typo. Maybe worth fixing that at the same time? > > Attached. Done, thanks! As I mentioned, I was very close, and in fact accidentally committed both, so egg on my face :-( -- but have reverted the first one.