pgsql: Add template for adaptive radix tree
Add template for adaptive radix tree
This implements a radix tree data structure based on the design in
"The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases"
by Viktor Leis, Alfons Kemper, and ThomasNeumann, 2013. The main
technique that makes it adaptive is using several different node types,
each with a different capacity of elements, and a different algorithm
for accessing them. The nodes start small and grow/shrink as needed.
The main advantage over hash tables is efficient sorted iteration and
better memory locality when successive keys are lexicographically
close together. The implementation currently assumes 64-bit integer
keys, and traversing the tree is in general slower than a linear
probing hash table, so this is not a general-purpose associative array.
The paper describes two other techniques not implemented here,
namely "path compression" and "lazy expansion". These can further
reduce memory usage and speed up traversal, but the former would add
significant complexity and the latter requires storing the full key
with the value. We do trivially compress the path when leading bytes
of the key are zeros, however.
For value storage, we use "combined pointer/value slots", as
recommended in the paper. Values of size equal or smaller than the the
platform's pointer type are stored in the array of child pointers in
the last level node, while larger values are each stored in a separate
allocation. This is for now fixed at compile time, but it would be
fairly trivial to allow determining at runtime how variable-length
values are stored.
One innovation in our implementation compared to the ART paper is
decoupling the notion of node "size class" from "kind". The size
classes within a given node kind have the same underlying type, but
a variable capacity for children, so we can introduce additional node
sizes with little additional code.
To enable different use cases to specialize for different value types
and for shared/local memory, we use macro-templatized code generation
in the same manner as simplehash.h and sort_template.h.
Future commits will use this infrastructure for storing TIDs.
Patch by Masahiko Sawada and John Naylor, but a substantial amount of
credit is due to Andres Freund, whose proof-of-concept was a valuable
source of coding idioms and awareness of performance pitfalls, and
who reviewed earlier versions.
Discussion: /messages/by-id/CAD21AoAfOZvmfR0j8VmZorZjL7RhTiQdVttNuC4W-Shdc2a-AA@mail.gmail.com
Branch
------
master
Details
-------
https://git.postgresql.org/pg/commitdiff/ee1b30f128d8f63a5184d2bcf1c48a3efc3fcbf9
Modified Files
--------------
src/backend/utils/mmgr/dsa.c | 13 +
src/include/lib/radixtree.h | 3009 ++++++++++++++++++++
src/include/utils/dsa.h | 1 +
src/test/modules/Makefile | 1 +
src/test/modules/meson.build | 1 +
src/test/modules/test_radixtree/.gitignore | 4 +
src/test/modules/test_radixtree/Makefile | 23 +
.../test_radixtree/expected/test_radixtree.out | 41 +
src/test/modules/test_radixtree/meson.build | 34 +
.../modules/test_radixtree/sql/test_radixtree.sql | 7 +
.../modules/test_radixtree/test_radixtree--1.0.sql | 8 +
src/test/modules/test_radixtree/test_radixtree.c | 473 +++
.../modules/test_radixtree/test_radixtree.control | 4 +
src/tools/pgindent/typedefs.list | 1 +
14 files changed, 3620 insertions(+)
John Naylor <john.naylor@postgresql.org> writes:
This implements a radix tree data structure based on the design in
"The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases"
by Viktor Leis, Alfons Kemper, and ThomasNeumann, 2013.
Hm ... do we know that this is not patent-encumbered technology?
Commits citing such specific prior art make me nervous.
regards, tom lane
On Thu, Mar 7, 2024 at 1:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
John Naylor <john.naylor@postgresql.org> writes:
This implements a radix tree data structure based on the design in
"The Adaptive Radix Tree: ARTful Indexing for Main-Memory Databases"
by Viktor Leis, Alfons Kemper, and ThomasNeumann, 2013.Hm ... do we know that this is not patent-encumbered technology?
Commits citing such specific prior art make me nervous.
There are several open source implementations in a variety of
languages, so I assumed not.
John Naylor <johncnaylorls@gmail.com> writes:
On Thu, Mar 7, 2024 at 1:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hm ... do we know that this is not patent-encumbered technology?
Commits citing such specific prior art make me nervous.
There are several open source implementations in a variety of
languages, so I assumed not.
Are any of those implementations used in places that might
entice a patent troll to come after them?
(If you think Postgres isn't an inviting target for patent
trolls, you're wrong. We've avoided getting sued so far,
but man this topic scares me.)
regards, tom lane
On Thu, Mar 7, 2024 at 1:32 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
John Naylor <johncnaylorls@gmail.com> writes:
On Thu, Mar 7, 2024 at 1:08 PM Tom Lane <tgl@sss.pgh.pa.us> wrote:
Hm ... do we know that this is not patent-encumbered technology?
Commits citing such specific prior art make me nervous.There are several open source implementations in a variety of
languages, so I assumed not.Are any of those implementations used in places that might
entice a patent troll to come after them?(If you think Postgres isn't an inviting target for patent
trolls, you're wrong. We've avoided getting sued so far,
but man this topic scares me.)
Understandably so. FWIW, the use case proposed by the authors was for
secondary indexes within in-memory databases, not as a type of
associative array. I'm unable to find patents for the thing itself,
but IANAL. I believe I've been in contact with some of the same
authors about a different subject, and they seemed open to people
trying to implement their ideas (it was a different paper, to be sure,
and unfortunately I no longer email account).
On 2024-Mar-07, John Naylor wrote:
Understandably so. FWIW, the use case proposed by the authors was for
secondary indexes within in-memory databases, not as a type of
associative array. I'm unable to find patents for the thing itself,
but IANAL. I believe I've been in contact with some of the same
authors about a different subject, and they seemed open to people
trying to implement their ideas (it was a different paper, to be sure,
and unfortunately I no longer email account).
Well, surely we can email them. Their profiles show they're still
active.
https://db.in.tum.de/people/sites/kemper/?lang=en
https://db.in.tum.de/people/sites/neumann/?lang=en
https://db.in.tum.de/people/sites/leis/?lang=en
--
Álvaro Herrera 48°01'N 7°57'E — https://www.EnterpriseDB.com/
"Pido que me den el Nobel por razones humanitarias" (Nicanor Parra)
On Thu, Mar 7, 2024 at 4:36 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Well, surely we can email them. Their profiles show they're still
active.
https://db.in.tum.de/people/sites/kemper/?lang=en
https://db.in.tum.de/people/sites/neumann/?lang=en
https://db.in.tum.de/people/sites/leis/?lang=en
I thought the same, and have sent them a message after completing some
research into remaining buildfarm failures.
On 2024-03-07 Th 00:49, John Naylor wrote:
Add template for adaptive radix tree
drongo and fairywren don't like this, See e.g.
<https://buildfarm.postgresql.org/cgi-bin/show_log.pl?nm=drongo&dt=2024-03-07%2016%3A51%3A00>
cheers
andrew
--
Andrew Dunstan
EDB: https://www.enterprisedb.com
On Thu, Mar 7, 2024 at 4:47 PM John Naylor <johncnaylorls@gmail.com> wrote:
On Thu, Mar 7, 2024 at 4:36 PM Alvaro Herrera <alvherre@alvh.no-ip.org> wrote:
Well, surely we can email them. Their profiles show they're still
active.
https://db.in.tum.de/people/sites/kemper/?lang=en
https://db.in.tum.de/people/sites/neumann/?lang=en
https://db.in.tum.de/people/sites/leis/?lang=enI thought the same, and have sent them a message after completing some
research into remaining buildfarm failures.
The authors have confirmed they did not patent adaptive radix tree.
Now, after I get some coffee I'll look into the Windows failures.
On Fri, 8 Mar 2024 at 13:48, John Naylor <johncnaylorls@gmail.com> wrote:
Now, after I get some coffee I'll look into the Windows failures.
I had a look at this and the attached fixes the broken build on MSVC for me.
I didn't take the time to fully understand it, but I did also try
PGDLLEXPORT and that *didn't* fix it. My guess is it's because these
are function pointer variables rather than functions.
git grep -E "PGDLLIMPORT.*\("
does not show anything else that does this for function pointers.
I did try commenting out "#define TRY_POPCNT_FAST 1" and the build still works.
That was the extent of my research.
David
Attachments:
fix_popcount64_link_issue_on_windows.patchtext/plain; charset=US-ASCII; name=fix_popcount64_link_issue_on_windows.patchDownload+2-2
On Fri, Mar 8, 2024 at 10:29 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 8 Mar 2024 at 13:48, John Naylor <johncnaylorls@gmail.com> wrote:
Now, after I get some coffee I'll look into the Windows failures.
I had a look at this and the attached fixes the broken build on MSVC for me.
I didn't take the time to fully understand it, but I did also try
PGDLLEXPORT and that *didn't* fix it. My guess is it's because these
are function pointer variables rather than functions.
Thanks, I was getting close to committing a hackish workaround -- this
seems better!
On Fri, 8 Mar 2024 at 16:37, John Naylor <johncnaylorls@gmail.com> wrote:
Thanks, I was getting close to committing a hackish workaround -- this
seems better!
ok cool. I also noticed a typo. Maybe worth fixing that at the same time?
Attached.
David
Attachments:
typo_fix.patchtext/plain; charset=US-ASCII; name=typo_fix.patchDownload+1-1
On Fri, Mar 8, 2024 at 10:43 AM David Rowley <dgrowleyml@gmail.com> wrote:
On Fri, 8 Mar 2024 at 16:37, John Naylor <johncnaylorls@gmail.com> wrote:
Thanks, I was getting close to committing a hackish workaround -- this
seems better!ok cool. I also noticed a typo. Maybe worth fixing that at the same time?
Attached.
Done, thanks! As I mentioned, I was very close, and in fact
accidentally committed both, so egg on my face :-( -- but have
reverted the first one.