And then maybe even juxtapose that with a linear search example, which also numbers every step. I bet this would make it really click for some people. And for free the user can also play with how a linear search can sometimes be faster when they just want the first element!
As a bonus: allow the user to change the cell count so they can really feel just how each method scales!
https://www.lindelystables.dk/en/posts/functional-quadtree-c...
Got very confused! I challenge the HN hivemind to figure out what's going on.
Well, I guess it was the first AI-first browser, hence all this bs. I uninstalled it months ago...
What does that mean?
Quad trees are abstract until you see what they look like. It’s a clever method to partition 2D points.
(Kd trees are even better)
Aren't Quadtrees covered by almost all basic data-structure books? It is the most simple form of taking the binary tree into the next (2D) dimension.
Edit: lol, downvoted for this post. Never change, HN.
https://proc0.github.io/HackerZen (it's also open source)
points.sort(key=lambda p: sum(((p[0]>>i&1)<<(2*i))|((p[1]>>i&1)<<(2*i+1)) for i in range(16)))https://forceflow.be/2013/10/07/morton-encodingdecoding-thro...
Here's an example from AWS, where lat/long pairs are put into a Z-index, which is used as a DynamoDB sort key, letting you efficiently query for items near a point.
https://aws.amazon.com/blogs/database/z-order-indexing-for-m...
And for a z-curve, the order is basically a depth-first traversal of a quadtree.
In short, it seems relatively few people have the skills to implement them and even fewer have the skills to come up with functional versions of ordinary data structures. They are almost completely absent from university lectures as well, as far as I am aware. For example for AVL trees I could only find a document from ETH from a lecture, that no longer exists or is taught. The language is Isabel and I need to understand its syntax first, before being able to translate it to Scheme.
If anyone has an obscure source for implementations one can learn from and implement oneself in another language, please share.
Functional Data Structures and Algorithms, A Proof Assistant Approach by Tobias Nipkow (Ed.) [https://fdsa-book.net/functional_data_structures_algorithms....]
Purely Functional Data Structures thesis by Chris Okasaki [https://www.cs.cmu.edu/~rwh/students/okasaki.pdf]
https://en.wikipedia.org/wiki/Purely_functional_data_structu...
Not all exercises have solutions. I get stuck on some exercise and have a hard time finding solutions to them, that I can compare with my implementations in Scheme. Not being a Haskell user (yet), I also have some issues translating Haskell to Scheme. In languages like Haskell one often controls flow by pattern matching on type. In Scheme one needs to make structures or records explicitly and use their predicates to explicitly check the type. It's all possible, but not exactly great for learning. Sort of the road has many sticks and stones to stumble.
[1]: https://archiv.infsec.ethz.ch/education/permanent/csmr/exerc...
[2]: https://archiv.infsec.ethz.ch/education/permanent/csmr.html
[3]: https://archiv.infsec.ethz.ch/education/permanent/csmr/exerc...
[4]: https://codeberg.org/ZelphirKaltstahl/guile-data-structures/...
Most code that I saw that used quadtrees were treating things as points and storing them only at the lowest level.
I also made mine auto-divide by counting items that are entirely in a quadrant as they are added to the node, with allocate and split triggered if a count went above a certain threshold.
Anything novel or oopsie?
That being said most ray tracing seems to do bounding volume hierarchies now, so maybe a bvh is the best way to deal with things that have volume.
Similar to yours I split the leaf nodes when they became too full, with some simple fixed threshold defining "full". Only leaf nodes contained references to the indexed objects, and all overlapping leaf nodes would reference the objects they overlapped. Search queries were done also using an AABB, and would iteratively invoke a callback for all overlapping object AABBs found in the overlapping leaf nodes. IIRC the object references hanging off the leaf nodes had a fixed number of linked list slots to be put on a results list during a search, to deduplicate the results before iterating that list with the provided candidate-found callback. Since any given indexed object could be on many leaf nodes in the index, if they spanned a large area shared with a high density of other indexed objects for instance.
It was up to the callback to do the narrow-phase collision detection / control the results list iterating stop vs. continue via return value.
I recall one of the annoyances of sticking the search results linked list entry in the object references hanging off the leaf nodes was it set a limit to the number of simultaneous searches one could perform against the index. Basically the game would initialize the index with a fixed maximum concurrent number of searches to handle, and that set the number of results-linked-list slots the object references would be allocated to accommodate. As long as that was never exceeded it worked great.
It's been a while so I may have gotten it wrong, but that sounds right to me.
Not sure what the most common approaches are...
The C source is @ https://git.pengaru.com/cgit/libix2/.git/tree/src/ix2.c
Quadtrees might look like natural generalization of binary trees, but some things that work very efficiently in binary trees don't work for naive generalization of binary trees into quadtrees. For a full binary tree with W leaves any segment can be described using log(W) nodes. Almost every use of binary tree more interesting than maintaining sorted list of numbers depends on this property. What happens in 2d with Quadtree, how many of quadtree nodes are necessary to exactly describe arbitrary rectangular area? Turn's out you get O(W+H) nodes, not log(N), not log(W)*log(H).
If you stored a rectangle in smallest node that completely contains that avoids the W+H problem during insertion operations, but during read operations you may end up with situation where all the information is stored at the root. This is no better having no quadtree at all and storing everything in plain list that has no special order. Such worse case scenario can be created very easily if every rectangle contains centre of area described by quadtree, then all rectangles will be placed at root.
Dynamically subdividing and or allocating tree nodes on demand isn't anything particularity novel. If anything the cases where you can use fixed sized static trees outside textbook examples is somewhat a minority. Not saying there are no such cases there are enough of them, but large fraction of practical systems need to be capable of scaling for arbitrary amount of data which get's added and removed over the time and total size is not known ahead of time and the systems need to be able to deal arbitrary user data that has maliciously crafted worst case distribution. Every B-tree will split leaves into smaller nodes when necessary. Almost every self balancing binary tree can be considered of doing automatic division just with threshold of no more than 1 item per node. For 2d many uses cases of k-d tree will also subdivide on demand.
With fixed point recursion, this allows for very tiny definitions of IFS fractals. For example, fractals like the Sierpinski triangle/carpet only require ~50 bit of binary lambda calculus [1] [2]!
[0]: https://text.marvinborner.de/2024-03-25-02.html
[1]: https://lambda-screen.marvinborner.de/?term=ERoc0CrYLYA%3D
[2]: https://lambda-screen.marvinborner.de/?term=QcCqqttsFtsI0OaA
This is the most pedantic way of saying "binary 2-tuples" I've ever seen. Also for quadtrees this is inferior to base 4 because you can assume clockwise (or counter) ordering.
I don't think this is something the article cares about, though.
I've found the book "Foundations of Multidimensional and Metric Data Structures" by Hanan Samet to be an excellent resource when looking for a slightly deeper dive than a more introductory algorithms course. It goes in depth on the nuances of these approaches, many of which are highly similar at a cursory glance.
I do have a semi-unrelated question though: does using the recursive approach prevent it from being calculated efficiently on the GPU/compute shaders? Not that it matters; plenty of value in a CPU-bound version of a solution and especially one that is easy to understand when recursive. I was just wondering why the prominent examples used a non-recursive approach, but then I was like "oh, because they expect you to use them on the GPU". ...and then I was like "wait, is that why?"
Historically speaking, the use of recursion in shaders and GPGPU kernels (e.g. OpenCL "C" prior to 1.2) was effectively prohibited. The shader/kernel compiler would attempt to inline function calls, since traditional GPU models had little in the way of supporting call stacks like normal CPU programs have, and thus recursion would be simply banned in the shader/kernel language.