Fast sorted collections for Swift using in-memory B-trees

Overview

Fast Sorted Collections for Swift
Using In-Memory B-Trees

Swift 4.0 License Platform

Build Status Code Coverage

Carthage compatible CocoaPod Version

Overview

This project provides an efficient in-memory B-tree implementation in pure Swift, and several useful sorted collection types that use B-trees for their underlying storage.

  • Map<Key, Value> implements a sorted mapping from unique comparable keys to arbitrary values. It is like Dictionary in the standard library, but it does not require keys to be hashable, it has strong guarantees on worst-case performance, and it maintains its elements in a well-defined order.

  • List<Element> implements a random-access collection of arbitrary elements. It is like Array in the standard library, but lookup, insertion and removal of elements at any index have logarithmic complexity. (Array has O(1) lookup, but insertion and removal at an arbitrary index costs O(n).) Concatenation of two lists of any size, inserting a list into another list at any position, removal of any subrange of elements, or extraction of an arbitrary sub-list are also operations with O(log(n)) complexity.

  • SortedSet<Element> implements a sorted collection of unique comparable elements. It is like Set in the standard library, but lookup, insertion and removal of any element has logarithmic complexity. Elements in an SortedSet are kept sorted in ascending order. Operations working on full sets (such as taking the union, intersection or difference) can take as little as O(log(n)) time if the elements in the source sets aren't interleaved.

  • SortedBag<Element> implements a sorted multiset with comparable elements. This is a generalization of a set that allows multiple instances of the same value. (The standard library does not include such a collection, although you can use a dictionary to emulate one by storing the multiplicities of the keys as values.) The implementation provided in this package stores each duplicate element separately, which may come useful if your elements are reference types with identities or you have some other means to distinguish between equal elements. SortedBag operations have the same time complexities as the equivalent operations in SortedSet.

  • BTree<Key, Value> is the underlying primitive collection that serves as base storage for all of the above collections. It is a general sorted key-value store with full support for elements with duplicate keys; it provides a sum of all operations individually provided by the higher-level abstractions above (and more!).

    The BTree type is public; you may want to use it if you need a collection flavor that isn't provided by default (such as a multimap) or if you need to use an operation that isn't exposed by the wrappers.

All of these collections are structs and they implement the same copy-on-write value semantics as standard Swift collection types like Array and Dictionary. (In fact, copy-on-write works even better with these than standard collections; continue reading to find out why!)

The latest version of BTree requires Swift 4. (The last release supporting Swift 3 was 4.0.2.)

Reference Documentation

The project includes a nicely formatted reference document generated from the documentation comments embedded in its source code.

Optimizing Collections: The Book

If you want to learn more about how this package works, the book Optimizing Collections includes detailed explanations of many of the algorithms and optimization tricks implemented by this package – and so, so much more. It is written by the same author, and published by the fine folks at objc.io. Buying a copy of the book is not only a nice way to support this project, it also gets you something quite interesting to read. Win-win!

Optimizing Collections (eBook)

What Are B-Trees?

B-trees are search trees that provide a sorted key-value store with excellent performance characteristics. In essence, each node maintains a sorted array of its own elements, and another array for its children. The tree is kept balanced by three constraints:

  1. Only the root node is allowed to be less than half full.
  2. No node may be larger than the maximum size.
  3. The leaf nodes are all at the same level.

Compared to other popular search trees such as red-black trees or AVL trees, B-trees have huge nodes: nodes often contain hundreds (or even thousands) of key-value pairs and children.

This module implements a "vanilla" B-tree where every node contains full key-value pairs. (The other popular type is the B+-tree where only leaf nodes contain values; internal nodes contain only copies of keys. This often makes more sense on an external storage device with a fixed block size, but it seems less useful for an in-memory implementation.)

Each node in the tree also maintains the count of all elements under it. This makes the tree an order statistic tree, where efficient positional lookup is possible.

Why In-Memory B-Trees?

The Swift standard library offers heavily optimized arrays and hash tables, but omits linked lists and tree-based data structures. This is a result of the Swift engineering team spending resources (effort, code size) on the abstractions that provide the biggest bang for the buck.

Indeed, the library lacks even a basic double-ended queue construct -- although Cocoa's Foundation framework does include one in NSArray.

However, some problems call for a wider variety of data structures.

In the past, linked lists and low-order search trees such as red-black trees were frequently employed; however, the performance of these constructs on modern hardware is greatly limited by their heavy use of pointers.

B-trees were originally invented in the 1970s as a data structure for slow external storage devices. As such, they are strongly optimized for locality of reference: they prefer to keep data in long contiguous buffers and they keep pointer derefencing to a minimum. (Dereferencing a pointer in a B-tree usually meant reading another block of data from the spinning hard drive, which is a glacially slow device compared to the main memory.)

Today's computers have multi-tiered memory architectures; they rely on caching to keep the system performant. This means that locality of reference has become a hugely important property for in-memory data structures, too.

Arrays are the epitome of reference locality, so the Swift stdlib's heavy emphasis on Array as the universal collection type is well justified.

For example, using a single array to hold a sorted list of items has quite horrible (quadratic) asymptotic complexity when there are many elements. However, up to a certain maximum size, a simple array is in fact the most efficient way to represent a sorted list.

Typical benchmark results for sorted collections

The benchmark above demonstrates this really well: insertion of n elements into a sorted array costs O(n^2) when there are many items, but for many reasonably sized data sets, it is still much faster than creating a red-black tree with its fancypants O(n * log(n)) complexity.

Near the beginning of the curve, up to about eighteen thousand items, a sorted array implementation imported from an external module is very consistently about 6-7 times faster than a red-black tree, with a slope that is indistinguishable from O(n * log(n)).

Even after it catches up to quadratic complexity, in this particular benchmark, it takes about a hundred thousand items for the sorted array to become slower than the red-black tree!

The exact cutoff point depends on the type/size of elements that you work with, and the capabilities of the compiler. This benchmark used tiny 8-byte integer elements, hence the huge number.

The benchmark is based on my own red-black tree implementation that uses a single flat array to store node data. A more typical implementation would store each node in a separately allocated object, so it would likely be even slower.

The chart above is a log-log plot which makes it easy to compare the polynomial exponents of the complexity curves of competing algorithms at a glance. The slope of a quadratic algorithm on a log-log chart (like insertion into a sorted array---the green curves) is twice of that of a linear algorithm (like appending n items to an unsorted array---light blue curve) or a quasilinear one (like inserting into a red-black tree, red curve).

Note that the big gap between collections imported from stdlib and those imported from external modules is caused by a limitation in the current Swift compiler/ABI: when this limitation is lifted, the gap will narrow considerably, which will reduce the element count at which you'll be able to reap the benefits of lower asymptotic complexity.

(This effect is already visible (albeit in reverse) on the benchmark for the "inlined" sorted array (light green), which is essentially the same code as the regular one (dark green) except it was implemented in the same module as the benchmarking loop, so the compiler has more options to optimize away witness tables and other levels of abstraction. That line starts curving up much sooner, at about 2000 items--imagine having a B-tree implementation that's equally fast! Or better, try it yourself and report your results. Producing benchmarks like this takes a lot of time and effort.) :-)

This remarkable result is due in large part to the vast number of (to a CPU, random-looking) memory references that are needed to operate on red-black trees. Their intricate ballet of tree rotations looks mighty impressive to us mere humans, but to the delicate caches of your poor CPU, it looks more like a drunken elephant moshing at a thrash metal concert.

Meanwhile, the humble Array does the only thing it knows: sliding around long contiguous memory regions. It does this over and over, ad nauseum. It doesn't look impressive, but (up to a point) it fits well with how computers work.

So a small Array is perfect for maintaining a sorted list. But what if the list gets too long? The B-tree's answer is to simply cut the array in half, and to create a new index tree node on top to allow it to quickly find its way around this more complex list structure. These internal index nodes can also consist of arrays of elements and node references, creating a nice recursive data structure.

Because their fanout number is so high, B-trees are extremely shallow: for a B-tree with order 100 (which is actually rather on the low end), you can fit a billion items into a tree that's not more than five levels deep.

Once you accept that small arrays are fast, it is easy to see why B-trees work so well: unless it holds more elements than its order, a B-tree quite literally is just an Array. So it has the same performance behavior as an Array for a small number of elements, and when it grows larger it prevents a quadratic upswing by never allowing its arrays to get too large. The yellow curve on the benchmark above demonstrates this behavior well.

Consider that each node in a typical B-tree can hold about ten full levels of a red-black tree (or AVL trees or whatever binary tree you like). Looking up an item in a B-tree node still requires a binary search of the node array, but this search works on a contiguous memory region, while the conventional search tree is fiddling around with loading pointer values and dereferencing them.

So it makes perfect sense to employ B-trees as an in-memory data structure.

Think about this, though: how many times do you need to work with a hundred thousand sorted items in a typical app? Or even twenty thousand? Or even just two thousand? The most interesting benefits of B-trees often occur at element counts well over a hundred thousand. However, B-trees are not much slower than arrays for low element counts (remember, they are arrays in that case), so it makes sense to use them when there's even a slight chance that the count will get large.

Laundry List of Issues with Standard Collection Types

The data structures implemented by Array, Dictionary and Set are remarkably versatile: a huge class of problems is easily and efficiently solved by simple combinations of these abstractions. However, they aren't without drawbacks: you have probably run into cases when the standard collections exhibit suboptimal behavior:

  1. Insertion and removal in the middle of an Array can be slow when there are many items. (Keep the previous section in mind, though.)

  2. The all-or-nothing copy-on-write behavior of Array, Dictionary and Set can lead to performance problems that are hard to detect and fix. If the underlying storage buffer is being shared by multiple collection instances, the modification of a single element in any of the instances requires creating a full copy of every element.

    It is not at all obvious from the code when this happens, and it is even harder to reliably check for. You can't (easily) write unit tests to check against accidental copying of items with value semantics!

  3. With standard collection types, you often need to think about memory management.

    Arrays and dictionaries never release memory until they're entirely deallocated; a long-lived collection may hold onto a large piece of memory due to an earlier, temporary spike in the number of its elements. This is a form of subtle resource leak that can be hard to detect. On memory-constrained systems, wasting too much space may cause abrupt process termination.

    Appending a new element to an array, or inserting a new element into a dictionary or a set are usually constant time operations, but they sometimes take O(n) time when the collection exhausts its allocated capacity. These spikes in execution time are often undesired, but preventing them requires careful size analysis.
    If you reserve too little space, you'll still get spikes; if you reserve too much, you're wasting memory.

  4. The order of elements in a Dictionary or a Set is undefined, and it isn't even stable: it may change after seemingly simple mutations. Two collections with the exact same set of elements may store them in wildly different order.

  5. Hashing collections require their keys to be Hashable. If you want to use your own type as the key, you need to write a hash function yourself. It is annoyingly hard to write a good hash function, and it is even harder to test that it doesn't produce too many collisions for the sets of values your code will typically use.

  6. The possibility of hash collisions make Dictionary and Set badly suited for tasks which require guaranteed worst-case performance. (E.g. server code may face low-bandwidth denial of service attacks due to artificial hash collisions.)

  7. Array concatenation takes O(n) time, because it needs to put a copy of every element from both arrays into a new contiguous buffer.

  8. Merging dictionaries or taking the union/intersection etc. of two sets are all costly O(n) operations, even if the elements aren't interleaved at all.

  9. Creating an independently editable sub-dictionary or subset requires elementwise iteration over either the entire collection, or the entire set of potential target items. This is often impractical, especially when the collection is large but sparse.

    Getting an independently editable sub-array out of an array takes time that is linear in the size of the result. (ArraySlice is often helpful, but it is most effective as a short-lived read-only view in temporary local variables.)

These issues don't always matter. In fact, lots of interesting problems can be solved without running into any of them. When they do occur, the problems they cause are often insignificant. Even when they cause significant problems, it is usually straightforward to work around them by chosing a slightly different algorithm.

But sometimes you run into a case where the standard collection types are too slow, and it would be too painful to work around them.

B-Trees to the Rescue!

B-trees solve all of the issues above. (Of course, they come with a set of different issues of their own. Life is hard.)

Let's enumerate:

  1. Insertion or removal from any position in a B-tree-based data structure takes O(log(n)) time, no matter what.

  2. Like standard collection types, B-trees implement full copy-on-write value semantics. Copying a B-tree into another variable takes O(1) time; mutations of a copy do not affect the original instance.

    However, B-trees implement a greatly improved version of copy-on-write that is not all-or-nothing: each node in the tree may be independently shared with other trees.

    If you need to insert/remove/update a single element, B-trees will copy at most O(log(n)) elements to satisfy value semantics, even if the tree was entirely shared before the mutation.

  3. Storage management in B-trees is granular; you do not need to reserve space for a B-tree in advance, and it never allocates more memory than it needs to store the actual number of elements it contains.

    Storage is gradually allocated and released in small increments as the tree grows and shrinks. Storage is only copied when mutating shared elements, and even then it is done in small batches.

    The performance of B-trees is extremely stable, with no irregular spikes ever.

    (Note that there is a bit of leeway in allocations to make it easy to balance the tree. In the worst case, a B-tree may only fill 50% of the space it allocates. The ratio is typically much higher than that, though.)

  4. B-trees always keep their items sorted in ascending key order, and they provide efficient positional lookups. You can get the ith smallest/largest item in a tree in O(log(n)) time.

  5. Keys of a B-tree need to be Comparable, not Hashable. It is often significantly easier to write comparison operators than hash functions; it is also much easier to verify that the implementation works correctly. A buggy < operator will typically lead to obvious issues that are relatively easy to catch; a badly collisioning hash may go undetected for years.

  6. Adversaries (or blind chance) will never produce a set of elements for which B-trees behave especially badly. The performance of B-trees only depends on the size of the tree, not its contents. (Provided that key comparison also behaves uniformly, of course. If you allow multi-megabyte strings as keys, you're gonna have a bad time.)

  7. Concatenation of any two B-trees takes O(log(n)) time. For trees that aren't of a trivial size, the result will share some of its nodes with the input trees, deferring most copying until the time the tree needs to be modified. (Which may never happen.) Copy-on-write really shines with B-trees!

  8. Merging the contents of two B-trees into a single tree takes O(n) time in the worst case, but if the elements aren't too badly interleaved, it can often finish in O(log(n)) time by linking entire subtrees into the result in one go.

    Set operations on the keys of a B-tree (such as calculating the intersection set, subtraction set, symmetric difference, etc.) also exploit the same trick for a huge performance boost. If the input trees are mutated versions of the same original tree, these operations are also able to skip elementwise processing of entire subtrees that are shared between the inputs.

  9. The SubSequence of a B-tree is also a B-tree. You can slice and dice B-trees any way you like: getting a fully independent copy of any prefix, suffix or subrange in a tree only takes O(log(n)) time. You can then take the subtree you extracted and insert it into another tree; this also costs O(log(n)), no matter where in the tree you want to put it. (You do need to keep the order of keys correct, though.)

Implementation Notes

  • BTree is a generic struct with copy-on-write value semantics. Internally, it stores its data in nodes with a fixed maximum size, arranged in a tree. BTree type provides a full set of hand-tuned high-level operations to work with elements of a B-tree.

    Nodes are represented by instances of a reference type that is not exported as public API. (Low-level access to individual tree nodes would be tricky to get right, and it would prevent future optimizations, such as moving node counts up to parent nodes.)

  • By default, the tree order (a.k.a., the fanout, or the maximum number of children) is set such that each node stores about 16KiB data. Larger node sizes make lookups faster, while insertion/removal becomes slower -- 16KiB is a good enough approximation of the optimal node size on most modern systems. (But you can also set a custom node size if you know better. Note though that you cannot mix-n-match trees of different orders.) Thus, on a 64-bit system, a B-tree holding Int elements will store about 2047 elements per node. Wow!

  • Individual B-tree nodes may be independently shared between multiple B-trees. When mutating a (partially or fully) shared tree, copy-on-write is restricted to only clone the nodes whose subtree is actually affected by the mutation. This has the following consequences:

    • Nodes cannot contain a reference to their parent node, because it is not necessarily unique.

    • Mutations of shared trees are typically much cheaper than copying the entire collection at once, which is what standard collection types do.

    • The root node is never shared between trees that are not equal.

  • BTree allows elements with duplicate keys to be stored in the tree. (In fact, List works by using the same (empty) key for all elements.)

    All methods that take a key to find an element let you (optionally) specify if you want to work with the first or last matching element, or if you're happy with any match. The latter option is sometimes faster as it often allows the search to stop at the topmost matching element. There is also a selector that looks for the element after the specified key -- this can be nice to determine the position of the end of a range of matching items.

  • Each node keeps track of the number of items in its entire subtree, so efficient positional lookup is possible. For any i, you can get, set, remove or insert the ith item in the tree in log(n) time.

  • There is a BTreeIterator and a BTreeIndex that provide the usual generator/indexing semantics. While individual element lookup usually takes O(log(n)) operations, iterating over all elements via these interfaces requires linear time. Using the generator is faster than indexing, so you should prefer using it whenever possible. There are methods to start an iterator from the middle of the tree: from any offset, any index, or any key.

  • Note that forEach has a specialized recursive implementation, which makes it the fastest way to iterate over B-trees. There is even a variant that allows you to stop the iteration when you had seen enough items and want to get off the carousel.

  • BTreeCursor is an easy-to-use, general-purpose batch editing facility that allows you to manipulate the elements of a B-tree conveniently and highly efficiently. You can use a cursor to walk over the contents of a tree, modifying/inserting/removing elements as needed without a per-element log(n) lookup overhead. If you need to insert or remove a bunch or consecutive elements, it is better to use the provided bulk removal/insertion methods than to process them individually (Range operations have O(log(n)) complexity vs. elementwise processing takes O(k * log(n)).)

  • Internally, navigation in a B-Tree is based on abstract primitives that maintain a path to a particular position in the tree, as described by the BTreePath protocol. The methods directly provided by this protocol are too low-level for convenient use, but the protocol has extension methods built on top of these that support familiar concepts like moving back and forth step by step, jumping to a specific offset in the tree, or looking up a particular key.

    Indexes, generators and cursors use their particular implementation of BTreePath to represent their own path flavors. All three of them maintain a path of nodes from the root of the tree to a particular slot of a particular node, but the details are very different:

    • A BTreeIndex may not hold a strong reference to its tree, because that would interfere with copy-on-write when you want to mutate the tree at a certain index. Thus, indices are wrappers around a BTreeWeakPath, which uses weak references, and needs to tread very carefully in order to detect when one of its references gets out of date.

    • Meanwhile a BTreeIterator is supposed to support standalone iteration over the contents of the tree, so it must contain strong references. It uses a BTreeStrongPath to represent the path of its next element. While an iterator only needs to be able to move one step forward, BTreeStrongPath supports the full tree navigation API, making it very useful elsewhere in the codebase whenever we need a kind of read-only cursor into a tree. For example, the tree merging algorithm uses strong paths to represent its current positions in its input trees.

    • Finally, a BTreeCursor needs to maintain a path where each node is uniquely held by the cursor, ready for mutation. (A cursor owns its own copy of the tree, and does not share it with the outside world until it is finished.) This special path flavor is implemented by BTreeCursorPath. To speed things up, this struct intentionally breaks the node counts on its current path, to allow for super speedy elementwise insertions and removals. The counts are carefully recalculated whenever the path moves off a node's branch in the tree.

  • It would be overkill to create an explicit path to look up or modify a single element in the tree on its own, so BTree also provides a set of recursive methods that implement the same sort of lookups and simple mutations. They are faster when you need to retrieve a single item, but they aren't efficient when called repeatedly.
  • BTree includes a bulk loading algorithm that efficiently initializes fully loaded trees from any sorted sequence. You can also specify a fill factor that's less than 100% if you expect to insert data into the middle of the tree later; leaving some space available may reduce work to keep the tree balanced. The bulk loader can optionally filter out duplicate keys for you. It verifies that the elements are in the correct order and traps if they aren't.

    The bulk loader is based on a general BTreeBuilder struct that specializes on appending elements to a newly created tree. Beside individual elements, it also supports efficiently appending entire B-trees. This comes useful in optimized tree merging algorithms.

  • Constructing a B-tree from an unsorted sequence of elements inserts the elements into the tree one by one; no buffer is allocated to sort elements before loading them into the tree. This is done more efficiently than calling an insertion method with each element one by one, but it is likely still slower than a quicksort. (So sort elements on your own if you can spare the extra memory.)

Remark on Performance of Imported Generics

Current versions of the Swift compiler are unable to specialize generic types that are imported from external modules other than the standard library. (In fact, it is not entirely incorrect to say that the standard library works as if it was compiled each time anew as part of every Swift module rather than linked in as an opaque external binary.)

This limitation puts a considerable limit on the raw performance achievable by collection types imported from external modules, especially if they are parameterized with simple, extremely optimizable value types such as Int or even String. Relying on import will incur a 10-200x slowdown when your collection is holding these most basic value types. (The effect is much reduced for reference types, though.)

Without access to the full source code of the collection, the compiler is unable to optimize away abstractions like virtual dispatch tables, function calls and the rest of the fluff we've learned to mostly ignore inside a module. In cross-module generics, even retrieving a single Int will necessarily go through at least one lookup to a virtual table. This is because the code that implements the unspecialized generic also executes for type parameters that contain reference types, whose reference count needs to be maintained.

If raw performance is essential, currently the only way out of this pit is to put the collection's code inside your module. (Other than hacking stdlib to include these extra types, of course -- but that is a bad idea for a thousand obvious reasons.) However, having each module maintain its own set of collections would smell horrible, plus it would make it hard or impossible to transfer collection instances across module boundaries. Plus, if this strategy would be used across many modules, it would lead to a C++ templates-style (or worse) code explosion. A better (but still rather unsatisfactory) workaround is to compile the collection code with the single module that benefits most from specialization. The rest of the modules will still have access to it, if in a much slower way.

The Swift compiler team has plans to address this issue in future compiler versions, e.g., by allowing library authors to manually specialize generics for a predetermined set of type parameters.

Comments
  • Unable to build

    Unable to build

    I am attempting to use BTree in a project but I'm receiving build errors. Here is my Package.swift:

    import PackageDescription

    let package = Package( name: "swift-spell-checker", dependencies: [ .Package(url: "https://github.com/lorentey/BTree.git", versions: Version(2,1,0)..<Version(2,2,0)), ] )

    Packages/BTree-2.1.0/Sources/BTreeIndex.swift:166:89: error: expected initial value after '=' internal func expectValid(@autoclosure expression: Void->Bool, file: StaticString = #file, line: UInt = #line) {

    opened by cliffordh 11
  • Thread safety

    Thread safety

    Given the copy-on-write semantics, is reading the original tree thread safe?

    let set_1: SortedSet = ...
    var set_2 = set_1
    set_2.insert(new_element)
    

    Can we safely assume that set_1 can safely be read in a different thread while set_2 is mutated?

    opened by wildthink 4
  • BTree(Key|Value)Iterator cannot be constructed out of module

    BTree(Key|Value)Iterator cannot be constructed out of module

    While the BTree class is exposed to allow mere mortals to create collections based on it, the BTreeKeyIterator and BTreeValueIterator types have internal accessors and cannot be constructed from outside the BTree module. That feels somewhat inconsistent.

    opened by fay59 4
  • codebeat badge

    codebeat badge

    Is it fine to add codebeat badge to README?

    codebeat is automated code review tool for Swift, Ruby & Go that helps get instant feedback on code quality.

    "Quick wins" suggested by codebeat could be a nice candidate for a pull request and help other developers become contributors.

    FYI. To be fully open and honest. I'm co-founder of codebeat.

    opened by korzonek 3
  • OrdererSet updating and removing equal objects with different order question

    OrdererSet updating and removing equal objects with different order question

    I don't fully understand the behaviour below or is this a bug? Updating and removing equal objects with different order fails or succeeds if the comparable object is below or above the other object in set.

    Thanks.

    import Foundation
    import XCTest
    
    class DummyObject: Comparable
    {
        let name:String
        let order:Int
        
        init(name:String, order:Int) {
            self.name = name
            self.order = order
        }
        
        static func ==(a: DummyObject, b: DummyObject) -> Bool { return a.name == b.name }
        static func <(a: DummyObject, b: DummyObject)  -> Bool { return a.order > b.order }
    }
    
    class RandomTests: XCTestCase {
        
        func testSet()
        {
            var testSet = SortedSet<DummyObject>()
            
            // Creates dummy objects with name and order
            let createObject: (Int, Int) -> DummyObject = { index, order in
                let newName = "name \(index)"
                let newOrder = order
                return DummyObject(name: newName, order: newOrder)
            }
            
            // Inserts and logs update or replace
            let insertObject: (DummyObject) -> Void = { object in
                if (testSet.update(with: object) != nil) {
                    print("updated", object.name, object.order);
                } else {
                    print("inserted", object.name, object.order);
                }
            }
            
            // Removes and logs remove or not found
            let removeObject: (DummyObject) -> Bool = { object in
                if (testSet.remove(object) != nil) {
                    print("removed", object.name, object.order);
                    return true
                } else {
                    print("could not find \(object.name, object.order) to remove");
                    return false
                }
            }
            
            let object0 = createObject(0, 100)
            let object1 = createObject(1, 101)
            let object2 = createObject(2, 102) // Update test
            let object3 = createObject(2, 103) // Update test
            
            insertObject(object0)
            insertObject(object1)
            insertObject(object2) // Update test
            insertObject(object3) // update test
            
            // Update object object2 with comparable object with different order
            // -> (FAILS WITH 101 BUT NOT WITH 103, ACTUAL OBJECT IN SET IS 102)
            XCTAssertEqual(testSet.count, 3, "object2 should've been replaced by object3 and not inserted")
    
            // Remove object 3 with comparable object with different order
            // -> (FAILS WITH 102 BUT NOT WITH 104, ACTUAL OBJECT IN SET IS 103)
            let likeObject3 = createObject(2, 104)
            XCTAssert(removeObject(likeObject3), "object3 should've been removed")
            
            // Remove object 1 with comparable object with different order (FAILS)
            // -> (FAILS)
            let likeObject1 = createObject(1, 10)
            XCTAssert(removeObject(likeObject1), "object1 should've been removed")
        }
    }
    
    
    opened by Fabezi 2
  • Update `.gitignore` file for convenient use with Xcode.

    Update `.gitignore` file for convenient use with Xcode.

    Hello~

    Xcode constantly makes changes in xcuserdata, and Finder also make changes in .DS_Store files. I believe this addition to gitignore can help many people' life easier.

    opened by eonil 2
  • BTree is not buildable with the Swift Package Manager

    BTree is not buildable with the Swift Package Manager

    If I try to import this package using SPM, I get an error saying that the test folder has an unsupported layout, and that it can be fixed by moving those files inside a module. So, moving the test files into a separate folder (e.g. Tests/BTreeTests/... should fix it. :)

    bug 
    opened by timvermeulen 2
  • Corrupt BTree merge results when duplicate keys leak across common subtree boundaries

    Corrupt BTree merge results when duplicate keys leak across common subtree boundaries

    Most of BTree's merge methods have subtle bugs in their common subtree detection logic, which can sometimes lead to corrupt results and/or assertions when duplicate keys cross the boundaries of a common subtree.

    It is not easy to reproduce this bug using public API, but the following test manages to do it:

        func test_Intersection_withModifiedSelf() {
            var tree = BTree<Int, Int>(order: 5)
            for i in 0 ..< 10 {
                for j in 0 ..< 2 {
                    tree.insert((i, j))
                }
            }
    
            for i in 0 ..< 10 {
                var other = tree
                other.withCursor(atOffset: 2 * i) { cursor in
                    cursor.remove(2)
                }
                other.assertValid()
                let test = tree.intersection(other)
                XCTAssertEqual(test.map { $0.0 }, other.map { $0.0 })
            }
        }
    

    Results on current master branch:

    Test Case '-[BTreeTests.BTreeMergeTests test_Intersection_withModifiedSelf]' started.
    XCTAssertEqual failed: ("[1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9]") is not equal to ("[1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9]") - 
    XCTAssertEqual failed: ("[0, 0, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 0, 0, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9]") is not equal to ("[0, 0, 2, 2, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9]") - 
    XCTAssertEqual failed: ("[0, 0, 1, 1, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 0, 0, 1, 1, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9]") is not equal to ("[0, 0, 1, 1, 3, 3, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9]") - 
    XCTAssertEqual failed: ("[0, 0, 1, 1, 2, 2, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 0, 0, 1, 1, 2, 2, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9]") is not equal to ("[0, 0, 1, 1, 2, 2, 4, 4, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9]") - 
    XCTAssertEqual failed: ("[0, 0, 1, 1, 2, 2, 3, 3, 5, 5, 6, 6, 7, 7, 8, 8, 0, 0, 1, 1, 2, 2, 3, 3, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9]") is not equal to ("[0, 0, 1, 1, 2, 2, 3, 3, 5, 5, 6, 6, 7, 7, 8, 8, 9, 9]") - 
    XCTAssertEqual failed: ("[0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 6, 6, 7, 7, 8, 8, 0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 6, 6, 7, 7, 8, 8, 9, 9]") is not equal to ("[0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 6, 6, 7, 7, 8, 8, 9, 9]") - 
    XCTAssertEqual failed: ("[0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 7, 7, 8, 8, 0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 7, 7, 8, 8, 9, 9]") is not equal to ("[0, 0, 1, 1, 2, 2, 3, 3, 4, 4, 5, 5, 7, 7, 8, 8, 9, 9]") - 
    Test Case '-[BTreeTests.BTreeMergeTests test_Intersection_withModifiedSelf]' failed (0.076 seconds).
    

    Note how the returned results are often not even sorted correctly.

    The bug affects distinctUnion, subtracting, symmetricDifference, intersection, and the new bag variants, but not union.

    opened by lorentey 2
  • Make BTreeKeyIterator public

    Make BTreeKeyIterator public

    My case for them is that I made a sorted list structure, which is basically an OrderedSet that allows duplicates. It surprised me that I wasn't able to create the same type of iterator that OrderedSet was creating, especially since it has no dependency.

    I bailed out on BTreeValueIterator because it needs the EmptyKey structure, which is clearly an implementation detail, and it would be weird to expose an EmptyKey structure publicly. Of course, if the lack of symmetry is deemed a bad thing, I don't mind going back to duplicating like 10 lines of code in my project.

    opened by fay59 2
  • Proof or argument that BTree concatenation is O(log n) for arbitrary operation sequences?

    Proof or argument that BTree concatenation is O(log n) for arbitrary operation sequences?

    Thanks for publishing this library.

    I have been looking for something like this, including "Relaxed Radix Balanced Trees" (RRB trees), trying to determine precisely how to guarantee O(log n) time for arbitrary sequences of operations that include concatenating two B-trees used to represent integer-indexed arrays. The challenge seems to be to guarantee that the height/depth of the tree does not exceed O(log n) for an arbitrary sequence of concatenate, split, append, prepend, etc. operations.

    Does this library, and/or your book "Optimizing Collections", contain any kind of proof or argument that this is the case?

    For example, do you have any kind of invariants on the "shape" of the BTree data structure that is always true, regardless of the sequence of operations that produced them, that guarantee the depth remains at most O(log n)?

    opened by jafingerhut 1
  • Removed dynamic from library option in Package.swift

    Removed dynamic from library option in Package.swift

    Removed dynamic from library option in Package.swift to let client project determine whether to link dynamically or staticly against BTree.

    This is the recommended approach as can be read here: https://github.com/apple/swift-package-manager/blob/master/Documentation/PackageDescriptionV4.md

    I suffer from this in my own project where I want to link staticly against BTree.

    Mind also that it is not possible to override it once specified: https://forums.swift.org/t/building-distributable-executable-using-swiftpm/13792

    opened by werner77 1
  • Add ability to pass in comparator instead of making keys implement Comparable

    Add ability to pass in comparator instead of making keys implement Comparable

    Maybe I'm missing something, but it seems like with some algorithms there is an advantage to passing in the function that does the comparison. As an example, I'm computing line segment intersections with a sweep line algorithm. I wanted to try your BTree or SortedSet because my current data structure (a red black tree) seems slow. But it's difficult to use these classes because I need a comparison function that uses state from the algorithm: the sweep line, current point, etc. I can get around this by adding a reference to every single element in the SortedSet, but they were just integers, so that's at least doubled the storage needed per element.

    opened by rnikander 0
  • Code that produced performance graph is missing

    Code that produced performance graph is missing

    I am asking myself questions about the code that was run to produce the insertion performance graph. In particular, I'm wondering whether the binary searches for insertion position were done on an UnsafeBufferPointer projected from the array or not. It would be nice to be able to see the code so I could answer questions like this for myself.

    (It would be great to have an numbers update using more recent compilers, too)

    opened by dabrahams 0
  • BTree supports Codable

    BTree supports Codable

    Hey,

    thank you for the BTree, I use it in my other pet project and it works flawlessly!

    This is an attempt to support of Swift Codable to BTree.

    The pull request might be much smaller if Swift Tuple type and Void type supported Codable out of the box. With the current Swift state, tuple support can be fixed "internally", without many changes.

    But SortedSet and SortedBag being BTree<Element, Void> is a bummer, as in extension that means, that BTree<Codable, Codable> should be Codable, and BTree<Codable, Void> should be Codable and it appears that is not allowed. Hence the need for EmptyValue.

    I understand it is a big code change just to support Codable, so feel free to reject. I would be very excited if we agree on a different workaround for this, looking forward to your comments!

    Also, this works with Swift 4.2, what would be the best way to change the Swift version in podspec? Having multiple pods for different versions? To be honest, I am not very familiar with versioning pods for different Swift versions.

    opened by bogdad 1
  • Add fully open-ended range sub tree support

    Add fully open-ended range sub tree support

    Right now half open ranges and closed ranges are supported, but there is no support for getting a fully open range:

    (start, end)

    or a half open range where the half-open boundary is on the lower end:

    (start, end]

    I suggest to add methods.

    suffix(after: ...)

    to BTree (and SortedSet where I need it).

    I can help with a MR if wanted.

    opened by werner77 0
  • Slices of BTree collections do not share indices with the original collection

    Slices of BTree collections do not share indices with the original collection

    Swift's Collection protocol has a number of nontrivial semantic constraints that are rather under-documented; one of these is that a collection's SubSequence must be a collection that shares indices with the original instance. (The type system is currently not even able to enforce the requirement about SubSequence being a collection, or that its index type must be the same as the top-level collection. Presumably, these are going to be fixed, but the requirement about sharing index values is unlikely to be ever enforced by the type system.)

    Slices of valid collections should have their start index the same as the index of their starting element in the original collection. BTree gets this wrong:

    let list = List(0 ..< 10)
    print(list[3 ..< 10].startIndex) // ==> 0
    let array = Array(0 ..< 10)
    print(array[3 ..< 10].startIndex) // ==> 3
    

    This means BTree's collection types aren't really collections.

    The way to fix this is to replace the custom SubSequence typealiases with one of the default Slice types in each. This is a source-breaking change, so it requires a major version bump.

    The original BTree slicing behavior (i.e., extracting a separate sub-collection of the same type) will be moved to a family of functions (list.extract(3..<10) or list.sublist(3..<10) or somesuch).

    opened by lorentey 0
Releases(v4.1.0)
  • v4.1.0(Sep 7, 2017)

    This release updates the project to Swift 4 with no functional changes.

    BTree is now part of the Attaswift project. The bundle identifiers in the supplied Xcode project have been updated accordingly.

    Note that the URL for the package's Git repository has changed; please update your references.

    Source code(tar.gz)
    Source code(zip)
  • v4.0.2(Feb 7, 2017)

    This release contains the following changes:

    • BTree now compiles in Swift 3.1.
    • Issue #5: There is a new PerformanceTests target in the Xcode project containing some simple benchmarks. This facility is experimental and may be replaced later.
    • (Xcode project) The macOS deployment target was corrected to 10.9. Previously it was set at 10.11 by mistake.
    • (Xcode project) The build number is now correctly set in the tvOS framework.
    • (Xcode project) Code signing has been disabled, following Xcode 8 best practices.
    Source code(tar.gz)
    Source code(zip)
  • v4.0.1(Nov 8, 2016)

    This is a quick bugfix release restoring support for the Swift Package Manager. It includes no source-level changes.

    Bug Fixes

    • Issue #23: BTree is not buildable with the Swift Package Manager
    Source code(tar.gz)
    Source code(zip)
  • v4.0.0(Nov 7, 2016)

    This is a major release incorporating API-breaking changes. It also includes fixes for several high-severity bugs uncovered while working on new features, so this is a highly recommended upgrade.

    Breaking changes

    • To support multiset operations, some of BTree's methods have grown a new required parameter specifying the key matching strategy. To get the original behavior, specify .groupingMatches as the matching strategy, except for union, as noted below. The compiler will provide fixits, but you'll still need to update the code by hand. This affects the following methods:
      • BTree.isSubset(of:)
      • BTree.isStrictSubset(of:)
      • BTree.isSuperset(of:)
      • BTree.isStrictSuperset(of:)
      • BTree.union(:) -- use the .countingMatches strategy to get the original, multiset-appropriate, behavior.
      • BTree.distinctUnion(:) -- removed; use union with the .groupingMatches strategy instead.
      • BTree.subtracting(:) (both overloads)
      • BTree.intersection(:) (both overloads)
      • BTree.symmetricDifference(:)

    New Features

    • SortedBag is a new generic collection implementing an ordered multiset.
    • BTreeMatchingStrategy is a new public enum for selecting one of two matching strategies when comparing elements from two trees with duplicate keys.
    • BTree.index(forInserting:at:) is a new method that returns the index at which a new element with the given key would be inserted into the tree.
    • SortedSet.indexOfFirstElement(after:) is a new method that finds the lowest index whose key is greater than the specified key.
    • SortedSet.indexOfFirstElement(notBefore:) is a new method that finds the lowest index whose key is greater than or equal to the specified key.
    • SortedSet.indexOfLastElement(before:) is a new method that finds the greatest index whose key is less than the specified key.
    • SortedSet.indexOfLastElement(notAfter:) is a new method that finds the greatest index whose key is less than or equal to the specified key.

    Bug Fixes

    • Issue #19: BTree concatenation, set operations sometimes corrupt their input trees
    • Issue #20: Corrupt BTree merge results when duplicate keys leak across common subtree boundaries
    • Issue #21: BTree comparisons (subset/superset) may assert on certain shared subtrees
    • SortedSet.update(with:) now has a discardable result.
    Source code(tar.gz)
    Source code(zip)
  • v3.1.0(Oct 6, 2016)

    This is a feature release extending the functionality of SortedSet and Map with several new methods.

    New Features

    SortedSet

    Offset-based access

    • SortedSet.offset(of:) returns the offset to a particular member of the set.
    • SortedSet.remove(atOffset:) removes and returns the element at a particular offset in the set.

    Range-based operations

    • SortedSet.count(elementsIn:) returns the number of elements in the given open or closed range.
    • SortedSet.intersection(elementsIn:) returns the result of intersecting the set with a given open or closed range.
    • SortedSet.formIntersection(elementsIn:) is the in-place editing version of the above.
    • SortedSet.subtracting(elementsIn:) returns a set without members in the given open or closed range.
    • SortedSet.subtract(elementsIn:) is the in-place editing version of the previous method.

    Shifting

    • SortedSet.shift(startingAt start: Element, by delta: Element.Stride) is a new method for sorted sets with strideable elements. It adds delta to every element in the set that is greater than or equal to start. The elements are modified in place.

    All of these new methods run in logarithmic time, except for shift whose complexity is linear.

    Map

    • Map.offset(of:) is a new method for finding the offset of a particular key in the map. It has logarithmic complexity.

    Bug fixes

    • The tvOS target now generates a framework that is explicitly restricted to only use extension-safe API.
    Source code(tar.gz)
    Source code(zip)
  • v3.0.0(Sep 24, 2016)

    This release of BTree provides support for Swift 3.0, which involves extensive breaking API changes.

    • All API names have been reviewed and renamed to follow current Swift guidelines. (See SE-0023, SE-0005, SE-0006, SE-0118, and possibly others.) The resulting changes are too numerous to list here. Unfortunately, resource constraints prevented me from including forwarding availability declarations for renamed APIs; fixits won't be available, you'll have to rename usages in your code by hand. (Sorry about that.)
    • BTree's collection types now implement the new collection model described in SE-0065. BTreeIndex has been stripped of its public methods; use the new index manipulation methods in the various collection types instead. The underlying implementation hasn't been changed, but making the standalone index methods internal now allows for experimentation with more efficient indices without breaking API changes in the future.
    • OrderedSet was renamed to SortedSet, to prevent confusion with the similar class in Foundation. For a short while, SE-0086 renamed NSOrderedSet to OrderedSet in the Foundation framework, leading to a naming conflict with BTree. This was further aggravated by a naming lookup issue in the language itself that made it impossible to use the explicit name BTree.OrderedSet to work around the conflict. NSOrderedSet was quickly changed back to its original name, but the issue revealed that the two names are much too similar.
    • SortedSet was adapted to implement the new SetAlgebra protocol in SE-0059.
    • Lists that contain objects now have an arrayView property that returns an NSArray with the exact same values as the List in O(1) time. This is useful for using B-tree based lists in APIs that need arrays, without copying elements. (For example, you can now use NSCoder to encode Lists directly.)
    • Collection protocol conformance has been improved. List now explicitly conforms to RandomAccessCollection, while Map and SortedSet are now BidirectionalCollections. This required no major changes as these types already implemented everything that was necessary for conformance to these stricter protocols, but now conformance is made explicit.
    Source code(tar.gz)
    Source code(zip)
  • v3.0.0-rc.1(Sep 22, 2016)

    This release of BTree provides support for Swift 3.0, which involves extensive breaking API changes.

    • All API names have been reviewed and renamed to follow current Swift guidelines. (See SE-0023, SE-0005, SE-0006, SE-0118, and possibly others.) The resulting changes are too numerous to list here. Unfortunately, resource constraints prevented me from including forwarding availability declarations for renamed APIs; fixits won't be available, you'll have to rename usages in your code by hand. (Sorry about that.)
    • BTree's collection types now implement the new collection model described in SE-0065. BTreeIndex has been stripped of its public methods; use the new index manipulation methods in the various collection types instead. The underlying implementation hasn't been changed, but making the standalone index methods internal now allows for experimentation with more efficient indices without breaking API changes in the future.
    • OrderedSet was renamed to SortedSet, to prevent confusion with the similar class in Foundation. For a short while, SE-0086 renamed NSOrderedSet to OrderedSet in the Foundation framework, leading to a naming conflict with BTree. This was further aggravated by a naming lookup issue in the language itself that made it impossible to use the explicit name BTree.OrderedSet to work around the conflict. NSOrderedSet was quickly changed back to its original name, but the issue revealed that the two names are much too similar.
    • SortedSet was adapted to implement the new SetAlgebra protocol in SE-0059.
    • Lists that contain objects now have an arrayView property that returns an NSArray with the exact same values as the List in O(1) time. This is useful for using B-tree based lists in APIs that need arrays, without copying elements. (For example, you can now use NSCoder to encode Lists directly.)
    Source code(tar.gz)
    Source code(zip)
  • v2.1.0(Mar 23, 2016)

  • v2.0.0(Mar 6, 2016)

    This is a major release that includes breaking API changes, plus major new features and bug fixes.

    The package now implements all major features that were on my initial roadmap; further development will likely concentrate on refining the API, improving performance and adapting the package to new Swift versions. (Although it is never too late to propose new features!)

    This release supports Swift 2.1.1.

    Swift 2.2 is conditionally supported; add -DSwift22 to the Swift compiler flags to enable it. Note that this version of the module will compile with a number of warnings on 2.2; these will be fixed when Swift 2.2 is released.

    Swift 3 is not yet supported. In particular, API names mostly follow Swift 2 conventions, although certain internal APIs are following the new design conventions.

    New Features

    General

    • The README has been rewritten and greatly expanded in scope. It now includes a short intro and a detailed rationale section.
    • The term "position" has been systematically replaced with "offset" throughout the entire API and documentation.

    BTree

    • The second component of BTree's elements has been renamed from "payload" to "value" throughout the entire API and documentation. This is for consistency with other Swift key-value collections.
    • BTree now includes efficient set operations: union, distinctUnion, subtract, exclusiveOr, and intersect. These are based on keys, and exploit the order of elements to detect when they can skip elementwise processing for specific subtrees. subtract and intersect also have overloads for selecting for keys contained in a sorted sequence.
    • BTree now supports efficient tree comparison operations: elementsEqual, isDisjointWith, isSubsetOf, isStrictSubsetOf, isSupersetOf, and isStrictSupersetOf. All of these except the first work like set comparisons on the keys of the tree. They exploit the element order and detect shared nodes to skip over subtrees whenever possible. When Value is Equatable, you can now compare B-trees for equality using the == operator.
    • BTreeKeySelector now includes an After case, which selects first the element that has a key that is larger than the specified key. This is occasionally useful.
    • BTree now defines explicit overrides for the following methods on SequenceType: prefix, suffix, prefixUpTo, prefixThrough, suffixFrom, dropLast, dropFirst, first, last, popFirst, popLast, removeFirst and removeLast. The new implementations perform better than the default, index-based implementations provided by CollectionType. There are also new overloads for key-based slicing.
    • BTree gained methods for starting a generator at any specific key, index or offset.
    • The withCursor family of methods now allow their closure to return a value.
    • BTree.remove(:) now returns the full element that was removed, not just the value component.
    • Bulk loading initializers of BTree now respect the original order of elements, and optionally allow filtering out elements with duplicate keys. When initializing a Map from a sequence that contains duplicates, only the last element is kept for any key.
    • New methods: BTree.removeAtIndex(:), and BTree.removeAll()
    • BTreeIndex now contains efficient O(log(n)) implementations for advancedBy and distanceTo, replacing their default O(n) variants.
    • BTreeIndex is now Comparable. However, comparing indices only makes sense if the indices belong to the same tree.
    • BTreeCursor now has an element property that allows you to get or update the entire (key, value) pair at the current position.

    OrderedSet

    • OrderedSet is a new general-use wrapper around BTree, implementing a sorted analogue of Set.

    List

    • The generator type of List has been renamed from ListGenerator to BTreeValueGenerator.
    • List explicitly implements RangeReplaceableCollectionType.
    • You can now use + to concatenate two List values, just like you can with Arrays.

    Map

    • Map now supports elementsEqual, complete with an == operator when its Value is Equatable.
    • Map gained two methods for offset-based removal of elements: removeAtOffset and removeAtOffsets
    • You can now merge two Map values into a single map using merge.
    • You can extract a submap from a Map that includes or excludes a specific set or sequence of keys.

    Improvements

    • Navigation inside the B-tree is now unified under a single protocol, BTreePath, for all three flavors of tree paths: BTreeStrongPath, BTreeWeakPath and BTreeCursorPath.
    • The complexity of BTree.endIndex has been reduced from O(log(n)) to O(1). This also improves the endIndex properties of Map and OrderedSet.
    • Iterating over B-trees is now a bit faster, as getting to the successor of an item does not normally involve array lookups.
    • BTree.shiftSlots is a new internal method for shifting elements between nodes at the same level. This operation is often useful while reorganizing/rebalancing the tree.
    • The bulk loading algorithm has been extracted into a separate internal struct and generalized to allow appending full trees, not just elements.
    • The generated docs now include nice method group headers splitting the method lists into organized chunks.

    Bug fixes

    • Fixed issue #3, "Crash when inserting Element in List".
    • The copy-on-write semantics of BTree.withCursor(at: Index) have been fixed.
    • BTree now never allows its arrays to get larger than their specified order. (Previously, BTree.join could allow arrays to grow twice the maximum size, leading to wasted capacity.)
    Source code(tar.gz)
    Source code(zip)
  • v1.1.0(Feb 24, 2016)

    This is a feature release that includes the following changes:

    New features

    • BTree, List and Map are now their own subsequences. This makes slicing them much more convenient.
    • BTree now includes a family of subtree() methods to create subtrees based on index, position or key ranges.
    • Map now includes a family of submap() methods to create subtrees based on index, position or key ranges.
    • BTreeCursor.moveToKey(key, choosing: selector) is a new method for repositioning a cursor at a specific key.
    • BTreeCursor.removeAll() is a new method for removing all elements.
    • BTreeCursor.removeAllBefore(includingCurrent:) is a new method that removes all elements before the current position.
    • BTreeCursor.removeAllAfter(includingCurrent:) is a new method that removes all elements after the current position.
    • BTree.withCursorAt(index) is a new method that allows you to create a cursor positioned at an index. (Note that it doesn't make sense to reposition a cursor that's already been created, since creating a cursor invalidates existing indices.)

    Improvements

    • The default tree order is now based on the full element size, not just the size of the key (like in 1.0.0). The maximum node size has been bumped to 16kiB to compensate for this.
    • BTreeIndex now includes the slot number for each node on its path, making navigation a bit faster.
    • Position-based searches now start at the end of the tree if the position we're looking for is likely closer to the end than the start. This is a potential 2x improvement.
    • BTree.indexOfPosition now accepts the element count as a position, returning the tree's end index.

    Other changes

    • BTreeCursor.insertBefore has been renamed to insert.
    • BTreeCursor.isValid is not a public property anymore; invalid cursors aren't supposed to leak to correct programs.
    • There is now a shared tvOS scheme in the Xcode project. D'oh.
    • All public APIs are now documented.
    • @warn_unused_result attributes have been added for APIs that need them.
    Source code(tar.gz)
    Source code(zip)
  • v1.0.0(Feb 24, 2016)

Owner
A collection of useful Swift packages
null
A fast Swift diffing library.

HeckelDiff Pure Swift implementation of Paul Heckel's A Technique for Isolating Differences Between Files Features This is a simple diff algorithm tha

Matias Cudich 166 Oct 6, 2022
🦀Amazingly incredible extraordinary lightning fast diffing in Swift

DeepDiff ❤️ Support my apps ❤️ Push Hero - pure Swift native macOS application to test push notifications PastePal - Pasteboard, note and shortcut man

Khoa 2k Dec 29, 2022
💻 A fast and flexible O(n) difference algorithm framework for Swift collection.

A fast and flexible O(n) difference algorithm framework for Swift collection. The algorithm is optimized based on the Paul Heckel's algorithm. Made wi

Ryo Aoyama 3.3k Jan 4, 2023
Differific - a fast and convenient diffing framework.

Differific Description Differific is a diffing tool that helps you compare Hashable objects using the Paul Heckel's diffing algorithm. Creating a chan

Christoffer Winterkvist 127 Jun 3, 2022
Swift-extensions - Swift package extending the Swift programming language.

swift-extensions A package containing extensions for the Swift programming language. Contribution Reporting a bug If you find a bug, please open a bug

Alexandre H. Saad 2 Jun 12, 2022
Commonly used data structures for Swift

Swift Collections is an open-source package of data structure implementations for the Swift programming language.

Apple 2.7k Jan 5, 2023
Examples of commonly used data structures and algorithms in Swift.

Swift Structures This project provides a framework for commonly used data structures and algorithms written in a new iOS development language called S

Wayne Bishop 2.1k Dec 28, 2022
Simple diff library in pure Swift

Diff Simple diffing library in pure Swift. Installing You can use Carthage or Swift Package Manager to install Diff. Usage Start by importing the pack

Sam Soffes 120 Sep 9, 2022
A functional tool-belt for Swift Language similar to Lo-Dash or Underscore.js in Javascript

Dollar Dollar is a Swift library that provides useful functional programming helper methods without extending any built in objects. It is similar to L

Ankur Patel 4.2k Jan 4, 2023
Swift type modelling the success/failure of arbitrary operations.

Result This is a Swift µframework providing Result<Value, Error>. Result<Value, Error> values are either successful (wrapping Value) or failed (wrappi

Antitypical 2.5k Dec 26, 2022
Swift μ-framework for efficient array diffs and datasource adapters.

Buffer Swift μ-framework for efficient array diffs, collection observation and data source implementation. C++11 port here Installation cd {PROJECT_RO

Alex Usbergo 348 Aug 2, 2022
A Graph Data Structure in Pure Swift

SwiftGraph SwiftGraph is a pure Swift (no Cocoa) implementation of a graph data structure, appropriate for use on all platforms Swift supports (iOS, m

David Kopec 700 Dec 16, 2022
A Generic Priority Queue in Pure Swift

SwiftPriorityQueue SwiftPriorityQueue is a pure Swift (no Cocoa) implementation of a generic priority queue data structure, appropriate for use on all

David Kopec 350 Dec 22, 2022
Super lightweight DB written in Swift.

Use of value types is recommended and we define standard values, simple structured data, application state and etc. as struct or enum. Pencil makes us

Naruki Chigira 88 Oct 22, 2022
NSCoding's counterpart for Swift structs.

Dekoter Why You Might Be Interested How Much Familiar It Feels One More Example What We've Learned from It Features Save an Object to UserDefaults Arc

Artem Stepanenko 25 May 15, 2022
Algorithms and data structures in Swift, with explanations!

Welcome to the Swift Algorithm Club! Here you'll find implementations of popular algorithms and data structures in everyone's favorite new language Sw

raywenderlich 27.3k Jan 8, 2023
A Distributed Value Store in Swift.

Impeller is a Distributed Value Store (DVS) written in Swift. It was inspired by successful Distributed Version Control Systems (DVCSes) like Git and

David Coyle 1 Jun 17, 2020
A Swift probability and statistics library

Probably Probably is a set of Swift structures for computing the probability and cumulative distributions of different probablistic functions. Right n

Harlan Haskins 270 Dec 2, 2022
The simplest abstraction to synchronize local data with remote source. For iOS, wirtten in swift.

Purpose The simplest abstraction to synchronize local data with remote source. For iOS, written in swift. Overview Many applications uses remote serve

Siarhei Ladzeika 7 Mar 17, 2022