Issue Description
We don't have a system in place to profile the memory allocated to a hash table. We should have a workflow to reliably detect memory efficiency improvements and regressions measured by load factor.
Proposed Solution
Create a new test case for memory efficiency tests. Instead of relying on XCTest performance tests which do not reliably work on all platforms, we can use standard assertions over the count
and capacity
of a given collection through a series of insertions.
Parameters
The memory efficiency of the current insertion algorithm depends on two variables:
- The load factor cap, or the maximum ratio of
count / capacity
.
- The bump count cap, or the maximum number of consecutive bucket displacements.
Load Factor Cap
Bumps are caused by collisions, so the likelihood of a bump increases as load factor increases. Keeping the load factor low decreases the complexity of each insert operation, but increases the amount of empty buckets in the table. Not considering the bump count cap, the ideal load factor cap is one where any farther increase in load would trigger a bump chain that takes more time to traverse than would be taken to expand the table.
Bump Count Cap
The primary function of the bump count cap is to avoid bump loops. A bump loop occurs when the same sequence of members are bumped more than once. This is the only case where an insertion might fail in a cuckoo hashing algorithm, so these loops must be detected. The number of consecutive bumps per insertion is also a good secondary indicator of load factor, so keeping the bump count cap low can avoid unnecessary work caused by a high or absent load factor cap.
Current Profile
The current master
branch uses two methods to determine when to expand the hash table:
- A load factor cap of 0.5, meaning the hash table is expanded anytime
count
becomes greater than capacity / 2
. The likelihood of collisions increases exponentially as count
approaches capacity
, so keeping the table's load factor low avoids doing unnecessary bumps.
- A bump count cap of 20, meaning if 20 consecutive members are displaced from their buckets to insert a new member, the hash table is expanded. As the number of consecutive bumps increases, the probability the insert operation is in a "bump loop" increases.
According to these conditions, the best case load factor with the current design is 0.5, and the worst case could be as low as 2 / capacity
. In the event of a hash flooding attack, the capacity of the table would grow exponentially until the machine runs out of memory.
Optimizing for Memory Efficiency
Since memory efficiency and insert performance are inversely related, the optimal cap is where any farther increase in the cap would cause an increase in the rate of change of time per operation.
Detailed Design
The benchmark should be run as an executable that takes the following arguments:
- A bump count cap range or a ratio of bump count cap to the number of occupied buckets.
- A number of consecutive benchmarks to run.
- A maximum number of insertions to perform (optional, defaults to 1M).
Usage: swift run MemoryBenchmarks <file>
--cycles <Int>
--size <Int> (optional)
--max-load-factor <Double> (optional)
[--min-bump-count <Int> | --min-scaled-bump-count <Double>]
[--max-bump-count <Int> | --max-scaled-bump-count <Double>]
And should write a JSON document to <file>
with the following structure:
{
"runs": [
{
"bumpCountCap": 0.05,
"loadFactorCap": 0.7,
"results": [
{
"count": 2,
"capacity": 32,
},
]
},
],
}
This data should allows us to plot a two-dimensional graph with:
- The bump count cap on the X axis.
- The load factor on the Y axis.
Interpreting Results
When interpreting this graph:
- A slope of 0 would imply memory efficiency is unaffected by the bump count cap (but may be limited by a load factor cap).
- A positive slope would imply memory efficiency increases as the bump count cap increases.
- A negative slope would imply memory efficiency decreases as the bump count cap increases.
This information should help identify when these limits should be changed as the implementation of the insertion algorithm changes. It can also help detect regressions and improvements as the mean load factor accross runs increases or decreases. This should be used in conjunction with a time performance graph to determine how a change affects performance, and optimize algorithm variables.
documentation