How to migrate from BenchmarkTools to Chairmarks
Chairmarks has a similar samples/evals model to BenchmarkTools. It preserves the keyword arguments samples
, evals
, and seconds
. Unlike BenchmarkTools, the seconds
argument is honored even as it drops down to the order of 30μs (@b @b hash(rand()) seconds=.00003
). While accuracy does decay as the total number of evaluations and samples decreases, it remains quite reasonable (e.g. I see a noise of about 30% when benchmarking @b hash(rand()) seconds=.00003
). This makes it much more reasonable to perform meta-analysis such as computing the time it takes to hash a thousand different lengthed arrays with [@b hash(rand(n)) seconds=.001 for n in 1:1000]
.
Both BenchmarkTools and Chairmarks use an evaluation model structured like this:
init()
samples = []
for _ in 1:samples
setup()
t0 = time()
for _ in 1:evals
f()
end
t1 = time()
push!(samples, t1 - t0)
teardown()
end
return samples
In BenchmarkTools, you specify f
and setup
with the invocation @benchmark f setup=(setup)
. In Chairmarks, you specify f
and setup
with the invocation @be setup f
. In BenchmarkTools, setup
and f
communicate via shared local variables in code generated by BenchmarkTools. In Chairmarks, the function f
is passed the return value of the function setup
as an argument. Chairmarks also lets you specify teardown
, which is not possible with BenchmarkTools, and an init
which can be emulated with interpolation using BenchmarkTools.
Here are some examples of corresponding invocations in BenchmarkTools and Chairmarks:
BenchmarkTools | Chairmarks |
---|---|
@btime rand(); | @b rand() |
@btime sort!(x) setup=(x=rand(100)) evals=1; | @b rand(100) sort! evals=1 |
@btime sort!(x, rev=true) setup=(x=rand(100)) evals=1; | @b rand(100) sort!(_, rev=true) evals=1 |
@btime issorted(sort!(x)) || error() setup=(x=rand(100)) evals=1 | @b rand(100) sort! issorted(_) || error() evals=1 |
let X = rand(100); @btime issorted(sort!($X)) || error() setup=(rand!($X)) evals=1 end | @b rand(100) rand! sort! issorted(_) || error() evals=1 |
BenchmarkTools.DEFAULT_PARAMETERS.seconds = 1 | Chairmarks.DEFAULTS.seconds = 1 |
For automated regression tests, RegressionTests.jl is a work in progress replacement for the BenchmarkGroup
and @benchmarkable
system. Because Chairmarks is efficiently and stably autotuned and RegressionTests.jl is inherently robust to noise, there is no need for parameter caching.
Toplevel API
Chairmarks always returns the benchmark result, while BenchmarkTools mirrors the more diverse base API.
BenchmarkTools | Chairmarks | Base |
---|---|---|
minimum(@benchmark _) | @b | N/A |
@benchmark | @be | N/A |
@belapsed | (@b _).time | @elapsed |
@btime | display(@b _); _ | @time |
N/A | (@b _).allocs | @allocations |
@ballocated | (@b _).bytes | @allocated |
Chairmarks may provide @belapsed
, @btime
, @ballocated
, and @ballocations
in the future.
Fields
Benchmark results have the following fields:
Chairmarks | BenchmarkTools | Description |
---|---|---|
x.time | x.time/1e9 | Runtime in seconds |
x.time*1e9 | x.time | Runtime in nanoseconds |
x.allocs | x.allocs | Number of allocations |
x.bytes | x.memory | Number of bytes allocated across all allocations |
x.gc_fraction | x.gctime / x.time | Fraction of time spent in garbage collection |
x.gc_time*x.time | x.gctime | Time spent in garbage collection |
x.compile_fraction | N/A | Fraction of time spent compiling |
x.recompile_fraction | N/A | Fraction of time spent compiling which was on recompilation |
x.warmup | true | whether or not the sample had a warmup run before it |
x.evals | x.params.evals | the number of evaluations in the sample |
Note that more fields may be added as more information becomes available.
Comparisons
Chairmarks does not provide a judge
function to decide if two benchmarks are significantly different. However, you can get accurate data to inform that judgement by passing passing a comma separated list of functions to @b
or @be
.
Warning
Comparative benchmarking is experimental and may be removed or changed in future versions
julia> f() = sum(rand() for _ in 1:1000)
f (generic function with 1 method)
julia> g() = sum(rand() for _ in 1:1010)
g (generic function with 1 method)
julia> @b f,g
(1.121 μs, 1.132 μs)
julia> @b f,g
(1.063 μs, 1.073 μs)
julia> judge(minimum(@benchmark(f())), minimum(@benchmark(g())))
BenchmarkTools.TrialJudgement:
time: -5.91% => improvement (5.00% tolerance)
memory: +0.00% => invariant (1.00% tolerance)
julia> judge(minimum(@benchmark(f())), minimum(@benchmark(g())))
BenchmarkTools.TrialJudgement:
time: -0.78% => invariant (5.00% tolerance)
memory: +0.00% => invariant (1.00% tolerance)
Nonconstant globals and interpolation
Like BenchmarkTools, benchmarks that include access to nonconstant globals will receive a performance overhead for that access and you can avoid this via interpolation.
However, Chairmarks's arguments are functions evaluated in the scope of the macro call, not quoted expressions eval
ed at global scope. This makes nonconstant global access much less of an issue in Chairmarks than BenchmarkTools which, in turn, eliminates much of the need to interpolate variables. For example, the following invocations are all equally fast:
julia> x = 6 # nonconstant global
6
julia> f(len) = @b rand(len) # put the `@b` call in a function (highest performance for repeated benchmarks)
f (generic function with 1 method)
julia> f(x)
15.318 ns (2 allocs: 112 bytes)
julia> @b rand($x) # interpolate (most familiar to BenchmarkTools users)
15.620 ns (2 allocs: 112 bytes)
julia> @b x rand # put the access in the setup phase (most concise in simple cases)
15.507 ns (2 allocs: 112 bytes)
BenchmarkGroup
s
It is possible to use BenchmarkTools.BenchmarkGroup
with Chairmarks. Replacing @benchmarkable
invocations with @be
invocations and wrapping the group in a function suffices. You don't have to run tune!
and instead of calling run
, call the function. Even running Statistics.median(suite)
works—although any custom plotting might need a couple of tweaks.
using BenchmarkTools, Statistics
function create_benchmarks()
functions = Function[sqrt, inv, cbrt, sin, cos]
group = BenchmarkGroup()
for (index, func) in enumerate(functions)
group[index] = @benchmarkable $func(x) setup=(x=rand())
end
group
end
suite = create_benchmarks()
tune!(suite)
median(run(suite))
# edit code
median(run(suite))
using Chairmarks, Statistics
function run_benchmarks()
functions = Function[sqrt, inv, cbrt, sin, cos]
group = BenchmarkGroup()
for (index, func) in enumerate(functions)
group[nameof(func)] = @be rand func
end
group
end
median(run_benchmarks())
# edit code
median(run_benchmarks())
This behavior emerged naturally rather than being intentionally designed so expect some rough edges. See https://github.com/LilithHafner/Chairmarks.jl/issues/70 for more info.