Benchmark against scipy

As biteopt is implemented in C++ its inner loops can be expected to run a lot faster than scipy’s solvers which are coded in Python. Here we will benchmark biteopt against scipy’s most powerful stochastic optimizers on a famous test function for optimization which can be computed very quickly: the Rastrigin function.

[1]:

import numpy as np
from scipybiteopt import biteopt
from scipy.optimize import dual_annealing, differential_evolution
import timeit

The rules are: No local optimizations allowed (therefore no_local_search=True for dual_annealing and polish = False for differential_evolution) as we want to compare the performance of the stochastic optimizers. We keep the default number of function evaluations ad termination criteria. We use a moderately large number of dimensions: d = 10.

differential_evolution does not always run for the same number of function evaluations: in my tests it always took between 80.000 and 100.000 evaluations. For the measurement of time per function evaluations, this detail is not graciously overlooked as it does not change the overall result a lot.

[10]:

def rastrigin(x):
    return np.sum(x*x - 10*np.cos(2*np.pi*x)) + 10*x.shape[0]

lower_bounds = [-8.12] * 10
upper_bounds = [5.12] * 10
bounds=list(zip(lower_bounds, upper_bounds))

res_da = dual_annealing(rastrigin, bounds, no_local_search=True)
print("Scipy Dual Annealing: minimal value={}, number of function evaluations={}".format(res_da.fun, res_da.nfev))
time_da = %timeit -n 1 -o dual_annealing(rastrigin, bounds, no_local_search=True)
print("Full optimization time in ms: {}".format(1000 * time_da.average))
print("Time per function evaluation in ms: {}".format(1000 * time_da.average/res_da.nfev))
print()

res_de = differential_evolution(rastrigin, bounds, polish = False)
print("Scipy Differential Evolution: minimal value={}, number of function evaluations={}".format(res_de.fun, res_de.nfev))
time_de = %timeit -n 1 -o differential_evolution(rastrigin, bounds, polish = False)
print("Full optimization time in ms: {}".format(1000 * time_de.average))
print("Time per function evaluation in ms: {}".format(1000 * time_de.average/res_da.nfev))
print()

res_bo = biteopt(rastrigin, bounds)
print("Biteopt: minimal value={}, number of function evaluations={}".format(res_bo.fun, res_bo.nfev))
time_biteopt = %timeit -n 1 -o biteopt(rastrigin, bounds)
print("Full optimization time in ms: {}".format(1000 * time_biteopt.average))
print("Time per function evaluation in ms: {}".format(1000 * time_biteopt.average/res_da.nfev))

Scipy Dual Annealing: minimal value=1.4864591690866291e-05, number of function evaluations=20001
861 ms ± 13.8 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Full optimization time in ms: 860.5632999999833
Time per function evaluation in ms: 0.0430260136993142

Scipy Differential Evolution: minimal value=0.9984855135245283, number of function evaluations=85200
4.7 s ± 707 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Full optimization time in ms: 4704.701642857166
Time per function evaluation in ms: 0.23522332097680945

Biteopt: minimal value=0.0, number of function evaluations=13788
109 ms ± 3.05 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Full optimization time in ms: 108.57158571429538
Time per function evaluation in ms: 0.005428307870321253

We see that biteopt is app. 8 times faster than dual_annealing and almost 50 times faster than differential-evolution. Note that the timing difference does not

Let’s see how this changes when the objective function is also compiled and no longer uses the Python interpreter. For that we will use numba’s convenient just in time compiler.

[12]:

from numba import njit

#let numba JIT compile
rastrigin_compiled = njit(rastrigin)
out = rastrigin_compiled(np.zeros((10, )))

res_da = dual_annealing(rastrigin_compiled, bounds, no_local_search=True)
print("Scipy Dual Annealing: minimal value={}, number of function evaluations={}".format(res_da.fun, res_da.nfev))
time_da = %timeit -n 1 -o dual_annealing(rastrigin_compiled, bounds, no_local_search=True)
print("Full optimization time in ms: {}".format(1000 * time_da.average))
print("Time per function evaluation in ms: {}".format(1000 * time_da.average/res_da.nfev))
print()

res_de = differential_evolution(rastrigin_compiled, bounds, polish = False)
print("Scipy Differential Evolution: minimal value={}, number of function evaluations={}".format(res_de.fun, res_de.nfev))
time_de = %timeit -n 1 -o differential_evolution(rastrigin_compiled, bounds, polish = False)
print("Full optimization time in ms: {}".format(1000 * time_de.average))
print("Time per function evaluation in ms: {}".format(1000 * time_de.average/res_da.nfev))
print()

res_bo = biteopt(rastrigin_compiled, bounds)
print("Biteopt: minimal value={}, number of function evaluations={}".format(res_bo.fun, res_bo.nfev))
time_biteopt = %timeit -n 1 -o biteopt(rastrigin_compiled, bounds)
print("Full optimization time in ms: {}".format(1000 * time_biteopt.average))
print("Time per function evaluation in ms: {}".format(1000 * time_biteopt.average/res_da.nfev))

Scipy Dual Annealing: minimal value=1.5468483397285127e-05, number of function evaluations=20001
679 ms ± 16.6 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Full optimization time in ms: 679.4589142857603
Time per function evaluation in ms: 0.03397124715193042

Scipy Differential Evolution: minimal value=0.0, number of function evaluations=91650
3.85 s ± 854 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Full optimization time in ms: 3849.0725999999995
Time per function evaluation in ms: 0.19244400779961

Biteopt: minimal value=0.0, number of function evaluations=14399
16.8 ms ± 639 µs per loop (mean ± std. dev. of 7 runs, 1 loop each)
Full optimization time in ms: 16.775514285687418
Time per function evaluation in ms: 0.0008387337775954911

Now biteopt is app. 45 times faster than dual_annealing and 200 times faster than differential-evolution!

Note that this comes with a tradeoff though: biteopt does not perform any sanity checks. Using scipybiteopt can be like using raw C/C++: Errors or exceptions occuring during evaluation of the objective are not caught and might crash the Python interpreter. Help is always welcome to make this wrapper more robust.