Hooke and Jeeves Pattern Search#
The Hooke and Jeeves algorithm is a classic pattern search method for nonlinear optimization, particularly suited for problems where derivative information is unavailable or unreliable.
This method alternates between two key steps:
Exploratory moves that test directions along coordinate axes.
Pattern moves that use momentum from successful steps to accelerate progress.
It is a derivative-free method, making it well suited for:
Black-box functions
Discontinuous or noisy objective landscapes
Engineering problems where gradients are hard to compute
References#
Videos#
The hookeNjeeves module#
The optimization logic is implemented in a Python module, hookeNjeeves.py.
Hooke and Jeeves Search Module Code
# Hooke and Jeeves Search as python script for import into Jupyter
# Minimal Script.
#
# This is a direct python port of Mark G. Johnson's C implementation of the Hooke and Jeeves algorithm
#
# Sean R. Johnson, July 7, 2013
#
# immediately below is the original documentation
# in the body, comments marked with ## are from the original,
# other comments are my own
#
#/* Nonlinear Optimization using the algorithm of Hooke and Jeeves */
#/* 12 February 1994 author: Mark G. Johnson */
#/* Find a point X where the nonlinear function f(X) has a local */
#/* minimum. X is an n-vector and f(X) is a scalar. In mathe- */
#/* matical notation f: R^n -> R^1. The objective function f() */
#/* is not required to be continuous. Nor does f() need to be */
#/* differentiable. The program does not use or require */
#/* derivatives of f(). */
#/* The software user supplies three things: a subroutine that */
#/* computes f(X), an initial "starting guess" of the minimum point */
#/* X, and values for the algorithm convergence parameters. Then */
#/* the program searches for a local minimum, beginning from the */
#/* starting guess, using the Direct Search algorithm of Hooke and */
#/* Jeeves. */
#/* This C program is adapted from the Algol pseudocode found in */
#/* "Algorithm 178: Direct Search" by Arthur F. Kaupe Jr., Commun- */
#/* ications of the ACM, Vol 6. p.313 (June 1963). It includes the */
#/* improvements suggested by Bell and Pike (CACM v.9, p. 684, Sept */
#/* 1966) and those of Tomlin and Smith, "Remark on Algorithm 178" */
#/* (CACM v.12). The original paper, which I don't recommend as */
#/* highly as the one by A. Kaupe, is: R. Hooke and T. A. Jeeves, */
#/* "Direct Search Solution of Numerical and Statistical Problems", */
#/* Journal of the ACM, Vol. 8, April 1961, pp. 212-229. */
#/* Calling sequence: */
#/* int hooke(nvars, startpt, endpt, rho, epsilon, itermax) */
#/* */
#/* nvars {an integer} This is the number of dimensions */
#/* in the domain of f(). It is the number of */
#/* coordinates of the starting point (and the */
#/* minimum point.) */
#/* startpt {an array of doubles} This is the user- */
#/* supplied guess at the minimum. */
#/* endpt {an array of doubles} This is the location of */
#/* the local minimum, calculated by the program */
#/* rho {a double} This is a user-supplied convergence */
#/* parameter (more detail below), which should be */
#/* set to a value between 0.0 and 1.0. Larger */
#/* values of rho give greater probability of */
#/* convergence on highly nonlinear functions, at a */
#/* cost of more function evaluations. Smaller */
#/* values of rho reduces the number of evaluations */
#/* (and the program running time), but increases */
#/* the risk of nonconvergence. See below. */
#/* epsilon {a double} This is the criterion for halting */
#/* the search for a minimum. When the algorithm */
#/* begins to make less and less progress on each */
#/* iteration, it checks the halting criterion: if */
#/* the stepsize is below epsilon, terminate the */
#/* iteration and return the current best estimate */
#/* of the minimum. Larger values of epsilon (such */
#/* as 1.0e-4) give quicker running time, but a */
#/* less accurate estimate of the minimum. Smaller */
#/* values of epsilon (such as 1.0e-7) give longer */
#/* running time, but a more accurate estimate of */
#/* the minimum. */
#/* itermax {an integer} A second, rarely used, halting */
#/* criterion. If the algorithm uses >= itermax */
#/* iterations, halt. */
#/* The user-supplied objective function f(x,n) should return a C */
#/* "double". Its arguments are x -- an array of doubles, and */
#/* n -- an integer. x is the point at which f(x) should be */
#/* evaluated, and n is the number of coordinates of x. That is, */
#/* n is the number of coefficients being fitted. */
#/* rho, the algorithm convergence control */
#/* The algorithm works by taking "steps" from one estimate of */
#/* a minimum, to another (hopefully better) estimate. Taking */
#/* big steps gets to the minimum more quickly, at the risk of */
#/* "stepping right over" an excellent point. The stepsize is */
#/* controlled by a user supplied parameter called rho. At each */
#/* iteration, the stepsize is multiplied by rho (0 < rho < 1), */
#/* so the stepsize is successively reduced. */
#/* Small values of rho correspond to big stepsize changes, */
#/* which make the algorithm run more quickly. However, there */
#/* is a chance (especially with highly nonlinear functions) */
#/* that these big changes will accidentally overlook a */
#/* promising search vector, leading to nonconvergence. */
#/* Large values of rho correspond to small stepsize changes, */
#/* which force the algorithm to carefully examine nearby points */
#/* instead of optimistically forging ahead. This improves the */
#/* probability of convergence. */
#/* The stepsize is reduced until it is equal to (or smaller */
#/* than) epsilon. So the number of iterations performed by */
#/* Hooke-Jeeves is determined by rho and epsilon: */
#/* rho**(number_of_iterations) = epsilon */
#/* In general it is a good idea to set rho to an aggressively */
#/* small value like 0.5 (hoping for fast convergence). Then, */
#/* if the user suspects that the reported minimum is incorrect */
#/* (or perhaps not accurate enough), the program can be run */
#/* again with a larger value of rho such as 0.85, using the */
#/* result of the first minimization as the starting guess to */
#/* begin the second minimization. */
#/* Normal use: (1) Code your function f() in the C language */
#/* (2) Install your starting guess {or read it in} */
#/* (3) Run the program */
#/* (4) {for the skeptical}: Use the computed minimum */
#/* as the starting point for another run */
#/* Data Fitting: */
#/* Code your function f() to be the sum of the squares of the */
#/* errors (differences) between the computed values and the */
#/* measured values. Then minimize f() using Hooke-Jeeves. */
#/* EXAMPLE: you have 20 datapoints (ti, yi) and you want to */
#/* find A,B,C such that (A*t*t) + (B*exp(t)) + (C*tan(t)) */
#/* fits the data as closely as possible. Then f() is just */
#/* f(x) = SUM (measured_y[i] - ((A*t[i]*t[i]) + (B*exp(t[i])) */
#/* + (C*tan(t[i]))))^2 */
#/* where x[] is a 3-vector consisting of {A, B, C}. */
#/* */
#/* The author of this software is M.G. Johnson. */
#/* Permission to use, copy, modify, and distribute this software */
#/* for any purpose without fee is hereby granted, provided that */
#/* this entire notice is included in all copies of any software */
#/* which is or includes a copy or modification of this software */
#/* and in all copies of the supporting documentation for such */
#/* software. THIS SOFTWARE IS BEING PROVIDED "AS IS", WITHOUT */
#/* ANY EXPRESS OR IMPLIED WARRANTY. IN PARTICULAR, NEITHER THE */
#/* AUTHOR NOR AT&T MAKE ANY REPRESENTATION OR WARRANTY OF ANY */
#/* KIND CONCERNING THE MERCHANTABILITY OF THIS SOFTWARE OR ITS */
#/* FITNESS FOR ANY PARTICULAR PURPOSE. */
#/* */
#import sys #you may need to activate on your computer
#def rosenbrock(x): # a classic example function - deactivated here and moved to actual call
# '''
# ## Rosenbrocks classic parabolic valley ("banana") function
# '''
# a = x[0]
# b = x[1]
# return ((1.0 - a)**2) + (100.0 * (b - (a**2))**2)
def _hooke_best_nearby(f, delta, point, prevbest, bounds=None, args=[]):
'''
## given a point, look for a better one nearby
one coord at a time
f is a function that takes a list of floats (of the same length as point) as an input
args is a dict of any additional arguments to pass to f
delta, and point are same-length lists of floats
prevbest is a float
point and delta are both modified by the function
'''
z = [x for x in point]
minf = prevbest
ftmp = 0.0
fev = 0
for i in range(len(point)):
#see if moving point in the positive delta direction decreases the
z[i] = _value_in_bounds(point[i] + delta[i], bounds[i][0], bounds[i][1])
ftmp = f(z, *args)
fev += 1
if ftmp < minf:
minf = ftmp
else:
#if not, try moving it in the other direction
delta[i] = -delta[i]
z[i] = _value_in_bounds(point[i] + delta[i], bounds[i][0], bounds[i][1])
ftmp = f(z, *args)
fev += 1
if ftmp < minf:
minf = ftmp
else:
#if moving the point in both delta directions result in no improvement, then just keep the point where it is
z[i] = point[i]
for i in range(len(z)):
point[i] = z[i]
return (minf, fev)
def _point_in_bounds(point, bounds):
'''
shifts the point so it is within the given bounds
'''
for i in range(len(point)):
if point[i] < bounds[i][0]:
point[i] = bounds[i][0]
elif point[i] > bounds[i][1]:
point[i] = bounds[i][1]
def _is_point_in_bounds(point, bounds):
'''
true if the point is in the bounds, else false
'''
out = True
for i in range(len(point)):
if point[i] < bounds[i][0]:
out = False
elif point[i] > bounds[i][1]:
out = False
return out
def _value_in_bounds(val, low, high):
if val < low:
return low
elif val > high:
return hight
else:
return val
def hooke(f, startpt, bounds=None, rho=0.5, epsilon=1E-6, itermax=5000, args=[]):
#
'''
In this version of the Hooke and Jeeves algorithm, we coerce the function into staying within the given bounds.
basically, any time the function tries to pick a point outside the bounds we shift the point to the boundary
on whatever dimension it is out of bounds in. Implementing bounds this way may be questionable from a theory standpoint,
but that's how COPASI does it, that's how I'll do it too.
'''
result = dict()
result['success'] = True
result['message'] = 'success'
delta = [0.0] * len(startpt)
endpt = [0.0] * len(startpt)
if bounds is None:
# if bounds is none, make it none for all (it will be converted to below)
bounds = [[None,None] for x in startpt]
else:
bounds = [[x[0],x[1]] for x in bounds] #make it so it wont update the original
startpt = [x for x in startpt] #make it so it wont update the original
fmin = None
nfev = 0
iters = 0
for bound in bounds:
if bound[0] is None:
bound[0] = float('-inf')
else:
bound[0] = float(bound[0])
if bound[1] is None:
bound[1] = float('inf')
else:
bound[1] = float(bound[1])
try:
# shift
_point_in_bounds(startpt, bounds) #shift startpt so it is within the bounds
xbefore = [x for x in startpt]
newx = [x for x in startpt]
for i in range(len(startpt)):
delta[i] = abs(startpt[i] * rho)
if (delta[i] == 0.0):
# we always want a non-zero delta because otherwise we'd just be checking the same point over and over
# and wouldn't find a minimum
delta[i] = rho
steplength = rho
fbefore = f(newx, *args)
nfev += 1
newf = fbefore
fmin = newf
while ((iters < itermax) and (steplength > epsilon)):
iters += 1
#print "after %5d , f(x) = %.4le at" % (funevals, fbefore)
# for j in range(len(startpt)):
# print " x[%2d] = %4le" % (j, xbefore[j])
# pass
###/* find best new point, one coord at a time */
newx = [x for x in xbefore]
(newf, evals) = _hooke_best_nearby(f, delta, newx, fbefore, bounds, args)
nfev += evals
###/* if we made some improvements, pursue that direction */
keep = 1
while ((newf < fbefore) and (keep == 1)):
fmin = newf
for i in range(len(startpt)):
###/* firstly, arrange the sign of delta[] */
if newx[i] <= xbefore[i]:
delta[i] = -abs(delta[i])
else:
delta[i] = abs(delta[i])
## #/* now, move further in this direction */
tmp = xbefore[i]
xbefore[i] = newx[i]
newx[i] = _value_in_bounds(newx[i] + newx[i] - tmp, bounds[i][0], bounds[i][1])
fbefore = newf
(newf, evals) = _hooke_best_nearby(f, delta, newx, fbefore, bounds, args)
nfev += evals
###/* if the further (optimistic) move was bad.... */
if (newf >= fbefore):
break
## #/* make sure that the differences between the new */
## #/* and the old points are due to actual */
## #/* displacements; beware of roundoff errors that */
## #/* might cause newf < fbefore */
keep = 0
for i in range(len(startpt)):
keep = 1
if ( abs(newx[i] - xbefore[i]) > (0.5 * abs(delta[i])) ):
break
else:
keep = 0
if ((steplength >= epsilon) and (newf >= fbefore)):
steplength = steplength * rho
delta = [x * rho for x in delta]
for x in range(len(xbefore)):
endpt[x] = xbefore[x]
except Exception as e:
exc_type, exc_obj, exc_tb = sys.exc_info()
result['success'] = False
result['message'] = str(e) + ". line number: " + str(exc_tb.tb_lineno)
finally:
result['nit'] = iters
result['fevals'] = nfev
result['fun'] = fmin
result['x'] = endpt
return result
This module defines the pattern search behavior and allows us to easily apply it to new problems by supplying:
An objective function
An initial guess
A step size
A shrink factor
A tolerance
The next cell demonstrates the module in use with a classic optimization benchmark.
Example 1: Rosenbrock Function#
The Rosenbrock function, also known as the banana function, is a standard test case in optimization. Its curved, narrow valley challenges many optimization methods.
The function is defined as:
A contour plot in the search range is

The function has a global minimum at \((x, y) = (1, 1)\), where \(f(x, y) = 0\).
Let’s use Hooke and Jeeves to find the minimum without using gradients.
import hookeNjeeves # import the script below from local file hookeNjeeves.py
def rosenbrock(x):
'''
Rosenbrock's classic parabolic valley ("banana") function.
'''
return (1 - x[0])**2 + 100 * (x[1] - x[0]**2)**2
bounds=[[-2, 2], [-1, 3]] # Upper/Lower bounds
start = [-1.0,2.12] # Initial guess
rho = 0.25 # Step size change rate
epsilon = 1.0e-9 # Convergence threshold
itermax = 50 # Maximum iterations
print("Initial Objective Value :",rosenbrock(start))
res = hookeNjeeves.hooke(rosenbrock, start, bounds, rho, epsilon, itermax) # unconstrained
#print(res)
print("Minimum Value :",res['fun'])
print("Optimum at :",res['x'])
print("Function evaluations :", res['fevals'])
print("Number of Iterations :", res['nit'])
Initial Objective Value : 129.44
Minimum Value : 1.802202459302945e-14
Optimum at : [1.0000001341104507, 1.0000002676174993]
Function evaluations : 1870
Number of Iterations : 14
Hooke and Jeeves Search Process#
The search proceeds as follows:
Start at an initial guess \( x_0 \)
Perform exploratory search by evaluating function values in each coordinate direction.
If an improvement is found, take a pattern move in that direction.
If no improvement, reduce the step size.
Repeat until convergence criteria are met (e.g., small step size or no improvement).
Advantages:
Simple and intuitive
No derivatives needed
Flexible for arbitrary constraints or black-box evaluations
Disadvantages:
Slower than gradient-based methods in smooth problems
Sensitive to step size and shrink factor
Discussion Questions#
How did the search perform on the Rosenbrock function?
How would you tune the step size and shrink factor for better performance?
What kinds of problems might benefit from a Hooke and Jeeves approach in your domain?
Try It!#
Try using this same pattern search on a custom function — perhaps one from your design or lab work!