Author: Aaron Watters
I created a Notebook that describes how to examine, illustrate, and solve a geometric mathematical problem called "House Location" using Python mathematical and numeric libraries. The discussion uses symbolic computation, visualization, and numerical computations to solve the problem while exercising the NumPy, SymPy, Matplotlib, IPython and SciPy packages.
I hope that this discussion will be accessible to people with a minimal background in programming and a high-school level background in algebra and analytic geometry. There is a brief mention of complex numbers, but the use of complex numbers is not important here except as "values to be ignored". I also hope that this discussion illustrates how to combine different mathematically oriented Python libraries and explains how to smooth out some of the rough edges between the library interfaces.
The House Location Problem is taken from the HackerRank.com programming challenges and is described here: http://www.hackerrank.com/challenges/house-location. The input to the problem are the coordinates of Kimberly, Bob, Jack, and Janet's houses and two ratios 'a' and 'b'. The goal of the problem is to find a location (x,y) where
The distance to Kimberly's house = 'a' times the distance to Bob's house
The distance to Janet's house = 'b' times the distance to Jack's house.
The problem statement also provides one example input set with the expected output for those inputs. Let's express the example as code. Here are the inputs and the expected output for the example provided
# coordinates of Kimberly's house (xk,yk) = (4,0) # coordinates of Bob's house (xb,yb) = (0,0) # coordinates of Jack's house (xj,yj) = (-2,-4) # coordinates of Janet's house (xn,yn) = (-2,-1) # the ratios (a,b) = (3,4) # expected solution (xsoln,ysoln) = (-2,0)
Let's check that the solution actually solves the problem by calculating the distances between the solution point and the house locations. I will define the distance between points using the squared distance as an intermediate value because the squared distance function will be useful later.
def dist2(x0,y0,x1,y1): "distance between (x0,y0) and (x1,y1) squared" return (x0-x1)**2 + (y0-y1)**2 # use the numpy sqrt function, because it is the most general import numpy as np def distance(x0,y0,x1,y1): "distance between (x0,y0) and (x1,y1)" return np.sqrt(dist2(x0,y0,x1,y1)) BobDist = distance(xsoln, ysoln, xb,yb) KimDist = distance(xsoln, ysoln, xk,yk) print "comparing Bob and Kim", (BobDist, KimDist), "equal?", (BobDist*a, KimDist)
comparing Bob and Kim (2.0, 6.0) equal? (6.0, 6.0)
So the distance to Bob and Kim satisfy the requested relationship.
JackDist = distance(xsoln, ysoln, xj, yj) JanetDist = distance(xsoln, ysoln, xn, yn) print "comparing Jack and Janet", (JackDist, JanetDist), "equal?", (JanetDist*b, JackDist)
comparing Jack and Janet (4.0, 1.0) equal? (4.0, 4.0)
... and the example solution checks out.
Can we draw a diagram that illustrates the House Location Problem? How can we compute a solution location (xsoln,ysoln) from the input locations?
Evidently we need to find the set of points where JanetDist $\times$ b == JackDist and intersect that set with the set of points where BobDist $\times$ a == KimDist: the points in the intersection will solve the problem. How can we represent those sets?
It turns out that we can represent each set as the set of pairs (x,y) that satisfy an algebraic equation. The symbolic mathematics library sympy can make creating and manipulating these equations easy. The following code creates an "equation object" representing the set of points where JackDist=JanetDist*b.
import sympy as s # Let x and y represent variable symbolic values. x,y = s.symbols("x y") # define symbolic expression objects for the squares of distances to Janet and Jacks house. JackDist2 = dist2(x, y, xj, yj) JanetDist2 = dist2(x, y, xn, yn) print "JackDist2:", JackDist2 print "JanetDist2:", JanetDist2 # Relate the squared distances in an equation (square b because the distances are squared). # JackDist2 == JanetDist2 * b**2 Jack_Janet_shape = s.Eq( JackDist2, JanetDist2 * b**2 ) # Eq creates an equation object print "Equation:", Jack_Janet_shape
JackDist2: (x + 2)**2 + (y + 4)**2 JanetDist2: (x + 2)**2 + (y + 1)**2 Equation: (x + 2)**2 + (y + 4)**2 == 16*(x + 2)**2 + 16*(y + 1)**2
For people not used to symbolic computing these steps may be a little mind blowing. The above steps created symbolic expression objects JackDist2 and JanetDist2 which can be manipulated in Python much like numbers or arrays, and then created an equation object for expressing a relationship between the expressions. We avoid the need for square roots in the equation by using squared distances.
The text representation for Jack_Janet_shape is a little hard to read. Let's use a trick to put it into more standard mathematical notation.
from IPython.display import display, Math def showsym(expression): display(Math( s.latex(expression) ) ) # format the Jack_Janet_shape equation object using standard LaTeX conventions. showsym(Jack_Janet_shape)
Sadly, by itself the Jack_Janet_shape equation does not really help me understand much about the set of
points the right distance from Jack and Janet's house.
I would really like to know what the set of points looks like on the XY plane. How can we plot the points?
The sympy module has an automatic plot function, but I was not able to figure out how to get it to display visualizations the way I wanted (probably due to my ignorance), so I'll plot it myself.
First note that we can solve for x in terms of y in the Jack_Janet_shape equation using the sympy.solve function. This works well because the Jack_Janet_shape equation is not too complex. [In general the sympy.solve function may not work or may give horribly complex results for more difficult equations.] Below we use the sympy.solve function to solve the Jack_Janet_shape equation for y.
ysolutions = s.solve(Jack_Janet_shape, y) ysolutions
[-sqrt(-25*x**2 - 100*x - 84)/5 - 4/5, sqrt(-25*x**2 - 100*x - 84)/5 - 4/5]
# Format the solutions using more conventional mathematical notation for soln in ysolutions: showsym(soln)
Here we have 2 solutions that express y in terms of x and we would like to use these expressions to plot the (x,y) points that satisfy the Jack_Janet_shape equation.
However, the symbolic solutions as shown are not suitable for actually computing y values from x values because they are symbolic expressions and not numerical functions. The sympy.lambdify function will convert a symbolic expression to a numerical function, but you have to be a bit careful with it as we will see. In particular this first attempt below doesn't work well:
# let's convert the first solution into a numeric function of x ysoln0 = ysolutions yfunction0 = s.lambdify(x, ysoln0)
# let's try out the yfunction0 on some values yfunction0(-1.5)
--------------------------------------------------------------------------- ValueError Traceback (most recent call last) <ipython-input-10-f440f216a2c3> in <module>() ----> 1 yfunction0(0) /Users/awatters/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/numpy/__init__.pyc in <lambda>(x) ValueError: math domain error
# Uh oh. Looks like the computation above tried to find the square root of a negative number. # You can't do that for real numbers. Maybe it will work for a complex number input? yfunction0(complex(0))
--------------------------------------------------------------------------- TypeError Traceback (most recent call last) <ipython-input-11-9e19acfcdf76> in <module>() 1 # Uh oh. Looks like the computation above tried to find the square root of a negative number. 2 # You can't do that for real numbers. Maybe it will work for a complex number input? ----> 3 yfunction0(complex(0)) /Users/awatters/Library/Enthought/Canopy_64bit/User/lib/python2.7/site-packages/numpy/__init__.pyc in <lambda>(x) TypeError: can't convert complex to float
As shown above the yfunction0 function is not defined at all values and will not work for complex numbers even though the algebraic expression ysoln0 is well defined for all complex numbers. The underlying problem is that lambdify used the standard math.sqrt function in the definition of yfunction0 and math.sqrt will not operate on complex numbers and will generate an error for negative numbers.
To make things easy it is best to use a function which is defined everywhere even though we are not interested in function values that are not real numbers.
We can create a different function that will work for negative arguments, complex number arguments and even array arguments too if we direct lambdify to use the numpy implementation of sqrt as follows:
yfunction0 = s.lambdify(x, ysoln0, modules="numpy") # by using numpy.sqrt we have a yfunction0 that is defined for all reals (and all complexes) yfunction0( complex(0) )
Now that we have a well behaved function we can plot it as follows:
# Let's guess that some x values between -8 and 8 map to # interesting y values. Let's try 2000 of them. x_range = np.linspace(-8.0, 8.0, 2000) # Convert the x_range array to complex numbers (with 0 imaginary part) # so that roots of negative numbers work. x_range_complex = np.array(x_range, dtype=np.complex) # Compute the y values for the range all at once (array application) y_values = yfunction0( x_range_complex ) # We only want to plot points for y_values which are "real" # -- that is, have imaginary part of 0. # The following finds the indices of y_values where the # imaginary part is 0. real_indices = np.where( np.imag(y_values)==0 ) # Find the "real" y values. real_y = np.real( y_values[ real_indices ] ) # Find the x values that correspond to the "real" y values. real_x = x_range[ real_indices ] # Plot the (x,y) pairs using matplotlib from matplotlib.pyplot import plot, text, axes, show plot(real_x, real_y) # For illustration also show where Jack and Janet's houses are by marking them J and N text(xj,yj, "J") text(xn,yn, "N") # put a reference line between the J and the N plot([xj, xn], [yj, yn]) # Tell matplotlib to keep the axes equal (don't distort them to fit) axes().set_aspect("equal", "datalim") # show the result show()
Of course this plot only shows half of the (x,y) pairs which satisfy the equation Jack_Janet_shape because there was another way to solve for y.
But at this point we have the tools to build an illustration for the whole problem.
The following functions automate the process we used to plot equation solutions.
def xplot(x, xexpr, xrng): a = np.array(xrng, np.complex) f = s.lambdify(x, xexpr, modules="numpy") fa = f(a) realrng = np.where(np.imag(fa)==0) realx = xrng[realrng] realy = np.real( fa[realrng] ) return (realx, realy) def plot_x_solutions(x, y, equality, xrng): solns = s.solve(equality, y) for soln in solns: #print "plot", soln (xx,yy) = xplot(x, soln, xrng) plot(xx,yy)
We can create an equation for the points that are the right distance from Kim and Bob's house just like we did for Jack and Janet.
# define symbolic expression objects for the squares of distances to Kim and Bob's house. KimDist2 = dist2(x, y, xk, yk) BobDist2 = dist2(x, y, xb, yb) print "KimDist2:", KimDist2 print "BobDist2:", BobDist2 # Relate the squared distances in an equation (square 'a' because the distances are squared). # KimDist2 == BobDist2 * a**2 Kim_Bob_shape = s.Eq( KimDist2, BobDist2 * a**2 ) # Eq creates an equation object showsym(Kim_Bob_shape)
KimDist2: y**2 + (x - 4)**2 BobDist2: x**2 + y**2
The following generates a diagram illustrating the House location problem for the example inputs.
plot_x_solutions(x,y, Kim_Bob_shape, x_range) plot_x_solutions(x,y, Jack_Janet_shape, x_range) # mark the input points and put lines between them plot([xk,xb], [yk,yb]) text(xk,yk, "K") text(xb,yb, "B") plot([xj,xn], [yj,yn]) text(xj,yj, "J") text(xn,yn, "N") axes().set_aspect("equal", "datalim") show()
It looks like the House Location problem is looking for the intersection of two ellipses. In fact the ellipses look like they might be circles. Are they?
Also note that one of the intersections is at the solution offered for the example (x,y) = (-2, -1).
Now that we have equations for the two shapes, we can solve for the intersection points by simply letting the sympy solver magically do all the work for us.
s.solve( [Kim_Bob_shape, Jack_Janet_shape], x, y)
[(-2, 0), (-386/289, -360/289)]
Using sympy.solve works and it will work for any other set of input parameters, but it's a little disappointing because the black box solver doesn't give us any additional insight into the problem. It's also disappointing because this solution cannot be used to solve the HackerRank challenge since the HackerRank Python instances do not include the sympy package.
I will leave the reader to find a solution that is acceptible to HackerRank.
The Enthought Canopy environment automatically includes all software used in this example. Get the free version of Canopy from Enthought here: http://www.enthought.com/.
As mentioned above, the official statement of the House Location problem is here: https://www.hackerrank.com/challenges/house-location.
The official documentation for sympy is here: http://docs.sympy.org/latest/index.html.
The scipy and numpy documentation is here: http://docs.scipy.org/doc/.