Solution of Nonlinear Equations

Solution of Nonlinear Equations 1 Introduction Finding the solutions of the nonlinear equations occurs often in the scientific computing. For examp...
Author: Coleen Norton
0 downloads 1 Views 97KB Size
Solution of Nonlinear Equations

1

Introduction

Finding the solutions of the nonlinear equations occurs often in the scientific computing. For example, let us consider the problem of finding the parameter λ in the curve y = λ cosh(x/λ) such that the length of the arc between x = 0 and x = 5 is 10. Now Z 5 Z 5 ds 10 = cosh(x/λ) dx = λ sinh(5/λ) dx = 0 dx 0 Hence, finding the appropriate curve needs the value of λ which can be found from the solution of the transcendal equation λ sinh(5/λ) = 10 In this chapter, we discuss few methods along with their convergence properties. Let α is a solution of f (x) = 0 and cn is a approximation of the root. Here suffix n denotes an iteration index that will be introduced later. Now the error at the n-th stage is en = α − cn . If |en+1 | = A|en |p , then p is the order of the convergence and A is called the asymptotic rate constant. Clearly for p = 1, we need A < 1 for convergence.

2

Bisection method

Suppose that f is a continuous function on the interval [a, b] and f (a)f (b) < 0. By intermediate value theorem, f has at least one zero in the interval [a, b]. We next calculate c = (a + b)/2 and test f c). If f (c) = 0, then c is the root and we are done. If not, then either f (a)f (c) < 0 or f (b)f (c) < 0. In the former case, a root lies in [a, c] and we rename c as b and do the same process. In the latter case, we rename c as a and continue the same process. The root now lies in a interval whose length is half of the length of the original interval. The process is repeated and we stop the iteration when f (c) is very nearly zero or length of the interval [a, b] is very small.

2.1

Convergence

Let a0 = a and b0 = b and [an , bn ] (n ≥ 0) are the successive intervals in the bisection process. Clearly a0 ≤ a1 ≤ a2 ≤ · · · ≤ b 0 = b and b 0 ≥ b 1 ≥ b 2 ≥ · · · ≥ a0 = a Now the sequence {an } is monotonic increasing and bounded above and the sequence {bn } is monotonic decreasing and bounded below. Hence both the sequences converge. Further, b n − an =

b−a bn−1 − an−1 = ··· = n 2 2

Hence limn→∞ an = limn→∞ bn = α. Further, taking limit in f (an )f (bn ) ≤ 0, we get [f (r)]2 ≤ 0 and that implies f (r) = 0. Hence, both an and bn converges to a root of f (x) = 0. Let us apply the bisection method to the interval [an , bn ] and calculate midpoint cn = (an + bn )/2. Then the root lies either in [an , cn ] or [cn , bn ]. In either case |α − cn | ≤

b n − an b−a = n+1 2 2

Hence, cn → α as n → ∞. In this method, we can calculate the number of iteration n that need to be done to achieve a specified accuracy. Suppose we want relative accuracy ǫ of the root. Hence we want |α − cn | ≤ǫ |α| Suppose that the root lies in [a, b] where b > a > 0. Clearly |α| > a and hence the above relation is true if |α − cn | ≤ǫ a which is true if b−a ≤ǫ 2n+1 a Solving this we can find minimum number of iteration needed to obtain the desired accuracy. Now 1 b n − an 1 |en+1 | = |α − cn+1 | ≤ (bn+1 − an+1 ) = 2 2 2 and 1 |en | = |α − cn | ≤ (bn − an ) 2 Thus we find 1 |en+1 | ∼ |en | 2 Hence the bisection method converges linearly.

a b

3

Regula falsi

Consider the figure in which the root lies between a and b. In the first iteration of bisection method, the approximation lies at the small circle. However, since |f (a)| is small, we expect the root to lie near a. This can be achieved if we joint the coordinates (a, f (a)) and (b, f (b))

and take the intersection c of the line with the x axis as the first approximation. Hence the new approximation c satisfies c−a 0 − f (a) af (b) − bf (a) = =⇒ c = b−a f (b) − f (a) f (b) − f (a) Since f (a) and f (b) are of opposite sign, the method is well defined. If f (c) = 0, then c is the exact root α. Otherwise we take b = c if f (a)f (c) < 0 and a = c if f (c)f (b) < 0. This process is then repeated. Of course, method does not work satisfactorily in all cases and certain modification can be made. Let [an , bn ] be the successive intervals of the regula falsi method. In this case bn − an may not go to zero as n → ∞. However, the method still converges to the root. To show this, we consider the worst case. We assume that f ′′ (x) exists and for some i, f ′′ (x) ≥ 0 in [ai , bi ]. The case of f ′′ (x) ≤ 0 can be treated similarly. Also, suppose that f (ai ) < 0 and f (bi ) > 0. Let ci be the new approximation which is nothing but the intersection of the straight line through (ai , f (ai )) and (bi , f (bi )) with the x-axis. We claim that ai < ai+1 = ci < bi+1 = bi . To show that note that the straight line is nothing but the degree one polynomial p(x) with p(ci ) = 0. Obviously ai < ci < bi . For x ∈ [ai , bi ] f (x) − p(x) = (x − ai )(x − bi )f ′′ (η)/2,

η ∈ (ai , bi )

Putting x = ci and using the given conditions, we get f (ci ) ≤ 0 If f (ci ) 6= 0, then ai+1 = ci > ai and bi+1 = bi . Hence, if the condition holds, then bi = bf (say) for i ≥ i0 . Now ai is monotonically increasing and bounded by bf . Hence limn→∞ ai exists and is equal to ζ (say). Since f is continuous, f (ζ) = limn→∞ f (ai ) ≤ 0 and f (bf ) > 0 and hence ζ 6= bf . Taking limit in ci =

ai f (bi ) − bi f (ai ) f (bi ) − f (ai )

ζ=

ζf (bf ) − bf f (ζ) f (bf ) − f (ζ)

we find

Hence we find

(ζ − bf )f (ζ) = 0 Since ζ 6= bf , we must have f (ζ) = 0 and ai converges to ζ. To find the order of convergence, let us consider the worst case discussed just above. Now writing xi instead of ai and bi = b, the iteration can be written as xi+1 =

bf (xi ) − xi f (b) = φ(xi ) f (xi ) − f (b)

where

bf (x) − xf (b) f (x) − f (b) Hence after i ≥ i0 , the regula falsi method usually becomes a fixed point iteration method. This method will converge if |φ′ (ζ)| < 1. Now we can write φ(x) =

φ′ (ζ) = 1 −

ζ −b f ′ (ζ) f ′ (ζ) = 1 − ′ , f (ζ) − f (b) f (η1 )

ζ < η1 < b.

By mean-value theorem, there exists η2 ∈ (xi , ζ) such that f (xi ) − f (ζ) = (xi − ζ)f ′ (η2 ) Since f ′′ (x) ≥ 0 in [xi , b], f ′ (x) is monotonically increasing in [xi , b] and f ′ (η2 ) = f (xi )/(xi − η) > 0. This implies 0 < f ′ (η2 ) ≤ f ′ (ζ) ≤ f ′ (η1 ) =⇒ 0 < Hence 0≤1−

f ′ (ζ) ≤1 f ′ (η1 )

f ′ (ζ) < 1 =⇒ 0 ≤ φ′ (ζ) < 1. f ′ (η1 )

Hence the fixed point iteration converges to ζ i.e. ζ = φ(ζ). Now en+1 = ζ − xn+1 = φ(ζ) − φ(xn ) = en φ′ (η) = Ken , where 0 ≤ K < 1. Hence the convergence is linear.

4

Secant method

Here we don’t insist on bracketing of roots. Given two initial guess. Given two approximation xn−1 , xn , we take the next approximation xn+1 as the intersection of line joining (xn−1 , f (xn−1 )) and (xn , f (xn )) with the x-axis. Thus xn+1 need not lie in the interval [xn−1 , xn ]. If the root is α and α is a simple zero, then it can be proved that the method converges for initial guess in sufficiently small neighbourhood of α. Now xn+1 is given by xn+1 =

4.1

xn−1 f (xn ) − xn f (xn−1 ) f (xn ) − f (xn−1 )

Convergence

Now xn−1 f (xn ) − xn f (xn−1 ) f (xn ) − f (xn−1 ) xn − xn−1 f (xn ) α − xn + f (xn ) − f (xn−1 ) f (xn ) − f (α) α − xi + f [xn , xn−1 ] f [α, xn , xn−1 ] −(α − xn )(α − xn−1 ) f [xn , xn−1 ] ′′ −f (η1 ) −en en−1 ′ , 2f (η2 )

en+1 = α − xn+1 = α − = = = =

where η1 ∈ I(xn , xn−1 , xn ) and η2 ∈ I(xn , xn−1 ). Here, I(a, b, c) denotes the interior of the interval formed by a, b and c. Since α is a simple zero, f ′ (α) 6= 0. Consider the interval J = {x : |x − α| ≤ δ} such that ′′ −f (η1 ) η1 , η2 ∈ J 2f ′ (η2 ) ≤ M,

Now we have |en+1 | ≤ M |en ||en−1 | Let εn = M |en |. Then εn+1 ≤ εn εn−1 . Now choose initial guess x0 and x1 such that |xi − α| < min{1/M, δ},

i = 0, 1

This implies εi = M |xi − α| < min{1, M δ} for i = 0, 1. Now choose 0 < D < min{1, M δ} and thus 0 < D < 1 and ε0 , ε1 ≤ D < 1 . Now ε2 ≤ ε1 ε0 ≤ D 2 Also, ε3 ≤ ε2 ε1 ≤ D3 etc. By induction, we can show that εn ≤ Dλn where λ0 = λ1 = 1 and λn = λn−1 + λn−2 for n ≥ 2. Using λn ∝ rn , we find   √ !n+1 √ !n+1 √ !n+1 5 1  1+ 5 1− 5 1 + 1 ∼ √ λn = √ − , as n → ∞ 2 2 2 5 5 Since D < 1 and λn → ∞, we get εn → 0 as n → ∞ and thus xn → α. Now as xn → α, η1 , η2 → α. This implies ′′ f (α) |en+1 | ∼ C|en ||en−1 |, C = ′ 2f (α) Let

|en+1 | ∼ C β |en |p =⇒ |en | ∼ C β |en−1 |p =⇒ |en−1 | = simC −β/p |en |1/β

Now from |en+1 | ∼ C|en ||en−1 |, we get C|en |p ∼ C|en ||en |1/p C −β/p , which is true provided p = 1 + 1/p,

β = 1 − β/p =⇒ β = p/(1 + p) = p − 1 √ Taking the positive value of p, we find p = (1 + 5)/2 = r (golden ratio) and β = r − 1. Hence |en+1 | ∼ C r−1 |en |r The order of convergence is non-integer which is greater than one. Hence, the convergence is superlinear.

5

Newton-Raphson method

Let x0 be an initial guess to the root α of f (x) = 0. Let h is the correction i.e. α = x0 + h. Then f (α) = 0 implies f (x0 + h) = 0. Now assuming h small and f twice continuously differentiable, we find f (x0 ) + hf ′ (x0 ) +

h2 ′′ f (η) = 0, 2

η ∈ I(x0 , x0 + h]

Neglecting quadratic and higher order term and assuming that α is a simple root, we find h ≈ −f (x0 )/f ′ (x0 ) =⇒ x1 = x0 − f (x0 )/f ′ (x0 )

might be a better approximation to α than x0 . We can continue this process with x1 . Hence, method is given by xn+1 = xn − f (xn )/f ′ (xn ), n = 0, 1, 2, · · · Geometrically, xn+1 is the intersection of tangent with the x−axis that passes through the point (xn , f (xn )). This method can also be derived from the secant method if xn−1 approach xn . The method may or may not converge if the initial guess is too far from the root.

5.1

Convergence

If α is simple root, the f ′ (α) 6= 0 and hence f ′ (x) 6= 0 in a sufficiently small neighbourhood of α. Consider the interval J = {x : |x − α| ≤ δ} in which f ′ (x) 6= 0 and ′′ f (ξn ) xn , ξ n ∈ J 2f ′ (xn ) ≤ M, If en = α − xn , then from f (xn + en ) = 0, we find

f (xn ) + en f ′ (xn ) + e2n f ′′ (ξn )/2f ′ (xn ) = 0 Now we write en = α − xn , divide both side by f ′ (xn ) and use the iteration formula for Newton-Raphson method to arrive at   f ′′ (ξn ) e2 =⇒ |en+1 | ≤ M |en |2 en+1 = − ′ 2f (xn ) n Let εn = M |en |. Then εn+1 ≤ ε2n . Now choose initial guess x0 such that |x0 − α| < min{1/M, δ} This implies ε0 = M |x0 − α| < min{1, M δ}. Now choose 0 < D < min{1, M δ} and thus 0 < D < 1 and ε0 ≤ D < 1 . Now 2

εn ≤ ε2n−1 ≤ ε2n−2 ≤ · · · ≤ (ε0 )2 = D2 n

n

Since D < 1, εn → 0 as n → ∞ and thus xn → α as n → ∞. Also, |en+1 | ∼ C|en |2 which implies quadratic convergence and the asymptotic rate constant is C = |f ′′ (α)/2f ′ (α)|. Also, note that −f (xn ) = f (α) − f (xn ) = (α − xn )f ′ (cn ) ∼ (α − xn )f ′ (xn ) This implies en = α − xn ∼ −f (xn )/f ′ (xn ) = xn+1 − xn Hence, the error is approximately the difference between the two successive iteration values. Thus the difference between the successive iteration values can be used for stopping criterion.

6

Fixed point iteration

In this method, one writes f (x) = 0 in the form x = g(x) so that any solution of x = g(x) (which is also called fixed point) is a solution of f (x) = 0. This can be accomplished in many ways. For example, with f (x) = x2 − 5, we can write g(x) = (x + 5/x)/2 or g(x) = x + 5 − x2 or g(x) = x − (5 − x2 )/2 etc. The function g(x) is also called an iteration function. Once an g(x) is chosen, then we carry out the iteration (starting from initial guess x0 ) xn+1 = g(xn ),

n = 0, 1, 2, · · ·

Theorem: Let g be defined in an interval I = [a, b] such that g(x) ∈ I i.e. g(x) maps I into itself. Further, suppose that g is differentiable in I and there exists a nonnegative constant K < 1 such that g ′ (x) ≤ K for all x ∈ I. Then there exists a unique fixed point ξ in I and xn → ξ as n → ∞. Proof: If g(a) = a or g(b) = b, then obviously g have a fixed point. Hence suppose that g(a) 6= a and g(b) 6= b. Since g maps I into I, we must have g(a) > a and g(b) < b. Now consider the function h(x) = x − g(x) and we must have h(a) < 0 and h(b) > 0. By intermediate value theorem, there exists ξ such that h(ξ) = 0 and hence existence of fixed point is proved. To prove uniqueness, suppose that ξ and η are distinct fixed point. Then |ξ − η| = |g(ξ) − g(η)| = |g ′ (ζ)||ξ − η| ≤ K|ξ − η| < |ξ − η|,

ζ ∈ I(ξ, η)

which is a contradiction. Hence fixed point is unique. To prove convergence, consider en = ξ − xn . Then en = g(ξ) − g(xn−1 ) = g ′ (cn )en−1 ,

cn ∈ I(ξ, xn )

Hence |en | ≤ K|en−1 ≤ K 2 |en−2 | ≤ · · · ≤ K n |e0 | Since 0 ≤ K < 1, we have K n → 0 as n → ∞ and hence en → 0 as n → ∞. Also, assuming that g is twice differentiable, we have en+1 = ξ − xn+1 = g(ξ) − g(xn ) = g(ξ) − g(ξ − en ) = en g ′ (ξ) − If g ′ (ξ) 6= 0, then

|en+1 | ∼ A|en |,

e2n ′′ g (cn ), 2

cn ∈ I(ξ, xn )

A ≈ |g ′ (ξ)|

showing that the convergence is first order. On the other hand, when g ′ (ξ) = 0, then |en+1 | ∼ C|en |2 ,

C ≈ |g ′′ (ξ)|/2

showing that the convergence is 2nd order. For example, the Newton-Raphson is a special case of fixed point iteration in which g(x) = x − f (x)/f ′ (x). If ξ is a simple root of f , then the convergence of Newton-Raphson method is 2nd order. It is often very difficult to verify the assumptions of the previous theorem. Hence many times, we check the following condition: If g is continuously differentiable in some open interval J containing ξ and if |g ′ (ξ)| < 1 in J, then there exists an δ > 0 such that the fixed point iteration converges whenever we start with x0 that satisfies |x0 − ξ| ≤ δ.

To show this we take q = (1 + |g ′ (ξ))/2 < 1 and take ǫ = (1 − |g ′ (ξ))/2 > 0. Since g ′ is continuous, there exists a δ > 0 such that |g ′ (x) − g ′ (ξ)| ≤ ǫ

|x − ξ| ≤ δ.

Now |g ′ (x)| ≤ |g ′ (x) − g ′ (ξ)| + |g ′ (ξ)| ≤ ǫ + |g ′ (ξ) = q Consider I = [ξ − δ, ξ + δ] and we show that g maps I to itself. To show this, we note that for x ∈ I |g(x) − ξ| = |g(x) − g(ξ)| = g ′ (η)(x − ξ) where η ∈ I(x, ξ) and hence η ∈ I = [ξ − δ, ξ + δ]. Thus |g(x) − ξ| ≤ q|x − ξ| < ǫ showing that g maps I to I.

7

Roots of polynomial

Finding roots of polynomial also deals with complex roots. Also, sometimes we are interested in finding all the roots of a polynomial. We know that a polynomial of degree n has n roots (counting multiplicity) in the complex field. We first deal with some localization theorem. Theorem: All roots of the polynomial p(z) = an z n + an−1 z n−1 + · · · + a0 lie in the open disk whose centre is at the origin of the complex plane and whose radius is ρ = 1 + |an |−1 max |ai | 0≤i≤n

Proof: Let c = max0≤i≤n |ai |. If c = 0, then nothing to prove. Hence assume that c > 0 and then ρ > 1. Now we show that p(z) does not vanish in the region |z| ≥ ρ. To show this, we find (noting that |z| ≥ ρ > 1 and c|an |−1 = ρ − 1) |p(z)| ≥ |an z n | − |an−1 z n−1 + an−2 z n−2 + · · · + a0 | n−1 X ≥ |an z n | − c |z|n n

i=0 n

> |an z | − c|z| (|z| − 1)−1

= |an z n |[1 − c|an |−1 (|z| − 1)−1 ]

≥ |an z n |[1 − c|an |−1 (ρ − 1)−1 ] = 0 Here we have used |z| ≥ ρ =⇒ |z| − 1 ≥ ρ − 1. Corollary: Note that if we consider s(z) = z n p(1/z) then s(z) = a0 z n + a1 z n−1 + · · · + an Note that p(z0 ) = 0 implies s(1/z0 ) = 0. Hence, if all the roots of s lies inside the disk |z| ≤ ρ, then all the nonzero roots of p are outside the disk |z| < 1/ρ.

8

Horner’s algorithm

This is also known as nested multiplication and as synthetic division. For a polynomial p(z) = an z n + an−1 z n−1 + · · · + a0 and a number z0 , we can write p(z) = (x − z0 )q(z) + p(z0 ) where q(z) = bn−1 z n−1 + bn−2 z n−2 + · · · + b0 is polynomial of degree one less than that of p. Substituting q(z) and equating like powers we find bn−1 = an , bn−2 = an−1 + bn−1 z0 , · · · , b0 = a1 + b1 z0 , p(z0 ) = a0 + b0 z0 Thus we can use Horner’s algorithm to find value of a polynomial at any point z0 . This can also be used to deflate a polynomial if we know that z0 is a root. This method also can be used to find Taylor expansion of a polynomial about any point. Suppose p(z) = an z n + an−1 z n−1 + · · · + a0 = bn (z − z0 )n + bn−1 (z − z0 )n−1 + · · · + b0 Clearly bk = p(k) (z0 )/k!. We can use Horner’s algorithm to find ck efficiently. Since c0 = p(z0 ) which is obtain by applying nested multiplication to p(z). The method also gives q(z) = (p(z) − p(z0 ))/(z − z0 ) = bn (z − z0 )n−1 + bn−1 (z − z0 )n−2 + · · · + b1 Hence we can obtain c1 by applying nested multiplication to q(z). This process can be repeated.

9

Newton-Raphson method

We use the iteration zk+1 = zk − p(zk )/p′ (zk ),

k = 0, 1, 2, · · ·

Note that p(z) = (x − zk )q(x) + p(zk ) and p′ (zk ) = q(zk ). Hence, both numerator and denominator can be obtained by two steps of nested multiplication. To obtain complex roots, we need a complex initial guess. Alternatively, we also can use z = α + iβ and substituting in p(z) = 0, we get two equation F (α, β) = 0, G(α, β) = 0 that can be solved together using Newton-Raphson technique with real arithmetic. Suppose we obtain a root α1 by Newton-Raphson method. Then by deflation method, p(z) ≈ (z − α1 )q1 (z) where q1 is a polynomial of degree one less than p. Now we apply Newton-Raphson to q1 and find another root α2 . We can proceed this way and find all the roots α1 , α2 , · · · , αn such that p(z) ≈ (z − α1 )(z − α2 ) · · · (z − αn ) Of course, the error in the roots increases from α2 to αn since the error gradually built up in the deflation method. One remedy is to take α2 to αn as initial guess and work with the full polynomial.

10

M¨ uller’s method

This is generalization of secant method. This method works well for simple and multiple roots. This method may converge to a complex root even if we start with a real initial guesses. It works for non polynomial too. Here we fit a polynomial of degree 2 through three interpolatory points xi−2 , xi−1 , xi : p(x) = f [xi ] + f [xi , xi−1 ](x − xi ) + f [xi , xi−1 , xi−2 ](x − xi )(x − xi−1 )

= f [xi ] + f [xi , xi−1 ](x − xi ) + f [xi , xi−1 , xi−2 ][(x − xi )2 + hi ](x − xi ) = ai (x − xi )2 + 2bi (x − xi ) + ci ,

where ai = f [xi , xi−1 , xi−2 ], bi = [hi f [xi , xi−1 , xi−2 ]+f [xi , xi−1 ]/2, ci = f [xi ] and hi = xi −xi−1 . Let δi be the root of smallest absolute value of the quadratic equation ai δ 2 + 2bi δ + ci = 0. Then xi+1 = xi + δi is the root of p(x) = 0 closest to xi . Note that p −bi ± b2i − ai ci c pi δi = =− 2ai bi ± b2i − ai ci

We need the sign that make the denominator largest in absolute value. Hence δi = −

ci p bi + sgn(bi ) b2i − ai ci

Complex arithmetic have to be used since b2i − ai ci might be negative. Once we get xi+1 , then we repeat the same procedure with interpolatory points xi−1 , xi and xi+1 . It can be shown that   (3) f (α) ei+1 ∼ ei ei−1 ei−2 − ′ 6f (α)

Further, if |ei+1 | ∼ |ei |p , then p ≈ 1.84 and hence the convergence is superlinear and better than secant method.

11

Sensitivity of polynomial roots

Consider p(x) = x2 − 2x + 1 which has real roots x1 = x2 = 1. If we change the coefficient −2 to −1.9999, the roots will be complex. Hence, the character of the roots change from real to complex. Of course the roots are equal here. However, the well separated roots of a polynomial will be sensitive too. For example, consider Wilkinson polynomial p(x) = (x − 1)(x − 2) · · · (x − 20) whose roots are simple, real and well separated. A small perturbation to the coefficient of x19 from −210 to −210 + 2−23 will change the roots significantly and some of the becomes complex too. Hence, care must be exercised while finding the roots of a polynomial.

Suggest Documents