A Nonmonotone Inexact Newton Method

A Nonmonotone Inexact Newton Method ∗ Silvia Bonettini Dipartimento di Matematica, Universit`a di Modena e Reggio Emilia Abstract In this paper we ...
Author: Victor Anthony
1 downloads 2 Views 210KB Size
A Nonmonotone Inexact Newton Method



Silvia Bonettini Dipartimento di Matematica, Universit`a di Modena e Reggio Emilia

Abstract In this paper we describe a variant of the Inexact Newton method for solving nonlinear systems of equations. We define a nonmonotone Inexact Newton step and a nonmonotone backtracking strategy. For this nonmonotone Inexact Newton scheme we present the convergence theorems. Finally, we show how we can apply these strategies to Inexact Newton Interior–Point method and we present some numerical examples. Keywords: Nonlinear Systems, Inexact Newton Methods, Nonmonotone Convergence, Newton Interior–Point Methods.

1

Introduction

A classical way to solve a system of nonlinear equations F (x) = 0

(1)

where F : Rn → Rn is continuously differentiable, is the Newton method: given a starting point x0 , at each step k the Newton equation F 0 (xk )sk = −F (xk )

(2)

has to be solved, in order to determine the Newton direction sk . Then, the new iterate is computed by the rule xk+1 = xk + sk . Convergence theorems for Newton method can be found for example in [11]. The main computational task is the solution of (2), which can be very ∗

This research was supported by the Italian Ministry for Education, University and Research (MIUR), FIRB Project RBAU01JYPN.

1

2 expensive if n is large. The idea of Inexact Newton method introduced in [2] is to substitute (2) with a condition on its residual: kF (xk ) + F 0 (xk )sk k ≤ ηk kF (xk )k, where ηk ∈ [0, 1) and k · k is an n–dimensional vector norm. In order to obtain global convergence properties, the Global Inexact Newton method presented in [5] also requires another condition that guarantees a ”sufficient decrease” of the norm of F at each iterate. A general scheme for this method can be written as follows: Let x0 ∈ Rn and β ∈ (0, 1) be given. For k = 0, 1, 2, . . . Find some ηk ∈ [0, 1) and a vector sk that satisfy kF (xk ) + F 0 (xk )sk k ≤ ηk kF (xk )k

(3)

and kF (xk + sk )k ≤ (1 − β(1 − ηk ))kF (xk )k.

(4)

Set xk+1 = xk + sk . A vector that satisfies (3) is called Inexact Newton step at the level ηk and the parameter ηk is the forcing term. In [5], global convergence theorems have been established for some particular algorithms following this scheme, under the assumption that the sequence of the iterates {xk } has a limit point where the jacobian matrix F 0 is nonsingular. We observe that condition (3) is a generalization of the Newton equation; hence, in order to satisfy (3), it may be sufficient to solve (2) inexactly, for example by means of an iterative solver. Furthermore, the accuracy of the solution is given by the norm of F at the current iterate, so, when we are far from the solution, unnecessary computations can be avoided. This is an advantage of Inexact Newton methods, especially for large scale problems. Note that, when k · k is the euclidean norm k · k2 , condition (3) guarantees that the Inexact Newton step is a descent direction for the scalar function 1 Φ(x) = kF (x)k22 . 2 Indeed we have the following inequality (we omit the iteration index): ∇Φ(x)t s = F (x)t F 0 (x)s = F (x)t [−F (x) + F 0 (x)s + F (x)] = −kF (x)k22 + F (x)t (F 0 (x)s + F (x)) ≤ −(1 − η)kF (x)k22 ≤ 0.

(5)

3 Condition (4) provides that at every iteration the norm of F is reduced, so the sequence {kF (xk )k} is monotone nonincreasing. Then, we conclude that Inexact Newton method with the euclidean norm can be considered as a descent method with line search (4) for the merit function Φ(x). However, it can be observed that if x∗ is a root of F (x), it is also a minimizer of the norm of F , but the converse is not true. In this paper we present a nonmonotone version of Inexact Newton method, where both the conditions (3) and (4) have been relaxed. First of all, it is useful to introduce the following notations. Given N ∈ N and a sequence {xk }, we denote by x`(k) the element with the following property kF (x`(k) )k =

max

0≤j≤min(N,k)

kF (xk−j )k.

(6)

Note that we have k − min(N, k) ≤ `(k) ≤ k. The modified scheme can be written as follows: Let x0 ∈ Rn and β ∈ (0, 1) be given. For k = 0, 1, 2, . . . Find some ηk ∈ [0, 1) and a vector sk that satisfy kF (xk ) + F 0 (xk )sk k ≤ ηk kF (x`(k) )k

(7)

and kF (xk + sk )k ≤ (1 − β(1 − ηk ))kF (x`(k) )k.

(8)

Set xk+1 = xk + sk . According to (3), we define the vector sk satisfying (7) nonmonotone Inexact Newton step at the level ηk . Note that the sequence {kF (xk )k} satisfying (7) and (8) is nonmonotone, but {kF (x`(k) )k} is a monotone nonincreasing subsequence of it. Furthermore, the nonmonotone step is not a descent direction for the merit function defined in (5). This fact may be useful in some cases to avoid local minima of the merit function where F 0 is singular. In the next section, we present a backtracking algorithm following the nonmonotone scheme, for which, in section 3, we state convergence theorems. In section 4, as a special case, we consider the Newton Inexact interior– point method and we show that, by applying nonmonotone strategies, we can choose the perturbation parameter of the interior methods in a larger range of values. Finally, in section 5, we present some numerical experiments related to nonmonotone interior–point method. For the remainder of the paper, we denote Nδ (x) = {y ∈ Rn : ky − xk < δ} for δ > 0 and we use the following results:

4 Lemma 1.1 [11, 2.3.3] Assume that F 0 (x) is invertible. Then, for any ² > 0 there exists δ > 0 such that F 0 (x) is invertible and kF 0 (x)−1 − F 0 (y)−1 k < ², for all y ∈ Nδ (x). Lemma 1.2 [11, 3.1.5] For any x and ² > 0, there exists δ > 0 such that kF (z) − F (y) − F 0 (y)(z − y)k ≤ ²kz − yk, for all z, y ∈ Nδ (x).

2

A nonmonotone Inexact Newton method

The nonmonotone Inexact Newton method can be implemented by using a backtracking strategy. At each step k, we determine a forcing term η¯k and a vector s¯k that satisfy the nonmonotone condition (7); then, we reduce s¯k by means of a damping parameter αk obtained by a nonomonotone backtracking rule; the nonmonotone Inexact Newton step sk = αk s¯k satisfies condition (7) and (8) with ηk = (1 − αk (1 − η¯k )). The algorithm can be stated as follows. Algorithm 2.1 Step 1. Set x0 ∈ Rn , β ∈ (0, 1), 0 < θmin < θmax < 1, ηmax ∈ (0, 1), k = 0. Step 2. Determine η¯k ∈ [0, ηmax ], s¯k that satisfy kF (xk ) + F 0 (xk )¯ sk k ≤ η¯k kF (x`(k) )k. Set αk = 1. Step 3. While kF (xk + αk s¯k )k > (1 − αk β(1 − η¯k ))kF (x`(k) )k Step 3a. Choose θ ∈ [θmin , θmax ]; Step 3b. Set αk = θαk . Step 4. Set xk+1 = xk + αk s¯k . k =k+1 Go to Step 2.

5 The following lemma shows that if η¯k ∈ [0, ηmax ] and s¯k satisfy the condition at the step 2, then the vector α¯ sk is a nonmonotone Inexact Newton step at the level η(α) = (1 − α(1 − η¯k )) for any α ∈ (0, αmax ], where αmax < 1. Furthermore, in (0, αmax ] the condition kF (xk + α¯ sk )k < (1 − αβ(1 − η¯k ))kF (x`(k) )k is verified. Lemma 2.1 Let β ∈ (0, 1); suppose that there exist η¯ ∈ [0, 1), s¯ satisfying kF (xk ) + F 0 (xk )¯ sk ≤ η¯kF (x`(k) )k. Then, there exist αmax ∈ (0, 1] and a vector s such that kF (xk ) + F 0 (xk )sk ≤ ηkF (x`(k) )k

(9)

kF (xk + s)k ≤ (1 − βα(1 − η))kF (x`(k) )k

(10)

hold for any α ∈ (0, αmax ], where η ∈ [¯ η , 1), η = (1 − α(1 − η¯)). Proof. Let s = α¯ s. Then we have kF (xk ) + F 0 (xk )sk = ≤ ≤ =

kF (xk ) − αF (xk ) + αF (xk ) + αF 0 (xk )¯ sk (1 − α)kF (xk )k + αkF (xk ) + F 0 (xk )¯ sk (1 − α)kF (x`(k) )k + α¯ η kF (x`(k) )k ηkF (x`(k) )k,

so (9) is proved. Now let ε=

(1 − β)(1 − η¯) kF (x`(k) )k, k¯ sk

(11)

and δ > 0 be sufficiently small (see Lemma 1.2) that kF (xk + s) − F (xk ) − F 0 (xk )sk ≤ εksk

(12)

whenever ksk < δ. Choosing αmax = min(1, k¯δsk ), for any α ∈ (0, αmax ] we have ksk < δ and then, using (11) and (12), we obtain the following inequality kF (xk + s)k ≤ ≤ = = ≤

kF (xk + s) − F (xk ) − F 0 (xk )sk + kF (xk ) + F 0 (xk )sk εαk¯ sk + ηkF (x`(k) )k ((1 − β)(1 − η¯)α + (1 − α(1 − η¯)))kF (x`(k) k (1 − βα(1 − η¯))kF (x`(k) )k (1 − βα(1 − η))kF (x`(k) )k,

6

that completes the proof.

¤

A consequence of the previous lemma is that the while loop at the step 3 terminates. Indeed, at each iterate k the backtracking condition kF (xk + α¯ sk )k ≤ (1 − αβ(1 − η¯))kF (x`(k) )k

(13)

is satisfied for α < αmax , where αmax depends on k. Since the value of αk is reduced by a factor θ < θmax < 1 at the step 3a, then there exists a positive integer p such that (θmax )p < αmax and so the while loop terminates at most after p steps. When it is impossible to determine xk+1 we say that the algorithm breaks down. Then, Lemma 2.1 yields that algorithm 2.1 breaks down if and only if is impossible to find a nonmonotone inexact Newton step at any level. Theorem 2.1 Let {xk } a sequence such that limk→∞ F (xk ) = 0 and for each k the following conditions hold: kF (xk ) + F 0 (xk )sk k ≤ ηkF (x`(k) )k,

(14)

kF (xk+1 )k ≤ kF (x`(k) )k,

(15)

where sk = xk+1 − xk and η < 1. If x∗ is a limit point of {xk }, then F (x∗ ) = 0 and if F 0 (x∗ ) is nonsingular, then the sequence {xk } converges to x∗ . Proof. If x∗ is a limit point of the sequence {xk }, there exists a subsequence {xkj } of {xk } convergent to x∗ . By the continuity of F , we obtain µ F (x∗ ) = F

¶ lim xkj

j→∞

= lim F (xkj ) = 0. j→∞

Furthermore, since {x`(k) } is a subsequence of {xk }, also the sequence {F (x`(k) )} converges to zero when k diverges. Denote K = kF 0 (x∗ )−1 k and δ > 0 be sufficiently small that F 0 (y)−1 exists whenever y ∈ Nδ (x∗ ); thus we can suppose kF 0 (y)−1 k ≤ 2K, (16) kF (y) − F (x∗ ) − F 0 (x∗ )(y − x∗ )k ≤

1 ky − x∗ k. 2K

7 Then for any y ∈ Nδ (x∗ ) we have kF (y)k = kF 0 (x∗ )(y − x∗ ) + F (y) − F (x∗ ) − F 0 (x∗ )(y − x∗ )k ≥ kF 0 (x∗ )(y − x∗ )k − kF (y) − F (x∗ ) − F 0 (x∗ )(y − x∗ )k 1 ≥ K1 ky − x∗ k − 2K ky − x∗ k 1 = 2K ky − x∗ k. Then ky − x∗ k ≤ 2KkF (y)k

(17)

holds for any y ∈ Nδ (x∗ ). Now let ² ∈ (0, 4δ ) and since x∗ is a limit point of {xk }, there exists a k sufficiently large that xk ∈ N δ (x∗ ) 2

and x`(k)

½ ∈ S² ≡ y : kF (y)k
τ > 0.

8 Proof. Denoting kF 0 (x∗ )−1 k = K, we can find δ > 0 such that (i) F 0 (x)−1 exists whenever x ∈ Nδ (x∗ ), (ii) kF 0 (x)−1 k ≤ 2K

∀x ∈ Nδ (x∗ )

(iii) kF (x) − F (y) − F 0 (y)(x − y)k ≤ N2δ (x∗ ).

(1−β)(1−ηmax ) 2K(1+ηmax ) ky

− xk

∀x, y ∈

Since x∗ is a limit point, there exist infinitely many k such that xk ∈ Nδ (x∗ ) for which the following condition holds: k¯ sk k ≤ kF 0 (xk )−1 k(kF 0 (xk )¯ sk + F (xk )k + kF (xk )k) ≤ 2K(1 + ηmax )kF (x`(k) )k.

(18)

Since sk = α¯ sk , formula (18) can be written as ksk k ≤ ΓαkF (x`(k) )k

(19)

where Γ = 2K(1 + ηmax ). Now we show that if α ≤ ΓkF (xδ )k , then the while loop terminates. We can `(k) write by means of condition (ii), Lemma 2.1 and formula (19) kF (xk + sk )k ≤ kF (xk ) + F 0 (xk )sk k + kF (xk + sk ) − F (xk ) − F 0 (xk )sk k max ) ≤ ηkF (x`(k) )k + (1−β)(1−η ksk k Γ ≤ ((1 − α)(1 − η¯) + (1 − β)α(1 − η¯))kF (x`(k) )k. Thus kF (xk + α¯ sk )k ≤ (1 − αβ(1 − η¯))kF (x`(k) )k This inequality shows that the backtracking condition (13) is satisfied for α ≤ ΓkF (xδ )k and since α is reduced at every step by a factor θ ≤ θmax < `(k) 1 the while loop terminates. Suppose now that the while loop has been executed at least once, let denote αk the final value (i.e. the value of α for which (13) is satisfied) and α¯k the previous one. At the penultimate step the condition (13) is not satisfied, so necessarily we have α¯k > and so αk = θα¯k >

δ ΓkF (x`(k) )k

δθmin δθmin ≥ . ΓkF (x`(k) )k ΓkF (x0 )k

9 δθmin Hence Lemma (2.2) has been proved with τ = min(1, ΓkF (x0 )k ).

¤

From Lemma 2.2 we can derive the following corollary, which is used in the proof of the convergence theorem. Corollary 2.1 Suppose that Algorithm 2.1 does not break down. If x∗ is a limit point of {xk } such that F 0 (x∗ ) is nonsingular and {xkj } is a subsequence converging to x∗ then the sequence {αkj } is bounded away from zero. Now we can state the convergence theorem. The proof is similar to the one of theorem in section 3 of [7]. Theorem 2.2 Suppose that Algorithm 2.1 does not break down and that the norm of inexact Newton step is bounded for every k by a positive constant M k¯ sk k ≤ M. (20) Assume also that one of the two following properties holds: F is Lipschitz continuous; n

the set Ω(0) = {x ∈ R : kF (x)k ≤ kF (x0 )k} is compact.

(21) (22)

If x∗ is a limit point of xk such that F 0 (x∗ ) is invertible then F (x∗ ) = 0 and {xk } converges to x∗ when k diverges. Proof. Since kF (x`(k) )k is a monotone nonincreasing, bounded sequence, then there exists L ≥ 0 such that L = lim kF (x`(k) )k. k→∞

Thus, writing the backtracking condition (13) for the iterate `(k), we obtain kF (x`(k) )k ≤ (1 − α`(k)−1 β(1 − η¯`(k)−1 ))kF (x`(`(k)−1) )k.

(23)

When k diverges, we can write L ≤ L − L · lim α`(k)−1 β(1 − η¯`(k)−1 ). k→∞

Since β is a constant and 1 − η¯j ≥ 1 − ηmax > 0 for any j, (24) yields L · lim α`(k)−1 ≤ 0 k→∞

(24)

10 that implies L=0 or lim α`(k)−1 = 0.

k→∞

(25)

ˆ Suppose that L 6= 0, so that (25) holds. Let `(k) = `(k + N + 1) so that ˆ k + N + 1`(k) > k and we show by induction that for any j ≥ 0 we have lim α`(k)−j =0 ˆ

(26)

lim kF (x`(k)−j )k = L. ˆ

(27)

k→∞

and k→∞

For j = 1, since {α`(k)−1 } is a subsequence of {α`(k)−1 }, (25) implies (26). ˆ From (20) we also obtain lim kx`(k) − x`(k)−1 k = 0. ˆ ˆ

k→∞

(28)

If (21) holds, from |kF (x)k − kF (y)k| ≤ kF (x) − F (y)k and (28) we obtain lim kF (x`(k)−1 )k = L. ˆ

k→∞

(29)

If, instead of (21), (22) holds, then, exploiting the uniform continuity of F in Ω(0), we can again derive (29). Assume now that (26) and (27) hold for a given j. We have kF (x`(k)−j )k ≤ (1 − α`(k)−(j+1) β(1 − η`(k)−(j+1) ))kF (x`(`(k)−(j+1)) )k. Using the same arguments employed above, since L > 0, we obtain lim α`(k)−(j+1) =0 ˆ

k→∞

and so lim kx`(k)−j − x`(k)−(j+1) k = 0, ˆ ˆ

k→∞

lim kF (x`(k)−(j+1) )k = L. ˆ

k→∞

Thus, we conclude that (26) and (27) hold for any j ≥ 1. Now, for any k, we can write ˆ `(k)−k−1

kxk+1 − x`(k) ˆ k≤

X j=1

α`(k)−j k¯ s`(k)−j k ˆ ˆ

11 ˆ − k − 1 ≤ N , we have so that, since we have `(k) lim kxk+1 − x`(k) ˆ k = 0.

(30)

kx`(k) − x∗ k ≤ kx`(k) − xk+1 k + kxk+1 − x∗ k ˆ ˆ

(31)

k→∞

Furthermore, we have

Since x∗ is a limit point of {xk+1 } and (30) holds, (31) implies that x∗ is a limit point for the sequence {x`(k) ˆ }. From (28) we conclude that x∗ is a limit point also for the sequence {x`(k)−1 }, which contradicts the Corollary ˆ 2.1. Indeed, there exists a τ > 0 such that α`(k)−1 > τ for infinitely many ˆ k. Hence, we necessarily have L = 0, that implies lim kF (xk )k = 0.

k→∞

Now Theorem 2.1 completes the proof.

¤

Theorem 2.3 Under the hypothesis of Theorem 2.2 we have that the sequence {kF (xk )k} converges and lim kF (xk )k = lim kF (x`(k) )k.

k→∞

k→∞

Proof. If limk→∞ kF (x`(k) )k = 0, then limk→∞ kF (xk )k = 0. If limk→∞ kF (x`(k) )k = L > 0, using the same arguments in the first part of the proof of Theorem 2.2, we can conclude that (30) holds. If (21) or (22) holds, then limk→∞ kF (xk )k = L = limk→∞ kF (x`(k) )k. ¤

3

An application: a nonmonotone Inexact Newton Interior–Point Method

First, we recall the basic concepts of Newton Inexact interior–point method, as a special case of Inexact Newton method. For the details we refer to [4]. Here and for the remainder, we assume k · k = k · k2 . Consider now the nonlinear programming problem min f (x) g1 (x) = 0 g2 (x) ≥ 0

(32)

12 where x ∈ Rn , f : Rn → R, g1 : Rn → Rneq , g2 : Rn → Rp ; by introducing the slack variables s on the inequality constraints, the Karush–Kuhn–Tucker (KKT) optimality conditions for problem (32) are given by the following system of nonlinear equations:   ∇f (x) − ∇g1 (x)λ − ∇g2 (x)w   −g1 (x)  = 0, (33) H(v) ≡    −g2 (x) + s W Sep with s, w ≥ 0, Rneq

Rp

where λ ∈ s, w ∈ and W = diag(w); S = diag(s). Here λ and w are the Lagrange multipliers related to the equality and inequality constraint respectively; the vector ej indicates the vector of j components whose values are equal to 1. Furthermore we set v = (xt , λt , wt , st )t and n ˜ ≡ n + neq + 2p (the size of the system (33)). The first n + neq + p components of the vector H(v),   ∇f (x) − ∇g1 (x)λ − ∇g2 (x)w  G(v) =  −g1 (x) −g2 (x) + s represent the gradient of the lagrangian function of the minimum problem, while the last p equations in (33), SW ep = 0, are called complementarity conditions. In the framework of Newton interior–point method, instead of (33), we consider the perturbed KKT conditions H(v) = ρ˜ e s, w > 0,

(34)

with ρ > 0 and e˜ = (0tn+neq+p , etp )t and, given a starting point v0 with (s0 , w0 ) > 0, at the iteration k we have to solve the perturbed Newton equation H 0 (vk )∆v = −H(vk ) + ρk e˜, (35) so that the iterates satisfy the positivity condition on (sk , wk ). The perturbation parameter ρk can be defined as ρk = σk µk ,

(36)

13 with σk ∈ (0, 1) and µk > 0. Now we will briefly recall the conditions that enable us to view the Newton interior–point method as an Inexact Newton method applied to the the system (33). Consider the Newton equation for (33): H 0 (vk )∆vk = −H(vk ).

(37)

The residual vector rk ∈ Rn˜ for (37) can be written as rk = H 0 (vk )∆vk + H(vk ). If we suppose that rk is given by the following expression µ ¶ 0n+neq+p , rk = ρk σk ep then we obtain

(38)

(39)

√ krk k = ρk kep k = ρk p.

Note that if we choose µk ≤

kH(vk )k , √ p

(40)

as in [3], and if ∆vk satisfies (38) where rk is given by (39), then ∆vk is an inexact Newton step at the level σk for the system (33). In interior–point st w st w )k √ k , so methods a suitable choice of µk is µk = kp k ; we have kp k ≤ kH(v p (40) is satisfied. Furthermore, a sufficient condition for (39) is that ∆vk is an exact solution for the perturbed equation (35), so (40) guarantees that the vector computed at every step of the interior–point method by solving (35) exactly is an Inexact Newton step. Suppose now that the residual of (37) at the iteration k has the following expression, instead of (39): ¶ µ r¯k , (41) rk = ρk ep where r¯k ∈ Rn+neq+p satisfies the condition k¯ rk k ≤ δk kH(vk )k.

(42)

Now, if (40) and (42) hold and σk + δk < 1, then ∆vk in (38) is an Inexact Newton step at the level σk + δk for the system (33). Indeed, we have krk k2 = pρ2k + δk2 k¯ rk k2 ≤ (σk2 + δk2 )kH(vk )k2 ≤ (σk + δk )2 kH(vk )k2 ,

14 which implies krk k ≤ (δk + σk )kH(vk )k. In order to obtain a residual vector as in (41), one may solve the equation (35) inexactly on the first n + neq + p equations, by means of an iterative solver, using condition (42) as inner stopping criterion. So, the conditions on k¯ rk k, δk , σk and µk allow us to calculate only an inexact solution of (35), obtaining again an Inexact Newton step. This approach is useful when n ˜ is large and the computation of an exact solution can be too expensive. If we replace (40) and (42) with · t ¸ sk wk kH(v`(k) )k µk ∈ , (43) √ p p and k¯ rk k ≤ δk kH(v`(k) )k

(44)

then it is easy to verify, using the same observations employed above, that a vector ∆v for which the residual rk has the form in (41) is a nonmonotone Inexact Newton step. After the computation of the direction ∆vk , the following iterate in an interior–point method is determined by the updating rule vk+1 = vk + αk ∆vk , where αk ∈ (0, 1] has to be chosen in order to guarantee the positivity of the components of sk+1 and wk+1 . Furthermore the parameter αk ∈ (0, 1] must be selected so that the centrality conditions (see e.g. [6]) are satisfied. Finally, we include in the method the nonmonotone backtracking strategy seen in the previous section. Now we present the nonmonotone Newton Inexact interior–point method. Algorithm 3.1 Step 1. Fix v0 such that (s0 , w0 ) > 0 and choose the positive parameters as follows: - τ1 < 1 and τ1 ≤ p (mini=1,...,p (s0 )i (w0 )i ) /st0 w0 ; - τ2 ≤ (st0 w0 )/kG(v0 )k; ˜ β, θ, tol ∈ (0, 1); - δ, - δmax + σmax < 1 and σmax ≥ - δ˜ ≤ σmin < σmax .

√ 2τ2 δmax / min(1, τ2 ) + δ˜

15 Set k ← 0. Step 2. If kH(vk )k ≤ tol then stop, else choose the positive parameters σk , δk , µk such that: - 0 ≤ δk ;

√ - σmin ≤ σk ≤ σmax and σk ≥ δ˜ + 2δk / min(1, τ2 ); h t i kH(v`(k) )k √ - µk ∈ sk pwk , as in (43). p Step 3. Find ∆vk = (∆xtk , ∆λtk , ∆wkt , ∆stk )t such that (35) hold with rk defined in (41) and k¯ rk k ≤ δk kH(v`(k) )k as in (44). ³ ´ (1) (2) (1) (2) Step 4. Compute α˜k = min αk , αk , where αk and αk are the largest numbers in (0, 1] such that the following centrality conditions hold for (1) (2) any α ∈ (0, αk ] and α ∈ (0, αk ] respectively: min sk (α)i wk (α)i ≥ (τ1 /p)sk (α)t wk (α),

(45)

sk (α)t wk (α) ≥ τ2 kG(v(α))k,

(46)

i=1,...,p

where v(α) = vk + α∆vk . Step 5. If kH(vk + α˜k ∆vk )k ≤ (1 − α ˜ β(1 − (δk + σk ))kH(v`(k) )k,

(47)

go to Step 6, else update α ˜ = θα ˜ . and go to Step 5. Denote αk the last value of α˜k . Step 6. Update vk+1 = vk + αk ∆vk . Set k ← k + 1. Go to Step 2. Step 2, 3, 5 and 6 enable us to consider Algorithm 3.1 as a special case of Algorithm 2.1. At the first step all the parameters are set, while at the step 4 the centrality conditions are stated. One can observe that conditions (45) and (46) avoid the last p components of the vector H(v) (related to

16 complementarity equations) to become smaller than kG(v)k at every iterate. For the analysis of the convergence, it is useful to introduce the set Ω(²) = {v ∈ Rn˜ : ² ≤ kH(v)k ≤ kH(v0 )k, s.t. v satisfies conditions (45) and (46)}.

(48)

We observe that all the iterates vk belong to Ω(0). For the convergence of Algorithm 3.1 we make the following assumptions: A1. In Ω(0) f (x), g1 (x), g2 (x) are twice continuously differentiable and the derivative of G(v) is Lipschitz continuous. Moreover the columns of ∇g1 (x) are linearly independent. A2. Ω(0) is a compact set. A3. The matrix H 0 (vk ) is nonsingular for any k ≥ 0. The assumption A2 implies that the iteration sequence {vk } is bounded. First we prove some lemmas used in the proof of the convergence theorem presented below. Lemma 3.1 Let {vk } generated by Algorithm 3.1. Under the assumption A1–A3 there exists a positive constant M such that k∆vk k ≤ M.

Proof. Recalling that the direction ∆vk computed at the step 3 is a nonmonotone Inexact Newton step at the level σk + δk , we obtain the following inequality: k∆vk k ≤ kH 0 (vk )−1 k · (kH 0 (vk )∆vk + H(vk )k + kH(vk )k) ≤ kH 0 (vk )−1 k · (1 + σk + δk )kH(v`(k) )k.

(49)

¡ ¢ Denoting M = maxv∈Ω(0) kH(v)−1 k (1 + σmax + δmax )kH(v0 )k, (49) yields k∆vk k ≤ M. ¤ For the proof of the following lemma we refer to [4]: tacking into account Lemma 3.1 and (43), it is possible to use the same arguments.

17 Lemma 3.2 Let {vk } generated by Algorithm 3.1, so that the settings at the step 1 and 2 hold. Assume that {vk } ⊂ Ω(²)with ² > 0. Then α ˜k computed at the step 4 is bounded away from zero. Now we prove the following convergence result. Theorem 3.1 Under the assumptions A1–A3, the Algorithm 3.1 with tol = 0 generates a sequence {vk } such that {kH(vk )k} converges to zero and each limit point of {vk } satisfies the KKT conditions for (32). Furthermore, if v∗ is a limit point of {vk } such that H 0 (v∗ ) is nonsingular, then the sequence {vk } converges to v∗ . Proof. Denote L = limk→∞ kH(v`(k) )k. From Lemma 3.1 and Theorem 2.3 we obtain that limk→∞ kH(vk )k = L. Suppose now that L > 0. This implies that {vk } ⊂ Ω(²), with ² > 0 (at least for k large). Consequently, from Lemma 3.2, α˜k is bounded away from zero. If v∗ is a limit point of {vk }, then {v∗ } ∈ Ω(²) and H 0 (v∗ ) is a nonsingular matrix. Then, from Theorem 2.2, we deduce that L = 0 and this is a contradiction. Notice that we can use Theorem 2.2 even if the starting value of the backtracking procedure is α˜k instead of 1 because α˜k is bounded away from zero. Then {kH(vk )k} has to converge to zero. So, if v∗ is a limit point of {vk } such that H 0 (v∗ ) is nonsingular, using Theorem 2.1, then the sequence {vk } converges to v∗ . ¤

4

Numerical examples

In this section we report some numerical experiments, obtained by coding Algorithm 3.1 in FORTRAN90 using double precision on a Compaq XP1000 workstation. In particular we set β = 10−4 , θ = 0.5, tol = 10−8 . We declare the failure of the algorithm when the tolerance tol is not satisfied after 500 iterations or when at some step the backtracking reductions are more st w than 10. Furthermore we set µk = kp k as in [6]. The aim of the numerical experiments is to compare the behaviour of the monotone and nonmonotone algorithms; for the nonmonotone one, the parameter N has been chosen equal to 2, 4 and 9. Furthermore, the comparison has been performed in two different cases. In the first one, δk is set equal to 0: this means that the perturbed Newton equation (35) is solved exactly at each iteration. The solution of the linear system is computed by the MA27 subroutine of the Harwell library that performs a LU factorization. In the second case, Hestenes multipliers scheme has been adopted as iterative inner solver (see

18

Table 1: References and starting points P P1 P2 P3 P4 P5 P6 P7 P8

Reference Example 5.5 in [8] Example 5.6 in [8] Example 5.7 in [8] Example 5.8 in [8] Example 4 in [9] Example 5 in [9] Example 4.2, M = 1, K = 0.8 in [10] Example 4.2, M = 0, K = 1 in [10]

(x0 )i direct 1 0.01 0.5 3.995

(x0 )i iterative 1 2 1.5 2;7 0;3 0;3 1.75 ; 5 3

χ 107 − 108 106 − 107 106 − 107 106 − 107 106 − 107 106 − 107 107 − 108 107 − 108

[1]), so an inexact solution of (35) is calculated. The nonlinear programming problems considered here arise from the discretization by finite difference of elliptic control problems described in [8], [9] and [10]. The references are listed in Table 1. In the third and fourth column of Table 1, the starting points for the two choices of solver, direct and iterative, have been reported; when two different values are listed on the same row, the first one is the value of the components of x0 related to the state variables, while the second one is related to the control variable. Only the value of the variable x0 are reported, while the other components of the vector v0 are always been set equal to 1. In the last column of Table 1 is specified the interval which the parameter χ in [1] belongs to. In Table 2 the results of the monotone and nonmonotone algorithms with the direct inner solver are compared in terms of number of iterations (it.) and total number of backtracking reductions (b.). Each test problem has been executed three times, by changing the meshsize: the values on the first column indicate the number of meshpoints on the x and y axis. In this case, since an exact solver has been adopted, the nonmonotone scheme differs to the monotone one only on the backtracking rule. From the results, in some cases the two algorithms seem to behave in a similar way. In more critical cases, in order to satisfy the monotone backtracking rule the damping parameter is reduced to a very small value; this fact yields the failure of the algorithm, while the nonmonotone rule allows to accept larger values of the damping parameter, avoiding in many cases the stagnation of the iterates. In general, a reduction of the number of the backtracking steps can be observed. Figure 1 illustrates the decrease of kH(vk )k for P1 with n = 10593. The results Table 3 have been obtained employing the iterative solver, so the number of the inner iterations (inn.) is reported. Now the difference between the monotone and the nonmonotone schemes are not only in the backtracking rule, but in the stopping criterion of the inner solver too. In the last column are listed the final values of the objective function (obj.). A

19 Figure 1 4

10

Mon. Nonm. 2

10

0

10

−2

10

−4

10

−6

10

−8

10

−10

10

0

5

10

15

20

25

30

35

40

45

50

general reduction of the number of inner iterations can be observed, and in many cases the number of external iterations (ext.) and of the backtracking reductions (b.) is also reduced.

5

Conclusions

We proposed a variant of Inexact Newton Method in which monotonicity requirements have been relaxed. For the modified scheme we devised conditions under which we proved the convergence theorems. Then we applied the nonmonotone techniques to the inexact interior–point method, as special case of Inexact Newton Method, and we proved the convergence of the whole scheme. As shown in the tables 2 and 3, the nonmonotone approach can reduce the number of the backtracking steps and of the inner iterations when an iterative solver is employed.

References [1] S. Bonettini, E. Galligani and V. Ruggiero (2003). A Newton Inexact Interior–Point method combined with Hestenes’ multipliers scheme, Technical Report n.334, Dipartimento di Matematica dell’Universit` a di Ferrara.

20

Table 2: Numerical results: direct inner solver

Grid

n

50

2793

100

10593

150

23393

P P1 P2 P3 P4 P1 P2 P3 P4 P1 P2 P3 P4

Monotone N =0 it b. 27 1 30 15 – – 33 6 46 55 – – – – 31 0 – – – – – – 29 2

N =2 it b. 26 0 – – 25 0 – – – – – – 26 1 31 0 – – – – 26 1 32 1

Nonmonotone N =4 N =9 it b. it b. 26 0 26 0 26 2 26 2 25 0 25 0 – – 34 2 – – 33 0 – – 32 13 27 0 27 0 31 0 31 0 – – – – – – 41 19 27 0 27 0 32 1 32 1

Table 3: Numerical results: iterative inner solver

P P1

P2

P3

P4

P5

P6

P7

P8

Grid 50 100 200 50 100 150 50 100 200 50 100 200 50 100 200 50 100 200 50 100 200 50 100 200

n 2793 10593 41193 2793 10593 23393 2793 10593 41193 2793 10593 41193 4998 19998 79998 4998 19998 79998 4998 19998 79998 4998 19998 79998

Monotone N =0 ext. inn. 25 27 35 36 – – 23 24 31 33 – – 17 19 24 26 31 34 18 19 26 28 39 41 17 18 17 18 20 23 29 30 41 42 69 72 21 21 27 28 46 47 28 28 42 43 51 96

b. 1 1 – 0 0 – 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

ext. 22 29 53 23 31 39 18 23 32 18 26 37 17 17 19 29 41 59 21 27 46 28 42 51

N =2 inn. 22 29 53 23 31 39 18 23 32 18 26 37 17 17 19 29 41 59 21 27 46 28 42 66

b. 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

Nonmonotone N =4 ext. inn. b. 22 22 0 29 29 1 53 53 1 23 23 0 31 31 0 39 39 0 18 18 0 23 23 0 32 32 0 18 18 0 26 26 0 37 37 0 17 17 0 17 17 0 19 19 0 29 29 0 41 41 0 59 59 0 21 21 0 27 27 0 46 46 0 28 28 0 42 42 0 51 51 0

ext. 22 29 53 22 31 39 18 23 32 18 26 37 17 17 19 29 41 59 21 27 46 28 42 51

N =9 inn. 22 29 53 23 31 39 18 23 32 18 26 37 17 17 19 29 41 59 21 27 46 28 42 51

b. 0 1 1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

obj. .5479649 .5522459 .5543686 .0140651 .0150786 .0154262 .2575581 .2638984 .2671221 .1539771 .1616639 .1657634 .0773888 .0780638 .0784259 .0521892 .0526638 .0529328 -6.4857811 -6.5764272 -6.6200922 -18.4825400 -18.7361482 -18.8633116

21 [2] R. S. Dembo, S. C. Eisenstat and T. Steihaug(1982). Inexact Newton methods, SIAM Journal on Numerical Analysis, 19, 400–408. [3] C. Durazzi (2000) On the Newton interior–point method for nonlinear programming problems, Journal of optimization Theory and Applications, 104, 73-90. [4] C. Durazzi and V. Ruggiero (2003). A Newton Inexact Interior– Point method for large scale nonlinear optimization problems, Annali dell’Universit`a di Ferrara, Sezione VII Scienze Matematiche, 49, 333357. [5] S. C. Eisenstat and H. F. Walker (1994). Globally convergent Inexact Newton methods, SIAM Journal on Optimization, 4, 393–422. [6] A. S. El Backry, R. A. Tapia, T. Tsuchiya and Y. Zhang (1996). On the Formulation and Theory of Newton Interior–Point Method for Nonlinear Programming, Journal of Optimization Theory and Applications, 89, 507–541. [7] L. Grippo, F. Lampariello and S. Lucidi (1986). A Nonmonotone line search technique for Newton’s method, SIAM Journal on Numerical Analysis 23, 707–716. [8] H. D. Mittelmann and H. Maurer (1999). Optimization techiques for solving elliptic control problems with control and state constraints: Part 1. Boundary control, Computational Optimization and Applications, 16, 29–55. [9] H. D. Mittelmann and H. Maurer (2001). Optimization techiques for solving elliptic control problems with control and state constraints: Part 2. Distributed control, Computational Optimization and Applications, 18, 141–160. [10] H. D. Mittelmann and H. Maurer (2000). Solving elliptic control problems with Interior Point and SQP Methods: control and state constraints, Journal of Computational and Applied Mathematics, 120, 175–195. [11] J. M. Ortega and W. C. Rheimboldt (1970). Iterative solution of nonlinear equations in several variables, Academic Press, New York.

Suggest Documents