Numerical Methods for Optimization-based Model Validation

Numerical Methods for Optimization-based Model Validation Ekaterina A. Kostina Institute for Aplplied Mathematics (IAM) Interdisciplinary Center for ...
2 downloads 0 Views 2MB Size
Numerical Methods for Optimization-based Model Validation

Ekaterina A. Kostina Institute for Aplplied Mathematics (IAM) Interdisciplinary Center for Scientific Computing (IWR) Heidelberg Collaboratory for Industrial Optimization (HCO)

SAMSI Optimization Workshop August 29 - September 2, 2016, Research Triangle Park, USA

What Models We Are Looking For? Why Do We Need These Models?

What Models We Are Looking For?

These? ... No!

What Models We Are Looking For?

Maybe this? ... Also no!

Photo by Manfred Werner-Tsui

Barbara Meier, studied mathematics, the winner of the second cycle of Germany’s Next Topmodel

What Models We Are Looking For? ... these also no!

Möbius band

Klein bottle

But The Models That Allow Simulation From Merriam-Webster’s Dictionary

Why Do We Need Simulation Models? E.g. the concept of “digital twins” (GE, Siemens, ...) GE Report Oct 4, 2015

every new product or a physical economical process has a digital twin which includes a collection of models and algorithms; which accompanies the process from the “origin” (modeling) through its life time, “growing” together with the process; is used among others to analyze data, predict malfunctioning and perform optimal operation. “The Wall Street Journal”, Jan 15, 2016: GE Digital CEO Says “Digital Twins” Will Optimize Machinery, Human Health

Another example: development and admission of drugs Necessary precondition: validated models with reliable parameter estimates

Life Cycle of the Mathematical Model

Outline

1

Parameter Estimation

2

Sensitivity Analysis of Parameter Estimates

3

Design of Optimal Experiments

4

Applications

First Step in Model Validation: Parameter Estimation

Constrained Parameter Estimation Problem determine parameters p and states y to minimize deviation of the model response M[y; p] from measurements η in a suitable norm (e.g. weighted l2 , l1 , hybrid) min y,p

||η − M[y; p]|| ,

such that model equations F[y; p] = 0 and additional “experimental constraints” are satisfied C[y; p] ≥ 0

Mathematical Model Classes for Dynamical Processes F[y; p] = 0 Differential-Algebraic Equations (DAE) x: differential variables z: algebraic variables, y = (x, z)

B(x, z, p)˙x

=

f (x, z, p, q, u)

p: unknown parameters (to be estimated by PE)

0

=

g(x, z, p, q, u)

q: design parameters (to be chosen in OED) u: controls (to be chosen OED)

Partial-Differential Equations (PDE)

ut − ∇(K∇u) = f (u, p) → Semidiscretization in space or Rothe method initial and boundary conditions!

Observation Model and Data

ηi

=

M(ti , ytrue (ti ), ptrue ) + εi at time points ti , i

= 1, ..., m1

Data from multiple experiments under varying conditions

M: nonlinear function of states (differential and algebraic) εi : measurement errors independent in this talk: εi ∼ N (0, σi2 )

Numerical Methods for Parameter Estimation in ODE/DAE/PDE Bock et al 1981, ..., K. et al 1997, ...

Direct “all-at-once” multiple shooting method Constrained Gauss Newton method Structure exploitation (multiple experiments, parameterization in time, spatial discretization of PDE) Efficient line search strategies Use of inexact Jacobians Robust parameter estimation based on l1 or Huber

After (Multiple Shooting) Discretization: Large-Scale l2 Parameter Estimation Problem 1 min kF1 (x)k22 , x 2

s.t.

F2 (x) = 0

Optimization variables: xT := (pT , sT ) ∈ Rn , with dicretization variables s  η1 − M(t1 , y(t1 ; p, s), p)   .. F1 (x) := Σ−1   ∈ Rm1 . ηm1 − M(tm1 , y(tm1 ; p, s), p) 

Cost function:

  wi , i = 1, ..., m1 , wi ∈ {0, 1} Σ−1 = diag σ i

 Constraints:

F2 (x) :=

Jacobians:

J1 (x) :=

C(p, s)



Discretized BVP ∂F1 (x) ∂x

∈ Rm2

and J2 (x) :=

∂F2 (x) ∂x

Solution Approach: Generalized Gauss-Newton Solve iteratively, given initial guess x0 xk+1 = xk + [tk ]∆xk ∆xk solves linearized problem min

∆x∈Ωk

s.t.

∆xk

=

1 ||F1 (xk ) + J1 (xk )∆x||22 2 F2 (xk ) + J2 (xk )∆x = 0

−J + (xk )F(xk ) J + is a generalized inverse: J + JJ + = J +     J1 F1 J= , F= J2 F2

Parameter Estimation and Sensitivity Analysis Dynamical Process

Measurements with Errors

Mathematical Model

Parameter Estimation Problem

Estimates x∗

Qualitative Assessment of x∗

Quantitative Output

Qualitative Output?

Likelihood Ratio Confidence Regions

Dlr (α) = {x | F2 (x) = 0, kF1 (x)k22 − kF1 (x∗ )k22 ≤ γm2 (α)}

Beale 60, Draper and Smith 81, Pazman 93

γm2 (α) is (1 − α)-quantile of χ2 -distribution with m := n − m2 degrees of freedom

Advantage High approximation quality Disadvantage Very difficult to compute

Linearized Confidence Regions

Dlin (α) = {x | J2 (x∗ )(x − x∗ ) = 0, kJ1 (x∗ )(x − x∗ )k22 ≤ γm2 (α)}

Beale 60, Draper and Smith 81, Pazman 93

Advantage Can be computed very efficiently

Another View: Parameter Sensitivity w.r.t. Measurement Errors Introduce an error weight τ ∈ [0, 1]: min kF1 (x)k22

min kF1 (x, τ )k22

s.t. F2 (x) = 0

s.t. F2 (x) = 0

x

x

where  (M(t1 ) + τ 1 ) − M(t1 , y(t1 ; s), p)   .. F1 (x, τ ) = Σ−1   . (M(tm1 ) + τ m1 ) − M(tm1 , y(tm1 ; s), p) 

Measurements M ⇒ estimates x(M) resp. x(0) Measurements M + τ  ⇒ estimates x(M + τ ) resp. x(τ ) Question: what is the distribution of x(τ )?

Lemma Taylor series for the  solution  of the modified problem −1  + Σ • x(τ ) = x + τ J + O(τ 2 ) 0

Another View: Linearized Confidence Regions Dlin (α) = {x∗ + ∆x | ∆x = −J + (x∗ )

  η , kηk22 ≤ γm2 (α)} 0

Dlin (α) = {x | J2 (x∗ )(x − x∗ ) = 0, kJ1 (x∗ )(x − x∗ )k22 ≤ γm2 (α)} Linear approximation of the covariance matrix  C = E ∆x∆xT   Im1 0 + ∗ = J (x ) J +T (x∗ ) 0 0m2 Dlin (α) is enclosed by a (minimal) box n

GL (α) ⊂

×[x

∗ i

− θi , xi∗ + θi ],

i=1

p where θi = Cii γm2 (α) and Cii are the diagonal elements of the covariance matrix Bock 1987

Next Step in Model Validation: Design of Optimal Experiments

Design of Optimal Experiments

Experimental Setup 1

Experimental Setup 2



B 

B bad



B

good



B BN

  

Confidence Regions

Aim: choose optimal experimental conditions (design variables) to gain information about parameter estimates (to reduce parameter uncertainty) Design variables  ξ T := qT , wT ∈ Ξ q ∈ Rnq (discretized) control functions and control parameters w ∈ {0, 1}K sampling

Design of Optimal Experiments: Problem Formulation

min φ(ξ; x) ξ

s. t. ml ≤ ψ(t, x, ξ) ≤ mu , t ∈ T 0 = χ(t, x, ξ), t ∈ T w ∈ {0, 1}M

ξ T = (qT , wT ) are design variables x is an estimate of true parameter and (discretized) states values xtrue satisfying the underlying parameter estimation problem

Design of Optimal Experiments: Cost Functionals Typical cost functionals φ is a functional of the covariance matrix   Im1 0 C = J + (x) J +T (x) = Im1 0 0m2

0

  J1T J1 J2

J2T 0

−1 

Im1 0



A: φA (C) = 1n Trace(C) 1

D: φD (C) = det(K T CK) n mit K ∈ Rn×nk E: φE (C) = max{λ| λ is an eigenvalue of C}

OED problem is a complex non-standard nonlinear mixed-integer optimal control problem Cost function already involves 1st order derivatives of PE problem and is implicitly defined by a generalized inverse Integer decisions: sampling design

Optimal Experimental Design is a Complex Non-Standard Nonlinear Mixed-Integer Optimal Control Problem Theory originally developed for linear models e.g. Kiefer, Wolfowitz 50’s, Box, Draper 1987, Pukelsheim 1993

Algorithmic developments: Lohmann 1990, Körkel, 1998, ..., Bauer, Körkel et al, 1998, ..., K. et al 2004, ..., Software VPLAN

structure exploiting, direct, all-at-once optimization algorithms for DE models, based on multiple shooting efficient evaluation of higher order derivatives by structure exploiting “internal” and “algorithmic” differentiation optimal integer controls by exact lower bounds and intelligent approximation strategies robustification of optimization against uncertainties experimental design for model discrimination

What Improvements Can We Expect From Optimal Experimental Designs?

Application: Transport and Degradation of Xenobiotics in Soil

Transport and Degradation of Xenobiotics in Soil Dieses, Bock, Schlöder, Richter 2002

investigation of fate of pesticides in soil needs expensive lysimeter experiments for registration replacement by computer experiments requires validated models! here: optimal mini-lysimeter experiments optimal irrigation optimal solute application

Typical Column Outflow Experiment Water transport: =⇒ 3 unknown parameters n, α, Ks

C(ψ)

K(ψ) = Ks

 ∂ψ ∂ ∂ = K(ψ) (ψ − z) ∂t ∂z ∂z

(1 − (α|ψ|)n−1 (1 + (α|ψ|n )1/n−1 )2 (1 + (α|ψ|)n )

1−1/n 2

C(ψ) = α(n − 1)(θs − θr )(α|ψ|)n−1 (1 + (α|ψ|)n )1/n−2 z≥0

Initial condition: ψ(0, z) = 670, Upper boundary condition: q(t, 0) = −K(ψ(t, 0)) Lower boundary condition:

 ∂ψ(t, 0) −1 ∂z

control: water flux

∂ψ(t,Le ) ∂z

= 0,

t≥0

Typical Column Outflow Experiment Solute transport: =⇒ 3 unknown parameters k, b, Dm  ∂ θDh (θ) ∂c ∂(qc) ∂(θc) ∂z = − − kc ∂t ∂z ∂z 0.0046ebθ + Dm θ

Dh (θ) =

Initial condition:

c(0, z) = 0,

z≥0

Upper boundary condition: −Dh (θ(t, 0))

∂c(t, 0) q(t, 0) q(t, 0) + c(t, 0) = c0 (t, 0) control: solute input ∂z θ(t, 0) θ(t, 0)

Lower boundary condition:

∂c(t,Le ) ∂z

= 0,

t≥0

Typical Column Outflow Experiment

=⇒ Coupled water and solute transport: 6 unknown parameters

Optimization of Experimental Conditions 2 boundary controls - piecewise constant water flux q(t, 0) changeble only every 24 h 0 ≤ q(t, 0) ≤ 0.6 solute input c0 (t, 0) changeble only every 24 h 50 ≤ c0 (t, 0) ≤ 200 sampling scheme: fixed measurement times 24 outflow measurements - 12 days, every 12h (σ = 0.01) 6 profile measurements (σ = 0.1) only at end of experiment evaluation of solute concentrations at 6 locations

Parameterized Expert Designs Input water flux q(t, 0)

Substance input concentration c0 (t, 0)

Quality of Expert Designs: Criterion Values and Standard Deviations

n α Ks k b Dm

Expert Design A

Expert Design B

A-Criterion = 3.8859

A-Criterion = 0.75

scaled 1.0 1.0 1.0 1.0 1.0 1.0

± ± ± ± ± ±

standard deviation 0.1604 1.5968 3.3752 0.0124 3.0460 0.2644

n α Ks k b Dm

scaled 1.0 1.0 1.0 1.0 1.0 1.0

± ± ± ± ± ±

standard deviation 0.0800 0.8600 1.6736 0.0083 0.9712 0.1033

Optimal Experimental Conditions Input water flux q(t, 0)

Substance input concentration c0 (t, 0)

Comparison: Optimal Design vs Expert Designs A and B

A-Criterion n α Ks k b Dm

1.0 1.0 1.0 1.0 1.0 1.0

Optimal design

Expert design A

Expert design B

0.0472

3.8859

0.750

± 0.0191 ± 0.2438 ± 0.4348 ± 0.0086 ± 0.1852 ± 0.0108

± 0.1604 ± 1.5968 ± 3.3751 ± 0.0124 ± 3.0460 ± 0.2644

± 0.0800 ± 0.8600 ± 1.6735 ± 0.0083 ± 0.9712 ± 0.1032

All parameters can be simultaneously estimated!

Ongoing Developments Rich and well advanced theory of optimal experimental design for linear models Very high relevance in industrial applications BASF: “We save up to 80% costs with these methods and get much more precise results!”

“Bildzeitung”, 17.06.2008

Actual Developments

Rich and well advanced theory of optimal experimental design for linear models Very high relevance in industrial applications BASF: “We save up to 80% costs with these methods and get much more precise results!” Efficient numerical methods for robustification of OED against uncertainties in parameter estimates parabolic PDE models on-line design of optimal experiments

Robustification of OED Based on Second Order Sensitivity Analysis

Robustification of OED Based on Second Order Sensitivity Analysis Consider again parametric PE problem depending on an error weight τ ∈ [0, 1]: min kF1 (x)k22

min kF1 (x, τ )k22

s.t. F2 (x) = 0

s.t. F2 (x) = 0

x

x

where 

 (M(t1 ) + τ 1 ) − M(t1 , y(t1 ; s), p)   ..  F1 (x, τ ) = Σ−1    . (M(tm1 ) + τ m1 ) − M(tm1 , y(tm1 ; s), p)

Lemma Taylor series for the  solution  of the modified problem −1  + Σ • x(τ ) = x + τ J + O(τ 2 ) 0   −1    Σ−1  Σ  • x(τ ) = x + τ J + + τ 2 dJ + (I − JJ + ) − 12 J + (dJ)J + + O(τ 3 ) 0 0

Properties of Taylor Series Consider Taylor expansion at the parameter estimates x = x∗ + ∆x + 12 ∆x with      η η ∆x = −J + , ∆x = −2 dJ + (I − JJ + ) − 12 J + (dJ)J + , kηk22 ≤ γm2 (α) 0 0 Then: k∆xk2 ≤ θ,

 

1 ∗ ∗

∆x + 1 ∆x ≤ θ + κ ˜ (x ) + ω ˜ (x )θ θ

2 2 2

where q

Trace(C(x∗ ))γm2 (α),

 T ∗



+ ∗ ∂J(x∗ )



, κ

C ∂J1 (x ) (I ⊗ F1 (x∗ )) ˜ (x ) := ω ˜ (x∗ ) := J (x )



∂x ∂x θ :=

Design of Optimal Experiments: “Q” Cost Functional K., Nattermann 2015, 2016

New cost functional   ω ˜ (·) Trace(C(·)) Trace(C(·)), φQ (·) = Trace(C(·)) + κ ˜ (·) + 2

ω ˜ is a weighted Lipschitz constant of J (nonlinearity of the model functions) κ ˜ is a weighted Lipschitz constant of J + (compatibility of the data and the model, asymptotic convergence rate of Gauss-Newton) Numerical experiments show that Q-optimal designs are significantly insensitive to uncertainties in parameter estimates, the condition of the corresponding PE problem is improved



OED for PDE Models

Optimal Experimental Design for Parabolic PDE Models

Choose design controls ζ as solution of the optimal control problem tr(Cov[u, p, ξ]) + ckζk2L2 (Ω×[0,T])

min

ξ∈L2 (Ω×[0,T]),u∈W(0,T)

s.t.

∂t u + a(u, p, ξ)

= 0,

u(0)

= u0 ,

Cov

=

(Jred T Jred )−1 ,

+ additional constraints on controls

Optimal Experimental Design for Parabolic PDE Models: Numerical Framework K., Kriwet 2016

Parametrization of the underlying parameter estimation problem with multiple shooting approach Condensing technique in function space and efficient computation of the reduced ovariance matrix for parameters Formulation of the OED problem as a multi-stage optimal control problem Efficient computation of reduced gradients by an adjoint approach which requires np + 1 PDE solves Efficient computation of Hesse operator by Lagrange function and adjoint equations Positive definite approximation of Hesse operator which requires solution of decoupled np × np adjoint equations Software on the basis of DEAL-II

Application: Catalytic Plug Flow Reactor

A fluid flux through a packed bed of a porous catalyst, where an exothermic reaction takes place

Aim: to calibrate the mathematical model for the reactor

Catalytic Plug Flow Reactor

Mathematical Model Mass balance: 

∂(uf ci ) ∂ci ∂ 2 ci =− + Deff ,z 2 + Deff ,r ∂t ∂z ∂z



1 ∂ci ∂ 2 ci + ∂r2 r ∂r

 +

ρs,bed Ri ρf

Energy balance: ( ρf cp,f + (1 − )ρs,bed cp,s )

∂(uf ρf T) ∂T ∂2T = −cp,f + λeff ,z 2 + λeff ,r ∂t ∂z ∂z +ρs,bed

N X

∆HR,i Ri

i=1

Catalyst activity:    ∂acat Ea,des = cA k0,des + A0,des exp − ∂t Rg T



∂2T 1 ∂T + ∂r2 r ∂r



The Optimal Experiment Design Problem

4 parameters to estimate: activation energies and frequency factors of the main and deactivation reaction 16 measurements of the temperature in the middle of the reactor at 5 different time points Bondary control: The temperature of the gas at the inlet of the reactor Tent , and the concentration cent λeff ,z

∂T(t, 0, r) = −uf ρf cp,f (Tent (t, r) − T(t, 0, r)) ∂z c(t, 0, r) = cent (t, r)

z=0 z=0

Intuitive vs Optimal Design Parameter k1 k2 k1,deac k2,deac trace

True Values −1.76416 −3.37131 −5.60809 −5.62849 -

Estimates −1.79401 ± 0.949 −3.83068 ± 0.232 −4.98277 ± 8.117 −5.61164 ± 1.160 7.18573

Uncertainty 52.9% 6.1% 162.9% 20.6% -

Estimates Based on Intuitive Experimental Design

Parameter k1 k2 k1,deac k2,deac trace

True Values −1.76416 −3.37131 −5.60809 −5.62849 -

Estimates −1.77712 ± 0.628 −3.84067 ± 0.228 −5.57985 ± 4.914 −5.61718 ± 0.977 2.692312

Estimates Based on the Optimal Experiment

Uncertainty 35.4% 5.9% 88.1% 17.4% -

Refinements: Online Optimal Experimental Design Idea: new measurements gained in experiments can be used to improve the parameter estimates p∗ re-optimize optimum experimental design over remaining experiment min

ϕ(C(ξ, p∗ ))

ξ∈Ω×[t,tN ]

Challenges: real-time iteration and multi-level iteration for optimum experimental design

Parameter k1 k2 k1,deac k2,deac trace

True Values −1.76416 −3.37131 −5.60809 −5.62849 -

Estimates −1.77712 ± 0.628 −3.84067 ± 0.228 −5.57985 ± 4.914 −5.61718 ± 0.977 2.692312

Uncertainty 35.4% 5.9% 88.1% 17.4% -

Estimates Based on Offline Optimal Experimental Design

Parameter k1 k2 k1,deac k2,deac trace

True Values −1.76416 −3.37131 −5.60809 −5.62849 -

Estimates −1.79392 ± 0.492 −3.8335 ± 0.203 −5.42928 ± 2.653 −5.59868 ± 0.872 0.851882

Uncertainty 27.4% 5.3% 48.9% 15.6% -

Estimates Based on Online Optimal Experimental Design

One More Application: Select Optimal Catalyst Geometry by Heat Transfer Coefficient – High Impact on the Value Chain

Estimation of Catalyst Properties

Cooperation with

silver catalyst on aluminum supports of different geometries ultimate aim: maximize selectivity of catalytic production process, e.g., ethene to ethene oxide

important criterion: high heat transfer coefficient to avoid hot spots or explosion!

Experimental setup

Cooperation with

standard heat conduction test reactor

special features: non-stationary heating, temperature range 20-350◦ C temperature gradient over 200◦ C

Experimental setup

Cooperation with

2D nonstationary PDE τ

∂ ∂T  1 ∂ ∂T  ∂T ∂T = λz + rλr − Gcp ∂t ∂z ∂z r ∂r ∂r ∂z

≈ 1000 spatial grid points 2 controls: boundary temperature TW , mass current density G 4 parameters to estimate: λz , λr (heat conductivity) τ , cp (heat capacity)

measurements: outlet temperature

Simulation of Heat Conduction Cooperation with

Optimal Experimental Design

Cooperation with

four experiments intuitive expert design experimental time 16h

optimal design

with constant G = 0, 0.9, 1.8, 4 kg/m2 s non-stationary heating with ramp TW = 20 − 350◦ C

one experiment

experimental time 2.5h

with varying steps of G = 0, 0.49, 4 kg/m2 s

uncertainty is halved

stationary heating with TW = 350◦ C

Optimal Experimental Design

Cooperation with

Results: total heat transfer coefficients for 3 candidate shapes including error bars

high accuracy decision basis for reactor design market idea revised: shape 1 instead of shape 2 (original choice) Impact on value chain: catalyst investment over 1 Me!

Conclusions Numerical methods and software for parameter estimation and optimum experimental design Complex nonlinear problems modelled by ODE/DAE/PDE can be treated Optimal experimental design leads to acceleration of modeling process, allows to reduce e.g. development times for new products, prevents from making wrong decisions based on not calibrated models Challenges: High potential, but practically no theory for nonlinear dynamical processes available Also more efficient numerical methods needed robustification of OED against uncertainties more general problem classes, e.g., hybrid systems, PDE models incorporation of OED in real time state and parameter estimation and feedback control

New application areas

THANK YOU!