FITTING OF NONLINEAR STATISTICAL MODELS USING SAS

FITTING OF NONLINEAR STATISTICAL MODELS USING SAS Savita Wadhwa I.A.S.R.I., Library Avenue, Pusa, New Delhi – 110 012 [email protected] 1. The fol...
Author: Lindsey Allison
19 downloads 0 Views 229KB Size
FITTING OF NONLINEAR STATISTICAL MODELS USING SAS Savita Wadhwa I.A.S.R.I., Library Avenue, Pusa, New Delhi – 110 012 [email protected]

1.

The following data gives proportion of area under High yield varieties (HYVs) of wheat for the State of Punjab during the years 1966-67 to 1985-86: t: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 yt: 0.0365, 0.3542, 0.4851, 0.6549, 0.6520, 0.7259, 0.7837, 0.8430, 0.8839, 0.9003, 0.9213, 0.9370, 0.9627, 0.9826, 0.9804, 0.9883, 0.9964, 0.9942, 0.9987, 0.9994.

Fit the following Monomolecular Model using SAS9.2 software package for the above data using Levenberg - Marquardt’s method: yt = 1  (1  a) * exp(b * t ) +et 2.

Wheat productivity data (in quintals/hectare) of the country during the years 1973-74 to 1996-97 is as follows: t: 0 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 yt: 11.72, 13.38, 14.10, 13.87, 14.80, 15.68, 14.36, 16.30, 16.91, 18.16, 18.43, 18.70,20.46, 19.16, 20.02, 22.41, 21.21, 22.81, 23.97, 23.27, 23.80, 25.59, 24.93, 26.59.

Fit Logistic and Gompertz growth models to above data using ‘Levenberg-Marquardt’ nonlinear estimation procedure available in SAS 9.2 package. Carry out residual analysis and forecast the values for the year 2010 by each model. Logistic Model yt = c /(1  b * exp(a * t )) +et Gompertz Model yt = c * exp(b * exp(a * t )) +et

SAS Commands for Nonlinear Statistical Models Title 'Monomolecular Model'; data hyv; input t yt; cards; 0 0.0365 1 0.3542 2 0.4851 3 0.6549

Fitting of Nonlinear Statistical Models Using SAS 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19

0.6520 0.7259 0.7837 0.8430 0.8839 0.9003 0.9213 0.9370 0.9627 0.9826 0.9804 0.9883 0.9964 0.9942 0.9987 0.9994 ; proc nlin best = 10 method = marquardt data = hyv; parms a = 0 to 1 by .1 b = 0 to .1 by .01; model yt = 1-(1-a)*exp(-b*t); output out = resultmono p = pred r = resid; proc print data = resultmono; run;

proc sgplot data= resultmono; scatter x=t y=yt; series x=t y=pred/lineattrs=(color=blue pattern=1); run; quit ;

Title'Logistic Model'; Title1 'wheat yield 1973-74 to 1996-97(24 years data in q/ha)'; data wheat; input t yt; cards; 0 11.72 1 13.38 2 14.10 3 13.87 4 14.80 5 15.68 6 14.36 7 16.30 104

Fitting of Nonlinear Statistical Models Using SAS 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23

16.91 18.16 18.43 18.70 20.46 19.16 20.02 22.41 21.21 22.81 23.97 23.27 23.80 25.59 24.93 26.59

; proc nlin best=10 method=marquardt data=wheat; parms c=30 to 80 by 10 b=0 to 1 by .1 a=0 to .1 by .01; model yt=c/(1+b*exp(-a*t)); output out=reslogistic p=pred r=resid parms=c b a; proc print data=reslogistic; run; data forecast; set reslogistic (obs = 1 keep = c b a); forecast = c/(1+b*exp(-a*37)); run; proc print data = forecast; run;

To test the Normality of given data Proc univariate data=reslogistic normal; var resid; run; Title 'Gompertz Model'; proc nlin best=10 method=marquardt data=wheat; parms c=30 to 50 by 10 b=0 to .5 by .1 a=0 to .5 by .1; model yt=c*exp(-b*exp(-a*t)); output out=resgompertz p=pred r=resid; proc print data= resgompertz ; run;

105

Fitting of Nonlinear Statistical Models Using SAS

To test the Normality of given data Proc univariate data=reslogistic normal; var resid; run;

To test the Randomness of given data /*The MEAN=0 option in the PROC STANDARD step below centers the variable resid about its mean. */ proc standard data=reslogistic out=two mean=0; var resid; run; /*The following DATA step computes the total number of runs (RUNS), the number of positive runs (NUMPOS), and the number of negative runs (NUMNEG). */ data runcount; keep runs numpos numneg n; set two nobs=nobs end=last; retain runs 1 numpos 0; prevpos=( lag(resid) GE 0 ); currpos=( resid GE 0 ); if _n_=1 and resid GE 0 then numpos+1; else do; if currpos and prevpos then numpos+1; else if currpos and ^prevpos then do; runs+1; numpos+1; end; else if ^currpos and prevpos then runs+1; end; if last then do; numneg=nobs-numpos; 106

Fitting of Nonlinear Statistical Models Using SAS n=nobs; output; end; run; /*Finally, these steps compute and display the Wald-Wolfowitz (or Runs) test statistic and its pvalue. */ data waldwolf; label z='Wald-Wolfowitz Z' pvalue='Pr > |Z|'; set runcount; mu = ( (2*numpos*numneg) / (numpos+numneg) ) + 1; sigmasq = ( (2*numpos*numneg) * (2*numpos*numneg-numneg-numpos) ) / ( ( (numpos+numneg)**2 ) * (numpos+numneg-1) ); sigma=sqrt(sigmasq); drop sigmasq; if N GE 50 then Z = (Runs - mu) / sigma; else if Runs-mu LT 0 then Z = (Runs-mu+0.5)/sigma; else Z = (Runs-mu-0.5)/sigma; pvalue=2*(1-probnorm(abs(Z))); run; title 'Wald-Wolfowitz Test for Randomness'; title2 'H0: The data are random'; proc print data=waldwolf label noobs; var z pvalue; format pvalue pvalue.; run; Some more steps on fitting of non-linear models are also available at http://iasri.res.in/design/Analysis%20of%20data/nonlinear_models.html

107

Fitting of Nonlinear Statistical Models Using SAS

RESULTS Practical 1: Monomolecular Model Parameter Estimates: Parameter Estimate Standard Error a .090 .023 b .264 .011

95% Confidence Interval Lower Bound Upper Bound .042 .139 .241 .286

Correlations between Parameter Estimates a b a 1.000 -.610 b -.610 1.000 M.S.E. = 0.001

Graph of observed and fitted series of 'Monomolecular Model'

108

Fitting of Nonlinear Statistical Models Using SAS

Practical 2: Logistic Model Parameter Estimates Parameter Estimate Standard Error a .062 b 2.260 c 40.391 M.S.E.=0.48

.012 .574 7.849

95% Confidence Interval Lower Bound .037 1.065 24.069

Upper Bound .086 3.454 56.713

Run Test Wald-Wolfowitz Z Pr > |Z| 1.044

0.297

Tests for Normality Kolmogrov-Smirnov Shapiro-Wilk Statistic Sig. Statistic df Sig. df Residuals 0.123 24 0.200 0.963 24 0.511 Forecast value for year 2010 = 32.86 (quintals/hectare) Practical 2: Gompertz Model Parameter Estimates Parameter Estimate Standard Error

95% Confidence Interval Lower Bound

a b c M.S.E. = 0.479

.030 1.527 56.820

.011 .385 23.023

0 .006 0.727 8.942

Upper Bound .053 2.327 104.699

Run Test Wald-Wolfowitz Z Pr > |Z| 1.044

0.297

Tests for Normality Kolmogrov-Smirnov Shapiro-Wilk Statistic df Sig. Statistic df Sig. Residuals 0.109 24 0.200 0.968 24 0.627 Forecast value for year 2010= 34.04 (quintals/hectare) 109

Fitting of Nonlinear Statistical Models Using SAS

1

e

μ1

β1

y

FITTING OF NONLINEAR STATISTICAL MODEL(NLMIXED) USING SAS 3. The data below arose from the five orange trees grown at Riverside, California, during the period 1969-1973. The response y in the body of the table is the trunk circumference in millimeters, and the predictor variable t is the time in weeks, with an arbitrary origin taken on December 31, 1968. Fit the logistic model  to this data.    x  2e 3 Response variable (y) Week 1 2 3 4 5 16.86 30 33 30 32 30 69.14 58 69 51 62 49 94.86 87 111 75 112 81 143.43 115 156 108 167 125 175.86 120 172 115 179 142 196 142 203 139 209 174 226 145 203 140 214 177 1 estimates the upper horizontal asymptote, 1/(1+2) estimates the y-intercept, and 3 indicates the shape of the curve (the biggest the value, the steeper the curve).

SAS COMMANDS data trees; input tree week y; datalines; 1 16.86 30 1 69.14 58 1 94.86 87 1 143.43 115 1 175.86 120 1 196 142 1 226 145 2 16.86 33 2 69.14 69 2 94.86 111 2 143.43 156 2 175.86 172 2 196 203 2 226 203 3 16.86 30 3 69.14 51 3 94.86 75 3 143.43 108 3 175.86 115 3 196 139 3 226 140 110

Fitting of Nonlinear Statistical Models Using SAS 4 16.86 32 4 69.14 62 4 94.86 112 4 143.43 167 4 175.86 179 4 196 209 4 226 214 5 16.86 30 5 69.14 49 5 94.86 81 5 143.43 125 5 175.86 142 5 196 174 5 226 177 ; /* Graph of the data to see the change in circumferences over time and to get the idea of initial values for the parameters */ proc sgplot data=trees; series y=y x=week / group=tree; Title ‘Orange Trees Growth Curve’; run; title;

/* To group the data by trees */ proc sort data =trees; by tree; run; /* Fitting of the nonlinear mixed model (Logistic growth model) */ proc nlmixed data=trees; parms b1=180 b2=8 b3=0.6 s2u=10 s2e=5; num=b1+u; den=1+b2*exp(-b3*week); pred=num/den; model y ~ normal( pred, s2e); random u ~ normal( 0,s2u) subject=tree; predict pred out=preddata; run; /* Graph of the predicted curve along with data */ proc sgplot data=preddata; scatter y=y x=week / group=tree; series y=pred x=week /group=tree name="fit"; keylegend "fit" / title= 'Tree'; title ‘ Predicated Orange Tree Growth Curve’; 111

Fitting of Nonlinear Statistical Models Using SAS run; title; RESULTS:

Parameter Estimates

Parameter Estimate Standard Error b1 192.05 15.6584 b2 8.0950 0.8566 b3 0.02011 0.001565 s2u 1001.61 649.62 s2e 61.5069 15.8813

DF t Value 4 12.27 4 9.45 4 12.85 4 1.54 4 3.87

Pr > |t|

Alpha Lower

Upper

Gradient

0.0003 0.0007 0.0002 0.1980 0.0179

0.05 0.05 0.05 0.05 0.05

235.53 10.4733 0.02445 2805.23 105.60

1.507E-6 0.000082 -0.10576 2.568E-7 4.924E-6

148.58 5.7167 0.01577 -802.02 17.4132

b1 b2 b3 (Betas) are significant. The variance of upper horizontal asymptote among trees is 1001.61, shows that there is a large amount of variation among the trees, suggesting a fairly large amount of variation among the trees regarding the ultimate circumference a tree approaches.

112

Suggest Documents