Knowledge-Based Risk Assessment and Cost Estimation

Automated Software Engineering,2, 21%230 (1995) @ 1995 KluwerAcademicPublishers,Boston. Manufacturedin The Netherlands. Knowledge-Based Risk Assessme...
5 downloads 0 Views 739KB Size
Automated Software Engineering,2, 21%230 (1995) @ 1995 KluwerAcademicPublishers,Boston. Manufacturedin The Netherlands.

Knowledge-Based Risk Assessment and Cost Estimation RAYMOND J. MADACHY [email protected] USC Center.for Software Engineering, University of Southern California, Los Angeles, CA 90089-0781; and SoJb,vare Engineering Process Group, Litton Data Systems, Agoura Hills, CA 91376-6008

Abstract. A knowledge-basedmethod for software projectrisk assessmentand cost estimationhas been implemented on multipleplatforms. As an extensionto the ConstructiveCost Model (COCOMO), it aids in project planningby identifying,categorizing,quantifyingand prioritizingprojectrisks. It also detects cost estimateinput anomalies and provides risk controladvicein additionto conventionalCOCOMO cost and schedule calculation. The method has been developed in conjunctionwith a system dynamicsmodel of the software development process, and serves as an intelligent front end to the simulationmodel. It extends previous research in the knowledge-basedcost estimationdomainby focusingon risk assessment,incorporatingsubstantiallymore rules, going beyond standard COCOMO, performingquantitativevalidation,providing a user-friendlyinterface, and integratingit with a dynamicsimulationmodel. Results of the validationare promising,and the methodis beingused at LittonData Systemsand other industrial environments.It willbe undergoingfurtherenhancementas partof an integratedcapabilityfor softwareengineering to assist in system acquisition,project planningand risk management. Keywords: software cost estimation, software risk management, knowledge-based software engineering, COCOMO.

Introduction The objective of software risk management is to identify, address and eliminate risk items before undesirable outcomes occur. It is often very difficult to implement because of the scarcity of seasoned experts and the unique characteristics of individual projects. However, the practice of risk management can be improved by leveraging on existing knowledge and expertise. In particular, expert knowledge can be employed during cost estimation activities by using cost factors for risk identification and assessment to detect patterns of project risk. During cost estimation, consistency constraints and cost model assumptions may be violated or an estimator may overlook project planning discrepancies and fail to realize risks. Approaches for identifying risks are usually separate from cost estimation, thus a technique that identifies risk in conjunction with cost estimation is an improvement. C O C O M O is a widely used cost model that incorporates the use of cost drivers to adjust effort calculations. As significant project factors, cost drivers can be used for risk assessment using sensitivity analysis or Monte-Carlo simulation, but this approach uses them to infer specific risk situations. At project inception for example, a manager who is inexperienced and/or lacking sufficient time to do a thorough analysis may have a vague idea that the project is risky. But he will not know exactly which risks to mitigate and how. With automated assistance, the identified risks derived from cost inputs are used to create mitigation plans based on the relative risk severities and provided advice. The method described herein is encapsulated in a tool called Expert COCOMO. In conjunction with a dynamic model of an inspection-based software lifecycle process to

220

MADACHY

support quantitative evaluation of the process, these modeling techniques can support project planning and management, and aid in process improvement. The remainder of this paper provides background and related work, the methodology used during development, implementation details including an example session, and conclusions. The background and details of the simulation aspect of this work can be found in (Madachy, 1994).

Background This research has drawn upon the related software engineering disciplines of knowledgebased methods, risk management and cost estimation. The sections below provide a brief background to major areas relative to this work. Recent research in knowledge-based assistance for software engineering is on supporting all lifecycle activities (Green et al., 1983), though much past work has focused on automating coding activities. Improvements have been made in transformational methods, but there has been much less progress towards accumulating knowledge bases for large scale software engineering processes (Boehm, 1992). Despite the potential of capturing expertise to assist in project management functions such as cost estimation and risk management, few applications have specifically addressed such concerns.

Cost Estimation Cost models are commonly used for project planning and estimation to predict both the person effort and elapsed time of a project. The most widely accepted and thoroughly documented software cost model is Boehm's COCOMO (Boehm, 1981). The model is incorporated in many of the estimation tools used in industry and research. The multi-level model provides formulas for estimating effort and schedule using cost driver ratings to adjust the estimated effort. The COCOMO model estimates software effort as a nonlinear function of the product size and modifies it by a geometric product of effort multipliers associated with cost driver ratings. The cost driver variables include product attributes, computer attributes, personnel attributes and project attributes. The revised Ada COCOMO model adds some more cost drivers, including process attributes (Boehm-Royce, 1989). The COCOMO 2.0 project is currently underway to update the model for new development processes and products, and incorporates a revised set of cost drivers (Boehm et al., 1995). Mitre developed an Expert System Cost Model (ESCOMO) employing an expert system shell on a PC (Day, 1987). It used 46 rules involving COCOMO cost drivers and other model inputs to focus on input anomalies and consistency checks with no quantitative risk assessment.

Risk Management Risk is the possibility of undesirable outcome, or a loss. Risk impact, or risk exposure is defined as the probability of loss multiplied by the cost of the loss. Risk management is a new discipline whose objectives are to identify, address and eliminate software risk items before they become either threats to successful software operation

KNOWLEDGE-BASED RISK ASSESSMENT AND COST ESTIMATION

221

or major sources of software rework (Boehm, 1989). There is documented evidence of many software development failures to highlight the need for risk management practice (Charette, 1989). Examples of risk in software development include exceeding budget, schedule overrun, or delivering an unsuitable product. These illustrate the classic risk taxonomy of cost, schedule and performance (technical). Boehm identifies the top 10 generic software risk items in (Boehm, 1989). In practice, risks must be identified as specific instances to be manageable. Software risk management involves a combination of methods used to assess and control risk, and is on ongoing activity throughout a development project. Some common techniques used include performance models, cost models, checklists, network analysis, decision analysis, quality factor analysis and others (Boehm, 1989; Rook, 1993). Management of risk involves both risk assessment and risk control (Boehm, 1989; Charette, 1989). The substeps in risk assessment are risk identification, risk analysis (evaluating the magnitudes of loss probability and consequence) and risk prioritization, whereas risk control entails risk management planning, risk resolution and risk monitoring. Risk management is heavily allied with cost estimation (Boehm, 1989; Charette, 1989; Rook, 1993). Cost estimates are used to evaluate risk and perform risk tradeoffs, risk methods such as Monte-Carlo simulation can be applied to cost models, and the likelihood of meeting cost estimates depends on risk management. Risk management attempts to balance the triad of cost-schedule-functionality (Charette, 1989; Boehm, 1989). Though COSt,schedule and product risks are interrelated, they can also be analyzed independently. Some methods used to quantify cost, schedule and performance risk include table methods, analytical methods, knowledge based techniques, questionnairebased methods and others. A risk identification scheme has been developed by the Software Engineering Institute (SEI) that is based on a risk taxonomy (Cart et al., 1993). A hierarchical questionnaire is used by trained assessors to interview project personnel. Different risk classes are product engineering, development environment and program constraints. Knowledge-based methods can be used to assess risk and provide advice for risk mitigation. Incorporation of expert system rules can place considerable added knowledge at the disposal of the software project planner or manager to help avoid high-risk development situations and cost overruns. Toth has developed a knowledge-based software technology risk advisor (STRA) (Toth, 1994), which provides assistance in identifying and managing software technology risks. Whereas Expert COCOMO uses knowledge of risk situations based on cost factors to identify and quantify risks, STRA uses a knowledge base of software product and process needs, satisfying capabilities and maturity factors. Risk areas are inferred by evaluating disparities between needs and capabilities. STRA focuses on technical product risk while Expert COCOMO focuses on cost and schedule risk. A knowledge-based project management tool has also been developed to assist in choosing the software development process model that best fits the needs of a given project (Sabo, 1993). It also performs remedial risk management tasks by alerting the developer to potential conflicts in the project metrics. This work was largely based on a process structuring decision table in (Boehm, 1989), and operates on knowledge of the growth envelope of the project, understanding of the requirements, robustness, available technology, budget, schedule, haste, downstream requirements, size, nucleus type, phasing, and architecture understanding.

222

MADACHY

Method

Knowledge engineering involved choosing appropriate abstractions for formulating heuristics, iterative elicitation of expert knowledge, representation of the knowledge for diagnosis, and testing of the expert system. Additionally, a risk quantification scheme was devised. Cost drivers in the COCOMO model were identified very early as a complete set of attributes for project risk diagnosis, and this approach leveraged off of them. Knowledge was acquired from written sources on cost estimation (Boehm, 1981; BoehmRoyce, 1989; Day, 1987), risk management (Boehm, 1989; Charette, 1989) and domain experts including Dr. Barry Boehm, Walker Royce and this author. A matrix of COCOMO cost drivers was used as a starting point for identifying risk situations as a combination of multiple cost attributes, and the risks were formulated into a set of rules. As such, the risk assessment scheme represents a heuristic decomposition of cost driver effects into constituent risk escalating situations. A risk situation can be described as a combination of extreme cost driver values indicating increased effort, whereas an input anomaly may be a violation of COCOMO consistency constraints such as an invalid development mode given size or certain cost driver ratings. Risk items are identified, quantified, prioritized and classified depending on the cost drivers involved and their ratings. A typical risk situation can be visualized in a 2-D plane as shown in Figure 1, where each axis is defined as a cost attribute rating range. The curves represent iso-risk contours, and this figure shows risk increasing towards the top right corner. An example would be for complex product development in conjunction with low analyst capability; risk would be increasing in the direction of increasing product complexity (CPLX) and decreasing analyst ATTRIBUTE 1 very low

extra high

very low

ATTRIBUTE 2

very high discretized I into

ATTRIBUTE 1 VERY LOW VERY LOW LOW ATTRIBUTE 2 NOMINAL tHGH VERY HIGI-

Figure 1.

Typical assignment of risk levels.

LOW

NOMINAL H I G H VERYIIIGIIEXTRAHIGH M O D E R A T E ~[OH V E R Y HIGH ¢IODERATE HIGH

MODERATE

KNOWLEDGE-BASED RISK ASSESSMENT AND COST ESTIMATION

223

capability (ACAP). As seen in the figure, the continuous representation is discretized into a table. A risk condition corresponds to an individual cell containing an identified risk level. The rules use cost driver ratings to index directly into these tables of risk levels. The tables constitute the knowledge base for risk situations defined as interactions of cost attributes. After several iterations of the prototype, the experts were engaged again to help quantify the risks. A quantitative risk weighting scheme was developed that accounts for the nonlinearity of the assigned risk levels and cost multiplier data to compute overall risks for each category and the entire project according to #categories #categery risks

project risk =

~

~

.i=l

i=l

risk level~.i effort multiplier product/,./

where risk level = 1 moderate 2 high 4 very high effort multiplier product = (driver # 1 effort multiplier)* (driver #2 effort multiplier) ... * (driver #n effort multiplier). If the risk involves a schedule constraint (SCED), then effort multiplier product = (SCED effort multiplier)/(relative schedule)* (driver #2 effort multiplier)... * (driver #n effort multiplier). The risk level corresponds to the probability of the risk occurring and the effort multiplier product represents the cost consequence of the risk. The product involves those effort multipliers involved in the risk situation. When the risk involves a schedule constraint, the product is divided by the relative schedule to obtain the change in the average personnel level (the near-term cost) since the staffing profile is compressed into a shorter project time. The risk assessment calculates general project risks, indicating a probability of not meeting cost, schedule or performance goals. The risk levels were normalized to provide meaningful relative risk indications. Sensitivity analysis was performed to determine the sensitivity of the quantified risks with varying inputs, and extreme conditions were tested. An initial scale with benchmarks for low, medium and high overall project risk was developed as follows: 0-15 low risk, 15-50 medium risk, 50-350 high risk.

Implementation A working prototype assistant called Expert COCOMO was developed that runs on a Macintosh using HyperCard. Earlier versions utilized an expert system shell, but the prototype was recoded to eliminate the need for a separate inference engine. The Litton Software Engineering Process Group (SEPG) has also ported the rule base to a Windows environment, and has incorporated the risk assessment technique into standard planning and management practices. The tool, Litton COCOMO, encapsulates cost equations calibrated to historical Litton data and was written in Visual Basic as a set of macros within Microsoft Excel. The tools evaluate user inputs for risk situations or inconsistencies and perform calculations for the intermediate versions of standard COCOMO (Boehm, 1981) and ADA

224

MADACHY L

U$C Expert COCOMO1.7

CosL Driver Linear f ~ t l r s

Productkttributes RELY- req~irqKI,oft~caroroll, bllltg DATA- data base size C P L X - producteomple×ity

Rating very Io~v

O C)

lO~thOmlMIhNh ~

8 8 C) O

verdi high

extra htgh

Project

0

~

~

~ C)

(~

~ C)

name

i example SIZa:~SLOC~ Schedule:

ComouterA~trlbutes '[',HE- execu||ont| InecQn,, rat n, STOR- rrmin ~tor~aconstraint YtRT- ¥irluel rn~ch|ne volattlitg T U R N - computerturnaraUCKI tilde pQrsonnel

2TI

~

I

~Months

(~ orgamc tlode: O Semidetached ~j) Embedded

Attributes

AEXP- applications experience P~P - programmercapabilitQ YElp - vi rtt~l mechir~ experience LEXP- proQrammtng langoage experience

~

j~t'.9.iec~kttributea MODP-useo'modernprograrnrningpractices ~ TOOL- ~ Qfsoftvera~Qois SI2ED- requir~ developmen!=hedu]e (~

8 ~

8

8

8

0

O

O

0

fRtor~; AdsProcessAttributes

ExpoNntigl

.0.T-..da.,g°,h0.og,..

RISK- risks elimir~t, bVPDR ~tO L - reqLdramantsvolatility Figure 2.

888

8 8 8

(

o,

)

Sample input screen.

COCOMO (Boehm-Royce, 1989; Royce, 1990). They operate on project inputs, encoded knowledge about the cost drivers, associated project risks, cost model constraints and other information received from the user. The screen snapshots in subsequent figures are from the prototype, and Litton COCOMO looks very similar. The graphical user interface provides an interactive form for project input using radio buttons, provides access to the knowledge base, and provides output in the form of dialog box warnings, risk summary tables and charts, calculated cost and schedule and graphs of effort phase distributions. It provides multiple windowing, hypertext helP utilities and several operational modes. The risk weight tables (seen in the bottom of Figure 1) are user-editable and can be dynamically changed for specific environments, resulting in different risk weights. The following example is for a very risky project where many cost drivers are rated at their costliest values. Figure 2 is the input screen showing the rated attributes for the project. This data also constitutes the input for a cost estimate. In this example, it is seen that the project has a tightly constrained schedule as well as stringent product attributes and less than ideal personnel attributes. With this input data, the expert system identifies specific risk situations and quantifies them per the aforementioned formulas. The individual risks are also ranked, and the different risk summaries are presented in a set of tables. The interface supports embedded hypertext links, so the user can click on a risk item in a list to traverse to a screen containing the associated risk table and related information. An example risk output is seen in Figure 3, showing the overall project risk and risks for subcategories. It is seen that the leading subcategories of risk are schedule, product and personnel. Other outputs include a prioritized list of risk situations as seen in Figure 4

225

KNOWLEDGE-BASED RISK ASSESSMENT AND COST ESTIMATION

|i~'

' d s k output

[RISK SUMMARY] Project Risk

lexemple

igo

Schedule I 6 6 Product I 3 7 Personnel I 5 1

Process I 1 7 Computer I28

Personnel Rule Wt.

Schedule Rult Wt.

Product Rile Wt.

SCED,..CPLX 0.~ 8CED._RELY 4,5~ SCED_TIME IO.E SCED_VIRT 1.BE $CED_TOOL 1.8C $CED_TURN 3.5( SCED_~CAP 3.9C SCED,.,.AEXP 8.4( SCED_.PCAP 3.8~ ~ED_VEXP 7.9.~ SCED-HODP i 8.1"

$CED-CPLX 10.| 5CED-RELY 4.5~ RELY_ACAP 5.33 RELY-PCAP 5,27 RELY-MODF6,94 CPLX_ACAP 5.92 CPLX_PCAP 5.86

Total weights

16s.71

136.7 I

Process

,oe i

I~O.~11

~16.el

IZS:3] (

Overall weight

Figure 3.

Lz

Risk Ranking

)

Sample risk outputs.

ir~,l, . . . . . . . . . .

i

, rlsk output

LRisk Ranking Rank Rule I SCED_TIME 2 SCED_CPLX 3 rSCEO_AEXP =I SCED_MODP $CEO_VEXP 5 RELV_HODP 6 7 SCED-RELY O TIME_ACAP 9 CPLX_ACAP 10 SCED._ACAP II TIME_PCAP 12 CPLX-PCAP 13 SCED_PCAP 14 SCED_TURN 15 RELY_ACAP 16 RELV_PCAP 17 SCED_VIRT 18 SCED_TOOL 19 STOR_ACAP 20 STOR_PCAP

i, ,

'

,

'

,ill

]

Warning

Weight

Tight schedule and high computer time constralnt. Tight schedule and a highly complex system. Tight schedule w i t h low applications experience. Tight schedule and low use of modern progremming practices.

Tight schedule wlth low vlrtual machine experience. High reliabilltywlth low use of modern programming practices. Tight schedule and a highly reliable system. Execution time constraint and low anal~st capability. Hlgh complexity and low analyst capability. Tight schedule with low analyst capability. Execution time constraint and low programmer capability. High complexity and low programmer capability. Tight schedule wlth low programmer capability. Tight schedule wlth hlgh turnaround time. High reliabilitywlth low analyst capability. High reliabilitywlth low programmer capability. Tight schedule with high virtual machine volatility. Tight schedule wlth low use of software tools. Storage constraint and low analyst capability. Storage constraint and low programmer capability.

Olsk

Figure 4.

Anomaly

Computer

gill WI. Rile Wt. Rile Wt. SIZE_SCED 5CED..ACAP 3.gc ~CED_TOOL 1.8C 5CED_TIHE 5CED~P 18.4E 5CEP..HODP 8.1~ 5CED_VIRT 1.8E HODE,.AEXP ~ED~P 3.8Z RELY-PlODP 6.9~ SCED_TURN 3.50 TIHE_ACAP 3.95 ! SCED_¥~P 7.9.• TIHUCAP 3.88 R E L Y ~ P 3.33 5TOR_ACAP ! .43 R E L Y ~ P 5.23 STOR_PCAP 1.41 CPLX_ACAP 3.9~ ¥IRT_Y~P 1.3~; CPLX_PCAP 5.8( TI HE..ACAP 3.9E TIME_PCAP 3.8E STOR.ACAP 1.4,~ 5TOR_P~P 1.41 VI~T.yF~P 1 ~c

Prioritized risks.

Summary )

~]I

226

MADACHY

and a list of advice to help manage the risks. The highest risks in this example deal with schedule, and appropriate advice is provided to the user. Standard COCOMO cost and schedule estimates are also provided.

Rule base

Currently, 77 rules have been identified of which 52 deal with project risk, 17 are input anomalies and 8 provide advice. There are over 320 risk conditions, or discrete combinations of input parameters that are covered by the rule base. The knowledge is represented as a set of production rules for a forward chaining inference engine. The current rule base for project risks, anomalies and advice is shown in Table 1. The four letter identifiers in the rulenames are the standard abbreviations for the COCOMO cost drivers. Figure 5 shows the rule taxonomy and corresponding risk taxonomy as previously described. For each risk category, the cost drivers involved for the particular risk type are shown in boldface. Note that most rules show up in more than one category. Figure 5 also shows the cost factors and rules for input anomalies and advice. Validation

Testing and evaluation of the expert system has been done against the COCOMO project database and other industrial data. In one test, correlation is performed between the quantified risks versus actual cost and schedule project performance. Using the rule set on the COCOMO database shows a correlation coefficient of .74 between the calculated risk and actual realized cost in person-months/KDSI, as shown in Figure 6. Figure 7 shows risk by Table 1.

Rulebase Anomaly

Risk

sced_cplx sced_rely seed_time sced_virt seed_tool seed_turn sced_acap sced_aexp sced_pcap sced_vexp seedAexp sced_modp rely_acap modp,acap modp_pcap tool_aeap tool_pcap tool_modp

rely_pcap rely_modp cplx_acap cplx_pcap time_acap time_pcap stor..acap stor_pcap virt_vexp rvol..rely rvot_acap rvol_aexp rvol_sced rvol_cplx rvol_stor rvol_time rvol_turn size_pcap

cplx_tool time_tool cplx_acap_pcap rely~cap_pcap rely_data_seed rely_stor_sced cplx_time..sced cplx_stor~sced time_stor..sced time_virt_sced acap_risk increment_drivers sced_vexp_pcap virt_sced_pcap lexp_aexp_sced ruse..aexp ruseAexp

mode_cplx mode.rely size_seed size_mode size_cplx mode_virt mode_time mode_aexp mode_vexp size_pcap size_acap tool_modp tool_modp_tum pmex_pdrt_risk_awot increment_personnel modp_pcap seed acap_anomaly

Advice size_turn rely_data data_turn time stor pcap_acap data

227

KNOWLEDGE-BASED RISK ASSESSMENT AND COST ESTIMATION

Overall Project Risk Schedule risk

Product risk

Personnel risk

Process dsk

Computer risk

SCED sced_rely seed_time sced_virt seed tool seed turn seed_seep sced_aexp sced_pcap seed vexp sced_lexp sced modp rvol seed rdy_data_sced rely stot_sc~d cptx time_seed cptx_stor seed tJme_stor_scod time_vir t seed sced_vexp_pcap vlr t seed_peep lexp_aexp sced rvol_sced cplx tlme sced Cplx stor_sced time stor_sced time virt seed

RELy rely_acap

ACAP acap risk cp[x acap

MODP seed modp rl~ly modp modp_acap modp_peap

TIME sced time time_-pcap tirne_acap cplx_timm_sced

rely.peep

cpt~_aclp~cep

rely modp rvol_rely soed_fefy

modp_llCap rely_acap

rcly data_sced

rely_acapj~cap

raly_stor seed rdy_acap_pcap DATA

too/_modp

rvo{ acap seed_seep stor_acap time_acap tooLacap AEXP

rely_data$ced SEE size_peep CPLX eplx_acsp cplx_acap_pcap cplxj~cap cplx_$tor seed cpIx dme sced cplx_tOOI rvol_cplx sced cpix

lexpaexp_sced ruao_aexp rvol_aexp sced_aex~ LEXP

lexp_aexpsced $ced ]exp ruse_lexp PCAP virt_sced_pcap modpjocap rely_peep cplx pcap seed_peep sizej~cl~p stor pcap time pcap tool_pcap

tame stor $ced tirne-virt-soed

TOOL scedtool tool_leap tool_peep cplx tool time tool

tool modp RVDL rvof_rely rvol leap rvol_~exp rvoi seed rvoLcplx rvol_slz)r rvol~lJme rvol turn RUSE ruse aexp ruse lexp INCREMENTS; increment ddvats RISK seep_risk PMEX PDRT

timestorsced VIRT sced_vlrt vir t_vexp virLsced peap

tin;e_Wrt_scod 11JRN seed_turn rvol_turn

Cp/X JCBp~Cap raly_ocop ocap sced_v~p_.ocmp

Rule t y p e

sced_vaxp..pcap

COST FACTOR rulenamel ruFenarne2

CPLX slze_cplx MODE RELY "NME VIRT AEXP VEXP mode cplx mode rely mode_virt modetime mode aexp mode vexp SC~D seed TOOL MODP TURN ACAP tool_modp tcol_rnodp turn modp peep INCREMENT increment perAonnBI PMEX PDRT RISK RVOL pmexJ0 drt risk_rvol

Advice eRE TURN s~ze_turn RELY DATA rely_data TIME time STOR stor PCAP ACAP pcap aca~ DATA data data turn

LEGEND:

VEXP virt vaxp scad_vexp

Figure 5.

rvol ~ma time_tool STOR stor_acap stor_pcap rvol stor cplx_stor_sced

Input anomaly S~'E

Rule taxonomy.

5O 45

40 C¢1 35

ao 25 ~, 20 Z O

ffl

15

I

I

B m

10

5

~ = = = • n •=O= O

m

ii

== • I

D

I

I

I

I

20

40

60

80

1O0

120

RISK

Figure 6.

Correlation against actual cost.

project number and grouped by project type for the COCOMO database. This depiction also appears reasonable and provides confidence in the method. For example, a control application is on average riskier than a business or support application. Industrial data from Litton and other affiliates of the USC Center for Software Engineering is also being used for evaluation, where the calculated risks are compared to actual cost and schedule variances from estimates. Data is still being collected, and correlation will be performed against the actual cost and schedule variance from past projects. In another test, the risk taxonomy is being used as a basis for post-mortem assessments of completed projects.

228

MADACHY 120

-

BUS

CTL

HMI

SCI

SYS

SUP

100 80 tP.

60

40 20 0

II'll I ,..till I!'1............ !i.

i

'

i

PROJECT #

Figure 7. Risk by COCOMO project number.

Software engineering practitioners have been evaluating the system and providing feedback and additional project data. Many of the USC Affiliate companies are testing the tool in-house. At Litton, nine evaluators consisting of the SEPG and other software managers have unanimously evaluated the risk output of the tool as reasonable for a given set of test cases, including past projects and sensitivity tests.

Conclusions and Future Work

The existing set of COCOMO cost drivers served well as a core set of abstractions to map onto decision drivers for project risk assessment. The completeness of the attribute set for the cost estimation domain was vital for generating a critical mass of rules from them. Common inputs between the expert system and cost model also ensured unambiguous mapping from project data to the ruleset. This work is another example of cost drivers providing a powerful mechanism to identify risks. Explication of risky attribute interactions helps illuminate underlying reasons for risk escalation as embodied in cost drivers, thus providing insight into software development risk. Analysis has shown that risk is highly correlated with the total effort multiplier product associated with the cost drivers, and the value of this approach is that it identifies specific risk situations that need attention. More refined calibrations are still needed for meaningful risk scales. Consistency with other risk taxonomies and assessment schemes is also desired, such as the SEI risk assessment method (Cart et al., 1993) and the Software Technology Risk Advisor method of calculating the disparities between functional needs and capabilities (Toth, 1994). At Litton, the knowledge base will be extended for specific product lines and environments to assist in consistent estimation and risk assessment. It is being implemented in the risk management process, and is being used as a basis for risk assessment of ongoing projects. Additional risk data from industrial projects will be collected and reported oni The following additional features are also planned for the tool: incremental development support, additional and refined rules, cost risk analysis using Monte-Carlo simulation to assess the effects of uncertainty in the COCOMO inputs, and automatic sensitivity analysis to vary

KNOWLEDGE-BASED RISK ASSESSMENT AND COST ESTIMATION

229

cost drivers for risk minimization. For Monte-Carlo simulation, users will be able to specify probabilistic distributions for cost factors. The prototype will continue to be enhanced and rehosted if necessary to attain wider usage. Additional rules will be identified and incorporated to handle more cost driver interactions, cost model constraints, incremental development inheritance rules, rating of consistency violations and advice. Substantial additions are expected for process related factors and advice to control the risks. The domain experts will continue to provide feedback and clarification. The current risk model is well-suited for integration with a dynamic project model as demonstrated in (Madachy, 1994), since cost (drivers), schedule and risk are interrelated factors that interact in a dynamic fashion throughout a project lifecycle. By combining quantitative techniques with expert judgement in a common model, an intelligent simulation capability results to support planning and management functions for software development projects. This work is also coordinated with other relevant research at the USC Center for Software Engineering. The evolving C O C O M O 2.0 model has updated cost and scale drivers relative to the original C O C O M O upon which the risk assessment heuristics are based. The risk and anomaly rulebases will be updated to correspond with the new set of cost factors and model definitions. A working hypothesis for C O C O M O 2.0, is that risk assessment should be a feature of the cost model (Boehm et al., 1994). Towards this, graduate students at USC have incorporated the rule base into the next revision of the public domain USC C O C O M O tool. The assessment scheme will also be incorporated into the WinWin spiral model prototype (Boehm et al., 1993) to support negotiation based on C O C O M O parameters.

Acknowledgments The author would like to thank Dr. Barry Boehm for his guidance and inspiration in this work in addition to serving as a domain expert, and Dr. Prasanta Bose for his insightful comments and suggestions on this paper. Thanks also to the Litton Data Systems division S E P G personnel and management for their support.

References Boehm, B. 1981. Software Engineering Economics. Englewood Cliffs, NJ: Prentice-Hall. Boehm, B. 1989. Software Risk Management. Washington,D.C.: IEEE-CS Press. Boehm, B. 1992. Knowledge-based process assistance.for large software projects, white paper in response to Rome Laboratories PRDS #92-08-PKRD, USC. Boehm, B., Bose, P., Horowitz, E., Scacchi, W., et aL 1993. Next generationprocess models and their environment support. Proceedings of the USC Center.for Software Engineering Convocation, USC. Boehm, B., and Clark, B., Horowitz, E., Westland, C., Madachy, R., and Selby, R. 1995. Cost modelsfi)rfkture so.flware life cycle processes: COCOMO 2.0, to appear in Annals of Software Engineering Special Volume on Software Process and Product Measurement, J.D. Arthur and S.M. Henry (Eds.), J.C. Baltzer AG, Science Publishers, Amsterdam, The Netherlands. Boehm, B. Royce, W. t989. Ada COCOMO and the Ada process model. Proceedings, Fifth COCOMO Users" Group Meeting, SEI.

230

MADACHY

Boehm, B. and Bose E 1994. Critical success factors for knowledge-based software engineering applications. Proceedings of the Ninth Knowledge-Based Software Engineering ConJerence, Monterey, CA: IEEE Computer Society Press. Carr, M., Konda, S., Monarch, I., Ulrich, E, and Walker, C. 1993. Taxonomy-Based Risk Identification. Technical Report CMU/SEI-93-TR-06, Software Engineering Institute. Charette, R. 1989. Software Engineering RiskAnalysis andManagement. Intertext Pnblications/MultiseiencePress and McGraw-Hill, New York, NY. Conte, S., Dunsmore, H., and Shen, V. 1986. S¢~ftware Engineering Metrics and Models. Menlo Park, CA: Benjamin/Cummings Publishing Co., Inc. Day, V. 1987. Expert System Cost Model (ESCOMO) Prototype. Proceedings, Third Annual COCOMO Users' Group Meeting, SEI. Green, C., Luckham, D., Balzer, R., Cheatham, T., and Rich, C. 1983. Report on a Knowledge-Based Software Assistant. Kestrel Institute, RADC#TR83-195, Rome Air Development Center, NY. Madachy, R. 1994. A software project dynamics model for process cost, schedule and risk assessment. Ph.D. Dissertation, Department of Industrial and Systems Engineering, USC. Rook, E 1993. Cost estimation and risk management tutorial. Proceedings of the Eighth International Forum on COCOMO and Software Cost Modeling, SEI, P!ttsburgh, PA. Royce, W. 1990. TRW's Ada process model for incremental development of large software systems. TRW-TS-90-01, TRW, Redondo Beach, CA. Sabo, J. 1993. Process model advisor. CSC1577A class project, University of Southern California. Toth, G. 1994. Software technology risk advisor. Proceedings of the Ninth Knowledge-Based Software Engineering Conference, Monterey, CA: IEEE Computer Society Press.