Empirical Studies of Inspection and Test Data at Ericsson Amarjit Singh Marjara; Cap Gemini Norway AS Reidar Conradi; NTNU Børge Skåtevik, STC Profes’99, 22-24 June, 1999, Oulu, Finland
Profes’99, Oulu, June 99
Amarjit Singh Marjara
1
Agenda Background ■ The inspection method ■ Data ■ Observations/questions ■ Results ■ Conclusions ■ Recommendations ■
Profes’99, Oulu, June 99
Amarjit Singh Marjara
2
Purpose of the studies •
H1: To investigate if there is a correlation between defects found during inspection/test and the complexity.
•
H2: To investigate if there is a correlation between the number of defects found in field-use and the complexity and the modification rate of a module.
•
H3: To investigate if there is a correlation between defects rates across phases and deliveries for individual documents/modules.
•
A diploma work at NTNU Sept.-Dec. 1997, and diploma work at NTNU Oct’98-Feb’99, against Ericsson AS, Norway. Profes’99, Oulu, June 99
Amarjit Singh Marjara
3
Background ■
■
■ ■ ■
Data are collected at Ericsson AS, AXE-division, Oslo. Every development document (design, code,..) is inspected. Using Gilb method; an extension of Fagan’s. Data for many projects are analysed. The analysed data orginates from design, unit test, function test and system test. Code is not inspected in this manner.
Profes’99, Oulu, June 99
Amarjit Singh Marjara
4
Background 2(3) ■
The paper is divided in two studies: ➤Study one: – Data from one project of 20.000 man hours. – It includes design, implementation, unit test and function test. – The initial phases, such as prestudy and system study, are excluded from 20.000 mh.
Profes’99, Oulu, June 99
Amarjit Singh Marjara
5
Background 3(3) ■ The
second study:
➤A
study of 6 projects ==> 100.000 manhours ➤It includes design, implementation, unit test and function test. ➤The initial phases, such as prestudy and system study, are excluded from 100.000 mh. Profes’99, Oulu, June 99
Amarjit Singh Marjara
6
The inspection method 1(2) Entry Evaluation and Planning Kickoff Reading (individual) Inspection Meeting Causal Analysis Discussion Meeting Rework Follow-up and Exit Evaluation
Profes’99, Oulu, June 99
Amarjit Singh Marjara
7
The inspection method - 2(2) •
Provide special training for the moderators.
•
Inspection meeting max two hours.
•
Follow the recommended, “optimal” inspection rates for the actual document type.
•
Do not cover too much complex material in a single review.
•
Invite the most competent inspectors to the meeting.
•
Avoid personal criticisms.
•
Postpone long discussions till end of the meeting. Profes’99, Oulu, June 99
Amarjit Singh Marjara
8
Inspection Data Block - name of the block (module). Document type - the type of document which is inspected. Hours saved/lost - for every inspection, estimating whether time has been saved or not. Number of persons - reading and participating in the inspection meeting. Planning - time spent on planning the inspection. Kickoff meeting - time spent on introducing the document to the participants. Reading - total time spent on individual preparation. Inspection meeting - time Rework , follow up - time Defects found during reading : classified (Super major or Major) Defects found in inspection meeting, pages read, pages treated in inspection meeting
■ ■ ■ ■ ■ ■ ■ ■ ■ ■ ■
➤
Defect classification: Super major, Major (not refined defect classification)
===> summerised in inspection survey stored in a DB
■
Profes’99, Oulu, June 99
Amarjit Singh Marjara
9
Collecting Data - testing ■
Unit test ➤ No
of defects found and time spent ➤ These data are available per module (=unit) ■
Function test, system test, field-use ➤ Cause ➤ Priority
- indicates the seriousness of the defect ➤ These data are available per module ➤ The time spent on system test and field-use is not available (final integration in Stockholm). Profes’99, Oulu, June 99
Amarjit Singh Marjara
10
Results STUDY 1 ■
-
How cost-effective are inspections?
Results from study 1: ! " !# $%&' $() (* (32 (4 5768 ( (* :9;< 98=>5?? 7&@ " (ACB #"D E ?F I #D( 5! ? K (L "6M L I "% O#(4 Q 5 $%
Profes’99, Oulu, June 99
+",,(+ H G G -+ /:N P"J RS TVU
.0/1 /1 + ,.)1 + J"1 + /1@/ ,!1 P R"TETXWYT
Amarjit Singh Marjara
11
Results Study 1-
Cost of inspection and testing, defects per hour
ZX[(\^] _E] \a`
yf xc)[(\>] ly x m'cVxpXm@p!\3] ly yf xc)[(\>] ly uvc)c)\^]Yy y ] \\cVf\ !Ey[(\>] l}y7\cVf\ `EfL\cVu\cVf\ !] cVq zfLc Profes’99, Oulu, June 99
bdc)ec)[ \^f gLhji
keelm'\ gLn i
o e \l\pXq gLrCi
st] uvcwf:x?cVy\ \ale4] yz{ly?c zc)ec)[(\ gn}| u i
0
0
0
0 |@
0
0
0
0 |@
00 0
0 0 0
0 | 0 |
Amarjit Singh Marjara
k f \ ]u p \ c z fLp_"c)z \^] uvcw~` cpVmq ` z c)ec)[(\ m@cVuvl0_"pdq 0
0
12
Results Study 1- time usage for inspections ■
Time usage for inspections ➤
Spent: 1474 hrs; whereof 1162,5 for individual reading and inspection meeting.
➤
Planned: 2723 hrs, according to internal Gilb guidelines Saved: 8200 hrs – if the defects had not been detected by inspection, but detected and repaired in the later phases (not field-use). Inspections detect almost 65% of the registered defects, and unit test 27%. The remaining 7% is found in the later testing activities and 2% in field use!!
➤
➤
➤
At Ericsson, the Gilb inspection process focuses on finding new defects in inspections, but only 2% of the defects are actually found in the inspection meeting (true negative). No data stored to verify ”false positives”.
Profes’99, Oulu, June 99
Amarjit Singh Marjara
13
Results Study 1 -083 E¡?¢!£)¤¥?:¦X§©¨¦L£)3ª>«'3¬¤¨¬¬):¢ 3ª>«"¬0:«@:ª« §®¡?¨ ¤«
Profes’99, Oulu, June 99
Amarjit Singh Marjara
14
Results Study 1-
Preparation rate and defect detection rate
Profes’99, Oulu, June 99
Amarjit Singh Marjara
15
Results Study 1- field-use defects and defects found during the inspections per module There does not seem to be a well-defined correlation between these two variables. The dashed line shows the expected (intuitively) values.
°´ ¸º ¶ » ±¹ °· ¸¹ · ± µ¶ ³´ ± °² ¯° ¼V½¾@½¿À©Á)¾®ÂÃÄ0ÅtÆ ÄjÆ Ä0Á®Ç½¿À©Æ ÂÄ)Á Profes’99, Oulu, June 99
Amarjit Singh Marjara
16
Results Study 1: defects found in inspection and number of states in a module
Again, there does not seem to be a well-defined correlation between these two variables. Surprisingly, the no. of defects detected in inspections seems to be rather constant if the topmost value along the y-axis is removed. Intuitively, it should be more difficult to inspect a document with a large number of states, than with a small nbumber of states. The dashed line shows the expected (intuitively) values
·´
¹³µ °² ·´Ê
¹ ·É ¹È ¸¶ ¸· ± µ¶ ³´ ± °² ¯° ËÂ:Ì4Â3¾ÁÀ®ÍÀ®½Á Profes’99, Oulu, June 99
Amarjit Singh Marjara
17
Results Study 1- number of defects in field-use versus states in a module
The number of states do seem to be correlated with number of defects in field-use, indicated with the dashed line. The number of system failures increases with increasing number of states. Thus, number of states represent the inherent complexity of a module.
°´ ¸º ¶ » ±¹ °· ¸¹ · ± µ¶ ³´ ± °² ¯° ËÂ:Ì4Â3¾ÁÀ®ÍÀ®½Á Profes’99, Oulu, June 99
Amarjit Singh Marjara
18
Results Study 1- defects found in unit test versus states in a module
Surprisingly, the number of defects found in unit test seems to be independent of the number of states in a module. This is indicated in the figure below, if th e topmost value along the y-axis is not considered.
³
³³ ´° ¹· ·¶
¸¹ · ± µ¶ ³´ ± °² ¯° ËÂ:Ì4Â3¾ÁÀ®ÍÀ®½Á Profes’99, Oulu, June 99
Amarjit Singh Marjara
19
Regression Analysis Study 1- Hypothesis 1 The number of states (ΝÎ@Ï is an important variable, because it correlates to the number of system failures (field-use defects, ÐOÑÓÒ ). The modification rate, ΝÔEÕÓÖ is included in the following regression equation.
Ð7Ñ©Ò× α+βΝÎ +λΝÔ Õ ØOÙ"ÚLÙ
α, β, λ Û
ÚLÙÝÜ:Þ0ßVà V ß àáâ Û
Profes’99, Oulu, June 99
Amarjit Singh Marjara
20
Regression Analysis - study 1 ■
■
■
H0: the fault density of a module in field-use depends on the complexity of the module (no. of states) and the its modification rate. High complexity and high modification rate will thus result in high fault density in operation. HA: the fault density of a module in field-use does not depend on the module’s complexity and the modification rate. H0 is the null hypothesis, and HA is the alternative hypothesis.
ßXã
λ are significantly different from zero H0 can only be accepted, if β Û and the significance level for each of the coefficients is better than 0.10.
Profes’99, Oulu, June 99
Amarjit Singh Marjara
21
Regression Analysis - study 1 äæå8çéèêEëaë^êEìCíaîðïñÝòëaóç}ôõòöçéçô^÷4íaøùò÷4çúæû Ð7Ñ©Ò× −1.73+0.084ΝÎ +0.097ΝÔEÕ Predictor
Coefficient
Constant (α)
StDev
t
p
-1.732
1.067
-1.62
0.166
States (β)
0.084
0.035
2.38
0.063
Modrate (λ)
0.097
0.034
2.89
0.034
üýEþdþÿ aþ þ4ÿ@ýþ:þ @þÿþ !þ þ! #"$ ®ý%©ý þ ÿ & ?þ ''Eÿ () þ þ>ÿ *+®ýE þ (þ '®þ ,-®!ÿ . /01©ý þ 2!) 'E þ ©ý3®!ÿ þ4ÿ & The analysis of variance is summarised below: Source Regression
DF 2
SS 28.68
MS 14.34 1.44
Error
5
7.20
Total
7
35.88
F 9.96
P 0.018
0ÿ@ý0 5þ36þ7®ý38@ýþ ÿ!@ÿ/30ÿ 90''7! ©ý þdÿ þ3 ÿ 0197 aþ4þ ÿ 9'!& H % þ (aþ 2!þ(& 4
0
Profes’99, Oulu, June 99
Amarjit Singh Marjara
22
Results Study 2- total defects found ■
Study 2:
Activity
Defects [#]
Inspection preparation Inspection meeting
[%] 71.1
4478 392
Desk check Emulator test Total
Profes’99, Oulu, June 99
6.2
832
13.2
598 6300
9.5 100.0
Amarjit Singh Marjara
23
Results Study 1- are inspections performed at recommended rates? D o c u m e n t i n f o r m a t io n D o c u m e n t ty p e
A c tu a l
N um ber of
T o ta l
A v e ra g e
d o c u m e n ts
num .
le n g th o f
of
doc
A c tu a l tim e
R ecom m ended ti m e
D e f e c ts
P la n n in g
R ecom .
T o ta l
D e fe c t
c o n sta n t
T im e
num ber
d e n sity
of
per page
d e fe c ts
pages
ADI
1
7
7 .0 0
36
20
20
12
1 .7 1
AI
29
241
8 .3 1
1019
72
2088
197
0 .8 2
BD
41
1038
2 5 .3 2
1438
40
1640
468
0 .4 5
BDFC
54
3376
6 2 .5 2
3531
104
5616
802
0 .2 4
COD
4
31
7 .7 5
105
14
56
38
1 .2 3
FD
33
1149
3 4 .8 2
2432
38
1254
784
0 .6 8
FD FC
19
897
4 7 .2 1
1230
26
494
338
0 .3 8
FF
14
366
2 6 .1 4
868
20
280
363
0 .9 9
FS
14
244
1 7 .4 3
950
24
336
205
0 .8 4
FT I
2
605
3 0 2 .5 0
216
14
28
22
0 .0 4
FT S
2
154
7 7 .0 0
840
14
28
44
0 .2 9
IP
3
65
2 1 .6 7
257
15
45
73
1 .1 2
O PI
5
61
1 2 .2 0
130
20
100
14
0 .2 3
PO D
4
23
5 .7 5
116
20
80
29
1 .2 6
PR I
57
582
1 0 .2 1
1651
96
5472
399
0 .6 9
SD
4
59
1 4 .7 5
300
18
72
47
0 .8
SPL
27
141
5 .2 2
417
80
T o ta l
313
9039
Profes’99, Oulu, June 99
15536
Amarjit Singh Marjara
2160
69
0 .4 9
19769
3904
0 .4 3
24
Regression Analysis Study 2- Hypothesis 2 Hypothesis 2, uses the data presented above, and checks whether there exist a correlation between defects found during inspection/test and complexity for a module. The regression equation used to state this hypothesis can be written as: Y = αX + β, where Y is defect density, X is the complexity and α, and β are constants. H0 can only be accepted if α and β are significantly different from zero and the significance level for each of the coefficients is better than 0.10.
Profes’99, Oulu, June 99
Amarjit Singh Marjara
25
Regression Analysis Study 2- Hypothesis 2 ■
The following values was estimated: Y = 0.1023*X + 13.595 Predictor β α
Estimate 13.595002 0.1022985
Standard error 18.52051 0.093689
t 0.73 1.09
p 0.4729 0.2901
It indicates that the linear regression line must be rejected if a significance of level 0.10 is assumed, i.e. H0 must therefore be rejected.
Profes’99, Oulu, June 99
Amarjit Singh Marjara
26
Regression Analysis Study 2- Hypothesis 3 ■
To check for correlation between defect densities across phases and deliveries, we have analyzed the correlation between defect densities for modules over two projects. Variable Defect density –
Defect density –
Defect density -
Project A
Project B
1.0000
0.4672
0.4672
1.0000
Project A Defect density – Project B
With a correlation coefficient of 0.4672, we cannot conclude that there exists a correlation between the two data set. We had only 6 modules with complete data for both projects for this test. Profes’99, Oulu, June 99
Amarjit Singh Marjara
27
Conclusions 1(2) ■
The data analysis indicates: ➤ Inspections proved to be the most cost-effective process in defect detection, function testing the least effective. ➤
Inspections find 70% of the recorded defects, cost 10% of the development time, and yield an estimated saving of 20%.
➤
8% (study 1: 2% and study 2: 6%)of the defects are found during the final meeting, 92% during the individual reading.
➤
But inspection meetings are more cost-effective than function test.
➤
individual inspections and individual desk reviews are the most costeffective techniques to detect defects.
➤
The recommended inspection rates are not followed, only 2/3 of the recommended time is spent.
Profes’99, Oulu, June 99
Amarjit Singh Marjara
28
Conclusions 2(2) ■
■
■
The defect density of a module in field-use is depends on the complexity and the modification rate. The litterature indicates that designs with lower complexity lead to lower defect rates. However, is it possible to create designs with low complexity in the area of realtime telecom software? Maybe the solution is to pay extra attention when designing the most complex parts of such a system? Finding new defects during meetings has been focused at Ericsson. Only 8% (study 2)of defects found in inspections are found here. The defect classification is too coarse
Profes’99, Oulu, June 99
Amarjit Singh Marjara
29
Recommendations Record the correct data properly, and later analyse the data to answer questions and test hypotheses regarding e.g. Is defect-prone module during inspection/test also defect-prone during field use? •
If one type of defect dominates one project, will the subsequent projects have the same type of defect or will these be eliminated?
•
Does the process lead to find the of serious defects during reading and meetings?
•
The defects found during the inspections should be classified (categories from basic test can be applied, or follow ISO/IEEE recommendations) to find out which type of defects are found in inspections.
•
One should consider omitting the inspection meetings for some document types and may be perform asynchronous inspection ‘meetings’ (by utilising the web technology).
Profes’99, Oulu, June 99
Amarjit Singh Marjara
30