Improving Requirements Testing with Defect Taxonomies Michael Felderer Research Group Quality Engineering Institute of Computer Science University of Innsbruck, Austria
The 25th CREST Open Workshop London, February 11, 2013
Motivation • Defect taxonomies (DT) provide information about distribution of failures in projects and are valuable for learning about errors being made • DT are in practice only used for a‐posteriori allocation of testing resources to prioritize failures for debugging purposes or for prediction • But DT have potential to control and improve overall system test process • • • • •
Design of requirements‐based tests Prioritization of test cases Tracing of requirements, tests and defects Controlling of defect management Provide precise statement about release quality
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 2
Defect Taxonomy Supported Testing (DTST) DTST
Before DTST Test Planning and Control
Test Planning and Control Test Analysis and Design
Test Implementation and Execution
Test Analysis and Design
Test Evaluation and Reporting
Test Implementation and Execution
Test Evaluation and Reporting
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 3
Outline • Process for system testing with defect taxonomies, called Defect Taxonomy‐Supported Testing (DTST) • • • • •
Aligned with ISTQB‐based standard test processes Prioritized requirements, defect categories, failures Traceability between requirements, defect categories and failures Application of specific, goal‐oriented test design techniques Detailed statement about release quality
• Empirical evaluation of effectiveness of DTST in an industrial case study compared to standard test process • Decision support for the application of defect taxonomy supported testing based on cost comparison with standard test process
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 4
Basic Concepts Test Case
Test Strategy 1..*
0..*
1 1..*
1
Test Pattern
0..* 1
Failure
1..*
+severity
1..*
1 low
medium
0..* 1
Test Technique 1
1
high
1 1..*
Defect Category +severity
Test Strength
1..*
1
1
Requirement 1..*
1..* +priority 1..*
0..*
0..1 Use Case 0..*
0..* Hierarchy Element
Defect Taxonomy 0..*
1
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 5
DTST Steps and Integration Standard Test Process Step 1: Analysis and Prioritization of Requirements
Step 2: Creation of a Product‐Specific Defect Taxonomy
(1) Test Planning Step 3: Linkage of Requirements and Defect Categories
Step 4: Definition of a Test Strategy with Test Patterns
(2) Test Analysis and Design (3) Test Execution Step 5: Analysis and Categorization of Failures after a Test Cycle (4) Test Evaluation and Reporting
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 6
Step 1: Analysis and Prioritization of Requirements • Prioritized requirements are assigned to use cases additionally defined by business processes, business rules and user interfaces
REQ# Description
UC Identification
UC Search Client
Improving Requirements Testing with Defect Taxonomies
Priority of REQ
39 ANF_0053 Logging and Auditing
normal
40 ANF_0057 LOGIN via OID
high
REQ# Description
Priority of REQ
…
…
…
February 11, 2013
Slide 7
Step 2: Creation of a Product‐Specific Defect Taxonomy • Top‐level categories of Beizer are mapped to product‐specific defect categories which are then further refined to concrete low‐level defect categories with assigned identifier and severity Defect Category of Beizer 1xxx . . Requirements 11xx . . Requirements incorrect 16xx . . Requirement changes 12xx . . Logic 13xx . . Completeness
Product-Specific Category Unsuitability of the system taking the organizational processes and procedures into account. Incorrect handling of the syntactic or semantic constraints of GUI.
DC
Description of DC
R1
Client not identified correctly Goals and measures of case manager are not processed correctly
R2 R3
Severity
Update and termination of case incorrect
critical normal normal
GUI-layout R4
4xxx . Data
D1
42xx . . Data access and handling
D2
Improving Requirements Testing with Defect Taxonomies
Syntactic specifications of input fields Error massages Incorrect access / update of client information, states etc. Erroneous save of critical data
February 11, 2013
major
normal critical
Slide 8
Step 3: Linkage of Requirements and Defect Categories • Experience‐based assignment of requirements to defect categories • Peer review of assignment important • Tests are derived for each requirement assignment Defect Category of Beizer 1xxx . . Requirements 11xx . . Requirements incorrect 16xx . . Requirement changes 12xx . . Logic
Product-Specific Category Unsuitability of the system taking the organizational processes and procedures into account. Incorrect handling of the syntactic or semantic constraints of GUI.
13xx . . Completeness
DC
Description of DC
R1
Client not identified correctly Goals and measures of case manager are not processed correctly
R2 R3
Update and termination of case incorrect
R4
Syntactic specifications of input fields
Severity critical normal normal
GUI-layout
4xxx . Data
D1
42xx . . Data access and handling
D2
Error massages Incorrect access / update of client information, states etc. Erroneous save of critical data
REQ# Description
major
normal critical
Priority of REQ
39 ANF_0053 Logging and Auditing
normal
40 ANF_0057 LOGIN via OID
high
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 9
Step 4: Definition of a Test Strategy with Test Patterns • A test pattern consists of a test design technique with three test strength and has assigned defect categories Id
S: Sequence oriented
S1 S3
D: Data oriented
D1 D3 D4
Test Design Technique Use case-based testing; process cycle tests State transition testing CRUD (Create, Read, Update and Delete) EP: Equivalence partitioning BVA: Boundary value analysis
Defect Categories
Test Strength 1 Test Strength 2 Test Strength 3 (low) (normal) (high)
R1, R2, R3, F1,F2, Main paths F3
Branch coverage
Loop coverage
I1, I2, F7, F8, F9
State transition coverage
Path coverage
Data cycle tests
Data cycle tests
State coverage
D1, D2 F3, F5, F6
EP valid
EP valid+invalid
EP valid+invalid
F3, F5, F6
BVA valid
BVA valid+invalid
BVA r values at boundaries
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 10
Test Design and Execution Defect Taxonomy
Requirements REQ
Text
PR
1 Workflow 2 Data
D e f e c t C a t e go ry o f B e ize r
P ro duc t - S pe c if ic C a t e go ry
DC
D e s c ript io n o f D C
1xxx . . Requirements
Unsuitability o f the system taking the o rganizatio nal pro cesses and pro cedures into acco unt.
R1
Client no t identified co rrectly
critical
R2
Go als and measures o f case manager are no t pro cessed co rrectly
no rmal
R3
Update and terminatio n o f case inco rrect
no rmal
11xx . . Requirements inco rrect
High Normal
16xx . . Requirement changes 12xx . . Lo gic
Inco rrect handling o f the syntactic o r semantic co nstraints o f GUI.
GUI-layo ut R4
13xx . . Co mpleteness
S e v e rit y
Syntactic specificatio ns o f input fields
majo r
Erro r massages
Test Design Id
S: Sequence oriented
S1 S3
D: Data oriented
D1
Test Design Technique Use case-based testing; process cycle tests State transition testing CRUD (Create, Read, Update and Delete)
Defect Categories
Test Strength 1 (low )
Test Strength 2 (norm al)
Test Strength 3 (high)
R1, R2, R3, F1,F2, F3
Main paths
Branch coverage
Loop coverage
I1, I2, F7, F8, F9
State coverage
State transition coverage
Path coverage
Data cycle tests
Data cycle tests
D1, D2
PR high normal normal low
SDC, SF Test Strength blocker, critical, major 3 blocker, critical, major 3 normal, minor 2 minor, trivial 1
Test Execution ID Description 1 see test spec. 2 see test spec. 3 see test spec. 4 see test spec. 5 see test spec.
Result pass pass fail pass fail
Comments Severity no no no critical no no minor
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 11
Step 5: Analysis and Categorization of Failures after a Test Cycle • Defects are exported from defect management tool (Bugzilla) • Severity assigned by testers is compared to severity of defect category and priority of requirement • Weights are adapted if needed • Precise statement about release quality is possible •
Additional information valuable for release planning
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 12
Case Study • Research Questions (RQ1) Defect taxonomy‐supported testing reduces the number of system test cases. (RQ2) Defect taxonomy‐supported testing increases the number of identified system failures per system test case.
• Evaluation via comparing (RQ1) normalized number of tests (NOT/SIZE) (RQ2) test effectiveness (NOF/NOT)
in two similar projects from a public health insurance institution
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 13
Studied Projects • Two projects performed by the same test organization in a public health insurance institution
Project A
Project B
Area
Application for case managers
Staff
About 7
Administration of clients of the public health insurance institution. Up to 10
9 month development, now under maintenance
9 month development, now under maintenance
4
3
41+14
28+20
27% of overall project effort
28% of overall project effort
ISTQB process + defect taxonomy‐supported testing Manual regression testing
ISTQB process Manual regression testing
Duration Number of iterations SIZE: NOR + NUC Ratio of system testing Test Process
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 14
Data Collection Procedure 1. 2. 3. 4. 5. 6.
Requirements stored in Excel, Priorities assigned by Domain Experts Defect Taxonomies stored in Excel, Severity assigned by Test Manager Requirements assigned to Defect Categories in Excel Test Cases implemented in test management tool TEMPPO Defects stored in Bugzilla, Severity assigned by Testers Defects exported to Excel and assigned to Defect Categories
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 15
Results 1/2 • Collected metrics for Project A and Project B Metrics NOR
Project A 41
Project B 28
14
20
SIZE (NUC+NOR) NOT
55 148
48 170
NOF NOT/SIZE
169 2.69
114 3.54
NOF/NOT
1.14
0.67
NUC
• NOT/SIZE indicates reduction of test cases in DTST (RQ1) • NOF/NOT indicates increased test effectiveness in DTST (RQ2) Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 16
Results 2/2 • RQ1 is additionally supported by the estimated number of test cases without DTST and the actual number of test cases with DTST Module GUI M1 M2 M3 M4 M5 Total
NOT Estimated 32 9 58 43 14 26 182
NOT DTST 29 8 41 34 13 23 148
Reduction 9% 11% 29% 21% 7% 12% 19%
• A paired two sample t‐test based on this table shows that the reduction of test cases is even significant (T=2.209, df=5, p=0.039)
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 17
Interpretation • In case study we quantitatively observed that • • •
DTST reduces the number of test cases and increases test effectiveness Test cases are more goal‐oriented to identify failures Less resources for test implementation, execution and evaluation are needed
• Results are supported qualitatively by feedback of stakeholders • • •
Testers noted that with DTST their test cases find more defects of high severity Testers and managers noted less discussion on resource allocation Test managers noted more realistic severity values and less unplanned releases
• Limitations of case study • • •
Public health insurance domain Two projects Beizer defect taxonomy Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 18
Validity of Results • Construct Validity • •
Application of standard measures on carefully selected projects Data quality checks
• Internal Validity •
Triangulation of data • Two independent projects • Consideration of estimated values • Qualitative feedback of stakeholders
• External Validity • •
DTST is independent of defect taxonomy and based on generic test process Replication of case study in other context as future work
• Reliability •
DTST and evaluation procedure are well‐defined, data available Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 19
Cost Comparison and Decision of Application • Estimation is structured according to the phases and steps of the test process • Comparison of cost (measured in time) of DTST and standard ISTQB‐ based test process • Break‐even and recommendation of application if cost of DTST is smaller than cost of ISTQB • Return on Investment provides same qualitative result • Adaptation and simulation with parameters • • • • •
Analysis and prioritization of a requirement Linkage of a requirement to defect category Design and implementation of a test case Execution time per test case Maintenance time per test case Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 20
Cost Comparison for Project A ISTQB Ph
Phases
50.00
TE Test Planning
DTST Diff of Costs Ph Ph 100.00
‐50.00
166.00 34.00 Step 1 : A nalysis and prio ritizatio n o17.50 f 110.00 92.50 TE Test Execution, Evaluation, Maintenance A nalysis and prio ritizatio n o f requirements A c t ivTE Test Analysis and Design it ie s 4 0 requirements @ 0 .5 P h TP
TAD TC1 TC2 TC3 Definitio n o f test strategy TC4
TC5
200.00 P h A c t iv it ie s
20,00 requirements 50.00 100.00 ‐50.00 4 0 requirements @ 0 .5 P h 250.00 266.00 ‐16.00 Step2: Creatio n o f a pro duct-specific defect360.00 taxo no my358.50 1.50 Step 3: Linkage o f requirements and defect 470.00 451.00 19.00 catego ries @ 0 .2 5 P h 36.50 4 0 requirements 580.00 543.50 Step 4: Definitio n o f a test strategy with 30,00 690.00 636.00 54.00 test patterns 800.00 728.50 71.50 5 0 ,0 0
TC6 TC7
910.00 1020.00
821.00 913.50
89.00 106.50
TC8
1130.00 1006.00
124.00
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Ph 20,00 30,00 10,00 40,00 10 0 ,0 0
Slide 21
Cost Comparison Scenarios for Project A Ph 1200
Ph 600
1000
500
800
400
DTST
DTST
600
ISTQB
300
400
200
200
100
ISTQB
0
0 TP
TAD
TC1
TC2
TC3
TC4
TC5
TC6
TC7
TC8
Test Cycles
Cost comparison with average test execution time 0.5h
Improving Requirements Testing with Defect Taxonomies
TP
TAD
TC1
TC2
TC3
TC4
TC5
TC6
TC7
TC8
Test Cycles
Cost comparison with average test execution time 0.1h
February 11, 2013
Slide 22
Summary and Future Work • Summary • • •
Standard aligned defect taxonomy supported system test process Industrial case study indicates more effective test cases Procedure for deciding whether to apply the approach based on cost comparison
• Future work • • • • • •
Impact of different defect taxonomy types Automatic recommendations for assignments Requirements validation with defect taxonomies Regression testing with defect taxonomies Application of defect taxonomy supported testing for release planning Investigate the value contribution of defect taxonomy‐supported testing, e.g. reduced release planning uncertainty, severe bugs in operation
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 23
Questions or Suggestions ?
[email protected]
Improving Requirements Testing with Defect Taxonomies
February 11, 2013
Slide 24