Lean Six Sigma Green Belt Y = f(x)
Transactional/Service V. 1.0., 2015 © TACTEGRA 2015
1
Welcome!
Welcome!
© TACTEGRA 2015
2
Safety and Logistics
• • • •
Locations of facilities Building fire alarm and evacuation plans Building access and parking Session timings - lunch, breaks, etc.
© TACTEGRA 2015
3
Class Ground Rules
What ground rules should we include for our time together? • Arrive on time each morning • Please refrain from using computers for non-class-related • Return from breaks and lunch on activities (access / respond to etime
• Turn off cell phones until break • Create a safe learning
mail during breaks)
• Absences
environment – what happens in class, stays in class
© TACTEGRA 2015
4
Class Introduction
Be prepared to share with the class: – Name – Role / Responsibility – What are your expectations?
© TACTEGRA 2015
5
Course Agenda
Week 2: •
Week 1 Review
•
Analyze
•
•
Failure and Root cause identification
•
Identifying wastes
•
Validating root causes statistically
Improve •
•
Identifying, testing and implementing solutions
Control •
Measuring and sustaining the gain
•
DFSS
•
Lean
•
Next Steps
© TACTEGRA 2015
6
DMAIC Question Checklist Define What problem are we trying to solve? Why this project now? Problem Statement Goals/Business Case How will we measure success? Project Scope Metrics Who needs to be involved? Project Team What risks does this process improvement present? Process Risks Project Risks Who cares about your progress? Team Management Stakeholder Management
Measure How does the current process work? Process Mapping What kind of data can we use to understand the process inputs? Data Basics How do we get that data, and what does it tell us? Sampling Graphical Tools Can we trust the measurement system? Measurement System Analysis How can we factually describe the current process performance? Statistical Process Control Capability Assessment
Analyze
Improve
Control
Where are the problems coming from? Root Cause Analysis Cause and Effect Matrix Risk Assessment/ FMEA Can we narrow the inputs to critical root causes, with proof? Practical Tools Graphical Tools Statistical Tools Where else can we remove roadblocks in the process? Lean Fundamentals Process Waste Standard Work
What alternatives should we consider for improvement? Brainstorming Benchmarking Multi-Generational Planning Which alternative best meets our success criteria? Voting Benefit-Effort Matrix Pugh Matrix How can the new process fail? FMEA How can we make sure the solution will work? Pilot Planning Implementation Plan Control Plan Risk Assessment/ FMEA
Do we have proof that the solution was implemented correctly? Statistical Process Control What is the plan to transfer ownership to the process owner? Control Plans Management Routines Dashboards What was learned from the project? Can we leverage the solution? Lessons Learned Replication
© TACTEGRA 2015
7
Analyze
Analyze
© TACTEGRA 2015
8
Analyze The purpose of the Analyze Phase: is to Identify the potential root causes of the problem and seek to understand them, then to validate the vital root causes statistically.
Because we only what to fix the validated root causes of the problem the first time so that they will not return!
© TACTEGRA 2015
9
DMAIC Question Checklist Define What problem are we trying to solve? Why this project now? Problem Statement Goals/Business Case How will we measure success? Project Scope Metrics Who needs to be involved? Project Team What risks does this process improvement present? Process Risks Project Risks Who cares about your progress? Team Management Stakeholder Management
Measure How does the current process work? Process Mapping What kind of data can we use to understand the process inputs? Data Basics How do we get that data, and what does it tell us? Sampling Graphical Tools Can we trust the measurement system? Measurement System Analysis How can we factually describe the current process performance? Statistical Process Control Capability Assessment
Analyze
Improve
Control
Where are the problems coming from? Root Cause Analysis Cause and Effect Matrix Risk Assessment/ FMEA Can we narrow the inputs to critical root causes, with proof? Practical Tools Graphical Tools Statistical Tools Where else can we remove roadblocks in the process? Lean Fundamentals Process Waste Standard Work
What alternatives should we consider for improvement? Brainstorming Benchmarking Multi-Generational Planning Which alternative best meets our success criteria? Voting Benefit-Effort Matrix Pugh Matrix How can the new process fail? FMEA How can we make sure the solution will work? Pilot Planning Implementation Plan Control Plan Risk Assessment/ FMEA
Do we have proof that the solution was implemented correctly? Statistical Process Control What is the plan to transfer ownership to the process owner? Control Plans Management Routines Dashboards What was learned from the project? Can we leverage the solution? Lessons Learned Replication
© TACTEGRA 2015
10
Analyze: Failure and Root Cause Identification Qualitative techniques: 1. Brainstorming 2. Cause and Effect Diagram (Fishbone Chart) 3. Cause and Effect Matrix 4. FMEA
© TACTEGRA 2015
11
Analyze: Validating the Root Causes
Quantitative Techniques:
1. 2. 3. 4. 5.
Measures of Center and Spread Normal Distribution and Confidence intervals Hypothesis testing ANOVA Regression and Correlation
Basic Lean tools:
1. Value Stream Mapping 2. Spaghetti Diagrams 3. 7 Wastes/5 S
© TACTEGRA 2015
12
Analyze: Improving the Flow
Basic Lean tools:
1. Value Stream Mapping 2. Spaghetti Diagrams 3. 7 Wastes/5 S
© TACTEGRA 2015
13
Failure and Root Cause Identification At the end of the Measure phase it should be clear how the process is performing. The question now becomes why is it performing that way? The first part of the Analyze phase is to identify as many potential reasons why things could be going wrong as possible. This section uses many qualitative and team engagement related techniques. The activities which follow use quantitative methods to validate the vital few root causes.
X
X
X X
X X
?
X
X
X
X
X
X
X X
X
!
X
Qualitative techniques lead to Quantitative techniques
© TACTEGRA 2015
14
Fishbone Diagram
Also called a Cause and Effect diagram or a Ishikawa diagram (named after its creator Kaoru Ishikawa) is a graphical tool to identify, explore and display, all of the possible root causes related to a problem or condition. Benefits
Weaknesses
• Engages and focuses team on the causes of the problem • Creates a snapshot of the collective knowledge of team • Creates consensus of the causes of a problem • Builds support for resulting solutions • Focuses the team on causes not symptoms
• Lacks a quantitative dimension to prioritize the causes • Assumes representative and knowledgeable team of SMEs
© TACTEGRA 2015
15 15
The Layout of a Fishbone
These are the 6 most commonly used categories however they can be modified to better fit the process or organization. For example instead of Machine the name of a software system or computer program might be more appropriate. 6 Major Categories People: anyone involved with the process Method: how the process is performed (policies, procedures, regulations, etc.) Material: materials, information, or resources used in the process Machine: any equipment required to achieve final product Measure: data generated by process for evaluation Environment: the conditions and culture in which the process operates
MAJOR CATEGORY
Cause Text Goes Here Cause Text Goes Here
Cause Text Goes Here
DEFECT OR DESIRED OUTCOME
Cause Text Goes Here MAJOR CATEGORY
© TACTEGRA 2015
16 16
Sample Fishbone Layouts
CAUSES
EQUIPMENT
METHOD
EFFECT
MATERIAL
The Head of the Fish must link to problem statement
DEFECT OR DESIRED OUTCOME
PEOPLE
MEASUREMENT
ENVIRONMENT
© TACTEGRA 2015
17 17
Sample Fishbone
CAUSES
EQUIPMENT
METHOD
EFFECT
MATERIAL
Duplication of efforts System malfunctions
Authorization levels
Complicated form
ORDER CYCLE TIME Untrained staff
Inventory levels Credit report cycle time
PEOPLE
MEASUREMENT
ENVIRONMENT
© TACTEGRA 2015
18 18
Developing a Fishbone
How to create a Fishbone: 1. Identify and assemble a team of subject matter experts and familiarize them with the project charter and problem statement. 2. Write the problem statement down on a white board or on a flipchart. 3. Using post it notes have the team individually list out as many potential root causes as possible or use a round robin brainstorming technique. 4. Begin to group the potential root causes within the categories, remembering that you can adjust the title of the categories if needed. Avoid the temptation to create an excessive number of new categories. 5. Use the 5 whys technique to drive each potential root cause to a more detailed level. This is done by repetitively asking “why does this problem exist?” 6. The lowest level why should be measurable and is considered a potential X.
© TACTEGRA 2015
19 19
Cause & Effect Diagram Exercise
Purpose:
Practice Cause & Effect Diagram (Fishbone)
Grouping:
Teams
Exercise
1. Select a problem statement from one of the projects in the class, or from the process mapping exercise 2. Brain storm at least 7 potential root causes of the problem statement 3. Organize the root causes and place into the appropriate categories on the fishbone
Deliverable:
Complete partial Fishbone diagram
Time:
30 minutes
© TACTEGRA 2015
20
Cause and Effect Matrix
Is a simple matrix used to relate the inputs of a process to its CTQ’s in order to emphasize the importance of understanding the customer requirements Benefits
Weaknesses
• Uses the process map as the primary source of inputs • CTQ’s are scored as to importance to the Customer • Inputs are scored as to relationship to Outputs • Results are prioritized key inputs,
• Scoring can be subjective particularly if the team is not knowledgeable about the problem
areas for “Quick Hit” improvements, areas for enhanced risk assessment
© TACTEGRA 2015
21
Cause and Effect Matrix
© TACTEGRA 2015
22
Cause and Effect Matrix - Example
1. List Outputs
Cause and Effect Matrix Order Fulfillment Example
© TACTEGRA 2015
23
Cause and Effect Matrix - Example
Cause and Effect Matrix Order Fulfillment Example 2. Rank Outputs by importance to Customer (1 to 10 or 1, 3, 9 scale)
This step should include your customer in the Process. This customer can be Internal or External. © TACTEGRA 2015
24
Cause and Effect Matrix - Example
3. List Inputs
Cause and Effect Matrix Order Fulfillment Example
List Inputs Down Side of Matrix. This step uses the Process Map inputs. © TACTEGRA 2015
25
Cause and Effect Matrix - Example
4. Correlate Inputs to Outputs
Cause and Effect Matrix Order Fulfillment Example Assign Value to Correlation between each Input and Output (0‐None, 1‐Weak, 3‐Moderate or 9‐Strong scale)
This is a subjective estimate of how influential the Inputs are on the CTQ’s (Outputs) © TACTEGRA 2015
26
Cause and Effect Matrix - Example
Cause and Effect Matrix Order Fulfillment Example
5. Template will create weighted scores and prioritize
Relationship Scores are multiplied by Importance weighting value to determine score totals We now start getting a feel for which variables are most important to explaining performance in the outputs. © TACTEGRA 2015
27
Cause and Effect Matrix - Example
6. Sort the Weighted Score Column
Cause and Effect Matrix Order Fulfillment Example
© TACTEGRA 2015
28
Cause and Effect Matrix
The C & E matrix is a living document….
• Keep the matrix available for “add‐ons” later • Use as a source of inputs for: - Data Analysis of Xs - Capability Analysis - Control Plans - Leverage for other potential improvement projects
© TACTEGRA 2015
29
C & E Matrix Exercise
Purpose:
Practice Cause and Effect Matrix
Grouping:
Teams
Exercise
1. Select 3 Key Process Steps from one of the projects in the class, or from the process mapping exercise 2. For each selected Process Step, determine at least 4 outputs (Ys) and at least 5 inputs (Xs) 3. Assign weights for each output 4. Score the relationship for each combination of output and input 5. Be prepared to discussed the scoring results
Deliverable:
Complete C & E matrix template in GB workbook
Time:
20 minutes
© TACTEGRA 2015
30
Failure Modes and Effects Analysis
Failure Modes and Effects Analysis (FMEA) is:
• A structured approach for: - Identifying potential failures of a process or service - Prioritizing those failures/risks in a structured way - Identifying the actions that should be taken to reduce the risk - Formulating categories for performance measurement - Formulating the process control plan - Evaluating the design of a process or a service
• Primary Directive: Identify ways the product or process can fail and eliminate or reduce the risk of failure
• A primary tool for understanding and assessing process risk © TACTEGRA 2015
31
Why And When To Use An FMEA Why Use an FMEA? ■ Captures the collective knowledge of a team. ■ Helps to identify every possible failure mode of a process or product. ■ Used to rank and prioritize possible causes of failures as well as develop and implement preventative measures. ■ Improves the quality, reliability, and safety of the process. ■ Provides logical, structured process for identifying process areas of concern. ■ Reduces process development time and cost. ■ Documents and tracks risk reduction activities. ■ Provides historical records; establishes baseline. ■ Helps increase customer satisfaction and safety.
When To Use an FMEA ■ When a process, product or service is being designed or redesigned, after quality function deployment. ■ When an existing process, product or service is being applied in a new way. ■ Before developing control plans for a new or modified process. ■ When improvement goals are planned for an existing process, product or service. ■ When analyzing failures of an existing process, product or service. ■ Periodically throughout the life of the process, product or service.
Potential Targets for an FMEA Application ■ New processes being designed. ■ Existing processes being changed. ■ Carry-over processes for use in new applications or new environments. ■ After completing a problemsolving study (to prevent recurrence). ■ When preliminary understanding of the processes is available (for a Process FMEA). ■ After system functions are defined, but before specific hardware is selected (for a System FMEA). ■ After product functions are defined, but before the design is approved and released to manufacturing (for a Design FMEA).
© TACTEGRA 2015
32
Types of FMEA
The basics of the FMEA is to identify failures or risks in a process or product. The concepts are often used in a variety of ways.
• Process FMEA ‐ used to analyze transactional and service processes • Design FMEA ‐ used to analyze product designs or service delivery processes before they are implemented
• System FMEA ‐ used to analyze complete systems and sub‐systems in the early concept and design stages
The FMEA is a flexible tool that can be used in multiple applications and industries!
© TACTEGRA 2015
33
Sources for input to the FMEA
Inputs:
• Process Maps and Charts • Fishbone Diagram • Requirements Tree • Cause and Effect Matrix • Process or service history • Subject Matter Expert opinion or experience • Customer Surveys, KPIs and Kano studies
© TACTEGRA 2015
34
Building the FMEA Let’s understand how to build an FMEA by building one. We have the following inputs from a Card activation process.
1
2
Security Check
Validate Account info
3 Confirm Customer info
4 Enter Approval
Card activation process map
Fishbone Diagram
Prioritized Customer Feedback 1. Process takes too long 2. Incorrect address 3. Card failed to work after activation
© TACTEGRA 2015
35
Building the FMEA
1
2
Step 1. List the process steps in the first column of the FMEA template. Step 2. Enter all the potential failures for each process step in the Failure Mode Column.
Note: You can have multiple Failure Modes at each process step. A failure mode is the specific way by which a failure occurs in terms of failure of the item.
© TACTEGRA 2015
36
Building the FMEA
3
4
5
Step 3. Enter the effect of the failure. The effect is the impact of the failure shown. Step 4. List possible cause for the failure. Note that there can be multiple causes for each failure. Step 5. List any methods, processes or systems used to currently in place detect the failure.
© TACTEGRA 2015
37
Building the FMEA The next step is to assess the risk of the failure. This is done by calculating a Risk Priority Number (RPN). RPN = Severity x Occurrence x Detection Severity – How severe is the impact on the customer? This could be the certain loss of a customer in a service business or loss of life in an industrial process! Use a scale of 1 through 10 with 10 being high and 1 being only a minor nuisance to the customer. Occurrence – How often is the failure occurring? Again use a scale of 1 through 10 with 10 being that the failure is inevitable. Detection – Can you detect the failure before it impacts the customer? The higher the detection score the higher the probability the customer will experience the effect. Again use a scale of 1 through 10. Note: Ratings cannot be 0 or the calculations will not work. © TACTEGRA 2015
38
Building the FMEA
Step 6. Enter a rating in the severity, occurrence and detection columns. The RPN value can now be used to prioritize the failures or process risks. From a improvement project perspective the high RPN scores are your potential vital Xs that you would confirm with the quantitative testing tools covered in this phase.
© TACTEGRA 2015
39
Analyzing the FMEA In addition to high RPNs the following items may also provide useful actions: High occurrence and high detection ratings regardless of the severity may identify ongoing customer nuisance issues. High occurrence and low detection ratings often reveal the “hidden factory” in your processes. A high severity rating by itself is also a cause for alarm and should prompt your team gather evidence of whether the occurrence rating was objective and whether the appropriate controls are in place.
© TACTEGRA 2015
40
RPN Review
Once you calculate the RPN for each Failure Mode, review the results and look for insights:
• Do the gut check ‐ does this make sense? • If not, discuss and make necessary rating changes Determine potential next steps:
• Collect Data • Conduct Experiments • Improve Process • Implement Process Controls
© TACTEGRA 2015
41
FMEA Exercise A
Purpose:
Practice FMEA
Grouping:
Teams
Exercise
1. Select 3 Key Process Steps from one of the projects in the class, or from the process mapping exercise 2. For each selected Process Step, determine the Failure Modes or ways in which the input can go wrong 3. For each Failure Mode associated with the inputs, determine Effects of the Failures on the customer ‐ remember the internal customers! 4. Identify potential Causes of each Failure Mode 5. List the Current Controls for each Cause or Failure Mode 6. Review standard Severity, Occurrence, and Detection rating scales and consider revising 7. Calculate RPN’s for each Failure Mode
Deliverable:
Complete FMEA template in GB workbook through RPN column
Time:
30 minutes
© TACTEGRA 2015
42
Mitigating Risks & Action Planning In some cases the risk is known with data supporting it’s impact and it is appropriate to develop a response plan. The right hand side of the FMEA is used for risk mitigation and action planning.
The risk mitigation plan portion of the FMEA includes the following: • The actions to diagnose/confirm its cause(s) • The actions to fix any defects that occur • The individuals responsible those actions and • The results of those actions © TACTEGRA 2015
43
FMEA Exercise B
Purpose:
Practice mitigation planning
Grouping:
Teams
Exercise
• For the process selected in FMEA exercise A continue to fill in the FMEA template through the final column • Save your work and prepare to report out
Deliverable:
Complete FMEA template in GB workbook through RPN column
Time:
20 minutes
© TACTEGRA 2015
44
Analyze: Validating the Potential Vital Xs Statistically Quantitative techniques: 1. PGA Approach Review 2. Commonly used distributions 3. What the CLT means 4. Confidence intervals 5. Hypothesis testing (means, variances and proportions) 6. ANOVA 7. Contingency tables and Chi-Square test 8. Correlation and regression
© TACTEGRA 2015
45
The PGA Approach PGA stands for Practical Graphical and Analytical. This is a commonly accepted 3 phase approach to understanding data. Practical – Review the raw data to make sure it make sense. Check for any quality and quantity issues. Graphical – Organize and display raw data in a variety of tabular and graphical formats. Analytical – Evaluate the data’s shape, center, spread and perform statistical tests in order to draw effective business conclusions.
© TACTEGRA 2015
46
The Practical Phase Ask the following questions of the data: How was the data collected and calculated? Was the anticipated quantity of data received? What are the operational definitions of each type of data? Are there any issues with missing data, data entry or transcription errors? Are there any obvious abnormalities or anomalies in the data? I.e. Data points or groups of data that stand out as very different from the others?
© TACTEGRA 2015
47
The Graphical Phase 16
Basic Graphical tools: Histograms Pareto Charts Boxplots Run Charts Scatter plots
25th Percentile (Q1) = 3.245 50th Percentile (Median) = 3.945 75th Percentile (Q3) = 4.430 Maximum = 4.98
12
95% CI Mean = 3.646 to 3.957 95% CI Sigma = 0.687006 to 0.908966
Frequency
10
Anderson-Darling Normality Test: A-Squared = 0.803438 P-Valu e = 0.0363
8 6 4 2
4.98
4.71
4.44
4.17
3.90
3.62
3.35
3.08
2.81
2.54
2.26
1.99
1.72
0 Overall Satisfaction
4.72
4.22 Mean: 3.80
3.72
3.22
2.72
2.22
5 Overall Satisfaction
14
Run Chart: Overall Satisfaction
Customer Type 5.5
Count = 100 Mean = 3.801 Stdev = 0.782461 Minimum = 1.72
4.5
1.72 Median
4
25th
3.5
75th Mean
3
Outliers
2.5 2 1.5
1
2
3
What are the most common ways to “show” data? © TACTEGRA 2015
48
Histogram
The Histogram is a method to view the distribution of data. It allows you to focus on the shape (distribution), pattern (modes), spread and center. Count = 100 Mean = 3.801 Stdev = 0.782461 Minimum = 1.72
16 14
25th Percentile (Q1) = 3.245 50th Percentile (Median) = 3.945 75th Percentile (Q3) = 4.430 Maximum = 4.98
12
95% CI Mean = 3.646 to 3.957 95% CI Sigma = 0.687006 to 0.908966
Frequency
10
Anderson-Darling Normality Test: A-Squared = 0.803438 P-Value = 0.0363
8
6 4
2
4.98
4.71
4.44
4.17
3.90
3.62
3.35
3.08
2.81
2.54
2.26
1.99
1.72
0
Overall Satisfaction
Note also that many statistical packages include much of this information in addition to something called the Anderson‐Darling Normality Test. This test confirms whether the shape is approximately normally distributed. © TACTEGRA 2015
49
Pareto Chart
Pareto Analysis is: • A method to identify and separate the “vital few” from the “trivial many” influences of a problem or outcome • Pareto principle: 80% of the cost, value or trouble is accounted for by 20% of the items or categories • Example: Eighty percent of customer complaints involve 20% of the customers
Count by group
Cumulative % of total
© TACTEGRA 2015
50
Box-and-Whisker Plot
• A great method to visually compare data by a given grouping • Shows the relative location and dispersion of each group Customer Type 5.5
Overall Satisfaction
5 4.5 Median
4
25th
3.5
75th Mean
3
Outliers
2.5 2 1.5
1
2
3
© TACTEGRA 2015
51
Run Chart
• Plot the data in chronological order (that is why it is called a run chart, not a scatter plot) • Provides a visual understanding of the data over time • Can be used to check outliers, stability (follow up with SPC), trend (follow up with regression), patterns (such as seasonality), distribution, variation, average (or median), capability (against desired outcome)
Run Chart: Overall Satisfaction
4.72
4.22 Mean: 3.80
3.72
3.22
2.72
2.22
1.72
© TACTEGRA 2015
52
Scatter Plot
• A great method to gain insight on the relationship of two continuous data types (variables data) • Unlike the Run Chart, it is not dependent upon the time order because the data is paired 5.7 y = 0.042x + 1.742 R² = 0.103
Overall Satisfaction
5.2 4.7 4.2 3.7 3.2 2.7 2.2 1.7 36.00
41.00
46.00
51.00
56.00
61.00
Avg days Order to delivery time
• Example: Overall Satisfaction vs. Average Order to Delivery Time © TACTEGRA 2015
53
The Analytical Phase
Basic Analytical tools:
Measures of Center and Spread Normal Distribution and Confidence intervals Hypothesis testing • Continuous • Discrete
ANOVA Regression and Correlation Basic Lean tools: Kits
KF Titrator
Balance
Flask Fridge
ill
Silver Corrosion
Cooler
Paperwork
Sp
Value Stream Mapping Spaghetti Diagrams 7 Wastes/5 S
Acid Titrator
Wet Chem
dr of x O r ufl us
I PA
PV R
1
PV R
2
C&P r el oo C
s el p ma Sl ai c ep S Haze
Caustic Titrator
PV R
3
ti n U P PF C
Cloud & Pour Unit
Distilliation Units
Oven
Bromine # Titrator
Color
Chloride Titrator
Wet Chem Hood
Sulfur Unit
s el p ma S maert S
hc ne B doo-h
trifu
ge
Warm Bath
C en
nev O r et a wI P A s ht ab
Visc Unit
yr eni f e R s ni b el p mas
hs al F ti n U
t otf J
API
Amsterdam Cold Pour
Reflux Reactions
Blender
Ammonias
Color
l auna M hs al F
Balance
RCC sti n U
© TACTEGRA 2015
54
Basic Statistical Tools – Measure of Center
Mean: • Applied to normal and non‐normal data • Easily calculated as the average of the data • Excel formula: Average (Data) Median: • Is the positional middle of the data • Useful for working with non‐normal or skewed data • Found by ordering a set of number from lowest to highest and then selecting the middle value • Excel formula: Median (Data) Mode: • The value in your data set that occurs most • Useful for determining if the shape is unimodal, bi‐modal or multi‐modal. • Excel formula: Mode (Data)
© TACTEGRA 2015
55
Basic Statistical Tools – Measure of Spread
Range: • Applied to normal and non‐normal data • Easily calculated as the maximum value of your data minus the minimum value • Excel formula: Range = max(data) – min(data) Variance: • Variance is the average squared distance of all of the points to the mean • Is in unit of the data squared. If your data is in ft, variance is in ft2 • Excel formula: Variance = var(data) • Fortunately, this measure is rarely used… Standard Deviation: • The square root of the variance • Is the same unit as the data (ft, min, mph, etc.) • Best parameter for describing the spread of normal data • Excel formula: Standard Deviation = stdev(data)
© TACTEGRA 2015
56
Basic Concepts of Distributions There are several commonly seen shapes for data. The common shapes can be categorized based on those shapes. Those categories are often called frequency or probability distributions. By understanding the type of distribution for a data set the analyst can make decisions about the probability of a specific value occurring as well as whether one group of data is different that another.
Normal Distribution
Exponential Distribution Note: There are many, many different types of distributions. For the basic Greenbelt it is important to understand whether the data is Normal or not.
© TACTEGRA 2015
57
Why is Normality Important?
68% of the data will fall between ±1 standard deviations. 95% of the data will fall between ±2 standard deviations. 99.73% of the data will fall between ±3 standard deviations. © TACTEGRA 2015
58
Population versus Sample - Reminder
• A Population is the entire group of objects about which one wishes to gather information in a statistical study.
• A Sample is the group of objects on which one actually gathers data in a statistical study.
Entire Population Population Sample (Subset of Population)
Sample
© TACTEGRA 2015
59
What is the Central Limit Theorem? This is one of the most important and useful concepts in statistics. While it sounds theoretical there are significant practical uses for the concept. Simply put, the data is not always going to be normal. This required statisticians to find a way to use many of the important statistical techniques when the data was not normal. What they discovered was that if you took several samples, the average (mean) of those samples became normally distributed the more samples you took. So even if the original distribution of the raw data was uniform or skewed the distribution of sample means from that raw data is normal. Common uses of this concept are imbedded in the formulas used for Control Charts, Confidence Intervals and Hypothesis testing. All of which are important for the greenbelt!
© TACTEGRA 2015
60
What are Confidence Intervals?
• A Confidence Interval is the estimated range where you can be confident your true population parameter of interest (mean, standard deviation, etc.) lies.
• A confidence interval is the range of values around the sample parameter that will contain the population parameter at a given level of risk.
• They are used extensively in hypothesis testing to see if two point estimates come from the same or different populations.
• They get smaller as the size of the sample increases or the sample variation decreases.
• They get larger as the sample size decreases or the sample variation increases.
© TACTEGRA 2015
61
Confidence Intervals
Why are Confidence Intervals important?
• Sample statistics, like the sample mean ( X ) or the sample standard deviation ( S ), are only estimates of the true population parameters, µ and σ.
• Due to the inherent sample‐to‐sample variability in these estimates (if two different people collect a sample, or if one person collects samples at different times, they will likely be different), we quantify our uncertainty using statistically based Confidence Intervals (CI’s).
• How confident do we want to be in our data? In most industries, we calculate 95% CI for our data. But in some businesses, like the medical industry, for example, 99.9% CI’s are more typical.
• The CI provides a way for us to investigate and quantify sample‐to‐ sample variation.
© TACTEGRA 2015
62
Point Estimate
• Whenever we estimate the population statistic with a sample statistic, we are making what’s called a point estimate.
• For example, X (x‐bar) is a point estimate of µ (mu).
X • Of course, since we are using samples, this estimate is generally wrong. - Your particular sample mean comes from one sample. - Any other sample would yield a different sample mean. - This results from random variation in the samples.
Point estimates can be made for both continuous and discrete data.
© TACTEGRA 2015
63
Confidence Interval Width
Risks Levels:
x t / 2, n 1
s n
x t / 2, n 1
s n Risk = 1% Risk = 5%
Risk = 10% Risk = 25%
The more confidence you want to have that the true (population) mean will fall within your intervals the wider the confidence intervals must be. © TACTEGRA 2015
64
Confidence Interval Width
Sample size:
x t / 2, n 1
s n
x t / 2, n 1
s n n=5 n = 10
n = 100 n = 1000
The more samples you have, the better your estimate, and the tighter your confidence intervals.
How about an example? © TACTEGRA 2015
65
CI Example Open data sheet “analyze sample data” and select the SigmaXL tab and the Statistical Tools drop down menu then select the Descriptive statistics tab.
The data has column headings, Select Use entire data table. Select all columns and move into the numeric data variables (Y).
© TACTEGRA 2015
66
CI Example Output Note that the confidence interval for the weight of football players does not overlap with the interval for the weight of the baseball players. What does this mean?
© TACTEGRA 2015
67
What is Hypothesis Testing?
• Hypothesis testing uses statistics to determine if sample data groups are statistically different; that is, do the samples come from the same, or different populations or processes.
• This helps us understand which variables in the process are causing the variation. Sometimes the difference in samples is clear…
Sometimes the difference in samples is less clear…
•
Hypothesis tests are used in the Analyze Phase to help make decisions about which X’s are the most critical ‐ which are the valid root causes.
•
Hypothesis testing is also used in the Improve Phase to help determine if improvements (statistical differences) have really occurred.
© TACTEGRA 2015
68
A Note on Differences
Practical versus Statistical: • Getting this wrong is one of the most common errors in hypothesis testing:
‐ A Practical Difference results in a practical, economic or financial value to the organization
‐ A Statistical Difference is a difference or change that likely did not occur from chance (with some degree of confidence)
• Hypothesis testing deals specifically with statistical differences.
Is it possible to see a statistical difference without seeing a practical one, and vice versa?
© TACTEGRA 2015
69
Questions Lead Tools Follow!
The first and most important step in Hypothesis testing is to form the critical question that you want to understand. The Problem
The Question
The Parameter
Management is concerned about the differences in average call time between locations
Is there a difference between average call time between locations?
Means (averages) of two or more groups
The fishbone diagram identified on‐ time delivery rate as a potential vital X
Is there a difference in on‐time delivery rate by time of day?
Proportion (percentage) in one group and proportion in another
Customers complain that the plastic cases for CD are often to tight or to loose for their CDs compared to a competitors product
Is there a difference in the spread (variation) in one group and the spread in another group?
Variance or Standard Deviation between one group and another
Management has asked HR to determine if there is a gap in income by gender (Income is known to be skewed)
Is there a difference in the median income between gender?
The Median of two or more groups.
What is similar in each of these questions? © TACTEGRA 2015
70
Basic Terminology
• Ho: Null Hypothesis – Pronounced: “H‐Naught”, “H‐Not,” or “H‐oh” - No difference exists between data sets - No relationship or correlation exits between data - The groups are equal or the same
• Ha: Alternative Hypothesis - There is a difference or relationship between data sets - There is a relationship or correlation between data
• p‐value: Probability Value - Likelihood that differences or relationships are due to chance - The probability of being wrong if we accept the alternative hypothesis - “when the p is low, the null must go!”
• Logic of Hypothesis Tests: - Presume there is no difference; assume Ho (innocent until proven guilty; not different until proven otherwise) - Test to see if the data provides evidence that there is a difference
© TACTEGRA 2015
71
Basic Symbols A notation of the population parameter being tested: Parameter
Notation
Mean
(mu)
Variance
2 (sigma squared)
Median
(eta)
Proportion
p
A mathematical symbol for the claim to be tested: Hypothesis Claims
Symbol Used
No difference
=
Difference with 1 or 2 groups
≠,
Difference with more than 2 groups
At least 1 group is different
© TACTEGRA 2015
72
Examples of Hypothesis Tests
• Ho: there is no difference between average call time between locations • Ha: there is a difference between average call time between locations • Ho: there is no difference in on‐time delivery rate by time of day • Ha: there is a difference in on‐time delivery rate by time of day • Ho: there is no difference in the standard deviation of the size of CD cases between Company A and Company B • Ha: there is a difference in the standard deviation of the size of CD cases between Company A and Company B
• Ho : there is no difference in the median income between gender • Ha : there is a difference in the median income between gender • Ho = _______________________________________ • Ha = _______________________________________ © TACTEGRA 2015
73
Hypothesis Testing Exercise
Convert the following statements into statistical hypotheses (Ho vs. Ha): What questions would you like to answer?
• The best order processing team has a current process cycle time of 30 seconds. Your team has developed a way to reduce the time to 25 seconds.
• There are 5 potential suppliers of an automated refill product. On‐time delivery rate is one of the key quality characteristics.
© TACTEGRA 2015
74
What are the Risks?
(Type I) Risk
• Conclude samples are different when they are not • Make Process changes and/or investments that do not need to be made •
(Type II) Risk
(producers risk) False Positive
• Conclude samples are the same when they are different (Consumers risk) • Maintaining the current level of process performance • False Negative
© TACTEGRA 2015
75
What are the Risks?
• Another term for α is “producer’s risk.” It’s the risk of making unnecessary changes and incurring unnecessary cost, time and effort because we find causes that don’t exist. Because we may make unnecessary changes to our processes, α risk is the risk of being too active and chasing unimportant x’s.
• Another term for β risk is “consumer’s risk” because the customer receives poor service or a defective product. We didn’t fix something we should have because our data said everything was okay. This is the risk of missing an important X. β is called the “lazy” risk; that is, we don’t find something we should have, and pass that risk to the consumer.
Note: the Green Belt should focus on the alpha risk. Black Belts are taught additional tools to understand the impact of Beta risk. © TACTEGRA 2015
76
α Risks
• α risk is expressed relative to a distribution, such as the t, z, F or chi‐squared distributions
• α risk is either placed entirely into one tail or the other, or spread evenly over both tails of the distribution. This changes our hypothesis statement slightly ∶ ≤ ̅ ∶ > ̅ = 5%
∶ ≥ ̅ ∶ < ̅ = 5% region of doubt
5%
∶ = ̅ ∶ ≠ ̅ = 5%
region of doubt
5%
2.5%
region of doubt
2.5%
© TACTEGRA 2015
77
The Role of the p-value in Understanding Risk
P-Value
• The p‐value is the probability that the null hypothesis is true! • Turns out the p‐value is the probability of making a type‐I α error. • That is, saying the two samples are different when they really are not. • If you want to be 95% confident, your α is 5%. • If your p‐value is less than your α, reject the null
if p is low the null must go! if the p is high the null must apply! © TACTEGRA 2015
78
β Risks
You can see that α risk is relatively easy to evaluate with the p‐value. But how do you control β risk?
• The size of the difference, or - As increases, β risk decreases, all else held constant (AEHC)
• The average (pooled) standard deviation, or s, of the two populations - As σ decreases, β risk decreases, AEHC
• The sample size, or n - As n increases, β risk decreases, AEHC
• The α risk - As α risk increases, β risk decreases, AEHC Our main weapon against β risk (the one we have the most control over), is the sample size, or n.
© TACTEGRA 2015
79
The Hypothesis Testing Process
1
Practical Question
2
Define Ho & Ha
3
Choose the Risk Level (.05)
4
Choose Sample Size
5
Select & Run Test
6
Interpret Results
7
State Practical Answer
• This process is applicable to all hypothesis tests • We have talked about Steps 1, 2, and 3 in this module • We will learn more about Sample Sizes and β (Step 4) in Black Belt training • The following slides outline the roadmap for Step 5 • Steps 6 and 7 will be covered by example as we look more closely at particular Hypothesis Testing
© TACTEGRA 2015
80
The Hypothesis Testing Road Maps
Is the data continuous or discrete?
Continuous Data Is the data normal or not normal?
Discrete Data
How many samples are you comparing?
Are the shapes of the data the same of different? How many samples are you comparing?
Continuous Hypothesis Tests
Discrete Hypothesis Tests
These are the critical questions to answer when using the Road Maps.
© TACTEGRA 2015
81
Continuous Normal Road Map
Normal Data
Two or More Samples?
One Sample?
Test of Equal Variance 1 Sample T
Equal Variance?
2 Samples
2 Sample T
>2 Samples
One Way ANOVA
Unequal Variance?
2 Samples
2 Sample T
>2 Samples
Welch ANOVA
The 2 sample T and the One Way ANOVA will be covered here. Note that there are several additional concepts not covered here that are covered in Black Belt. © TACTEGRA 2015
82
Continuous Non-Normal Road Map
Non-Normal Data
Two or More Samples?
One Sample?
Test of Equal Variance
Equal Variance?
KruskalWallis Test
Unequal Variance?
Moods Median Test
Median Test
The Moods Median test will be covered here. Note that there are several additional concepts not covered here that are covered in Black Belt. © TACTEGRA 2015
83
Attribute Data Road Map
Attribute Data
One Factor?
One Sample
1-Sample Proportion
2 Samples
2-Sample Proportion
Two Factors?
>2 Samples
Chi-Square Test
Chi-Square Test
The 2 sample Proportion and Chi-Square Test will be covered here © TACTEGRA 2015
84
Hypothesis Test Examples - Continuous Normal
There are two ways to determine whether your data are Normally distributed in SigmaXL. 1.
When you have only one column, use the Normal probability plot. This test will only provide graphical output.
2.
When you have two columns of data, use the 2 Sample Comparison test to obtain a Anderson‐ Darling test. The benefit of running the 2 sample comparison is that it also gives you results of other 2 sample hypothesis tests at the same time.
The Null and Alternative Hypothesis for a normality test are: Ho: Data is Normal Ha: Data is NOT Normal © TACTEGRA 2015
85
Normal Probability Plot Data is in a single column
1. In the analyze sample data workbook open tab ‘Player Data’ then within SigmaXL select Graphical Tools>Normal Probability Plots 2. Select the Data to be tested
3. Select Numeric Data Variable (Y) and then select OK
© TACTEGRA 2015
86
Normal Probability Plot
You want to see the majority of your data grouped around the center line and located between the two red lines. This data appears to be from a Normal distribution.
© TACTEGRA 2015
87
Anderson-Darling Test Using 2 Sample Comparison Test Data is in two columns Weight of Weight of Football players Baseball players 216.6750722 184.9742346 215.2803161 198.7588157 220.1691898 188.9250426 210.4564796 186.5218944 214.164282 191.342654 216.9156617 196.6212829 232.8578239 188.4481802 207.0123875 200.022006 205.2701762 191.962958 208.2484559 195.2341331 223.4977632 198.3391137 211.732619 187.7786797 213.329143 197.6012536 219.7881031 192.8705148
1. In the analyze sample data workbook open tab ‘Player Data’ then within SigmaXL select Statistical Tools>2 Sample Comparison Test 2. Select the Data to be tested
3. Select Unstacked column format, select Numeric Data Variable (Y) for the first two data columns and then select OK
© TACTEGRA 2015
88
Anderson-Darling Test Using 2 Sample Comparison Test
Ho: Data is Normal Ha: Data is NOT Normal We’re looking for HIGH p‐values, meaning p values of greater than .05. These two data sets represent Normal distributions
Using the SigmaXL 2 sample Comparison test also provides results for a Test for Equal Variances, 2 Sample t‐Test for means and the 2 Sample Mann‐Whitney test for medians.
© TACTEGRA 2015
89
Hypothesis Test Examples - Continuous Normal
Now that it has been determined that the data is Normal the next question to answer is how many samples do we have?
To compare means, we use t‐tests. Specifically, there are three types of t‐test: 1. 1‐sample t‐test to compare a sample average (mean) versus a target value 2. 2‐sample t‐test to compare the averages (means) from two different samples 3. Paired t‐test to compare two dependent samples A separate test called the One‐Way ANOVA is used to compare more than two means For Green Belts, we will focus only on the 2-sample t-test and the One-Way ANOVA © TACTEGRA 2015
90
Hypothesis Test Examples - Continuous Normal
Once we know how many samples there are the next question is whether or not the variances are equal?
The purpose of the test of equal variance at this point is to insure that we select the correct test of the mean. This test will also provide us with useful information about our data. Visually think about what would the two data sets look like if the means were equal but the variances were not?
B
A
© TACTEGRA 2015
91
2 Sample t-test
The Sigma XL software embeds the Test of Equal Variance in with the 2 sample t test in the 2 sample comparison test of the means so that both are run simultaneously. This is extremely helpful. However the two sample t test in sigma XL calculates the mean differences for us. Either method is acceptable since the 2 sample t is our choice for testing the means of two samples.
© TACTEGRA 2015
92
2 Sample t-test
For a 2 sample t‐test, the question that we are asking is whether the two samples are so similar to each other that they probably came from the same population. The alternative is whether they are more likely to have come from different populations. Example: Is the order approval cycle time longer for first time buyers than for return buyers? Order Approval Cycle Time versus Buyer Type
Notice the overlap in the two samples. Are these groups really different? We can answer that question with a test!
© TACTEGRA 2015
93
2 Sample t-test Example: Management believes that the processing time for two claims processing teams is different because of a recently leadership change. If the average processing time for the new team leader is more than two hours slower corrective action will need to be taken!
1. In the analyze sample data workbook open tab ‘Claims Team Data’ and then select within SigmaXL: Statistical Tools>2 Sample Comparison Tests 2. Enter the 2 Samples You Want to Compare
© TACTEGRA 2015
94
What do the Results Tell Us? The 2 Sample Comparison test actually works through the statistics and highlights the correct tests based on the information it finds. The p‐values for the AD normality tests are both greater than 0.05, so we conclude that we have Normal data. Therefore we should use an F‐test to compare the variances. Otherwise, we would use Levene’s test. The p‐value (0.5880) for equal variances is greater than our alpha level of 0.05, so we fail to reject the null hypothesis, meaning the variances are equal. Finally we look at the test for the means. The p‐value is below .05 for this test (.0000). What is your conclusion?
© TACTEGRA 2015
95
What do the results tell us?
The p‐value for comparing the means is lower than 0.05, so we can say that there is a statistical difference between the new team leader and the old team leader. Will we will switch vendors? Before we did the study, what did we decide would be our criteria for making that decision? Is the new team leader more than 2 hours slower than the old team leader? From the menu select the 2‐sample t‐test to drill deeper. The new team leader averaged 2.7 hours slower than the old team leader. The 95% confidence interval suggests that the new team leader could be as little as 1.7 hours slower or as much as 3.7 hours slower than the old team leader. © TACTEGRA 2015
96
One Way ANOVA
Suppose we want to test for differences between three or more team leaders? Since there are more then two groups, we can’t use a 2 sample t test. A one way ANOVA can tell us this. What would null hypothesis be? Ho: There is no difference in the mean delivery time between team leader A, B or C. What would the alternative hypothesis be? Ha: There is a difference in the mean delivery time by team leader. “Analysis of variance (ANOVA) is a collection of statistical models used to analyze the differences among group means and their associated procedures (such as "variation" among and between groups), developed by statistician and evolutionary biologist Ronald Fisher. In its simplest form, ANOVA provides a statistical test of whether or not the means of several groups are equal, and therefore generalizes the t‐test to more than two groups. As doing multiple two‐sample t‐tests would result in an increased chance of committing a statistical type I error.” Wikipedia
© TACTEGRA 2015
97
How Does a One-Way ANOVA Work?
A
Within group variation
B Between group variation
C
ANOVA tests a ratio (F statistic) of:
Average variation Between Average variation Within
Fortunately, SigmaXL does all of the heaviest lifting for us here! © TACTEGRA 2015
98
One-Way ANOVA Examples
Which of four order processing centers has the fastest turn around times?
Which postal service provides the fastest shipping times: USPS, UPS or FedEx?
Which of 3 Customer Contact Centers has the best handle times? Which of three types of scanners produces an image faster? Which of five stores has the best overall cashier wait times?
© TACTEGRA 2015
99
One-Way ANOVA Example
Practical Problem:
• You own 4 carwashes. You want to know if there is a difference in cycle time (time to wash a car) between the 4 facilities.
• Assume Variances Equal, run ANOVA What are the null and alternative hypothesis?
© TACTEGRA 2015
100
One-Way ANOVA Example 1. In the analyze sample data workbook open tab ‘Car Wash Data’. Within SigmaXL select Statistical Tools>On‐Way ANOVA & Means Matrix.
2. Select stacked column format and follow example above. Be sure to display the ANOVA table details.
© TACTEGRA 2015
101
What Do The Results Tell Us?
The p‐value (0.0000) is smaller than our α level, so we reject the null. But what does that mean? It means that at least one of our averages is statistically significant from the others. But which one?
SigmaXL also provides a Pairwise comparison of t‐ tests (p‐values) shown to compare each level. Since the variances are equal (we typically test this before running an ANOVA), it uses the “variances equal” assumptions for the test. In this example, car washes 2 and 3 are different from 1 and 4. However, Carwashes 2 and 3 are not different from one another, nor are 1 and 4 different. © TACTEGRA 2015
102
Continuous Non-Normal Road Map
Non-Normal Data
Two or More Samples?
One Sample?
Test of Equal Variance
Equal Variance?
KruskalWallis Test
Unequal Variance?
Moods Median Test
Median Test
The Moods Median test will be covered here. Note that there are several additional concepts not covered here that are covered in Black Belt. © TACTEGRA 2015
103
Parametric vs. Non-Parametric • When the underlying distribution is not known Non‐parametric procedures are used. These procedures do not require or reference particular parameters like parametric procedures (which always reference μ and/or σ).
• These types of tests are the most appropriate and useful with small sample sizes (nNonparametric Tests>Mood’s Median Test 2. Enter the 2 Samples You Want to Compare
© TACTEGRA 2015
106
What do the results tell us?
The p‐value (0.0001) is less than our α level, so we reject the null. We conclude that the medians are not equal. The new claims processing team leader is about 2.15 hours slower than the old team leader (87.85 – 85.7). Should we pursue changing team leaders?
© TACTEGRA 2015
107
Mood’s Median Test
Let’s take a look at an example of the Mood’s Median Test for more than two samples. Practical Problem: Look at cycle time data (in minutes per unit) for order processing at four different fulfillment sites. We wanted to compare the differences in cycle times for the four sites. 1. In the analyze sample data workbook open tab ‘Order Data’ and then select within SigmaXL: Statistical Tools>Nonparametric Tests>Mood’s Median Test 2. Use Cycle time for the Y and order processing Site for the X. Notice the data is stacked this time.
© TACTEGRA 2015
108
What do the results tell us?
The p‐value (0.3618) is greater than our α level, so we fail to reject the null. We can assume equal medians for each site. Look at the bottom of the output. Does that look familiar? You might recall a similar plot from the One Way ANOVA. Just like the One Way ANOVA, you can perform a straight line test. You can see that you can easily pass a straight line through all four CI’s.
© TACTEGRA 2015
109
Attribute Data Road Map
Attribute Data
One Factor?
One Sample
1-Sample Proportion
2 Samples
2-Sample Proportion
Two Factors?
>2 Samples
Chi-Square Test
Chi-Square Test
The 2 sample Proportion and Chi-Square Test will be covered here © TACTEGRA 2015
110
Proportion Tests
To compare proportions (a fancy word for percentages), we use Proportion tests. Specifically, there are three types of test:
• 1‐Proportion to compare a sample percent versus a target • 2‐Proportion to compare the percentages from two different samples • Chi‐Square to compare more than 2 samples at once For Green Belts, we focus only 2‐Proportion and Chi‐Square Tests – they are the most commonly used.
© TACTEGRA 2015
111
2 Proportion Test
Practical Problem: A supplier works with 2 different clients that have different payables systems. Slacker, Inc., seems to pay their invoices late more often than Company True. Given the good business each client brings, the client rep wants to be sure before he has to address the issue. Out of the last 100 invoices each paid, the following data was obtained: Company
On-Time Payments
Late Payments
Company True
97
3
Slacker, Inc.
92
8
In the analyze sample data workbook open tab for ‘Supplier Data’ and then select within SigmaXL: Statistical Tools>Basic Statistical Templates>2 Proportions Test and Confidence Interval
© TACTEGRA 2015
112
2 Proportion Test Enter summarized Sample Data, Null Hypothesis and Confidence Level in cells with yellow highlight. Do not modify any other part of this worksheet.
The p‐value (0.121) is greater than our α level, so we fail to reject the null. We must assume they are equally late payers.
© TACTEGRA 2015
113
Chi-Square Test
Practical Problem: 5 different calls centers are measured on the number of calls they answer in 30 seconds or less. The following data was sampled from each call center, and the management team hopes to identify the strongest centers and why their performance is so good: Which call center is the best? The worst? How do you know?
In the analyze sample data workbook open tab ‘On Time Calls’ and then select within SigmaXL: Statistical Tools>Chi‐ Square Test – Two‐Way Table Data
© TACTEGRA 2015
114
Chi-Square Test
Which one(s) are better or worse? The Chi‐Square Test utilizes the Chi‐Square test statistic to compare the actual observed count with an expected count . The larger the test statistic, the larger the difference between the counts we observed and the expected number that we hypothesized The p‐value (0.0000) is far less than our α level, so we reject the null. At least one call center is statistically better or worse than the others.
Hoboken had less bad call than expected and more good calls than expected compared to the other locations.
© TACTEGRA 2015
115
The Role of Correlation and Regression So far we have looked at how to analyze our project data when you have a Continuous or Discrete Y and a Discrete X. Let’s add another branch onto the hypothesis testing Road Map. What happens when you have a input (x) value? The bigger picture of our Roadmap Is the data continuous or discrete? actually starts with the following matrix.
Discrete
Continuous
Discrete
Potential Vital Few Input (x) (from the Measure phase)
Discrete Hypothesis Testing
Logistic Regression
Continuous Hypothesis Testing
Correlation and Regression
Discrete Hypothesis Tests
Project Y
Continuous Hypothesis Tests
Discrete Data
Continuous
Continuous Data
Correlation and Regression help us when we have two groups of continuous data. © TACTEGRA 2015
116
Correlation Defined
Correlation measures the direction and strength of the (linear) relationship between two (continuous) variables Direction: •
Correlation has two relationship directions, positive and negative.
•
Correlation is positive when two variables increase or decrease together (example: height and age).
•
Correlation is negative when two variables increase or decrease oppositely (example: speed and reaction time).
•
Both positive and negative relationships can be meaningful. Negative relationships are not “bad”
Strength: •
Correlation between variables is often described as being strong or weak and is measured on a scale between ‐1 and 1.
•
When the correlation is strong, the data points (on a scatter plot) are closer together. The closer the correlation is to ‐1 or 1, the stronger the relationship.
•
When the correlation is weak, the data points (on a scatter plot) are farther apart. The closer to correlation is to 0, the weaker the relationship
© TACTEGRA 2015
117
Correlation Coefficient
Scatter Plot: r =
1.0
0.8
0.4
0.0
‐0.4
‐0.8
‐1.0
When analyzing a scatter plot, there are four areas you should examine: • Form: Does the plot show a linear trend? Or, does it take some other form? • Direction: Are the points sloping upward (positive) or downward (negative)? • Strength of Association: How tightly does the data fit an (imaginary) trend line? • Outliers: Are there any points that seem way out of place?
© TACTEGRA 2015
118
Correlation Coefficient
Scatter Plot:
r =
1.0
Strong, Positive Linear Relationship (upward slope, points close together)
0.8
0.4
0.0
No Linear Relationship (random)
‐0.4
‐0.8
‐1.0
Strong, Negative Linear Relationship (downward slope, points close together)
As the relationship between variables get weaker, the points in the scatter plot get farther apart, eventually becoming randomly distributed when there is no relationship (r=0) © TACTEGRA 2015
119
A Word about “Strength”
• A frequently asked question about correlation is “what is a good strong correlation?”
• While we want to deal in the “objective” as much as possible, strength is often subjective and dependent upon the application and any risk or cost associated with making a bad decision.
• For example, if you are developing a new airplane what correlation might you want when predicting engine life?
• However in behavioral psychology, finding a correlation of 0.05 might be considered high!
© TACTEGRA 2015
120
“0” Correlation
• Stop! A word of caution! • A correlation of close to 0 does not mean that the two variables aren’t related, just that they are not linearly related
• Take a look at these scatter plots with correlation equal to 0 and see if you see any possible relationships… r =
0.0
0.0
0.0
0.0
0.0
0.0
0.0
• Analyzing this type of data involves techniques discussed in advanced training © TACTEGRA 2015
121
Start with Drawing a Scatter Plot
Remember our PGA approach; Practical, Graphical and then Analytical! Start with your y and x data arranged in columns. In this example, the y is shark attacks, and the x is Ice Cream Sales. 1. In the analyze sample data workbook open tab ‘Correlation’ and then select within SigmaXL: Graphical Tools > Scatter Plots 2. Select the data to be plotted.
© TACTEGRA 2015
122
How to Draw a Scatter Plot… 3. Highlight a column and click the button to place it in the appropriate box. You can see that we designated “Shark Attacks” as the y and “Ice Cream Sales” as the x. Note: Uncheck “Trendline”. We will use it later.
Let’s evaluate our scatter plot. What does it tell you about the relationship between Soda Purchases and Shark Attacks? What do you think the correlation coefficient might be?
© TACTEGRA 2015
123
How to Calculate Correlation…
Based on the scatter plot, there appears to be a linear relationship that is both positive and moderately strong. There does not appear to be any obvious outliers. 1. Return to your original data sheet. In SigmaXL choose Statistical Tools > Correlation Matrix
Select each column you want to test for correlation (minimum of 2) and add them to the correlation matrix
© TACTEGRA 2015
124
How to Calculate Correlation…
This is your correlation coefficient 0.8117 or 81.17%. The correlation coefficient confirms our hunch. 0.8129 is positive, and suggests a moderately strong relationship between Shark Attacks and Ice Cream Sales. What does the p‐value tell us?
0.8117 is positive, and suggests a moderately strong relationship between Shark Attacks and Soda Purchases. The p‐value tells us that this relationship is statistically significant. So, we can say that Soda Purchases cause Shark Attacks, right?
© TACTEGRA 2015
125
What is Correlation?
WRONG! While Correlation measures the direction and strength of the (linear) relationship between two (continuous) variables, it does not imply Causation
© TACTEGRA 2015
126
Other Possible Relationships
There could be many other relationships in play here, for example:
• There is a “bi‐causal” relationship where x and y cause each other. An example • •
• • •
would be coffee causes nervousness and nervous people drink more coffee. A relationship based on the effects of an outside variable on both x and y. Higher temperatures cause more people to drink cold soda and more people to get into shark infested water! Both variables could be part of a complex system. Economic variables often fall into this category. A simple example might be a student’s grades in high school being correlated to grades in college. The cause of both could be one (or a combination) of IQ, parent involvement, hours studied, student motivation, etc. The relationship may be purely coincidental, that is there is no discernable common cause. An example might be a relationship between the number of crimes committed and the number of people who exercise regularly. Two variables may be correlated simply because each is trending with time. An example might be the relationship between housing sales and emergency room visits both increasing month over month Of course, there could be a direct cause and effect relationship, that is x causes y.
© TACTEGRA 2015
127
Simple Linear Regression
Correlation helps objectively determine the direction and strength of a relationship (if any) between two variables. However, it does nothing do describe the SIZE of that relationship. That’s where simple linear regression comes in, it gives us the ability to create models that; capture the randomness inherent in the relationship between x and y, and mathematically estimate and/or predict y for any given x.
Simply put Linear regression creates a simple formula to help predict the continuous output given a continuous input.
In this example the Linear regression formula is: Sales = 10.07(Contact Time)+4710.5 © TACTEGRA 2015
128
Ordinary Least Squares (OLS)
Simple Linear Regression uses the Ordinary Least Squares, or OLS method to find the best fit line. Least squares regression literally squares each value (this makes them all positive), and sums the result. The line that minimizes this sum is the “best fit line.” Simply put, it chooses the line that minimizes the Sum of Squared Errors.
y
Looks like a scatter plot, doesn’t it?
x © TACTEGRA 2015
129
Back to our Scatter Plot…
1. In the analyze sample data workbook open tab ‘Correlation’ and then select within SigmaXL: Graphical Tools > Scatter Plots 2. Select the data to be plotted.
© TACTEGRA 2015
130
Adding the Trendline…
3. Highlight a column and click the button to place it in the appropriate box. You can see that we designated “Shark Attacks” as the y and “Ice Cream Sales” as the x. This time leave “Trendline” checked! Let’s evaluate our regression. You have added a fitted line to the scatter plot this time. This is the OLS best fit line. What does it tell you about the relationship between Soda Purchases and Shark Attacks? What do you think R2 means?
© TACTEGRA 2015
131
Simple Linear Regression
So, basically, in most months, you are going to see about 2 (1.9793) shark attacks. And, in general, you will see 0.0029 shark attacks per ice cream sold (or 2.9 attacks per 1000 ice creams).
© TACTEGRA 2015
132
Simple Linear Regression
Now about that R2… where have we seen r before? That’s right! Correlation! In simple linear regression R2 is literally r2. Recall that our original r = 0.8129. r x r = 0.8129 x 0.8129 = 0.6608 So what does that mean? R2 is 0.6608
Because it is squared, it will always be between 0 and 1.
R2 is the percentage of variation explained by the regression equation. So, in our case, the regression equation explains 66.08% of the variation in our data. Is this a “good” model based on the R2 ? Why or why not?
© TACTEGRA 2015
133
Simple Linear Regression
We have a linear equation, we know the amount of variation explained by that equation, we know how to estimate and/or predict…What’s missing?
The p-value! Let’s build the formal regression model using SigmaXL
© TACTEGRA 2015
134
The Formal Regression Model
In SigmaXL choose Statistical Tools>Regression>Multiple Regression I know, but trust me, it works for Simple Regression as well…
© TACTEGRA 2015
135
Identifying the Model
Highlight a column and click the button to place it in the appropriate box. You can see that we designated “Shark Attacks” as the y and “Ice Cream Sales” as the x. Leave the remaining defaults.
© TACTEGRA 2015
136
Evaluating the Model
Here is the regression output. At this stage (Green Belt), we are only worried about the significance of the model. What does our p‐value tell us?