Six Sigma - Green Belt

Six Sigma - Green Belt Six Sigma - Green Belt Copyright© 2012 Jasneet Singh All rights reserved This book is provided on the condition that it sh...
Author: Godfrey Miles
19 downloads 0 Views 3MB Size
Six Sigma - Green Belt

Six Sigma - Green Belt

Copyright© 2012

Jasneet Singh

All rights reserved This book is provided on the condition that it shall not by way of trade or otherwise, be lent, resold, hired out or otherwise circulated without the publisher’s prior consent in any form of binding or cover other than in which it is published and without a similar condition including this condition being imposed on the subsequent purchaser and without limiting the rights under the copyright reserved above, no part of this publication, may be reproduced, stored in or introduced into a retrieval system, or transmitted in any form or by any means (electronic, mechanical, photocopying , recording or otherwise) without the prior permission of the copyright owner and publisher of the book

Disclaimer: Due care and diligence has been taken while editing and printing this book. Neither the Author, publisher nor the printer of the book holds any responsibility for any mistake that may have crept in inadvertently. Cubezoid Solutions Private Limited – the publishers, will be free from any liability for damages and loss of any nature arising out or related to the content. All disputes are subject to the jurisdiction of the competent courts in Delhi.

Six Sigma - Green Belt

TABLE OF CONTENTS 1. Six Sigma and Organization .................................................................................................... 4 1.1. Six Sigma and Organizational Goal ...................................................................................................................4 1.2. Lean Principles ................................................................................................................................................ 16 1.3. Design for Six Sigma (DFSS) .......................................................................................................................... 20

2. Define ....................................................................................................................................... 28 2.1. Process Management ....................................................................................................................................... 28 2.2. Project Management ........................................................................................................................................ 30 2.3. Management and Planning Tools ..................................................................................................................... 32 2.4. Business Results .............................................................................................................................................. 42 2.5. Team Dynamics and Performance ................................................................................................................... 45

3. Measure.................................................................................................................................... 51 3.1. Process Analysis and Documentation .............................................................................................................. 51 3.2. Statistics and Probability.................................................................................................................................. 54 3.3. Collecting and Summarizing Data ................................................................................................................... 62 3.4. Probability Distributions .................................................................................................................................. 71 3.5. Measurement System Analysis ........................................................................................................................ 81 3.6. Control Chart ................................................................................................................................................... 88 3.7. Process Capability and Performance................................................................................................................ 91

4. Analyze ..................................................................................................................................... 99 4.1. Exploratory Data Analysis ............................................................................................................................... 99 4.2. Hypothesis Testing ........................................................................................................................................ 107

5. Improve and Control ............................................................................................................ 119 5.1. Design of Experiments (DOE) ....................................................................................................................... 119 5.2. Statistical Process Control (SPC) ................................................................................................................... 130 5.3. Implement and Validate ................................................................................................................................. 147 5.4. Control Plan ................................................................................................................................................... 152

Six Sigma - Green Belt

1. SIX SIGMA AND ORGANIZATION Six sigma is a method on quality, which is focused on results. It's also a technique of measurements which results in lower defects which convert into cost savings and competitive advantage. Sigma (σ), is an mathematical symbol representing one standard deviation from the average or mean. Most control charts set their range at +3σ, but Six Sigma extends three more standard deviations. With six sigma, there are only 3.4 parts per million (PPM) defective. A 6 Sigma level process is operating at 99.9997% quality level.

1.1. Six Sigma and Organizational Goal Six Sigma is defined as a methodology that aims at a quasi-perfect production process. It is also defined as a methodology that aims at a rate of 3.4 defects per million opportunities (DPMO). In the design phase of any process, the customers’ needs and expectations are identified and translated into Critical-To-Quality (CTQ) characteristics. These characteristics are put into the products’ design so as to manufacture or deliver it consistently and economically. But variability comes during delivery or manufacture hence, tolerance levels are specified thus, the company should measure and control the variations. Then the process performance is measured to know how the output against specified limits by the process capability or the ability of the process to generate products that are within the specified limits, and the process stability or company’s ability to predict the process performance based on past experience. Usually the SPC is used with sample being tested at specified intervals and estimation is derived for whole to know number of defects.

Continuous Improvement Continuous improvement involves constantly identifying and eliminating the causes that prevent a system or process from functioning at its optimum level. The concept of continuous improvement originated in Japan in the 1970s. It was adopted in many countries, including U.S.A., in the early 1980s. Continuous improvement—and consequent customer satisfaction—is the principle on which the concept of Lean manufacturing is developed. When this principle is combined with just-in-time technique, it results to Lean manufacturing. Continuous improvement helps an organization to add value to its products and services by reducing defects, mistakes, etc. and to maximize its potential. As continuous improvement requires constant ongoing efforts, it is essential that the top management takes a long term view and commits itself for its implementation. Continuous improvement enables organizations identify and rectify problems as and when they occur. Thus, it ensures smooth functioning of the processes. Many modern quality improvement models or tools like control charts, sampling methods, process capability measures, value analysis, design of experiments, etc. have been influenced by the concept of continuous improvement.

Six Sigma History

Six Sigma - Green Belt History of six sigma encompassed various events which shaped it’s formation and spread. Six sigma has evolved over time. It’s more than just a quality system like TQM or ISO. The events for six sigma evolution are as Carl Frederick Gauss (1777-1855) introduced the concept of the normal curve. Walter Shewhart in 1920’s showed that three sigma from the mean is the point where a process requires correction. Following the defeat of Japan in World War II, America sent leading experts including Dr. W. Edwards Deming to encourage the nation to rebuild. Leveraging his experience in reducing waste in U.S. war manufacture, he offered his advice to struggling emerging industries. By the mid-1950s, he was a regular visitor to Japan. He taught Japanese businesses to concentrate their attention on processes rather than results; concentrate the efforts of everyone in the organization on continually improving imperfection at every stage of the process. By the 1970s many Japanese organizations had embraced Deming's advice. Most notable is Toyota which spawned several improvement practices including JIT and TQM. Western firms showed little interest until the late 1970s and early 1980s. By then the success of Japanese companies caused other firms to begin to re-examine their own approaches and Kaizen began to emerge in the U.S. Many measurement standards (Zero Defects, etc.) later came on the scene but credit for coining the term “Six Sigma” goes to a Motorola engineer named Bill Smith. (“Six Sigma” is also a registered trademark of Motorola). Bill Smith, along with Mikel Harry from Motorola, had written and codified a research report on the new quality management system that emphasized the interdependence between a product’s performance in the market and the adjustments required at the manufacturing point. Various models and tools emerged which are Kaizen – It refers to any improvement, one-time or continuous, large or small TQM – It is Total Quality Management with Organization management of quality consisting of 14 principles PDCA Cycle - Edward Deming’s Plan Do Check Act cycle Lean Manufacturing – It focuses on the elimination of waste or “muda” and includes tools such as Value Stream Mapping, the Five S’s, Kanban, Poka-Yoke JIT– It is Just in Time Business or catering to needs of customer when it occurs. Six Sigma – It is designed to improve processes and eliminate defects; includes the DMAIC and DMADV models inspired by PDCA

Quality Pioneers Various pioneers emerged who helped shape quality principles and laid the foundations for six sigma. They included Walter A. Shewhart - He is the pioneer of Modern Quality Control who, recognized the need to separate variation into assignable and un-assignable causes. He is the founder of the control chart and originator of the plan-do-check-act cycle. He was the first to successfully integrate statistics, engineering, and economics and defined quality in terms of objective and subjective

Six Sigma - Green Belt quality. Dr. W. Edwards Deming – He studied under Shewhart at Bell Laboratories and major contributions includes developing 14 points on Quality Management, a core concept on implementing total quality management, is a set of management practices to help companies increase their quality and productivity. The 14 points are Create constancy of purpose for improving products and services. Adopt the new philosophy. Cease dependence on inspection to achieve quality. End the practice of awarding business on price alone; instead, minimize total cost by working with a single supplier. Improve constantly and forever every process for planning, production and service. Institute training on the job. Adopt and institute leadership. Drive out fear. Break down barriers between staff areas. Eliminate slogans, exhortations and targets for the workforce. Eliminate numerical quotas for the workforce and numerical goals for management. Remove barriers that rob people of pride of workmanship, and eliminate the annual rating or merit system. Institute a vigorous program of education and self-improvement for everyone. Put everybody in the company to work accomplishing the transformation. Joseph Juran - His major contributions are directing most of his work at executives and the field of quality management and developing the “Juran Trilogy” for managing quality, as Quality planning, quality control, and quality improvement. He also enlightened the world on the concept of the “vital few, trivial many” which is the foundation of Pareto charts. Philip Crosby - He stressed on Quality management and four absolutes of quality including Quality is defined by conformance to requirements. System for causing quality is prevention not appraisal. Performance standards of zero defects not close enough. Measurement of quality is the cost of nonconformance. Arman Feigenbaum - He developed a systems approach to quality (all organizations must be focused on quality) by emphasizing that costs of quality may be separated into costs for prevention, appraisal, and failures (scrap, warranty, etc.) Kaoru Ishikawa - He developed the concept of true and substitute quality characteristics as True characteristics are the customer’s view Substitute characteristics are the producer’s view Degree of match between true and substitute ultimately determines customer satisfaction

Six Sigma - Green Belt He also advocated of the use of the 7 tools and advanced the use of quality circles or worker quality teams. He also developed the concept of Japanese Total Quality Control Quality first and not short term profits. Next process is the customer. Use facts and data to make presentations. Respect for humanity as a management philosophy of full participation Genichi Taguchi - He developed the quality loss function (deviation from target is a loss to society) and promoted the use of parameter design (application of Design of experiments) or robust engineering. The goal is to develop products and processes that perform on target with smallest variation insensitive to environmental conditions and the focus is on engineering the design.

Value of Six Sigma The Six Sigma concept was developed at Motorola in the 1980s. Six Sigma can be viewed as a philosophy, a technique, or a goal. Philosophy - Customer-focused breakthrough improvement in processes Technique - Comprehensive set of statistical tools and methodologies Goal - Reduce variation, minimize defects, shorten the cycle time, improve yield, enhance customer satisfaction, and boost the bottom line Six sigma is not about quality for the sake of quality; it is about providing better value to customers, investors and employees. Six Sigma is a process of asking questions that lead to tangible and quantifiable answers that ultimately produce profitable results. There are four groups of quality costs, which are External failure cost: warranty claims, service cost Internal failure cost: the costs of labor, material associated with scrapped parts and rework Cost of appraisal and inspection: these are materials for samples, test equipment, inspection labor cost, quality audits, etc.. Cost related to improving poor quality: quality planning, process planning, process control, and training. Usually companies are at 3 Sigma level which translates to 25-40% of annual revenue being taken by cost of quality. Thus, if a company can improve its quality by 1 sigma level, its net income will increase hugely, approximately 10 percent net income improvement. Furthermore, when the level of process complexity increases (eg. output of one sub-process feeds the input of another sub-process), the rolled throughput yield of the process will decrease, then the final outgoing quality level will decline, and the cost of quality will increase. Project teams with well-defined projects improve the company's profits.

Mathematical Six Sigma The term ‘Six Sigma’ is drawn from the statistical discipline ‘process capability studies’. Sigma, represented by the Greek alphabet ‘σ’, stands for standard deviation from the ‘mean’. ‘Six Sigma’ represents six standard deviations from the ‘mean.’ This implies that if a company

Six Sigma - Green Belt produces 1,000,000 parts/units, and its processes are at Six Sigma level, less than 3.4 defects only will result. However, if the processes are at three sigma level, the company ends up with as many as 66,807 defects for every 1,000,000 parts/units produced. The table below shows the number of defects observed for every 1,000,000 parts produced (also referred to as defects per million opportunities or DPMO).

Sigma Level

Defects per opportunities 308,507 DPMO 66,807 DPMO 6,210 DPMO 233 DPMO 3.4 DPMO

Two Sigma Three Sigma Four Sigma Five Sigma Six Sigma

million

Process standard deviation (σ) should be so minimal that the process performance should be able to scale up to 12σ within the customer specified limits. So, no matter how widely the process deviates from the target, it must still deliver results that meet the customer requirements. Few terms used are USL – It is upper specification limit for a performance standard. Any deviation beyond this is a defect. LSL – It is lower specification limit for a performance standard. Any deviation below this is a defect. Target – Ideally, this will be the middle point between USL and LSL.

Six Sigma approach is to find out the root causes of the problem, symbolically represented by Y = F(X). Here, Y represents the problem that occurs due to cause (s) X. Y Dependent Customer related output Effect Symptom Monitor

x1, x2, x3, …., xn Independent Input-process Cause Problem Control

Six Sigma - Green Belt Benefits of Six Sigma Continuous defect reduction in products and services Enhanced customer satisfaction Performance dashboards and metrics Process sustenance Project based improvement, with visible milestones Sustainable competitive edge Helpful in making right decisions

Business processes A business process or a process is a group of tasks which result in a specific service or product for customers. It can be visualized with a flowchart or a process matrix. Business processes are fundamental to every company’s performance and implement the business strategy. Understanding and optimizing the business process is the crux of six sigma. Frequently, organizations treat the symptoms of a process performance issue without truly understanding the root cause or impact of the issue. Dissecting and truly understanding root cause for process performance is critical to effective process improvement which is can be accomplished by six sigma. Each process, have the three elements of inputs, process and outputs that affect its function. A business process is a collection of related activities that produce something of value to the organization, its stakeholders or its customers. Having a standard model such as DMAIC (Define-Measure-Analyze-Improve-Control) makes process improvement and optimization much easier by providing the teams with an easy roadmap. This disciplined, structured, rigorous approach consists of steps which are linked logically to the previous step and to the next step. It is not enough for organizations to treat process improvement as one-time or periodic events. A sustaining focus on process management and continuous improvement is the key. Types of Processes - Processes can be classified as management processes, operational processes and supporting processes. Management processes - These processes administer the operation of a system. Some examples of management processes are planning, corporate governance, etc. Operational processes - These processes create the primary value stream for the customers. Hence, they are also called ‘core business processes’. Some examples of operational processes are purchasing of raw materials, manufacturing of goods, rendering of services, marketing, etc. Supporting processes - These processes support the core business processes of the organization. Some examples of supporting processes are accounting, technical support, etc. These processes can be divided into many sub-processes that play their intended roles to successfully complete the respective head processes.

Business System

Six Sigma - Green Belt A business system is a group of business processes which combine to form a single and identifiable unit of business as a whole. It is composed of processes, which in turn are composed of sub-processes and which are further composed of individual tasks. A business system is a system that implements a process or a set of processes. It ensures that all the processes operate smoothly without delays or lack of resources. Six sigma directs business systems to ensure that the processes, products, and services are subjected to continuous improvement and for which collection and analysis of data from processes is initiated. It is important to have an appropriate business system in place and the relevant processes under the system are well-documented. The documentation of the processes must be done in such a way that every task, activity, and their sequence are taken into account for proper execution as planned for in the business system.

Process Control Feedback received from process is used for process control thus, focusing on the input and output of the process for data collection. Every sub-process or task act as an input to next task or as output for previous one. Achieving optimum resources usage by a process though keeping quality output by Applying feedback loop to collect data from various process stages so as to apply improvisation Re-design the process for data collection, analysis and improvisation as part of the process.

A real-time feedback will initiate improvisation quickly. Tools like control chart helps in data collection and analysis as well.

Six Sigma Green Belt’s Responsibilities A Six Sigma Green Belt has nearly identical responsibilities as a Black Belt when it comes to projects but they work on less complex challenges or problems than the Black Belt professionals. There are no dedicated Green Belt practitioners in any organization as, most Green Belts retain the positions they had prior to being trained in Six Sigma and use the new skills to improve their working environment and performance. The responsibilities of a Six Sigma Green Belt includes Project Management involving defining the project scope, marshal resources, setting up of goals, timelines and milestones and also reporting or updating stakeholders and executives.

Six Sigma - Green Belt Task Management involving establishing the team’s lean Sigma roadmap, leading the implementation of Six Sigma tools, managing team meetings, tracking and reporting team progress Team Management involving selecting team members, manage the team’s organizational interfaces and ensuring the team is trained and equipped for their work.

DMAIC Methodology The Six Sigma methodology is conceptually based a five phase project. Each phase has a specific purpose and specific tools and techniques which aid in achieving the phase objectives as well as lead the Six Sigma professional to significant conclusions. The 5 Phases of the Six Sigma Methodology is called as DMAIC or the Define Phase, Measure Phase, Analyze Phase, Improve Phase and the Control Phase. All the five phases are discussed below. Define Phase - The goal of Define is to establish the projects foundation and is the most important aspect of the Six Sigma project. Projects start with a current state challenge which is articulated in a quantifiable manner as well as the goal to achieve, is also determined. After specification of problems and goals the remaining tasks of valuation, team, scope, project planning, time line, stakeholders, VOC/ VOB etc. are to be completed. Various tools used by the Define Phase are Project Charter Problem Statement Business Case Objective High level time line Project Scope Project Team Stakeholder Assessment Pareto Charts SIPOC VOC/VOB and CTQ's High Level Process Map Measure Phase – In this phase baseline information is gathered about the process or product and achieve the following objectives Gather All possible x's Analyze measurement system and Data Collection Requirements Validate Assumptions and Improvement Goals Determine COPQ Refine Process Understanding Determine Process Capability Process Stability This Phase involves the usage of following tools

Six Sigma - Green Belt Process Maps, Value Stream Mapping Failure Modes and Effects Analysis (FMEA) Cause and Effect Diagram XY Matrix Basic Control Charts Six Sigma Statistics Basic Statistics Descriptive Statistics Normal Distributions Graphical Analysis Measurement Systems Analysis Variable Gage R&R Attribute Gage R&R Gage Linearity and Accuracy Gage Stability Process Capability (Cpk, Ppk) and Sigma Data collection plan Analyze Phase – It entails establishing verified drivers by using statistics and higher order analytics to discover the fact-based relationship between the process performance and the x's or the root causes or drivers of improvement effort. Thus, resulting in establishment of hypothesis for improvements. This phase establishes transfer function Y=f(x) and validates list of critical X's and their impacts. The analyze phase also results in a beta improvement plan like pilot plan. This phase utilizes various tools like Hypothesis Testing Simple Linear Regression Multiple Regression Improve Phase – This phase is aimed only on making the improvement like improving the designing, testing and implementing of the solution. It involves enlisting statistically proven results from active study or pilot, creating the improvement plan, updating the stakeholder assessment, revising the business case with investment ROI, risk assessment and adding new process capability. This phase uses tools like Design of Experiment (DOE) Implementation Plan Change Plan Communication Plan Control Phase – It is the last phases of the Six Sigma methodology which establishes automated and managed mechanisms to maintain and sustain improvements in the process. A successful control plan also results in a reaction and mitigation plan with an accountability structure. It involves tools like control plan, training plans, poka-yoke and/or audit plans. The Six Sigma

Six Sigma - Green Belt methodology is a complete system with tools and techniques built-in which ensures the Six Sigma practitioner to achieve success.

Cost of Quality (COQ) Cost of quality is the sum of various costs as that of appraisal costs, prevention costs, external failure costs, and internal failure costs. It is generally believed that investing in prevention of failure will decrease the cost of quality as failure costs and appraisal costs will be reduced. Understanding cost of quality helps organizations to develop quality conformance as a useful strategic business tool that improves their product, services & brand image. This is vital in achieving the objectives of a successful organisation. COQ is primarily used to understand, analyze & improve the quality performance. COQ can be used by shop floor personnel as well as a management measure. It can also be used as a standard measure to study an organization’s performance vis-à-vis another similar organisation and can be used as a benchmarking indices. The various costs which constitute cost of quality are Appraisal cost is the cost incurred because of inspecting the processes. The cost associated with checking and testing to find out whether it has been done first time right. Prevention cost is the cost incurred because of carrying out activities to prevent failures. The cost associated with planning and training associated with doing it first time right. External failure cost is the cost incurred because of the failure that occurred when the customer used the product. Internal failure cost is the cost incurred because of the failures within the organization. Examples of the various costs are Prevention - Training Programme, Preventive Maintenance Appraisal - Depreciation of Test/ Measuring Equipment, Inspection Contracts Internal Failure - Scrap, Rework, Downtime, Overtime External Failure - Warranty, Allowances, Customer Returns, Customer Complaints, Product Liability, Lawsuits, Lost Sales

Six Sigma - Green Belt Identifying COQ can have several benefits, as It provides a standard measure across the organisation & also inter-organisation It builds awareness of the importance of quality It identifies improvement opportunities Being a cost measure, it is useful at shop floor as well as at management level

Organizational Drivers and Metrics Key Drivers – Performance measurement and analysis is the primary way to reduce wastages and maintain higher quality products or services. Various internal and external entities act as the key drivers for improvements. Internal key drivers include operational, workforce, governance and compliance performance, and the external key drivers include customer, service, competitive and financial performance. Various performance measures are present but only those performance metrics need to be considered which represent the factors for improvisations in selected performances like financial or customer. Voice Of the Customer (VOC) - It is the term used to describe the stated and unstated needs or requirements of the customer. It helps in listing the relative importance of features and benefits associated with the product or service thus, showing the expectations and promises that are both fulfilled and unfulfilled by the product or service. Voice of the Customer (VOC) is describes customer’s feedback about their experiences with and expectations for the products or services. Gathering VOC information can be done by Direct interviews of customers like site intercepts, personal interviews, focus groups, customer feedback forms, or structured online surveys. Indirect interviews with representatives like sales people or customer service representatives, who interface with the customer and report on their needs. Conducting VOC helps by Customize products, services, add-ons and features to meet the needs and wants of customers No one becomes an industry leader without listening to the customer. Quality (customer perceived) is the leading driver of business success Maximize company’s profit. Higher market share companies have higher profits

Six Sigma - Green Belt The Balanced Scorecard - It is the most widely used business performance measurement framework, introduced by Robert S. Kaplan and David P. Norton in 1992. Balanced scorecards were initially focused on finding a way to report on leading indicators of a business’s health, they were refocused to measure the firm’s strategy that directly relate to the firm’s strategy. Usually the balanced scorecard is broken down into four sections, called perspectives, as The financial perspective - The strategy for growth, profitability and risk from the shareholder’s perspective. It focuses on the ability to provide financial profitability and stability for private organizations or cost-efficiency/effectiveness for public organizations. The customer perspective - The strategy for creating value and differentiation from the perspective of the customer. It focuses on the ability to provide quality goods and services, delivery effectiveness, and customer satisfaction The internal business perspective - The strategic priorities for various business processes that create customer and shareholder satisfaction. It aims for internal processes that lead to “financial” goals The learning and growth perspective - The priorities to create a climate that supports organizational change, innovation and growth. It targets the ability of employees, technology tools and effects of change to support organizational goals. The Balanced Scorecard is needed due to various factors, as Focus on traditional financial accounting measures such as ROA, ROE, EPS gives misleading signals to executives with regards to quality and innovation. It is important to look at the means used to achieve outcomes such as ROA, not just focus on the outcomes themselves. Executive performance needs to be judged on success at meeting a mix of both financial and non-financial measures to effectively operate a business. Some non-financial measures are drivers of financial outcome measures which give managers more control to take corrective actions quickly. Too many measures, such as hundreds of possible cost accounting index measures, can confuse and distract an executive from focusing on important strategic priorities. The balanced scorecard disciplines an executive to focus on several important measures that drive the strategy.

Organizational Goals Before a Six Sigma project can be executed, organizational strategic planning goals and objectives must be defined. Determining selection of appropriate projects and choosing an effective improvement model are crucial tasks that help to ensure company is pointed in the right direction. The broad objectives of the organization must be aligned with its long term strategies. One of the techniques that an organization can use to align its objectives with long term strategies is ‘hoshin planning’. Hoshin planning helps an organization to develop its business plan and deploy the same across the organization in order to reach the set goals.

Six Sigma - Green Belt Project selection is a testimony to a leader’s role in successfully aligning the broad objectives of the organization with its long term strategies. A project selection committee or group can be formed to screen and select projects. It can include Champions, Master Black Belts, Black Belts, and important executive supporters. The project selection committee sets the criteria to select the projects. The project selection criteria are framed on the basis of the key factors that define the business case and business need of an organization. After selecting the projects, the project selection committee matches the projects selected with teams assigned to execute them.

1.2. Lean Principles Lean manufacturing focuses on lean philosophy which is about elimination of waste in all forms at the workplace. Specific lean methods include just-in-time inventory management, Kanban scheduling systems and 5S workplace organization. Many of these concepts were developed by a Japanese company, Toyota which is an automobile manufacturer in the 1940s and these concepts became widespread for removing waste thus, graduating as best practices in many industries beyond automotive companies. Applying these principles to production has the potential for both improved profitability and increased complexity.

Origins Lean Manufacturing has evolved over times. In 1890's Frederick W. Taylor began to look at individual workers and work methods. Frank Gilbreth added Motion Study and invented Process Charting. Lillian Gilbreth introduced psychology by studying the motivations of workers and how attitudes affected the outcome of a process. These ideas led to waste elimination, a key component of JIT and Lean Manufacturing. In 1910, Henry Ford developed and implemented the first comprehensive Manufacturing Strategy by arranging all the elements of a manufacturing system like people, machines, tooling and products, in a continuous system or an assembly line for manufacturing the Model T automobile.

Toyota Production System During 1949 and 1975, in Toyota Motor Company, Taichii Ohno and Shigeo Shingo, began to incorporate Ford production and other techniques into an approach called Toyota Production System or Just In Time. But, they found flaws in the Ford system, especially with treatment towards employees as Ford used employees only for muscle power. The Toyota Production System (TPS) focuses on muri and muda. Muri focuses on the preparation and planning of the process, or what work can be eliminated in the design process. Muda are those waste steps and processes that add cost. Muri is used in new product design and muda is used to improve existing operations.

Concept and Tools

Six Sigma - Green Belt Lean manufacturing is not just usage of few techniques or processes but a journey in itself which takes a holistic view of the organization and involves various phases which make use of various techniques and processes. The process for lean manufacturing involves following steps Define value from the customer’s perspective Map the value stream Create flow by removing causes of waste Create pull if flow is difficult to achieve Measure and validate Practice continuous improvements Mudas - Muda is a Japanese term meaning "waste" as, lean manufacturing is an Japanese management philosophy hence, Japanese terms and concepts are used extensively. There are 7 mudas or seven types of waste that are found in a manufacturing process which are Overproduction - Producing more than the customer requires is waste causing other wastes like inventory costs, manpower and conveyance to deal with excess product. Needless Inventory - Inventory at any point is a no value-add as it ties up financial resources of the company and is exposed to the risk of damage, obsolescence, spoilage, and quality issues. It also needs space and other resources for proper management and tracking. Defects - Defects and broken equipment results in defective products and subsequently customer dissatisfaction, which need more resources for solving. Non-value Processing – It is also called over-processing, for which more resources are wasted in production, their wasted movement and time. Any processing that does not add value to the product is waste like in-process protective packaging due to extra manufacturing steps. Excess Motion - Unnecessary motion due to poor workflow, poor layout, housekeeping, inconsistent work methods or lack of standardized procedures, is a waste. Transport and Handling – It is shipping damage and includes pallets not being properly stretch wrapped (wasted material), or a truck is not loaded to use floor space efficiently. Waiting - These are wastages in time, due to broken machinery, lack of trained staff, shortages of materials, inefficient planning and waiting for material. Waste Elimination Techniques - Various waste elimination techniques which are used in lean manufacturing are listed, as Pull System – It is the technique for producing parts as per the customer’s demand. Companies need to have a Push System or building products to stock as per sales forecast, without firm customer orders. Kanban – It is a method for maintaining an orderly flow of material. Kanban cards are used to indicate material order points, how much material is needed, from where the material is ordered, and to where it should be delivered. Total Quality Management – It is a management system for continuous improvement in all areas of a company's operation. It is applicable to every operation of the organization and involves employees.

Six Sigma - Green Belt Quick Changeover (or SMED - Single Minute Exchange of Dies) – It is the technique for reducing changeover time to change a process from running a specific product manufacture to another. It enables flexibility in final product offerings and also to address smaller batch sizes. 5S or Workplace Organization – It is a systematic method for organizing and standardizing the workplace and is applicable to every function in an organization. Total Productive Maintenance – It focuses on proactive and progressive maintenance of equipments by utilizing the knowledge of operators, equipment vendors, engineering and support persons to optimize machine performance thus, drastically reducing breakdowns, unscheduled and scheduled downtime which results in improved utilization, higher throughput, and better product quality. Takt time is a measure of customer demand expressed in units of time and is calculated as Takt time = Available time per shift / Demand per shift or Cycle time/Number of People Visual Controls – They provide an immediate understanding (usually thirty seconds) of a condition or situation like what’s happening with regards to production schedule, backlog, workflow, inventory levels, resource utilization, and quality. It includes kanban cards, lights, color-coded tools, lines delineating work areas and product flow, etc. Poka Yoke or Mistake Proofing - Poka Yoke is a quality management concept developed by a Matsushita manufacturing engineer named Shigeo Shingo to prevent human errors from occurring in the production line as, extensive automation and computerization is expensive. Poka yoke is implemented by using simple objects like fixtures, jigs, gadgets, warning devices, paper systems, and the like to prevent people from committing mistakes.

Value-Added and Non-Value-Added Activities Value refers to an activity for which customer will pay for or which is valued by the customer and rest are non-value activities. Value stream refers to the sequence of activities involved from customer’s request ion to fulfillment and VSM records these activities as icons or symbols. Value Stream Mapping (VSM) is a visualization tool oriented to understand and streamline work processes using icons and symbols to depict various elements and improve the flow of material and information. It helps in identifying and decreasing waste or non-value addition, in the process. It can also be used as a strategic planning tool and a change management tool other than a communication tool. Few icons used for mapping and development of VSM, includes Icon

Name Inventory

Description This is a material Queue of products that are not being processed. It represents storage of raw materials as well as finished goods. The time period may be listed below the icon.

Six Sigma - Green Belt Icon

Name Supermarket

Description This is an inventory “supermarket” that contains some inventory available to downstream customers enabling them to select what they need. The next process or customer would pull from this inventory.

Go See Scheduling

Glasses represent collecting information visually. It can also indicate informal Scheduling.

Kanban Post

This represents a location for kanban signal pickup.

Developing the VSM - VSM mapping involves step by step development of the VSM state map whether a present or of future state map and involves the following steps Draw customer, supplier and production control icons. Enter customer requirements and calculate daily production required. Draw outbound shipping icon and truck with delivery frequency. Draw inbound shipping icon, truck and delivery frequency. Add process boxes, in sequence, left to right, and data boxes below. Add communication arrows with methods and frequencies. Obtain process attributes and add data boxes. Add operator symbols, inventory locations and levels in days of demand graph at bottom. Add push, pull and FIFO icons. Add working hours, cycle times (CT) and lead times. Calculate total cycle lead time.

Six Sigma - Green Belt 5S 5S is a discipline for creating and maintaining a clutter- free, clean, organized safe and high performance workplace in 5 steps, which are seiri, seiton, seiso, seiketsu and shitsuke. Seiri - Sorting out: Clean out the work area, keeping what is necessary in the work area, relocating or discarding what is not Seiton - Systematic arrangement / Set limits and Locations: Arrange needed items so they are easy to find, use and return Seiso - Shine and Sweep: Clean and care for equipment area Seiketsu - Standardization: Make all work areas similar Shitsuke - Self-Discipline / Sustain: Make these rules natural and instinctual

Theory of Constraints (TOC) It is a methodology for identifying the most important limiting factor (i.e. constraint) that stands in the way of achieving a goal and then systematically improving that constraint until it is no longer the limiting factor. It was first published in The Goal by Eliyahu M. Goldratt and Jeff Cox in 1984. TOC conceptually models the manufacturing system as a chain, and advocates focusing on its weakest link. Goldratt defines a five-step process that a change agent can use to strengthen the weakest link, or links, which includes Identify the System Constraint - The part of a system that constitutes its weakest link can be either physical or a policy. Decide How to Exploit the Constraint - Goldratt instructs the change agent to obtain as much capability as possible from a constraining component, without undergoing expensive changes. Subordinate Everything Else - The non-constraint components of the system must be adjusted to a "setting" that will enable the constraint to operate at maximum effectiveness. Once this has been done, the overall system is evaluated to determine if the constraint has shifted to another component. If the constraint has been eliminated, the change agent jumps to step five. Elevate the Constraint - "Elevating" the constraint refers to taking whatever action is necessary to eliminate the constraint. This step is only considered if steps two and three have not been successful. Major changes to the existing system are considered at this step. Return to Step One, But Beware of "Inertia"

1.3. Design for Six Sigma (DFSS) Design for Six Sigma can be seen as a subset of Six Sigma focusing on preventing problems by going upstream to recognize that decisions made during the design phase profoundly affect the quality and cost of all subsequent activities to build and deliver the product. Early investments of time and effort pay off in getting the product right the first time. DFSS adds a new, more predictive front end to Six Sigma. It describes the application of Six Sigma tools to product development and process design efforts with the goal of “designing in” Six Sigma performance capability. The intension of DFSS is to bring such new products and/or services to market with a process performance of around 4.5 sigma or better, for every customer requirement.

Quality Function Deployment (QFD)

Six Sigma - Green Belt Quality Function Deployment is a method for prioritizing and translating customer inputs into designs and specifications for a product, service, and/or process. While the detail of the work involved in QFD can be both complex and exhaustive, the essentials of the QFD method are based on common-sense ideas and tools. QFD is a planning tool that relates a list of delights, wants, and needs of customers to design technical functional requirements. With the application of QFD, possible relationships are explored between quality characteristics as expressed by customers and substitute quality requirements expressed in engineering terms. In the context of DFSS, these requirements critical-to characteristics, which include subsets such as critical-to-quality (CTQ) and critical-to-delivery (CTD). In the QFD methodology, customers define the product using their own expressions, which rarely carry any significant technical terminology. The voice of the customer can be discounted into a list of needs used later as input to a relationship diagram, which is called QFD’s house of quality. One major advantage of a QFD is the attainment of shortest development cycle, which is gained by companies with the ability and desire to satisfy customer expectation. The other significant advantage is improvement gained in the design family of the company, resulting in increased customer satisfaction. QFD is a robust method having many variations in applications, as Prioritize and select improvement projects based on customer needs and current performance Assess a process’s or product’s performance versus competitors Translate customer requirements into performance measures Design, test, and refine new processes, products, and services QFD uses various other methods like Voice of the Customer input to Design of Experiments, to work well. A special multidimensional matrix, also called as the “House of Quality,” is the bestknown element of the QFD method. A full QFD product design project will involve a series of these matrices, translating from customer and competitive needs to detailed process specifications. QFD concept involves two core concepts, which are The QFD Cycle - An iterative effort to develop operational designs and plans in four phases of Translate customer input and competitor analysis into product or service features. Translate product/service features into product/service specifications and measures. Translate product/service specifications and measures into process design features. Translate process design features into process performance specifications and measures. QFD is accomplished by multidisciplinary DFSS teams using a series of charts to deploy critical customer attributes throughout the phases of design development. QFD is usually deployed over four phases. The four phases are phase 1—CTS planning, phase 2—functional requirements, phase 3—design parameters planning, and phase 4—process variables planning, as shown in the figure below.

Six Sigma - Green Belt

Prioritization and Correlation - Detailed analysis of the relationships among specific needs, features, requirements, and measures. Matrices like the House of Quality or the simple L-Matrix keep this analysis organized and document the rationale behind the design effort. The QFD Cycle develops the links from downstream Ys (Customer Requirements and Product Specifications) back to upstream Xs (Process Specifications) in the design process itself. With an existing process or product, it can be used to clarify and document those relationships if they’ve never been investigated before. Another benefit of the House of Quality is a “diagonal” relationship test afforded by the matrix, testing combinations that may not have been considered by our standard human “linear” thought processes. An example is shown below

QFD analysis is conducted in six steps as It starts with the articulation of customer requirements. Techniques used could be interviewing, observation, prototyping, conceptual modeling, etc. The data from marketing research are also used. These requirements are also known as the "What's". In the second step, the company's current product is ranked against the competitors. Next, the team looks at Product/Process Characteristics, in other words, the "How's" of meeting the customer requirements. Candidate CCR's are listed across the top and for each their relevance is considered and ranked as to which will address customer needs. Then, the team relates customer and technical requirements with ratings such as "high", "moderate", "low", and "no" correlation. The team evaluates the degree to which customer wants and needs are addressed by the product or process characteristics.

Six Sigma - Green Belt In the fifth step, the roof of the "House" focuses on relationships among product/process characteristics. It shows whether the "How’s" reinforce or conflict with one another. In last, the team summarizes the key conclusions. It ranks the relevance of product or process characteristics to the attainment of customers' wants or needs.

Design And Process Failure Mode And Effects Analysis (DFMEA and PFMEA) FMEA is a systematic, proactive method for evaluating a process to identify where and how it might fail and to assess the relative impact of different failures, in order to identify the parts of the process that are most in need of change. FMEA includes review of the following Steps in the process Failure modes (What could go wrong?) Failure causes (Why would the failure happen?) Failure effects (What would be the consequences of each failure?) FMEA evaluates processes for possible failures and to prevent them by correcting the processes proactively rather than reacting to adverse events after failures have occurred. FMEA is also useful in evaluating new process prior to implementation and in assessing impact of changes to an existing process. FMEA usually involves the following steps Select a process to evaluate with FMEA - Evaluation using FMEA works best on processes that do not have too many sub-processes, instead of doing an FMEA on a large and complex process. Recruit a multidisciplinary team - Be sure to include everyone who is involved at any point in the process. Have the team meet together to list all of the steps in the process - Number every step of the process, and be as specific as possible. It may take several meetings for the team to complete this part of the FMEA, depending on the number of steps and the complexity of the process. Flowcharting can be a helpful tool for outlining the steps. When finished, be sure to obtain consensus from the group. The team should agree that the steps enumerated in the FMEA accurately describe the process. Have the team list failure modes and causes - For each step in the process, list all possible failure modes, anything that could go wrong, including minor and rare problems. Then, for each failure mode listed, identify all possible causes. For each failure mode, have the team assign a numeric value (known as the Risk Priority Number, or RPN) for likelihood of occurrence, likelihood of detection, and severity. Assigning RPNs helps the team prioritize areas to focus on and can also help in assessing opportunities for improvement. For every failure mode identified, the team should answer as a group with consensus on all values assigned to the following questions Likelihood of occurrence: How likely is it that this failure mode will occur? - Assign a score in 1 and 10, with 1 meaning “very unlikely to occur” and 10 meaning “very likely to occur.” Likelihood of detection: If this failure mode occurs, how likely is it that the failure will be detected? - Assign a score between 1 and 10, with 1 meaning “very likely to be detected” and 10 meaning “very unlikely to be detected.”

Six Sigma - Green Belt Severity: If this failure mode occurs, how likely is it that harm will occur? - Assign a score between 1 and 10, with 1 meaning “very unlikely that harm will occur” and 10 meaning “very likely that severe harm will occur.” In patient care examples, a score of 10 for harm often denotes death. Evaluate the results - To calculate the Risk Priority Number (RPN) for each failure mode, multiply the three scores obtained (the 1to 10 score for each of likelihood of occurrence, detection, and severity). The lowest possible score will be 1 and the highest 1,000. Identify the failure modes with the top 10 highest RPNs. These are the ones the team should consider first as improvement opportunities. To calculate the RPN for the entire process, simply add up all of the individual RPNs for each failure mode. Use RPNs to plan improvement efforts - Failure modes with high RPNs are probably the most important parts of the process on which to focus improvement efforts. Failure modes with very low RPNs are not likely to affect the overall process very much, even if eliminated completely, and they should therefore be at the bottom of the list of priorities.

Failure Mode: What could go wrong? Failure Causes: Why would the failure happen? Failure Effects: What would be the consequences of failure? Likelihood of Occurrence: 1–10, 10 = very likely to occur Likelihood of Detection: 1–10, 10 = very unlikely to detect Severity: 1–10, 10 = most severe effect Risk Priority Number (RPN): Likelihood of Occurrence × Likelihood of Detection × Severity

Six Sigma - Green Belt Design FMEA (DFMEA) – It is used to analyze designs before they are released to production. In the DFSS algorithm, a DFMEA should always be completed well in advance of a prototype build. The input to DFMEA is the array of functional requirements. The outputs are List of actions to prevent causes or to detect failure modes and History of actions taken and future activity. The DFMEA helps the DFSS team in Estimating the effects on all customer segments Assessing and selecting design alternatives Developing an efficient validation phase within the DFSS algorithm Inputting the needed information for Design for X (DFMA, DFS,DFR, DFE, etc.) Prioritizing the list of corrective actions using strategies such as mitigation, transferring, ignoring, or preventing the failure modes Identifying the potential special design parameters (DPs) in terms of failure Documenting the findings for future reference Process FMEA (PFMEA) – It is used to analyze manufacturing, assembly, or any other processes such as those identified as transactional DFSS projects. The focus is on process inputs. Software FMEA documents and addresses failure modes associated with software functions. The PFMEA is a valuable tool available to the concurrent DFSS team to help them in Identifying potential manufacturing/assembly or production process causes in order to place controls on either increasing detection, reducing occurrence, or both Prioritizing the list of corrective actions using strategies such as mitigation, transferring, ignoring, or preventing the failure modes Documenting the results of their processes Identifying the special potential process variables (PVs), from a failure standpoint, which need special controls

DFSS Roadmap IDOV and DMADV helps in improving and extending DFSS. Both IDOV and DMADV are discussed. IDOV - IDOV stands for Identify, Design, Optimize and Verify. It is a variant of DFSS (Design For Six Sigma) but, different from DMAIC (define, measure, analyze, improve and control). It consists of four different phases as Identify Phase - It identifies specific customer needs, based on which a product or business process will be designed. It is essential for launching a new product or service and involves various activities as, defining VOC, developing a team and team charter, performing competitive analysis and identifying CTQs. Other crucial steps in this phase involve the identification of customer and product requirements, establishment of an appropriate business model, identification of technical requirements such as CTQs, allocation of roles and responsibilities. Some of the tools used are QFD, FMEA, target costing and benchmarking.

Six Sigma - Green Belt Design Phase - It focus on functional requirements, development of alternate business processes, evaluation of available options and selection of the most appropriate business process based on CTQs identified earlier. It includes the formulation of concept design, identification of probable risk elements, and identification of design parameters by utilizing advanced simulation tools and formulation of procurement plans and manufacturing plans. Tools used in this phase include risk assessment, FMEA, engineering analysis and Design of experiments. Optimize Phase - This phase uses CTQs for calculating the tolerance level of a selected business process by simulation tools. It predicts the performance capability of a business process, optimizing existing design and developing alternative design elements. This phase may involve assessment of process capabilities, optimization of design parameters, development of design for robust performance and reliability, error proofing and establishment of tolerance measurement objectives. Tools usually used are manufacturing database and flow back tools, design for manufacturability, process capability models, Monte Carlo methods, tolerance measurement tools and Six Sigma tools. Validate Phase – It being the last phase, focus on testing and validating the selected design. Any changes to the design can be made in this phase. This phase involves prototype test and validation, assessment of performance, failure modes, reliability and risks, design iteration and final phase review. DMADV – DMADV refers to Define, Measure, Analyze, Design and Verify. DMADV is one aspect of Design for Six Sigma (DFSS), which has evolved from the earlier approaches of continuous quality improvement and Six Sigma approach to reduce variation. A key component of the DMADV approach is an active ‘toll gate’ check sheet review of the outcomes of each of the five steps of DMADOV. It is depicted as

The application of DMADOV is aimed at creating a high-quality product keeping in mind customer requirements at every stage of the game. In general, the phases of DMADOV are Define phase – In this phase, wants and needs believed to be most important to customers are identified by historical information, customer feedback and other information sources. Teams are assembled to drive the process. Metrics and other tests are developed in alignment with customer information. The key deliverables are team charter, project plan, project team, critical customer requirements and design goals. Measure phase - The defined metrics are used to collect data and record specifications for remaining process. All the processes needed to successfully manufacture the product or service are assigned metrics for later evaluation. Technology teams test metrics and then apply them. The key deliverables are qualified measurement systems, data collection plan, capability analysis, refined metrics and functional requirements.

Six Sigma - Green Belt Analyze phase - The result of the manufacturing process (i.e. finished product or service) is tested by internal teams to create a baseline for improvement. Leaders use data to identify areas of adjustment within the processes that will deliver improvement to either the quality or manufacturing process of a finished product or service. Teams set final processes in place and make adjustments as needed. The deliverables are data analysis, initial models developed, prioritized X's, variability quantified, CTQ flow-down and documented design alternatives. Design phase - The results of internal tests are compared with customer wants and needs. Any additional adjustments needed are made. The improved manufacturing process is tested and test groups of customers provide feedback before the final product or service is widely released. The deliverables includes validated and refined models, feasible solutions, tradeoffs quantified, tolerances set and predicted impact. Verify phase - The last stage in the methodology is ongoing. While the product or service is being released and customer reviews are coming in, the processes may be adjusted. Metrics are further developed to keep track of on-going customer feedback on the product or service. New data may lead to other changes that need to be addressed so the initial process may lead to new applications of DMADV in subsequent areas. The key deliverables are detailed design, validated predictions, pilot / prototype, FMEA's, capability flow-up and standards and procedures. The applications of these methodologies are generally rolled out over the course of many months, or even years. The end result is a product or service that is completely aligned with customer.

Six Sigma - Green Belt

2. DEFINE In define phase of a Six Sigma DMAIC project, the project leaders are responsible for clarifying the purpose and scope of the project or the process to be improved and for knowing the quality expectations of the customer. This phase also involves establishing realistic estimates for timeline and costs thus, ensuring stakeholders and project team on the same page about project's implementation, evaluation, progress and success. Project need to be assessed for suitability for DMAIC and it involves answering the following questions Is data available or easy to obtain? Does leadership support exist for improving this process? Is DMAIC really needed or is this a “just do it”: a problem with a known solution that should just be implemented? Is the team trying to boil the ocean or is the scope reasonable for chartering as a DMAIC project? Is the process directly related to a key outcome such as profitability, customer satisfaction, or employee satisfaction? The define phase achieves a number of purposes which includes assessing the current project against the strategic objectives and ensuring it’s potential. The phase also results in identification of the project scope, objectives, sponsors, schedule, deliverables and team members along with team formation. Before initiating, availability of resources is paramount.

2.1. Process Management Business Process Basics A business process is a group of tasks which result in a specific service or product for customers. It can be visualized with a flowchart or a process matrix. Business processes are fundamental to every company’s performance and implement the business strategy. Understanding and optimizing the business process is the crux of six sigma.

Flowchart

Process Matrix

Six Sigma - Green Belt Dissecting and truly understanding root cause for process performance is critical to effective process improvement which can be accomplished by Six Sigma. Each process, have the three elements of inputs, process and outputs that affect its function. A business process is a collection of related activities that produce something of value to the organization, its stakeholders or its customers. Process Elements – Every process has a start from the state or resources it needs and end where the process need to reach. The intermediate between both is the process logic which makes it possible. Process Identification – Process to be improved or optimized need to be identified by the process boundaries which indicate the influence and involvement of a process and it’s resources. SIPOC diagrams are usually used for process identification as it provides a top-level view of the process. SIPOC stands for Suppliers, Inputs, Process, Outputs and Customers. SIPOC enables the team to quickly develop a common understanding of the process and it's key customers and suppliers. The steps to create a SIPOC are Naming the process. Defining the starting point and the ending point of the process as listed in the scope section of the team charter. Enlist the key outputs of the process. Identify the entity receiving those outputs whether internal or external. State the top-level process steps without any decision points or feedback loops. Identify the inputs to process and the entities supplying those inputs. Systems Thinking - Systems thinking entails observing system as an whole. The term system is defined as a whole consisting of parts, each of which can affect the other's properties. The performance of the system is known by how parts interrelate like for a business, the manner in which sales, procurement, manufacturing and distribution relate to each other determines the business performance, instead of individual performance. Systems thinking can be applied in various ways in the Six Sigma project, as Systems thinking can be used to launch a high-impact initiative for real root cause areas instead of the symptoms of high level problems. It can be used to map out the system dynamics around a mission critical Big Y to optimize, and then identify the various high-leverage daughter projects. During the define phase, it identifies the possible negative consequences of optimizing the project Y. Thus, the project team can put avoidance or elimination strategies. During the measure or analyze phases the system dynamics of the critical Xs can be identified that affect the project Y that the team has been tasked to optimize. Six Sigma programs can avoid irrelevant issues and address the real issues by using the systems thinking. It helps in integrating successful management processes into a single management system which wisely uses resources while focusing on what is important for customers, shareholders and employees.

Six Sigma - Green Belt Owners and Stakeholders Stakeholders are the entity which has interest in the process or the business and they include the supplier, customer, employees and investors. Similarly the process stakeholder includes the process operators, executive, managers, suppliers, customer and supporting staff like logistics persons. The interest of stakeholders may also vary with time. Process owners are the individuals within an organization, responsible for coordinating and managing the workflow and activities at every stage of a process. They are also responsible for the performance of a process against the listed goals and measured by key process indicators. they have the authority to make necessary changes to the process and it’s stages in achieving the listed goals. Project teams having stakeholders and process owners are more effective in achieving the results. Stakeholder involvement is very helpful as they have the detailed knowledge about the process thus, they come out with innovative and impactful process improvement whilst considering the consequences and feasibility of the same.

Customer Identification Customer identification is crucial task of any Six Sigma project. Various tools like brainstorming, SIPOC and marketing analysis data are useful for the purpose. Customer identification should be carried out even if customers are known so as, to be better aware of the customers and reveal any hidden customers. Customers can be categorized as internal or external or on basis of location, demography, sex, etc. The criteria of classification is dependent upon achieving the desired results

Customer Data Collection and Analysis Capturing customer data which have been identified in earlier steps can be accomplished by various tools like VOC, survey, etc. Customer data analysis is the next step after customer data collection. Analysis helps in prioritizing and understanding customer needs. Various analysis tools are used like Pareto diagram, FMEA, affinity diagrams, interrelationships digraph, matrix diagrams and priority matrices issue identification and addressing.

Customer Requirement Mapping Customer requirement mapping involves identification of processes for improvements as needed against customer requirements. Quality Function Deployment (QFD) is an effective tool for the purpose as, QFD is a structured method to identify and prioritize customer’s expectations.

2.2. Project Management Project Management refers to the process of getting the project completion within the available resources and designated timeframe effectively and efficiently. It includes various crucial entities which are

Six Sigma - Green Belt Project Charter and Plan – Project charter is a statement of objectives of a project which also sets out detailed project goals, roles and responsibilities. It also identifies the main stakeholders. Project charter henceforth consists of the problem statement for which the project is initiated, the purpose outlining the goals to be achieved by the project, the scope of the project on enlisting the resource requirement and the results to achieve in quantifiable terms. Project charter also contains the likely benefits to the stakeholders for taking up the project and justifies the feasibility for same. Project plan development involves setting up timelines and milestones to achieve as the project processes. It acts as the basis on which resource requirements are computed. Various project planning tools are used for the purpose like Gantt charts, CPM/PERT charts, project schedules, etc. Project Risk Analysis is conducted during project planning to work out feasibility of the project as well develop counter-measures to mitigate risks involved and their impact. Usually aspects of project which are analyzed are safety, reliability, serviceability, etc. Risk analysis involves identification and mitigation of risks. Various analysis tools are used like SWOT (Strengths, Weaknesses, Opportunities and Threats) Matrix - It involves a scan of the internal and external environment to classify internal as strengths (S) or weaknesses (W), and those external to the firm can be classified as opportunities (O) or threats (T). Risk Priority Number - Risk Priority Number (RPN) is a measure risk by assigning the RPN values range from 1 (absolute best) to 1000 (absolute worst) to identify critical failure modes with project. Failure modes and effects analysis (FMEA) - It identifies failures in a project by studying the impact of all possible failures which are prioritized according to severity, frequency and identification. Risk mitigation involves continuous review of risk identification and mitigation plans as during project progress environmental changes and new risk are identified if any step changes mid way thus, a risk management system is embedded during project planning. Project Scope – After defining the project charter and planning, the project scope is finalized thus, defining the resource requirement and listing the affected departments during the project execution. Project managers utilize various tools during this step like SIPOC, Pareto charts, brainstorming, etc to defining and documenting the project scope. Project Metrics – They are the essential component of project management which shows the status of the project. Their selection and updation is necessary for proper monitoring of the project’s progress. Project metrics are tactical and used by project manager to adapt project work flow and technical activities i.e. guide adjustments to work schedule to avoid delays and assess product quality on an ongoing basis. Project metrics usually applied measure consumption of time, budget, other resources and quality of output. Project Documentation – It involves documenting all objectives, milestones, activities, process and blueprints of the project or in short all documents from project being conceived to

Six Sigma - Green Belt implementation so as to provide accurate measure of project success. Large projects need more detailed documentation to cover all aspects of the project. Various graphical tools and techniques are used like state mapping, storyboard and six sigma projects implement DMAIC methodology thus documentation is done accordingly with figures and charts showing activity at that stage. Project Closure – It is the last phase of project which confirms achievement of laid objectives for the project with completion of required documentation. It also involves discussion with project sponsors for project completion agreement which involves comparison with the project charter.

2.3. Management and Planning Tools Various management and planning tools are used which are

Flowchart It is used to develop a process map. A process map is a graphical representation of a process which displays the sequence of tasks using flowcharting symbols. It shows the inputs, actions and outputs of a given system. Inputs are the factors of production like land, materials, labor, equipment, and management. Actions are the way in which the inputs are processed and value is added to the product like procedures, handling, storage, transportation, and processing. Outputs are the finished good or delivered service given to the customer but, output also includes un-planned and undesirable entities like scrap, rework, pollution, etc. Flowchart symbols are standardized by ANSI and common symbols used are Symbol

Function Process Flow Terminator or start/stop of process Decision or branching Data Input or Output Process or Action step

The flowchart shows a high-level view of a process view and it's capability analysis. The flow chart can be made either more complex or less complex.

Check Sheets They consist of lists of items and are indicator of how often each item on the list occurs. It is also called as confirmation check sheets. They are used for data collection process easier by prewritten descriptions of events likely to occur like ‘‘Have all inspections been performed?’’ ‘‘How often does a particular problem occur?’’ ‘‘Are problems more common with part X than with part Y?’’

Six Sigma - Green Belt

It is a simple tool for process improvement and problem solving. It can also highlight items of importance during data collection. They are an effective tool for quality improvement when used with histograms and Pareto analysis. It is not a check list which is used to ensure that all important steps or actions have been taken but check sheet is a tally sheet to collect data on frequency of occurrence of defects or errors. It is of two types Location or concentration diagram - In it the marking is done on a diagram like before submitting car to service center, a car diagram is used to list defects at present by marking and writing on the diagram. Online application forms highlight errors before submission by highlighting the error section, is also an example of this type. Graphical or Distribution check sheet - It is commonly used for collecting frequency by marking to visualize the distribution of the data as shown in diagram below

Pareto charts It is a type of bar chart in which the horizontal axis represents categories which are usually defects, errors or sources (causes) of defects/errors. The height of the bars can represent a count or percent of errors/defects or their impact in terms of delays, rework, cost, etc. By arranging the bars from largest to smallest, a Pareto chart determines focusing on which categories will yield the biggest gains if addressed, and which are only minor contributors to the problem. It is the process of ranking opportunities to determine which of many potential opportunities should be pursued first. It is used at various stages in a quality improvement program to determine which step to take next. Pareto Chart Development – It involves the following steps Collect data on different types or categories of problems. Tabulate the scores. Determine the total number of problems observed and/or the total impact. Also determine the counts or impact for each category. For small or infrequent problems, add them together into an "other" category Sort the problems by frequency or by level of impact. Draw a vertical axis and divide into increments equal to the total number observed. Do not make the vertical axis as tall as the tallest bar, which can overemphasize the importance of the tall bars and lead to false conclusions Draw bars for each category, starting with the largest and working down.

Six Sigma - Green Belt The "other" category always goes last even if it is not the shortest bar

Cause and Effect Diagram It helps teams uncover potential root causes by providing structure to cause identification effort. It is also called as fishbone or Ishikawa diagram. It helps in ensuring new ideas being generated during brainstorming by not overlooking any major possible cause. It should be used for cause identification after clearly defining the problem. It is also useful as a cause—prevention tool by brainstorming ways to maintain or prevent future problems. Developing Cause and Effect Diagram – It involves the following steps Name the problem or effect of interest. Be as specific as possible. Write the problem at the head of a fishbone "skeleton" Decide the major categories for causes and create the basic diagram on a flip chart or whiteboard. Typical categories include the manpower, machines, materials, methods, measurements and environment Brainstorm for more detailed causes and create the diagram either by working through each category or open brainstorming for any new input. Write suggestions onto self-stick notes and arrange in the fishbone format, placing each idea under the appropriate categories. Review the diagram for completeness. Eliminate causes that do not apply Brainstorm for more ideas in categories that contain fewer items Discuss the final diagram. Identify causes which are most critical for follow-up investigation.

Six Sigma - Green Belt

Tree Diagram They are also similar to cause and effect diagram but tree diagram break down problem progressively in detail by partitioning bigger problem into smaller ones. This partitioning brings a level when the problem seems easy to solve. It is made by starting from right and going towards the left. It is used by quality improvement programs. Sometimes goals are placed on left and resources on right and then both are linked to for achievement of goal. It starts with single entity which branches into two or more, each of which branch into two or more, and so on. It looks like a tree, with trunk and multiple branches. It is used for known issues whose specific details are to be addressed for achieving an objective. It also assists in listing other solution, detailing processes and probing the root cause of a problem. It is also known as systematic diagram or tree analysis or analytical tree or hierarchy diagram.

Affinity Diagram The word affinity means a ‘‘natural attraction’’ or kinship. The affinity diagram organizes ideas into meaningful categories by recognizing their underlying similarity. It reduces data by organizing large inputs into a smaller number of major dimensions, constructs or categories. It organizes facts, opinions and issues into natural groups to help diagnose a complex situation or find themes. It helps to organize a lot of ideas and identify central themes in them. It is useful when information about a problem is not well organized and solution beyond traditional thinking is needed. It organizes ideas from a brainstorming session in any phase of DMAIC and can find themes and messages in customer statements gleaned from interviews, surveys, or focus groups.

Six Sigma - Green Belt

Developing Affinity Diagram Gather inputs from brainstorming session or customer feedbacks. Write each input on cards and place them randomly. Allow people to silently start grouping the cards. When the clustering is done, create a "header" label (on a note or card) for each group. Write the theme on a larger self-stick note or card (the "Header") and place it at top of cluster. Continue until all clusters are labeled Complete the diagram and discuss the results.

Matrix Diagram It is also known as matrix or matrix chart as it uses a matrix to display information. The matrix diagram displays relationship amongst two, three or four groups of information like the strength of relationship amongst the group, the roles played by various groups, etc. It helps in analyzing the correlations between groups of information. It enables systematic analysis of correlations. Six different matrix shaped diagram are possible: L, T, Y, X, C and roof–shaped, depending on how many groups must be compared. Relationship amongst two groups of entities is done by an L–shaped matrix or roof shaped matrix. T–shaped, Y–shaped or C–shaped matrix are used to show relationship amongst three groups and four groups, X–shaped matrix is used. Various matrix types showing relationship is listed below

Six Sigma - Green Belt L-shape

T-shape

Y-shape

X-shape

Interrelationship Digraph Interrelationship digraphs helps in organizing disparate information, usually ideas generated during brainstorming sessions. It defines the ways in which ideas influence one another instead of arranging ideas into groups as done by affinity diagrams. Similar to affinity diagram, interrelationship digraphs are developed by writing down the ideas or information on paper like Post-it notes which are then placed on a large sheet of paper and arrows are drawn between related ideas. An idea that has arrows leaving it but none entering is a root idea. By evaluating the relationships between ideas the functioning is made clear and usually the root idea is the key to improving the system.

Benchmarking Benchmarks are measures (of quality, time, or cost) that have already been achieved by others. It indicates about the level of possible goal so as to set goals for own operations. It is helpful for listing new ideas into the process though borrowed from others. Usually the benchmarking data is sourced from surveys or interviews with industry experts, trade or professional organizations, published articles, company tours, prior experience of current staff or conversations.

Types of Benchmarks Internal/Company - It establishes a baseline for external benchmarking Identifies differences within the company and provides rapid and easy-to-adapt improvements though opportunities for improvement are limited to the company's practices. Direct Competition - It prioritizes areas of improvement according to competition and is of interest to most companies but often involves a limited pool of participants thus, opportunities for improvement are limited to "known" competitive practices and may lead to potential antitrust issues.

Six Sigma - Green Belt Industry - It provides industry trend information and is a conventional basis for quantitative and process-based comparison though opportunities for improvement may be limited by industry paradigms Best-in-Class - It examines multiple industries to provide the best opportunity for identifying radically innovative practices and processes by building a brand new perspective but, usually difficult to identify best-in-class companies and get them to participate.

Prioritization Matrix It is used to prioritize is to arrange or deal with in order of importance. A prioritization matrix is a combination of a tree diagram and a matrix chart and used to help decision makers determine the order of importance of the activities. It narrows down options by systematically comparing choices through the selection, weighing, and application of criteria. It quickly surfaces basic disagreements, forces the team to narrow down all solutions from all solutions to the best solutions, limits "hidden agendas" by bringing decision criteria to the forefront of a choice and increases follow-through by asking for consensus after each step of the process. Developing a prioritization matrix - It involves five simple steps, as Determine criteria and rating scale - Determine the factors to assess the importance of each entity. Choose factors that will clearly differentiate important from unimportant which are the criteria like the value it brings to the customer, etc. Then, for each criteria, establish a rating scale to use in assessing how well a particular entity satisfies that criteria. Establish criteria weight - Place criteria in descending order of importance and assign a weight. Create the matrix - List criteria down the left column and the weight and names of potential entities across the top in an L-shaped matrix to judge the relative importance of each criterion. Work in teams to score entities - Review each entity and rate the entity on each of the criteria. Next, multiply the rating for each criterion by its weight and record the weighted value. After evaluating the entity against all of the criteria, add up the weighted values to determine the entity’s total score. Discuss results and prioritize list - After entities have been scored, undertake a discussion to compare notes on results and develop a master list of prioritized entities that everyone agrees upon. An example of prioritization matrix where, 10 is much less expensive, 5 is less expensive, 1 is same cost, 0.2 is more expensive and 0.1 is much more expensive

Six Sigma - Green Belt

Focus Group They are facilitated discussion sessions of customers that help an organization understand the Voice of the Customer (VOC). Usually they are of 1-3 hour sessions with maximum 20 customers. It facilitates better understanding of the voice of customer and organizes the gathered data. It also enables evaluation of the feedbacks and channelizes them for further action.. Usually two types of focused groups are applied, first being the explorative focus group which explores the collective needs of customers, develop and evaluate concepts for new product development as sensed or demanded by the voice of the customer. The next, experiential focus group observes the usage of products in the market and study what the customers feel and experience about the products, learning their reasons and motivations to use the product. Online focus groups have gained importance in recent times due to access to internet but, the discussion takes place on the internet instead of a interview site. Online focus groups are more suited for younger age groups.

Gantt Chart It is a graphical chart, showing the relationships amongst the project tasks, along with time constraints. The horizontal axis of a Gantt chart shows the units of time (days, weeks, months, etc.). The vertical axis shows the activities to be completed. Bars show the estimated start time and duration of the various activities. A Gantt chart shows what has to be done (the activities) and when (the schedule) as shown in the figure below

Six Sigma - Green Belt Milestone Charts - Gantt charts are often modified in a variety of ways to provide additional information. One common variation is milestone charts. The milestone symbol represents an event rather than an activity; it does not consume time or resources.

CPM/PERT Chart CPM or "Critical Path Method" - It is a tool to analyze project and determine duration, based on identification of "critical path" through an activity network. The knowledge of the critical path can permit project managers to change duration. It is a project modeling technique developed in 1950s and is used with all forms of projects. It displays activities as nodes or circles with known activity times. CPM is a diagram showing every step of the project, as letters with lines to each letter representing the sequence in which the project steps take place. A list of activities is required to complete the project and the time (duration) that each activity will take to complete, along with the sequence and dependencies between activities. CPM lays out the longest path of planned activities to the end of the project as well as the earliest and latest that each activity can start and finish without delaying other steps in the project. The project manager can then, determine which activities in the project need to be completed before others and how long those activities can take before they delay other parts of the project. They also get to know which set of activities is likely to take the longest, also called as the critical path which is also the shortest possible time period in which the project can be completed.

PERT Chart - A PERT chart (program evaluation review technique) is a form of diagram for CPM that shows activity on an arrow diagram. PERT charts are more simplistic than CPM charts because they simply show the timing of each step of the project and the sequence of the activities. In PERT, estimates are uncertain and ranges of duration and the probability that activity duration will fall into that range is taken whereas CPM is deterministic. A PERT chart is a graphic representation of a project’s schedule, showing the sequence of tasks, which tasks can be performed simultaneously, and the critical path of tasks that must be completed on time in order for the project to meet its completion deadline. The chart can be constructed with a variety of attributes, such as earliest and latest start dates for each task, earliest and latest finish dates for each task, and slack time between tasks. A PERT chart can document an entire project or a key phase of a project. The chart allows a team to avoid unrealistic timetables and schedule expectations, to help identify and shorten tasks that are bottlenecks, and to focus attention on most critical tasks. It is most useful for planning and tracking entire projects or for scheduling and tracking the implementation phase of a planning or improvement effort.

Six Sigma - Green Belt

Developing PERT Chart Identify all tasks or project components - Ensure the team has knowledge of the project so that during the brainstorming session all component tasks needed to complete the project are captured. Document the tasks on small note cards. Identify the first task that must be completed - Place the appropriate card at the extreme left of the working surface. Identify any other tasks that can be started simultaneously with task #1 - Align these tasks either above or below task #1 on the working surface. Identify the next task that must be completed - Select a task that must wait to begin until task #1(or a task that starts simultaneously with task #1) is completed. Place the appropriate card to the right of the card showing the preceding task. Identify any other tasks that can be started simultaneously with task #2 - Align these tasks either above or below task #2 on the working surface. Continue this process until all component tasks are sequenced. Identify task durations - Reach a consensus on the most likely amount of time each task will require for completion. Duration time is usually considered to be elapsed time for the task, rather than actual number of hours/days spent doing the work. Document this duration time on the appropriate task cards. Construct the PERT chart - Number each task, draw connecting arrows, and add task characteristics such as duration, anticipated start date, and anticipated end date. Determine critical path - The project’s critical path includes those tasks that must start or finish on time to avoid delays to the total project. Critical paths are typically displayed in red.

Activity Network Diagram It charts the flow of activity between separate tasks and graphically displays interdependent relationships between groups, steps, and tasks as they all impact a project. Bubbles, boxes, and arrows are used to depict these activities and the links between them. It shows the sequential relationships of activities using arrows and nodes to identify a project’s critical path. It is similar to the CPM/ PERT and also called as arrow diagram. Developing Activity Network Diagram - Development starts with compiling a list of tasks essential for completion of the project. These tasks are then arranged in a chronological order, depending on the project considering inter-task dependency. All tasks are placed in a progressing line with tasks that can be done simultaneously, is placed on parallel paths, whereas jobs that are dependent should be placed in a chronological line. Apply realistic estimate to each task then, enlist the critical path.

Six Sigma - Green Belt

2.4. Business Results Business results are the outcomes which are measured and were identified during planning stage to show the impact of the project on organization. It involves performance measures for the business and the process involved.

Business Performance It is the crucial performance measure and balanced scorecard is used for it. Balanced scorecard was developed by Robert S. Kaplan and David P. Norton which focuses on four perspectives which are Financial - It focuses relevant high-level financial measures and involves measuring cash flow, sales growth, operating income and return on equity. Customer - It identifies measures which are customer facing like percent of sales from new products, on time delivery, share of important customers’ purchases, ranking by important customers. Internal business processes – These measures answers the question "What must we excel at?" and include cycle time, unit cost, yield, new product introductions. Learning and growth - It eyes continuity to improve, create value and innovate thus, involves measures like time to develop new generation of products, life cycle to product maturity, time to market versus competition.

Project Performance

Six Sigma - Green Belt It usually includes performance indexes on cost, schedule, defects per project, response time, etc. Two common measures are Cost Performance Index - It is a measure of the efficiency of expenses spent on a project. It measures relationship between the budgeted cost of work performed (BCWP) and the actual work performed (ACWP) as a ratio. Schedule Performance Index - SPI measures the success of project management to complete work on time. It is expressed as the ratio of the budgeted cost of work performed (BCWP) to the budgeted cost of work scheduled (BCWS).

Process Performance It is a measure of an organization's activities and performance and includes metrics as Percentage Defective - This is defined as the (Total number of defective parts)/(Total number of parts) X 100. So if there are 1,000 parts and 10 of those are defective, the percentage of defective parts is (10/1000) X 100 = 1% PPM – It is same as the ratio defined in percentage defective, but multiplied by 1,000,000 and PPM for above example is 10,000. It indicates of presence of one or more defects only. Defects per Unit (DPU) – It finds the average number of defects per unit which also needs categorization of the units into number of defects from 0, 1, 2, up to the maximum number. As an example, the below chart shows defect count for 100 units with maximum of 5 defects. Defects 0 1 2 3 4 5 # of Units 70 20 5 4 9 1 The average number of defects is DPU = [Sum of all (D * U)]/100 = [(0 * 70) + (1 * 20) + (2 * 5) + (3 * 4) + (4 * 9) + (5 * 1)]/100 = 47/100 = 0.47 Defects per Opportunity (DPO) – It focus on number of ways of a defect occurrence or the defect “opportunity”, similar to a failure mode in FMEA. As an example from previous data considering that each unit can have a defect occurrence in one of 6 possible ways. Then the number of opportunities for a defect in each unit is 6 and DPO = DPU/O = 0.47/6 = 0.078333 Defects per Million Opportunities (DPMO) - It is obtained by multiplying DPO by 1,000,000 as DPMO = DPO * 1,000,000 = 0.078333 * 1,000,000 = 78,333 Rolled Through Yield (RTY) - A yield measures the probability of a unit passing a step defect-free, and the rolled throughput yield (RTY) measures the probability of a unit passing a set of processes defect-free. This takes the percentage of units that pass through several sub-processes of an entire process without a defect. The number of units without a defect is equal to the number of units that enter a process minus the number of defective units. For illustration, the number of units given as an input to a process is P, the number of defective units is D then, the first-pass yield for each sub-process or FPY is equal to (P – D)/P. After getting FPY for each sub-process, multiply them altogether to obtain RTY as, the yields of 4 sub-processes are 0.994, 0.987, 0.951 and 0.990, then the RTY = (0.994)(0.987)(0.951)(0.990) = 0.924 or 92.4%. Sigma Level- A Six Sigma process is the output of process that has a mean of 0 and standard deviation of 1, with an upper specification limit (USL) and lower specification limit (LSL) set at +3 and -3. However, there is also the matter of the 1.5-sigma shift which occurs over the long term. After computing DPMO and RTY, the sigma level can also be computed as

Six Sigma - Green Belt

If yield is 30.9% 62.9% 93.3 99.4 99.98 99.9997

DPMO is 690,000 308,000 66,800 6,210 320 3.4

Sigma Level is 1.0 2.0 3.0 4.0 5.0 6.0

Cost of poor quality - It is also called as the cost of nonconformance, which includes the cost of all defects as Internal defects - Before product leaves the organization and includes scrapping, repairing, or reworking the parts. External defects - After product leaves the organization and includes costs of warranty, returned merchandise, or product liability claims and lawsuits. It is difficult to calculate because the external costs can be delayed by months or even years after the products are sold thus, the internal costs of poor quality are computed. Process Capability - It compares the output of an in-control process to the specification limits by using capability indices. The comparison is made by forming the ratio of the spread between the process specifications (the specification "width") to the spread of the process values, as measured by 6 process standard deviation units (the process "width"). It is used to compare the output of a stable process with the process specifications and make a statement about how well the process meets specification. There are several statistical measures that are used to measure the capability of a process as Cp, Cpk and Cr. Index

Description Estimates what the process is capable of producing if the process mean were to be centered between the specification limits. Assumes process output is approximately normally distributed. Estimates what the process is capable of producing, considering that the process mean may not be centered between the specification limits. It is 1 divided by Cp.

FMEA and RPN Failure Mode and Effects Analysis (FMEA) is computed for failure analysis which also involves Risk Priority Number (RPN) computation. Both have been explained earlier.

Six Sigma - Green Belt 2.5. Team Dynamics and Performance A team is a group of people but every group is not a team. A team is different from a group in the sense that it is usually small and exists for relatively long period of time till the objective for which it is formed is accomplished. A team must, ideally, consist of members who possess multifarious skills to efficiently handle various types of tasks as per job responsibilities and tasks that are to be carried out. The purpose of forming a team is to improve the internal and external efficiencies of the company. This is done through the efforts of the team members to improve quality, methods, and productivity. Management supports the team process by Ensuring a constancy of purpose Reinforcing positive results, Sharing business results Giving people a sense of mission Developing a realistic and integrated plan Providing direction and support

Team Types A team can generally be classified as ‘formal’ or ‘informal’. Formal team – It is a team formed to accomplish a particular objective or a particular set of objectives. The objective of the team formation is called as ‘mission’ or ‘statement of purpose’. It may consist of a charter, list of team members, letter of authorization and support from the management. Informal team – This type of team will not have the documents that a formal team will have. But an informal team consist versatile membership as the members in it can be changed as per the requirements of the task on hand. A team can also be classified into following types depending on a given situation and constraints that prohibit the formation of either formal or informal teams, as Virtual team - A virtual team is usually formed to overcome the constraint of geographical locations which separate members. Some of the characteristics of a virtual team are as follows It consists of members who live in different places and who may never meet one another during the course of accomplishment of the goal of the team. In a virtual team, the members make use of different technologies like telephone, internet, etc. to coordinate within the team for the achievement of the common goal. Process improvement team - It is formed to discover the modifications required in a particular process in order to improve it. It consists of members who belong to various groups that will be affected by the proposed changes, thus making it cross functional in nature. Self-directed and work group teams - It has wide-ranging goals that are ongoing and repetitive. This necessitates the team to carry out activities on a daily basis. They are usually formed to make decisions on matters such as safety, personnel, maintenance, quality, etc.

Six Sigma - Green Belt Team Roles and Member Selection A team performs optimally when all the members are assigned appropriate roles and they understand their roles in terms of the overall functioning of the team. Some of the major team roles and responsibilities are as Champion Sets and maintains broad goals for improvement projects in area of responsibility Owns the process Coaches and approves changes, if needed, in direction or scope of a project Finds (and negotiates) resources for projects Represents the team to the Leadership group and serves as its advocate Helps smooth out issues and overlaps Works with Process Owners to ensure a smooth handoff at the conclusion of the project Regular reviews with Process Owner on key process inputs and outputs Uses DMAIC tools in everyday problem solving Process Owner Maximizes high level process performance Launches and sponsors improvement efforts Tracks financial benefit of project Understands key process inputs and outputs and their relationship to other processes Key driver to achieve Six Sigma levels of quality, efficiency and flexibility for this process Uses DMAIC tools in everyday problem solving Participates on GB/BB teams Team Member Participates with project leader (GB or BB) Provides expertise on the process being addressed Performs action items and tasks as identified Uses DMAIC tools in everyday problem solving Subject matter expert (SME) Green Belt (GB) Leads and/or participates on Six Sigma project teams Identifies project opportunities within their organization Know and applies Six Sigma methodologies and tools appropriately Black Belt (BB) Proficient in Six Sigma tools and their application Leads/supports high impact projects to bottom line full-time Directly supports MBB’s culture change activities Mentors and coaches Green Belts to optimize functioning of Six Sigma teams Facilitates, communicates, and teaches Looks for applicability of tools and methods to areas outside of current focus Supports Process Owners and Champions Master Black Belt (MBB) Owns Six Sigma deployment plan and project results for their organization Responsible for BB certification

Six Sigma - Green Belt Supervisor for DMAIC BBs; may be supervisor for DFSS BBs Influences senior management and Champions to support organizational engagement Leads culture change – communicates Six Sigma methodology and tools Supports Champions in managing project and project prioritization Ensures that project progress check, gate review, and closing processes meet corporate requirements and meet division needs Communicates, teaches, and coaches Coach Some businesses have coaches who support the GBs and others coach the BBs. Trains Green Belts with help from BBs and MBB Coaches BBs and GBs in proper use of tools for project success Is a consulting resource for project teams

Team Stages Most teams go through four development stages before they become productive - forming, storming, norming, and performing. Bruce W. Tuckman first identified the four development stages, which are Forming - Expectations are unclear. When a team forms, its members typically start out by exploring the boundaries of acceptable group behavior with leader directs the team. Members please each other and take pride in being part of new team. This period is also called as honeymoon period. Storming - Consists of conflict and resistance to the group’s task and structure. Conflict often occurs and disagreements slow down the team as every team member positions his position. However, if dealt with appropriately, these stumbling blocks can be turned into performance later. This is the most difficult stage for any team to work through. Norming - A sense of group cohesion develops and team members resolve conflicts by agreeing on mutually agreeable ideas. Team members use more energy on data collection and analysis as they begin to test theories and identify root causes. The team develops a routine and trust amongst members. Performing - The team begins to work effectively and cohesively as each team member is independent with responsibility and function. Transitioning – In this last phase, the team is split as the project ends. If project’s scope is increased then as per the scope, selective team members continue and rest go back to other work. Various conflicts arise amongst members of the team or between members and the leader during the team formation, relating to the group objectives, structure, or procedures. Several ways to resolve includes Do not tighten control or try to force members to conform to the procedures or rules established in the earlier stage. If disputes over procedures crop up, opt for a group consensus. Investigate the reasons behind the conflict and negotiate acceptable solution. In inter-member conflict, act as a mediator between team members. Dissuade any counter-productive behavior.

Six Sigma - Green Belt Focus on working together efficiently. Group norms are enforced on the group by the group itself. Common problems faced by team includes Floundering - It can be resolved by reviewing the plan and developing a new plan for movement. Reluctant or Dominating participants - It's resolution is to structure the member's participation and balance it so that it is not tilted towards few members of the team. The leader also acts as gate-keeper. Feuds - It is resolved by talking to offending parties in private and developing ground rules for engagement and behavior.

Team Tools Various tools are used by team members and leaders during team formation and it’s different phases which includes Brainstorming - The brainstorming technique was introduced by Alex Faickney Osborn in his book Applied Imagination in 1930. It is used as a tool to create ideas about a particular topic and to find creative solutions to a problem. Brainstorming Procedure - The first and foremost procedure in conducting brainstorming is to review the rules and regulations of brainstorming. Some of the rules and regulations are: all the ideas should be recorded, no scope for criticism, evaluation and discussion of ideas. The second procedure is to examine the problem that has to be discussed. Ensure that all the team members understand the theme of brainstorming. Give enough time (i.e., one or two minutes) for the team members to think about the problem. Ask the team members to think creatively to generate ideas as much as possible. Record the ideas generated by the members so that everyone can review those ideas. Proper care has to be taken to ensure that there is no criticism of any of the ideas and everyone is allowed to be creative. Brainstorming Rules – Rules to be followed for brainstorming are Ensure that all the team members participate in the brainstorming session because the more the ideas that are produced, the greater will be the effect of the solution. As the brainstorming session is a discussion among various people, no distinction should be made between them. The ideas generated by other people should not be condemned. At the time of building people’s ideas, consider each person’s ideas as the best, because the ideas generated by each individual may be superior to the other person. While generating ideas, always put more trust on quantitative ideas rather than qualitative ideas. As a facilitator tally these generated ideas with the team’s performance. Nominal Group Technique (NGT) - The nominal group technique was introduced by Delbecq, Van de Ven, and Gustafson in 1971. It is a kind of brainstorming that encourages every participant to express his/her views. This technique is used to create a ranked list of ideas. In this

Six Sigma - Green Belt technique, all the participants are requested to write their ideas anonymously and the moderator collects the written ideas and each is voted on by the group. It helps in decision-making and organizational planning where creative solutions are sought. It is generally carried out on a Six Sigma project to get feedback from the team members. NGT Procedure - All the members of the team are asked to create ideas and write them down without discussing with others. The inputs from all members are openly displayed and each person is asked to give more explanation about his/her feedback. Each idea is then discussed to get clarification and evaluation. This is usually a repetitive process. Each person is allowed to vote individually on the priority of ideas and a group decision is made based on these ratings. Multi-voting - Multivoting, which is also called NGT voting or nominal prioritization, is a simple technique used by teams to choose the most significant or highest priority item from a list with limited discussion and difficulty. Generally it follows the brainstorming technique. Multivoting is used when the group has a lengthy list of possibilities and wants to specify it in a small list for later analysis and discussion. It is applied after brainstorming for the purpose of selecting ideas. Multivoting Procedure – The procedure to be followed for conducting Multivoting, is Conduct a brainstorming process to create a list of ideas and record the ideas that are created during this process. After completing this, clarify the ideas and combine them so that everyone can easily understand. The group should not discuss the ideas at this time. Participants will vote for the ideas that are eligible for more discussion. Here the participants are given freedom to vote for as many ideas as they desire. Tally the vote for each item. If any item gets the majority of votes, it is placed for the next round. In the next level of voting, the participants can cast their vote for the remaining items in the list. Participants will continue their voting till they get a proper number of ideas for the group to examine as a part of the decision-making or problem solving process. When the group holds a discussion about pros and cons of the project, the remaining ideas are discussed. This discussion may be completed by a group as a whole. Continue proper actions by creating a choice of the best option or discovering the top priorities.

Team Communication Communication is the exchange of information, ideas and knowledge between sender and receiver through an accepted code of symbols. It is a two way process. The process is as an information source or sender, which produces a message a transmitter or encoding , which encodes the message into signals a channel, to which signals are adapted for transmission a receiver or decoding, which decodes the message from the signal a destination or receiver, where the message arrives. noise, is any interference with the message traveling along the channel Communication Types – Communication is either verbal or non-verbal.

Six Sigma - Green Belt Verbal communication - It uses verbal medium like words, speeches, presentations etc. and the sender shares his/her thoughts in the form of words. The tone of the speaker, the pitch and the quality of words play a crucial role in verbal communication. The speaker has to be loud and clear and the content has to be properly defined. While speaking the pitch ought to be high and clear for everyone to understand and the content must be designed keeping the target audience in mind. In verbal communication it is the responsibility of the sender to cross check with the receiver whether he has got the correct information or not and the sender must give the required response. Non verbal communication - It involves facial expressions, gestures, hand and hair movements and body postures for non verbal communication. Any communication made between two people without words and simply through facial movements, gestures or hand movements is called as non verbal communication. In other words, it is a speechless communication where content is not put into words but simply expressed through expressions If one has a headache, one would put his hand on his forehead to communicate his discomfort - a form of non verbal communication. Non verbal communications are vital in offices and meetings. Communication Barriers – Barriers block communication due to which the information to communicate, is not absorbed correctly by the audience. Various barriers which affect communications are Noise – Noises present during communication like in marriages high volume music is used Cultural barriers – Persons from different culture acts as a barrier like dealing with foreigner Emotions – Receiver is emotionally charged like sad due to death of near one Poor retention – Receiver is unable to recall or remember the information Poor Timing - A last moment communication with deadline may put too much pressure on the receiver and may result in resentment. Inappropriate Channel - Poor choice of channel of communication can also be contributory to them is understanding of the message. Network Breakdown - Sometime staff may forget to forward a letter or there may be professional jealousy resulting in closed channel. Barrier Removal can be done by taking effective steps as per the barrier type. Different barrier solving steps usually include Effective Listening Convey emotional contents of the message Use appropriate language Use proper channel Encourage open communication Ensure two-way communication Make best use of body language Communication techniques for project success - The environment in which conflict is managed is important. It is essential to manage communications to overcome the barriers and foster a supportive climate, marked by emphasis on

Six Sigma - Green Belt Presenting ideas or opinions. Problem orientation- focusing attention on the task. Spontaneity- communicating openly and honestly. Empathy: understanding another person's thoughts. Equality- asking for opinions. A willingness to listen to the ideas of others.

3. MEASURE The measure phase has the following objectives of Defining and identifying the specific processes under investigation. Defining metrics for measurement of the processes against project objectives. Establishing the process baseline for validating the present outcomes against defined business needs and to demonstrate improvement results. Evaluate measurement system to validate the reliability of data for drawing meaningful conclusions.

3.1. Process Analysis and Documentation A process is a group of resources and activities which processes inputs into outputs by value addition by executing repeatable tasks in a specific order. These activities and resource inputs at least for crucial process should be documented and controlled.

Process Modeling Flow charts, process maps, written procedures and work instructions are tools used for process modeling and documentation. Flow Charts - A flow chart or process map is a simple graphical tool for documenting the process flow which is comprehensible to users as it depicts the process sequence. A flow chart

Six Sigma - Green Belt examines each step in detail as each task is represented by a symbol. ANSI standards are present which lists various symbol types used for representation in a flowchart. Flowchart helps in improvement identification and can compare the present process to the desired process. Diamond shaped symbol are used for decision with only two outcomes (yes or no). Common flow chart symbols are

Process Mapping – It depicts a process in schematic format thus providing the ability to visualize the process under review. Process mapping is similar to flow charting, as it describes a process with symbols, arrows and words thus avoiding explanations. Process maps are used to outline new procedures and review old procedures for improvements. Many symbols used are standardized under ANSI Y15.3. Process maps are usually used to analyze and document toplevel processes. Process mapping consists of different kinds of maps Relationship Maps show the overall view. They show the departments of an organization and how they interact with suppliers and customers. Cross-functional Maps or Swim Lane Charts show which department performs each step and the inputs and outputs of each step. These maps have more detail than a relationship map but less than a flowchart.

Relationship Map

Cross Functional Map

Six Sigma - Green Belt Written Procedures – Written procedures helps in standardizing the processes thus also enabling improvement avenues for the process. Written procedures are developed by process owners or those responsible for the process. Written procedures are used for explaining complex or lengthy processes or for routine complex tasks thus, making them crucial for being consistent. Development of written procedures is needed for being comprehensible to the user. Documenting the process in the form of a procedure facilitates consistency in the process and avoids mistakes. Written procedure describes the process at a general level whereas, written instructions are more specific. Work Instructions – Written instructions are list or sequence of steps to be undertaken usually by the operational staff for accomplishing a specific task. These instructions are specific to a task which is usually routine. It enlists step-by-step sequence of activities. Flow charts may also be used with work instructions to show relationships of process steps. Controlled copies of work instructions are kept in the area where the activities are performed.

Process Inputs and Outputs Before a process can be improved, it must first be measured. This is accomplished by identifying process input variables and process output variables, and documenting their relationships through cause and effect diagrams, SIPOC and other similar tools. Inputs (Xs) are causes which are independent variables which results in specific outputs (Ys) or effects which are dependent variables. Thus, process maps are expanded to cover customer and supplier by SIPOC to gather measurable data from all. Cause and Effect (Fishbone) Diagrams – They are also called as Ishikawa diagrams and have been discussed earlier. They break problems down into small-size pieces and displays possible causes in a graphical manner. They display how various causes interact with each other and uses brainstorming rules when generating ideas. A fishbone diagram development consists of brainstorming, prioritizing and development of an action plan.

It highlights potential causes of a particular problem or the effect. During measure phase, it is used to brainstorm potential x data. The selected CTQ or CTP are placed in the head of the fish. Each bone is labeled with categories which usually are people, machine, materials, environment and methods. Each category is reviewed by the team by brainstorming for input and process data to collect. The development of fishbone diagram in measure phase involves, review of process

Six Sigma - Green Belt maps developed in define phase, review of categories, putting CTQ or CTP in the head of the fish, brainstorming and reviewing the diagram with revision as needed. SIPOC - SIPOC stands for Suppliers-Inputs-Process-Outputs-Customers and has been discussed earlier. SIPOC addresses issues regarding the input, output, supplier and customers like output being produced by the process, who provide inputs to this process, what are the inputs, what resources does this process use, which steps add value, etc. These issues apply to all processes and SIPOC addresses by putting in place a standard format. SIPOC development is initiated with persons having knowledge of the process and then conducting a brainstorming session to describe the problems and garner consensus for resolution.

Development of SIPOC involves identifying the process steps then, identifying the outputs of the process followed by the customers receiving the outputs of the process, then the inputs and the supplies of the required inputs. Relational Matrices - It is a relational matrix which is used to assess the effect of each input (X) against its output (Y) in a process. It helps team to identify and agree upon outputs critical to the product and/or customer with level of importance also assigned to each output variable by a numerical rating. It also highlights the relationship between inputs and outputs (Y=f(x)) and the relative importance of inputs is also computed. The procedure of development of relational matrices is to review the process map, list the output variables (Ys) along the horizontal axis, rate each output in terms of its overall importance (like a scale of 1 for low importance to 5 for high importance), identify potential inputs (Xs) which influence the outputs (Ys), rate the effect of each X on each Y, the customer importance rating (Y) is taken as a weighted response (which is multiplied by the association rating (X) for each relationship) and the weighted ratings are then added together to compute the importance score. After which prioritizing is done to focus on specific parts.

3.2. Statistics and Probability Drawing valid statistical conclusions Drawing statistical conclusions involves usage of enumerative and analytical studies, which are

Six Sigma - Green Belt Enumerative or descriptive studies describes data using math and graphs and focus on the current situation like a tailor taking a measurement of length, is obtaining quantifiable information which is an enumerative approach. Enumerative data is data that can be counted. These studies are used to explain data, usually sample data in central tendency (median, mean and mode), variation (range and variance) and graphs of data (histograms, box plots and dot plots). Measures calculate from a sample, called statistics with which these measures describe a population, called as parameters. A statistic is a quantity derived from a sample of data for forming an opinion of a specified parameter about the target population. A sample is used as data on every member of population is impossible or too costly. A population is an entire group of objects that contains characteristic of interest. A population parameter is a constant or coefficient that describes some characteristic of a target population like mean or variance. Analytical (Inferential) Studies - The objective of statistical inference is to draw conclusions about population characteristics based on the information contained in a sample. It uses sample data to predict or estimate what a population will do in the future like a doctor taking a measurement like blood pressure or heart beat to obtain a causal explanation for some observed phenomenon which is an analytic approach. It entails define the problem objective precisely, deciding if it will be evaluated by a one or two tail test, formulating a null and an alternate hypothesis, selecting a test distribution and critical value of the test statistic reflecting the degree of uncertainty that can be tolerated (the alpha, beta, risk), calculating a test statistic value from the sample and comparing the calculated value to the critical value and determine if the null hypothesis is to be accepted or rejected. If the null is rejected, the alternate must be accepted. Thus, it involves testing hypotheses to determine the differences in population means, medians or variances between two or more groups of data and a standard and calculating confidence intervals or prediction intervals.

Statistical Basic Terms – Various statistics terminologies which are used extensively are Data - facts, observations, and information that come from investigations. Measurement data sometimes called quantitative data -- the result of using some instrument to measure something (e.g., test score, weight); Categorical data also referred to as frequency or qualitative data. Things are grouped according to some common property(ies) and the number of members of the group are recorded (e.g., males/females, vehicle type). Variable - property of an object or event that can take on different values. For example, college major is a variable that takes on values like mathematics, computer science, etc. Discrete Variable - a variable with a limited number of values (e.g., gender (male/female). Continuous Variable – It is a variable that can take on many different values, in theory, any value between the lowest and highest points on the measurement scale.

Six Sigma - Green Belt Independent Variable - a variable that is manipulated, measured, or selected by the user as an antecedent condition to an observed behavior. In a hypothesized cause-and-effect relationship, the independent variable is the cause and the dependent variable is the effect. Dependent Variable - a variable that is not under the user's control. It is the variable that is observed and measured in response to the independent variable.

Descriptive Statistics Central Tendencies - Central tendency is a measure that characterizes the central value of a collection of data that tends to cluster somewhere between the high and low values in the data. It refers to measurements like mean, median and mode. It is also called measures of center. It involves plotting data in a frequency distribution which shows the general shape of the distribution and gives a general sense of how the numbers are grouped. Several statistics can be used to represent the "center" of the distribution. Mean - The mean is the most common measure of central tendency. It is the ratio of the sum of the scores to the number of the scores. For ungrouped data which has not been grouped in intervals, the arithmetic mean is the sum of all the values in that population divided by the number of values in the population as

where, µ is the arithmetic mean of the population, Xi is the ith value observed, N is the number of items in the observed population and ∑ is the sum of the values. For example, the production of an item for 5 days is 500, 750, 600, 450 and 775 then the arithmetic mean is µ = 500 + 750 + 600 + 450 + 775/ 5 = 615. It gives the distribution's arithmetic average and provides a reference point for relating all other data points. For grouped data, an approximation is done using the midpoints of the intervals and the frequency of the distribution as

Median – It divides the distribution into halves; half are above it and half are below it when the data are arranged in numerical order. It is also called as the score at the 50th percentile in the distribution. The median location of N numbers can be found by the formula (N + 1) / 2. When N is an odd number, the formula yields an integer that represents the value in a numerically ordered distribution corresponding to the median location. (For example, in the distribution of numbers (3 1 5 4 9 9 8) the median location is (7 + 1) / 2 = 4. When applied to the ordered distribution (1 3 4 5 8 9 9), the value 5 is the median. If there were only 6 values (1 3 4 5 8 9), the median location is (6 + 1) / 2 = 3.5 hence, median is half-way between the 3rd and 4th scores (4 and 5) or 4.5. It is the distribution's center point or middle value with an equal number of data points occur on either side of the median but useful when the data set has extreme high or low values and used with non-normal data Mode – It is the most frequent or common score in the distribution or the point or value of X that corresponds to the highest point on the distribution. If the highest frequency is shared by more than one value, the distribution is said to be multimodal and with two, it is bimodal or

Six Sigma - Green Belt peaks in scoring at two different points in the distribution. For example in the measurements 75, 60, 65, 75, 80, 90, 75, 80, 67, the value 75 appears most frequently, thus it is the mode.

Measures of Spread - Although the average value in a distribution is informative about how scores are centered in the distribution, the mean, median, and mode lack context for interpreting those statistics. Measures of variability provide information about the degree to which individual scores are clustered about or deviate from the average value in a distribution. Range - The simplest measure of variability to compute and understand is the range. The range is the difference between the highest and lowest score in a distribution. Although it is easy to compute, it is not often used as the sole measure of variability due to its instability. Because it is based solely on the most extreme scores in the distribution and does not fully reflect the pattern of variation within a distribution, the range is a very limited measure of variability. Inter-quartile Range (IQR) - Provides a measure of the spread of the middle 50% of the scores. The IQR is defined as the 75th percentile - the 25th percentile. The interquartile range plays an important role in the graphical method known as the boxplot. The advantage of using the IQR is that it is easy to compute and extreme scores in the distribution have much less impact but its strength is also a weakness in that it suffers as a measure of variability because it discards too much data. Researchers want to study variability while eliminating scores that are likely to be accidents. The boxplot allows for this for this distinction and is an important tool for exploring data. Variance (σ2) - The variance is a measure based on the deviations of individual scores from the mean. As, simply summing the deviations will result in a value of 0 hence, the variance is based on squared deviations of scores about the mean. When the deviations are squared, the rank order and relative distance of scores in the distribution is preserved while negative values are eliminated. Then to control for the number of subjects in the distribution, the sum of the squared deviations, is divided by N (population) or by N - 1 (sample). The result is the average of the sum of the squared deviations and it is called the variance. The variance is not only a high number but it is also difficult to interpret because it is the square of a value.

Six Sigma - Green Belt Standard deviation (σ) - The standard deviation is defined as the positive square root of the variance and is a measure of variability expressed in the same units as the data. The standard deviation is very much like a mean or an "average" of these deviations. In a normal (symmetric and mound-shaped) distribution, about two-thirds of the scores fall between +1 and -1 standard deviations from the mean and the standard deviation is approximately 1/4 of the range in small samples (N < 30) and 1/5 to 1/6 of the range in large samples (N > 100).

Coefficient of variation (cv) - Measures of variability can not be compared like the standard deviation of the production of bolts to the availability of parts. If the standard deviation for bolt production is 5 and for availability of parts is 7 for a given time frame, it can not be concluded that the standard deviation of the availability of parts is greater than that of the production of bolts thus, variability is greater with the parts. Hence, a relative measure called the coefficient of variation is used. The coefficient of variation is the ratio of the standard deviation to the mean. It is cv = σ / µ for a population and cv = s/ for a sample. Measures of Shape - For distributions summarizing data from continuous measurement scales, statistics can be used to describe how the distribution rises and drops. Symmetric - Distributions that have the same shape on both sides of the center are called symmetric and those with only one peak are referred to as a normal distribution. Skewness – It refers to the degree of asymmetry in a distribution. Asymmetry often reflects extreme scores in a distribution. Positively skewed is when it has a tail extending out to the right (larger numbers) so, the mean is greater than the median and the mean is sensitive to each score in the distribution and is subject to large shifts when the sample is small and contains extreme scores. Negatively skewed has an extended tail pointing to the left (smaller numbers) and reflects bunching of numbers in the upper part of the distribution with fewer scores at the lower end of the measurement scale. Measures of Association – It provides information about the relatedness between variables so as to help estimate the existence of a relationship between variables and it’s strength. They are Covariance - It shows how the variable y reacts to a variation of the variable x. Its formula is for a population cov( X, Y ) = ∑( xi − µx) (yi − µy) / N Correlation coefficient (r) - It is a number that ranges between −1 and +1. The sign of r will be the same as the sign of the covariance. When r equals−1, then it is a perfect negative relationship between the variations of the x and y thus, increase in x will lead to a proportional decrease in y. Similarly when r equals +1, then it is a positive relationship or the changes in x and the changes in y are in the same direction and in the same proportion. If r is zero, there is no relation between the variations of both. Any other value of r determines the relationship as per how r is close to −1, 0, or +1. The formula for the correlation coefficient for population is ρ = Cov( X, Y ) /σx σy

Six Sigma - Green Belt Coefficient of determination (r2) - It measures the proportion of changes of the dependent variable y as explained by the independent variable x. It is the square of the correlation coefficient r thus, is always positive with values between zero and one. If it is zero, the variations of y are not explained by the variations of x but if it one, the changes in y are explained fully by the changes in x but other values of r are explained according to closeness to zero or one. Frequency Distributions - A distribution is the amount of potential variation in the outputs of a process, usually expressed by its shape, mean or variance. A frequency distribution graphically summarizes and displays the distribution of a process data set. The shape is visualized against how closely it resembles the bell curve shape or if it is flatter or skewed to the right or left. The frequency distribution's centrality shows the degree to which the data center on a specific value and the amount of variation in range or variance from the center. A frequency distribution groups data into certain categories, each category representing a subset of the total range of the data population or sample. Frequency distributions are usually displayed in a histogram. Size is shown on the horizontal axis (x-axis) and the frequency of each size is shown on the vertical axis (y-axis) as a bar graph. The length of the bars is proportional to the relative frequencies of the data falling into each category, and the width is the range of the category. It is used to ascertain information about data like distribution type of the data.

It is developed by segmenting the range of the data into equal sized bars or segments groups then computing and labeling the frequency vertical axis with the number of counts for each bar and labeling the horizontal axis with the range of the response variable. Finally, determining the number of data points that reside within each bar and construct the histogram. Cumulative Frequency Distribution - It is created from a frequency distribution by adding an additional column to the table called cumulative frequency thus, for each value, the cumulative frequency for that value is the frequency up to and including the frequency for that value. It shows the number of data at or below a particular variable

Six Sigma - Green Belt

The cumulative distribution function, F(x), denotes the area beneath the probability density function to the left of x.

Central limit theorem and sampling distribution of the mean The central limit theorem is the basis of many statistical procedures. The theorem states that for sufficiently large sample sizes ( n ≥ 30), regardless of the shape of the population distribution, if samples of size n are randomly drawn from a population that has a mean µ and a standard deviation σ , the samples’ means X are approximately normally distributed. If the populations are normally distributed, the sample's means are normally distributed regardless of the sample sizes. Hence, for sufficiently large populations, the normal distribution can be used to analyze samples drawn from populations that are not normally distributed, or whose distribution characteristics are unknown. The theorem states that this distribution of sample means will have the same mean as the original distribution, the variability will be smaller than the original distribution, and it will tend to be normally distributed. When means are used as estimators to make inferences about a population’s parameters and n ≥ 30, the estimator will be approximately normally distributed in repeated sampling. The mean and standard deviation of that sampling distribution are given as µx = µ and σx = σ/√n. The theorem is applicable for controlled or predictable processes. Most points on the chart tend to be near the average with the curve's shape is like bell-shaped and the sides tend to be symmetrical. Using ± 3 sigma control limits, the central limit theorem is the basis of the prediction as, if the process has not changed, a sample mean falls outside the control limits an average of only 0.27% of the time. The theorem enables the use of smaller sample averages to evaluate any process because distributions of sample means tend to form a normal distribution.

Basic Probability

Six Sigma - Green Belt Basic probability concepts and terminology is discussed below Probability - It is the chance that something will occur. It is expressed as a decimal fraction or a percentage. It is the ratio of the chances favoring an event to the total number of chances for and against the event. The probability of getting 4 with a rolling of dice, is 1 (count of 4 in a dice) / 6 = .01667. Probability then can be the number of successes divided by the total number of possible occurrences. Pr(A) is the probability of event A. The probability of any event (E) varies between 0 (no probability) and 1 (perfect probability). Sample Space - It is the set of possible outcomes of an experiment or the set of conditions. The sample space is often denoted by the capital letter S. Sample space outcomes are denoted using lower-case letters (a, b, c . . .) or the actual values like for a dice, S={1,2,3,4,5,6} Event - An event is a subset of a sample space. It is denoted by a capital letter such as A, B, C, etc. Events have outcomes, which are denoted by lower-case letters (a, b, c . . .) or the actual values if given like in rolling of dice, S={1,2,3,4,5,6}, then for event A if rolled dice shows 5 so, A ={5}. The sum of the probabilities of all possible events (multiple E’s) in total sample space (S) is equal to 1. Independent Events - Each event is not affected by any other events for example tossing a coin three times and it comes up "Heads" each time, the chance that the next toss will also be a "Head" is still 1/2 as every toss is independent of earlier one. Dependent Events - They are the events which are affected by previous events like drawing 2 Cards from a deck will reduce the population for second card and hence, it's probability as after taking one card from the deck there are less cards available as the probability of getting a King, for the 1st time is 4 out of 52 but for the 2nd time is 3 out of 51. Simple Events - An event that cannot be decomposed is a simple event (E). The set of all sample points for an experiment is called the sample space (S). Compound Events - Compound events are formed by a composition of two or more events. The two most important probability theorems are the additive and multiplicative laws. Union of events - The union of two events is that event consisting of all outcomes contained in either of the two events. The union is denoted by the symbol U placed between the letters indicating the two events like for event A={1,2} and event B={2,3} i.e. outcome of event A can be either 1 or 2 and of event B is 2 or 3 then, AUB = {1,2} Intersection of events - The intersection of two events is that event consisting of all outcomes that the two events have in common. The intersection of two events can also be referred to as the joint occurrence of events. The intersection is denoted by the symbol ∩ placed between the letters indicating the two events like for event A={1,2} and event B={2,3} then, A∩B = {2} Complement - The complement of an event is the set of outcomes in the sample space that are not in the event itself. The complement is shown by the symbol ` placed after the letter indicating the event like for event A={1,2} and Sample space S={1,2,3,4,5,6} then A`={3,4,5,6} Mutually Exclusive - Mutually exclusive events have no outcomes in common like the intersection of an event and its complement contains no outcomes or it is an empty set, Ø for example if A={1,2} and B={3,4} and A ∩ B= Ø. Equally Likely Outcomes - When a sample space consists of N possible outcomes, all equally likely to occur, then the probability of each outcome is 1/N like the sample space of all the possible outcomes in rolling a die is S = {1, 2, 3, 4, 5, 6}, all equally likely, each

Six Sigma - Green Belt outcome has a probability of 1/6 of occurring but, the probability of getting a 3, 4, or 6 is 3/6 = 0.5. Probabilities for Independent Events or multiplication rule - Independent events occurrence does not depend on other events of sample space then the probability of two events A and B occurring both is P(A ∩ B) = P(A) x P(B) and similarly for many events the independence rule is extended as P(A∩B∩C∩. . .) = P(A) x P(B) x P(C) . . . This rule is also called as the multiplication rule. For example the probability of getting three times 6 in rolling a dice is 1/6 x 1/6 x 1/6 = 0.00463 Probabilities for Mutually Exclusive Events or Addition Rule - Mutually exclusive events do not occur at the same time or in the same sample space and do not have any outcomes in common. Thus, for two mutually exclusive events, A and B, the event A∩B = Ø, and the probability of events A and B occurring is zero, as P(A∩B) = 0, for events A and B, the probabilities of either or both of the events occurring is P(AUB) = P(A) + P(B) – P(A∩B) also called as addition rule. For example let P(A) = 0.2, P(B) = 0.4, and P(A∩B) = 0.5, then P(AUB) = P(A) + P(B) - P(A∩B) = 0.2 + 0.4 - 0.5 = 0.1 Conditional probability - It is the result of an event depending on the sample space or another event. The conditional probability of an event (the probability of event A occurring given that event B has already occurred) can be found as

For example in sample set of 100 items received from supplier1 (total supplied= 60 items and reject items = 4) and supplier 2(40 items), event A is the rejected item and B be the event if item from supplier1. Then, probability of reject item from supplier1 is – P(A|B) = P(A∩B)/ P(B), P(A∩B) = 4/100 and P(B) = 60/100 = 1/15.

3.3. Collecting and Summarizing Data Process improvement needs it to be measurable by data collection which is critical for any improvisation.

Types of data and Measurement scales Data is information that is objective and types of data and measurement scales are discussed next Types of data – They are of two types, discrete and continuous. Attribute or discrete data - It is based on counting like the number of processing errors, the count of customer complaints, etc. Discrete data values can only be non-negative integers such as 1, 2, 3, etc. and can be expressed as a proportion or percent (e.g., percent of x, percent good, percent bad). It includes Count or percentage – It counts of errors or % of output with errors. Binomial data - Data can have only one of two values like yes/no or pass/fail. Attribute-Nominal - The "data" are names or labels. Like in a company, Dept A, Dept B, Dept C or in a shop: Machine 1, Machine 2, Machine 3 Attribute-Ordinal - The names or labels represent some value inherent in the object or item (so there is an order to the labels) like on performance - excellent, very good, good, fair, poor or tastes - mild, hot, very hot

Six Sigma - Green Belt Variable or continuous data - They are measured on a continuum or scale. Data values for continuous data can be any real number: 2, 3.4691, -14.21, etc. Continuous data can be recorded at many different points and are typically physical measurements like volume, length, size, width, time, temperature, cost, etc. It is more powerful than attribute as it is more precise due to decimal places which indicate accuracy levels and specificity. It is any variable measured on a continuum or scale that can be infinitely divided. Data are said to be discrete when they take on only a finite number of points that can be represented by the non-negative integers. An example of discrete data is the number of defects in a sample. Data are said to be continuous when they exist on an interval, or on several intervals. An example of continuous data is the measurement of pH. Quality methods exist based on probability functions for both discrete and continuous data. Data could easily be presented as variables data like 10 scratches could be reported as total scratch length of 8.37 inches. The ultimate purpose for the data collection and the type of data are the most significant factors in the decision to collect attribute or variables data. Converting Data Types - Continuous data, tend to be more precise due to decimal places but, need to be converted into discrete data. As continuous data contains more information than discrete data hence, during conversion to discrete data there is loss of information. Discrete data cannot be converted to continuous data as instead of measuring how much deviation from a standard exists, the user may choose to retain the discrete data as it is easier to use. Converting variable data to attribute data may assist in a quicker assessment, but the risk is that information will be lost when the conversion is made. Measurement - A measurement is assigning numerical value to something, usually continuous elements. Measurement is a mapping from an empirical system to a selected numerical system. The numerical system is manipulated and the results of the manipulation are studied to help the manager better understand the empirical system. Measured data is regarded as being better than counted data. It is more precise and contains more information. Sometimes, data will only occur as counted data. If the information can be obtained as either attribute or variables data, it is generally preferable to collect variables data. The information content of a number is dependent on the scale of measurement used which also determines the types of statistical analyses. Hence, validity of analysis is also dependent upon the scale of measurement. The four measurement scales employed are nominal, ordinal, interval, and ratio and are summarized as Scale Nominal

Ordinal

Definition Only the presence/absence of an attribute. It can only count items. Data consists of names or categories only. No ordering scheme is possible. It has central location at mode and only information for dispersion. Data is arranged in some order but

Example go/no-go, success/fail, accept/reject

Statistics percent, proportion, chisquare tests

taste,

rank-order

Six Sigma - Green Belt Scale

Interval

Ratio

Definition differences between values cannot be determined or are meaningless. It can say that one item has more or less of an attribute than another item. It can order a set of items. It has central location at median and percentages for dispersion. Data is arranged in order and differences can be found. However, there is no inherent starting point and ratios are meaningless. The difference between any two successive points is equal; often treated as a ratio scale even if assumption of equal intervals is incorrect. It can add, subtract and order objects. It has central location at arithmetic mean and standard deviation for dispersion. An extension of the interval level that includes an inherent zero starting point. Both differences and ratios are meaningful. True zero point indicates absence of an attribute. It can add, subtract, multiply and divide. It has central location at geometric mean and percent variation for dispersion.

Example attractiveness

Statistics correlation, sign or run test

calendar time, correlations, ttemperature tests, F-tests, multiple regression

elapsed time, t-test, F-test, distance, weight correlations, multiple regression

Data collection methods Data collection is based on crucial aspects of what to know, from whom to know and what to do with the data. Factors which ensure that data is relevant to the project includes Person collecting data like team member, associate, subject matter expert, etc. Type of Data to collect like cost, errors, ratings etc. Time Duration like hourly, daily, batch-wise etc. Data source like reports, observations, surveys etc. Cost of collection Few types of data collection methods includes Check sheets - It is a structured, well-prepared form for collecting and analyzing data consisting of a list of items and some indication of how often each item occurs. There are several types of check sheets like confirmation check sheets for confirming whether all steps in a process have been completed, process check sheets to record the frequency of observations with a range of measurement, defect check sheets to record the observed frequency of defects and stratified check sheets to record observed frequency of defects by defect type and one other criterion. It is easy to use, provides a choice of observations and good for determining frequency over time. It should be used to collect observable data when the collection is managed by the same person or at the same location from a process.

Six Sigma - Green Belt Coded data- It is used when presence of too many digits are to be recorded into small blocks or during data capturing of large sequences of digits from a single observation or rounding off errors are observed whilst recording large digit numbers. It is also used if numeric data is used to represent attribute data or data quantity is not enough for a statistical significance in the sample size. Various types of coded data collection are Truncation coding for storing only 3,2 or 9 for 1.0003, 1.0002, and 1.0009 Substitution coding – It stores fractional observation, as integers like expressing the number 32 for 32-3/8 inches with 1/8 inch as base. Category coding - Using a code for category like "S" for scratch Adding/subtracting a constant or multiplying/dividing by a factor – It is usually used for encoding or decoding Automatic measurements - In it a computer or electronic equipment performs data gathering without human intervention like radioactive level in a nuclear reactor. The equipment observes and records data for analysis and action.

Techniques for Assuring Data Accuracy and Integrity Data integrity and accuracy have a crucial in the data collection process as they ensure the usefulness of data being collected. Data integrity determines whether the information being measured truly represents the desired attribute and data accuracy determines the degree to which individual or average measurements agree with an accepted standard or reference value. Data integrity is doubtful if the data collected does not fulfill the purpose like data collected on finished good departure gathers data from truck departures but if the data is recorded on computing device present in the warehouse then integrity is doubtful. Similarly data accuracy is doubtful if the measurement device does not conforms to the laid down device standards. Bad data can be avoided by following few precautions like avoiding emotional bias relative to tolerances, avoiding unnecessary rounding and screening data to detect and remove data entry errors. Sampling - Practically all items of population cannot be measured due to cost or being impractical hence, sampling is used to get a representative group of items to measure. Various sampling strategies are Random Sampling - The use of a sampling plan requires randomness in sample selection and requires giving every part an equal chance of being selected for the sample. The sampling sequence must be based on an independent random plan. It is the least biased of all sampling techniques, there is no subjectivity as each member of the total population has an equal chance of being selected and can also be obtained using random number tables. Sequential or Systematic Sampling – Init every nth record is selected from a list of the population. Usually, these plans are ended after the number inspected has exceeded the sample size of a sampling plan. It is used for costly or destructive testing. If the list does not contain any hidden order, this strategy is just as random as random sampling. Stratified Sampling – It selects random samples from each group or process that is different. If the population has identifiable categories, or strata, that have a common characteristic, random sampling is used to select a sufficient number of units from each strata. Stratified

Six Sigma - Green Belt sampling is often used to reduce sampling error. The resulting mix of samples can be biased if the proportion of the samples does not reflect the relative frequency of the groups. Sample Homogeneity - It occurs when the data chosen for a sample have similar characteristics. It focuses on how similar the data are in a given sample. If data are from a variety of sources, such as several production streams or several geographical areas then, the results will reflect these combined sources. It aims for homogeneous data so as to relate data from a single source to the degree as much possible, to evaluate and determine the influence from an input of concern on data. Non-homogeneous data result in errors. Deficiency of homogeneity in data will hide the sources and make root cause analysis difficult. Sampling Distribution of Means - If the means of all possible samples are obtained and organized, we could derive the sampling distribution of the means. The mean of the sampling distribution of the mean is the mean of the population from which the scores were sampled. Therefore, if a population has a mean µ, then the mean of the sampling distribution of the mean is also µ. Sampling Error - The sample statistics may not always be exactly the same as their corresponding population parameters. The difference is known as the sampling error.

Graphical methods They are effective tools for the visual evaluation of data is a graph showing the relationship between variables. They also provide a visual image of the data thus complementing numerical methods for identifying patterns in the data. They include box plots, stem and leaf plots scatter diagrams, pattern and trend analysis, histograms, normal probability distributions and Weibull distributions. Box plot - It is also called a box-and-whisker plot or “five number summary”. It has five points of interest, which are the quartiles, the median, and the highest and lowest values and shows how the data are scattered within those ranges. It shows location, spread and shape of the data. It is used for graphically showing the variation between multiple variables and the variations within the ranges. In it, the upper and lower quartiles of the data form the ends of the box, the median forms the centerline of the box which is also dividing the box and the minimum and maximum data points are drawn as end points to lines that extend from the box (the whiskers). Outlier data are represented by asterisks or diamonds outside of the minimum or maximum points. Notches indicate variability of the median, and widths are proportional to the log of the sample size.

Six Sigma - Green Belt It is used when comparing two or more sets of data or determining significance of an apparent difference. It is useful with a large number of data sets by providing a graphic summary of a data set as it visually shows the center, the spread, the overall range and indicates skewness of the distribution. It is usually used in the early stages of data analysis. Developing Box plot involves Enlisting the data in numerical order and computing the median Enlisting the lower and upper quartile and their medians. Computing the inter-quartile range and plot the 5-points to a number line (three medians, lowest and highest value). Draw a box through the upper and lower quartiles points and a vertical line through the median point. Draw the whiskers from each end of the box to the smallest and largest values. Stem and Leaf Plot - It separates each number into a stem (all numbers but the last digit) and a leaf (the last digit) like, for the numbers 45, and 59, the stems are 4 and 5, while the leaves are 5 and 9. It is easy to make and shows shape and distribution quickly. It is a compact depiction of data showing both variable and categorical data sets. It resembles a histogram and is used to visualize the spread of a distribution and indicate around what values the data are mainly concentrated. It is essentially composed of two parts, the stem on the left side of the graph and the leaf on the right. Data can be read directly from the diagram. It is useful for classifying data and organizing data as it is collected but all numbers should be whole numbers or of same precision. As in the figure, most data is in between 70 to 79.

Developing Stem and Leaf Plot Sort the given data in numerical order (ascending). Separate the numbers into stems and leaves. Group the numbers with the same stems. Histograms - It shows frequencies in data as adjacent rectangles, erected over intervals with an area proportional to the frequency of the observations in the interval. They are frequency column graphs that display a static picture of process behavior and require a minimum of 50-100 data points. It is characterized by the number of data points that fall within a given bar or interval or frequency. It enables the user to visualize how the data points spread, skew and detect the presence of outliers. A stable process which is predictable, usually shows a histogram with bell-

Six Sigma - Green Belt shaped curves which is not shown with unstable process even though shapes like exponential, lognormal, gamma, beta, Poisson, binomial, geometric, etc. are a stable process. The construction of a histogram starts with the division of a frequency distribution into equal classes, and then each class is represented by a vertical bar. They are used to plot the density of data especially of continuous data like weight or height.

Run Charts - It displays how a process performs over time as data points are plotted in chronological order and connected as a line graph. It is useful in detection of variation or problem trend or pattern as it is evident in run charts when shift occurs that’s why, it is also called as trend charts. It can displays sequential data for spotting patterns and abnormalities. It is used for monitoring and communicating process performance. It is usually used for displaying performance data over time or for showing tabulations. Even though trends observable on the run chart might not signify deviation as it might be under normal limits but, usually it indicates a trend or shift or a cycle. When a run chart exhibits seven or eight points successively up or down, then a trend is clearly present in the data. Developing Run Chart Sequence the input data against time and order the data from lowest to highest. Calculate the median and the range. Make the Y-axis scale 1.5 to 2 times the range and of X-axis 2 to 3 times against Y-axis. Depict the median by a dotted line. Plot the points and connect them to form a line graph.

Six Sigma - Green Belt

Scatter Diagram - It is displays multiple XY coordinate data points represent the relationship between two different variables on X and Y-axis. It is also called as correlation chart. It depicts the relationship strength between an independent variable on the vertical axis and a dependent variable on the horizontal axis. It enables strategizing on how to control the effect of the relationship on the process. It is also called scatter plots, X-Y graphs or correlation charts. It graph pairs of continuous data, with one variable on each axis, showing what happens to one variable when the other variable changes. If the relationship is understood, then the dependent variable may be controlled. The relationship may show a correlation between the two variables though correlation does not always refer to a cause and effect relationship. The correlation may be positive due to one variable moving in one direction and the second variable in the same direction but, for negative correlation both move in opposite directions. Presence of correlation is due to a cause-effect relationship, a relationship between one cause and another cause or due to a relationship between one cause and two or more other causes.

It is used when two variables are related or evaluating paired continuous data. It is also helpful to identify potential root causes of a problem by relating two variables. The tighter the data points along the line, the stronger the relationship amongst them and the direction of the line indicates whether the relationship is positive or negative. The degree of association between the two

Six Sigma - Green Belt variables is calculated by the correlation coefficient. If the points show no significant clustering, there is probably no correlation. Developing Scatter Diagram Collect data for both variables. Draw a graph with the independent variable on the horizontal axis (x) and the dependent variable on the vertical axis (y). For each pair of data, plot a dot (or symbol) where the x-axis value intersects the y-axis value. Normal Probability Plots - It is used to detect the presence of normal bell curve or Gaussian distribution in the process data. The plot is defined by mean and variance. For normally distributed data, the mean and median are very close and may be identical. The normal probability plot shows whether or not the data are distributed as a standard normal distribution. Normal distributions will follow a linear pattern. It is also called as normal test plots. It is used when prediction or taking decisions based on the data distribution and to test the assumption of normality. In it most of the data concentrate around or on the centerline which divides the curve into two equal halves. The data is plotted against a theoretical normal distribution in such a way that the points should form an approximate straight line. Departures from this straight line indicate departures from normality.

Weibull Plots - It is usually used to estimate the cumulative probability that a given sample will fail under certain conditions. The data can be used to determine a point at which a certain number of samples will fail. Once it is known, this information can help design a process such that no part of the sample approaches the stress limitations. It provides reasonably accurate failure analysis and forecasts with extremely small samples by providing a simple and useful graphical plot of the failure data.

Six Sigma - Green Belt The Weibull plot has special scales designed so that the data points will be almost linear if they follow a Weibull distribution. The Weibull distribution has three parameters but can use only two if the third is assumed α is the shape parameter θ is the scale parameter γ is the location parameter Weibull plots usually chart data on the probable life of a product or process which is measured in hours, miles, or any other metric that describes the time-to-failure. If complete data is available, the exact time-to-failure is known but for suspended data or right censored, the unit operates successfully for a known period of time and could have continued for an additional period of time that is not known whereas, for interval data or left censored, the time-to failure is known but only within a certain range of time.

3.4. Probability Distributions Distribution - Prediction and decision-making needs fitting data to distributions (like normal, binomial, or Poisson). A probability distribution identifies whether a value will occur within a given range or the probability that a value that is lesser or greater than x will occur or the probability that a value between x and y will occur. A distribution is the amount of variation in the outputs of a process, expressed by shape (symmetry, skewness and kurtosis), average and standard deviation. Symmetrical distributions the mean represents the central tendency of the data but for skewed distributions, the median is the indicator. The standard deviation provides a measure of variation from the mean. Similarly skewness is a measure of the location of the mode relative to the mean thus, if mode is to the

Six Sigma - Green Belt mean's left then the skewness is negative else positive but for symmetrical distribution, skewness is zero. Kurtosis measures the peakness or relative flatness of the distribution and the kurtosis is higher for a higher and narrower peak.

Probability Distribution It is a mathematical formula relating the values of a characteristic or attribute with their probability of occurrence in the population. It depicts the possible events and the associated probability for each of these events to occur. Probability distribution is divided as Discrete data describe a finite set of possible occurrences for the data like rolling a dice with the random variable can take value from 1, 2, 3, 4, 5 or 6. The most used discrete probability distributions are the binomial, the Poisson, the geometric, and the hypergeometric distribution. Continuous data describes a continuum of possible occurrences that is unbroken as, the distribution of body weight is a random variable with infinite number of possible data points.

Probability Density Function Probability distributions for continuous variables use probability density functions (or PDF), which are mathematically model the probability density shown in a histogram but, discrete variables have probability mass function. PDFs employ integrals as the summation of area between two points when used in a equation. If a histogram shows the relative frequencies of a series of output ranges of a random variable, then the histogram also depicts the shape of the probability density for the random variable hence, the shape of the probability density function is also described as the shape of the distribution. An example illustrates it Example: A fast-food chain advertises a burger weighing a quarter-kg but, it is not exactly 0.25 kg. One randomly selected burger might weigh 0.23 kg or 0.27 kg. What is the probability that a randomly selected burger weighs between 0.20 and 0.30 kg? That is, if we let X denote the weight of a randomly selected quarter-kg burger in kg, what is P(0.20 < X < 0.30)? This problem is solved by using probability density function as, imagine randomly selecting, 100 burgers advertised to weigh a quarter-kg. If weighed the 100 burgers, and created a density histogram of the resulting weights, perhaps the histogram might be

In this case, the histogram illustrates that most of the sampled burgers do indeed weigh close to 0.25 kg, but some are a bit more and some a bit less. Now, what if we decreased the length of the class interval on that density histogram then, it will be as

Six Sigma - Green Belt

Now, if it is pushed further and the interval is decreased then, the intervals would eventually get small that we could represent the probability distribution of X, not as a density histogram, but rather as a curve (by connecting the "dots" at the tops of the tiny rectangles) as

Such a curve is denoted f(x) and is called a (continuous) probability density function. A density histogram is defined so that the area of each rectangle equals the relative frequency of the corresponding class, and the area of the entire histogram equals 1. Thus, finding the probability that a continuous random variable X falls in some interval of values involves finding the area under the curve f(x) sandwiched by the endpoints of the interval. In the case of this example, the probability that a randomly selected burger weighs between 0.20 and 0.30 kg is then this area, as

Distributions Types Various distributions are Binomial - It is used in finite sampling problems when each observation has only one of two possible outcomes, such as pass/fail. Poisson - It is used for situations when an attribute possibility is that each sample can have multiple defects or failures. Normal - It is characterized by the traditional "bell-shaped" curve, the normal distribution is applied to many situations with continuous data that is roughly symmetrical around the mean.

Six Sigma - Green Belt Chi-square - It is used in many situations when an inference is drawn on a single variance or when testing for goodness of fit or independence. Examples of use of this distribution include determining the confidence interval for the standard deviation of a population or comparing the frequency of variables. Student's t - It is used in many situations when inferences are drawn without a variance known in the case of a single mean or the comparison of two means. F - It is used in situations when inferences are drawn from two variances such as whether two population variances are different in magnitude. Hypergeometric - It is the "true" distribution. It is used in a similar manner to the binomial distribution except that the sample size is larger relative to the population. This distribution should be considered whenever the sample size is larger than 10% of the population. The hypergeometric distribution is the appropriate probability model for selecting a random sample of n items from a population without replacement and is useful in the design of acceptance-sampling plans. Bivariate - It is created with the joint frequency distributions of modeled variables. Exponential - It is used for instances of examining the time between failures. Lognormal - It is used when raw data is skewed and the log of the data follows a normal distribution. This distribution is often used for understanding failure rates or repair times. Weibull - It is used when modeling failure rates particularly when the response of interest is percent of failures as a function of usage (time).

Binomial Distribution It is used to model discrete data having only two possible outcomes like pass or fail, yes or no and which are exactly two mutually exclusive outcomes. It may be used to find the proportion of defective units produced by a process and used when population is large – when N> 50 with small size of sample compared to the population. The ideal situation is when sample size (n) is less than 10% of the population (N) or n< 0.1N. The binomial distribution is useful to find the number of defective products if the product either passes or fails a given test. The mean, variance, and standard deviation for a binomial distribution are µ = np, σ2= npq and σ =√npq. The essential conditions for a random variable are fixed number of observations (n) which are independent of each other, every trial results in either of the two possible outcomes and if the probability of a success is p and the probability of a failure is 1 -p. The binomial probability distribution equation will show the probability p (the probability of defective) of getting x defectives (number of defectives or occurrences) in a sample of n units (or sample size) as

As an example if a product with a 1% defect rate, is tested with ten sample units from the process, Thus, n= 10, x= 0 and p= .01 then, the probability that there will be 0 defective products is

Six Sigma - Green Belt

Poisson Distribution It estimates the number of instances a condition of interest occurs in a process or population. It focuses on the probability for a number of events occurring over some interval or continuum where µ, the average of such an event occurring, is known like project team may want to know the probability of finding a defective part on a manufactured circuit board. Most frequently, this distribution is used when the condition may occur multiple times in one sample unit and user is interested in knowing the number of individual characteristics found like critical attribute of a manufactured part is measured in a random sampling of the production process with nonconforming conditions being recorded for each sample. The collective number of failures from the sampling may be modeled using the Poisson distribution. It can also be used to project the number of accidents for the following year and their probable locations. The essential condition for a random variable to follow Poisson distribution is that counts are independent of each other and the probability that a count occurs in an interval is the same for all intervals. The mean and the variance of the Poisson distribution are the same, and the standard deviation is the square root of the mean hence, µ = σ2 and σ =√µ =√σ2. The Poisson distribution can be an approximation to the binomial when p is equal to or less than 0.1, and the sample size n is fairly large (generally, n >= 16) by using np as the mean of the Poisson distribution. Considering f(x) as the probability of x occurrences in the sample/interval, λ as the mean number of counts in an interval (where λ > 0), x as the number of defects/counts in the sample/interval and e as a constant approximately equal to 2.71828 then the equation for the Poisson distribution is as

Normal Distribution A distribution is said to be normal when most of the observations are clustered around the mean. It charts a data set of which most of the data points are concentrated around the average (mean) in a symmetrical manner, thus forming a bell-shaped curve. The normal distribution’s shape is unique in that the most frequently occurring value is in the middle of the range and other probabilities tail off symmetrically in both directions. The normal distribution is used for continuous (measurement) data that is symmetric about the mean. The graph of the normal distribution depends on the mean and the variance. When the variance is large, the curve is short and wide and when the variance is small, the curve is tall and narrow.

The normal distribution is also called as the Gaussian or standard bell distribution. The population mean µ is zero and that the population variance σ2 equals one as in the figure and σ is the standard deviation. The normal probability density function is

Six Sigma - Green Belt

For normal distribution, the area under the curve lies between µ − σ and µ + σ. Z- transformation - The shape of the normal distribution depends on two factors, the mean and the standard deviation. Every combination of µ and σ represent a unique shape of a normal distribution. Based on the mean and the standard deviation, the complexity involved in the normal distribution can be simplified and it can be converted into the simpler z-distribution. This process leads to the standardized normal distribution, Z = (X − µ)/σ. Because of the complexity of the normal distribution, the standardized normal distribution is often used instead.

Chi-Square Distribution The chi-square (χ2) distribution is used when testing a population variance against a known or assumed value of the population variance. It is skewed to the right or with a long tail toward the large values of the distribution. The overall shape of the distribution will depend on the number of degrees of freedom in a given problem. The degrees of freedom are 1 less than the sample size. It is formed by adding the squares of standard normal random variables. For example, if z is a standard normal random variable, then the following is a chi-square random variable (statistic) with n degrees of freedom

The chi-square probability density function where v is the degree of freedom and gamma function is

(x) is the

An example of a χ2 distribution with 6 degrees of freedom is as

Student t Distribution It was developed by W.S. Gosset. The t distribution is used to determine the confidence interval of the population mean and confidence statistics when comparing the means of sample

Six Sigma - Green Belt populations but, the degrees of freedom for the problem must be know n. The degrees of freedom are 1 less than the sample size. The student’s t distribution is a symmetrical continuous distribution and similar to the normal distribution, but the extreme tail probabilities are larger than for the normal distribution for sample sizes of less than 31. The shape and area of the t distribution approaches towards the normal distribution as the sample size increases. The t distribution can be used whenever samples are drawn from populations possessing a normal, bell-shaped distribution. There is a family of curves, one for each sample size from n =2 to n = 31.

F Distribution The F distribution or F-test is a tool used for assessing the ratio of independent variances or equality of variances from two normal populations. It is used in the Analysis of Variance (ANOVA, a technique frequently used in the Design of Experiments to test for significant differences in variance within and between test runs). If U and V are the variances of independent random samples of size n and m taken from normally distributed populations with variances of w and z, then

which is a random variable with an F distribution with v1 = n-1 and v2 = m - 1. The Fdistribution is represented by

Six Sigma - Green Belt with (s1)2 is the variance of the first sample (n1- 1 degrees of freedom in the numerator) and (s2)2 is the variance of the second sample (n2- 1 degrees of freedom in the denominator), given two random samples drawn from a normal distribution. The shape of the F distribution is non-symmetrical and will depend on the number of degrees of freedom associated with (s1)2 and (s2)2. The distribution for the ratio of sample variances is skewed to the right (the large values).

Geometric distribution It addresses the number of trials necessary before the first success. If the trials are repeated k times until the first success, we would have k−1 failures. If p is the probability for a success and q the probability for a failure, the probability of the first success to occur at the kth trial is P(k, p) = p(q)k−1 with the mean and standard deviation are µ =1/p and σ = √q/p.

Hypergeometric Distribution The hypergeometric distribution applies when the sample (n) is a relatively large proportion of the population (n >0.1N). The hypergeometric distribution is used when items are drawn from a population without replacement. That is, the items are not returned to the population before the next item is drawn out. The items must fall into one of two categories, such as good/bad or conforming/nonconforming. The hypergeometric distribution is similar in nature to the binomial distribution, except the sample size is large compared to the population. The hypergeometric distribution determines the probability of exactly x number of defects when n items are samples from a population of N items containing D defects. The equation is

With, x is the number of nonconforming units in the sample (r is sometimes used here if dealing with occurrences), D is the number of nonconforming units in the population, N is the finite population size and n is the sample size.

Six Sigma - Green Belt Bivariate Distribution When two variables are distributed jointly the resulting distribution is a bivariate distribution. Bivariate distributions may be used with either discrete or continuous data. The variables may be completely independent or a covariance may exist between them. The bivariate normal distribution is a commonly used version of the bivariate distribution which may be used when there are two random variables. This equation was developed by Freund in 1962 as

With -∞ < x < ∞ -∞ < y < ∞ -∞ < µ1< ∞ -∞ < µ2< ∞ σx> 0, σx> 0 µ1 and µ2 are the two population means First σ2 and second σ2 are the two variances ρ is the correlation coefficient of the random variables

Exponential Distribution It is used to analyze reliability, and to model items with a constant failure rate. The exponential distribution is related to the Poisson distribution and used to determine the average time between failures or average time between a numbers of occurrences. The mean and the standard deviation are µ =1/λ and σ =1/λ. For example, if there is an average of 0.50 failures per hour (discrete data - Poisson distribution), then the mean time between failure (MTBF) is 1 / 0.50 = 2 hours (continuous data - exponential distribution). If a random variable x is distributed exponentially, then its reciprocal y =1/x follows a Poisson distribution. The opposite is also true. If x follows a Poisson distribution, then the reciprocal y = 1/x is exponentially distributed. The exponential distribution equation is

With µ is the mean (also sometimes referred to as θ), λ is the failure rate which is the same as1/µ and x is the x-axis values. When this equation is integrated, it results in cumulative probabilities as

Six Sigma - Green Belt

Lognormal Distribution The most common transformation is made by taking the natural logarithm, but any base logarithm, such as base 10 or base 2 may be used. It is used to model various situations such as response time, time-to-failure data, and time-to-repair data. Lognormal distribution is a skewedright distribution (with most data in the left tail), and consists of the distribution of the random variable whose logarithm follows the normal distribution. The lognormal distribution assumes only positive values. When the data follows a lognormal distribution, a transformation of data can be done to make the data follow a normal distribution. Then probabilities, confidence intervals and tests of hypothesis can be conducted (if the data follows a normal distribution). The lognormal probability density function is

With µ is the location parameter or log mean and σ is the scale (or shape) parameter or standard deviation of natural logarithms of the individual values.

Lognormal Distribution

Plotting Natural Logarithm

Weibull Distribution The Weibull distribution is a widely used distribution for understanding reliability and is similar in appearance to the lognormal. It can be used to measure time to fail, time to repair, and material strength. The shape and dispersion of the Weibull distribution depends on two parameters β which is the shape parameter and θ which is the scale parameter but, both parameters are greater than zero. The Weibull distribution is one of the most widely used distributions in reliability and statistical applications. The two and three parameter Weibull common versions. The difference is the three parameter Weibull distribution has a location parameter when there is some non-zero time to first

Six Sigma - Green Belt failure. In general, the probabilities from a Weibull distribution can be found from the cumulative Weibull function as

With, X is a random variable, x is an actual observation. The shape parameter (β) provides the Weibull distribution with its flexibility as If β = 1, the Weibull distribution is identical to the exponential distribution. If β = 2, the Weibull distribution is identical to the Rayleigh distribution. If 3 < β < 4, then the Weibull distribution approximates a normal distribution.

3.5. Measurement System Analysis Measurement Attribute Screens - Attribute screens use two categories for determining data outcomes, acceptable or not acceptable, go or no go, pass or fail. This screen is typically used when the percentage of nonconforming material is high or not known. A screen should evaluate the attributes that are most helpful in identifying major problems with a product or process. Gauge Blocks - Gauge blocks are used in manufacturing to set a length dimension for transfer or for tool calibration. Sets of these blocks usually come in groups of eight to eighty-one. Gauge blocks are accurate to within a few millionths of an inch.

Measuring Tools Various measurement tools are Calipers – They measure distance, depth, height, or length from either an inside or outside perspective. Most calipers capture physical measurements which are transferred to a scale to determine the data. Calipers are of types like Spring calipers – The two sides are connected by a spring to measure difficult to reach areas. It’s accurate to a tenth of an inch and uses steel ruler to transfer measurement. Vernier calipers - It uses a vernier scale and are accurate to one thousandth of an inch. Digital calipers - It uses an electronic readout and are accurate to five thousandths of an inch. Optical Comparators – It compares a part to a form that represents the desired dimensions by projecting a beam of light for a shadow of the object that is magnified by a lens for tolerance levels. Micrometers – It is also called as "mics", are handheld measuring devices with a C frame with the measurement occurring between a fixed anvil and a movable spindle. It is similar to calipers with a finely threaded screw with a head to show amount of screw movement with use. It measure items by a combination of readings on a barrel and thimble with accuracy to one thousandth of an inch.

Six Sigma - Green Belt Measurement System In order to ensure a measurement method is accurate and producing quality results, a method must be defined to test the measurement process as well as ensure that the process yields data that is statistically stable. Measurement Systems Analysis (MSA) refers to the analysis of precision and accuracy of measurement methods. It is an experimental and mathematical method of determining how much the variation within the measurement process contributes to overall process variability. Characteristics contribute to the effectiveness of a measurement method which is Accuracy - It is an unbiased true value which is normally reported and is the nearness of measured result and reference value. It has different components as Bias - It is the systematic difference between the average measured value and a reference value. The reference value is an agreed standard, such as a standard traceable to a national standards body. When applied to attribute inspection, bias refers to the ability of the attribute inspection system to produce agreement on inspection standards. Bias is controlled by calibration, which is the process of comparing measurements to standards. Linearity – It is the difference in bias through measurements. How does the size of the part affect the accuracy of the measurement method? Stability – It is the change of bias over time and usage. How accurately does the measurement method perform over time? Sensitivity - The gage should be sensitive enough to detect differences in measurement as slight as one-tenth of the total tolerance specification or process spread. Precision - It is the ability to repeat the same measurement by the same operator at or near the same time with nearness of measurement in any random measurement. Its components are Reproducibility - The reproducibility of a single gage is customarily checked by comparing the results of different operators taken at different times. It is the variation in the average of the measurements made by different appraisers using the same measuring instrument when measuring the identical characteristic on the same part. Repeatability - It is the variation in measurements obtained with one measurement instrument when used several times by one appraiser, while measuring the identical characteristic on the same part. Variation obtained when the measurement system is applied repeatedly under the same conditions is usually caused by conditions inherent in the measurement system. Repeatability serves as the foundation that must be present in order to achieve reproducibility. Reproducibility must be present before achieving accuracy. Precision requires that the same measurement results are achieved for the condition of interest with the selected measurement method. A measurement method must first be repeatable. A user of the method must be able to repeat the same results given multiple opportunities with the same conditions. The method must then be reproducible. Several different users must be able to use it and achieve the same measurement results. Finally, the measurement method must be accurate. The results the method produces must hold up to an external standard or a true value given the condition of interest.

Six Sigma - Green Belt

Gauge R and R Studies - Assuming that a gauge is determined to be accurate (that is, the measurements generated by the gauge are the same as those of a recognized standard), the measurements produced must be repeatable and reproducible. A study must be conducted to understand how much variance (if any) observed in the process is due to variation in the measurement system. The most widely used methods to quantify measurement errors are Range Method - The range method is a simple way to quantify the combined repeatability and reproducibility of a measurement system. Average and Range Method - The average and range method computes the total measurement system variability, and allows the total measurement system variability to be separated into repeatability, reproducibility, and part variation. It is outlined by AIAG and is a control chart model using averages and range to study variability in measurement methods. This model requires two or three replications (r), by two or three appraisers (k), on 10 parts (n). The average range value is computed as

The average range value is proportionate to the standard deviation of the process. The average range provides another source of understanding the variation using a specific measurement method. Analysis of Variance Method - ANOVA is the most accurate method for quantifying repeatability and reproducibility and allows the variability of the interaction between the appraisers and the parts to be determined. It separates the total variability found within a data set into random and systematic factors. The random factors do not have any statistical influence on the given data set, while the systematic factors do. It is used mainly to compare the means of two or more samples though, estimates of variance are the key intermediate statistics calculated. For example to check quality of packing boxes being manufactured at different factories with same manufacturing setup and output of a company, the box samples must be inspected before they reach the customer. Though, there is low variation in the size or other characteristics of the boxes ANOVA answers the question of whether the differences (variance) in the boxes made within each factory are "large" compared to the differences (variance) in the means for the boxes made at the different factories. Hence, an ANOVA computation compares the variances among the means to the variances within the samples. What it takes to be "large enough" for the difference to be statistically significant depends on the sample sizes and the amount of certainty that is needed in testing. ANOVA can also report at the interaction between those involved in looking at the measurement method and the attributes/parts themselves. ANOVA partitions the total variation as Choose a small number of parts (usually ten or fewer) in a random manner.

Six Sigma - Green Belt Select a characteristic to be measured. Number the parts to identify each part specifically. Select a few technicians or inspectors - usually five or fewer Require technicians or inspectors to measure the parts using the same measuring device. Repeat above step to obtain two complete sets of data. Conduct an ANOVA analysis beginning with the construction of an ANOVA table. The observed value using an ANOVA study is Observed Value = Part Mean + Bias + Part Effect + Appraiser Effect + Replication Error or Observed Value = Reference Value + Deviation and in equation format

With is the mth measurement taken by appraiser j on the ith part. Assuming that all of the are independent and normally distributed with mean µ and variance (σ)2 the total variance is given by

with replication error.

are the variances due to the part effect, the appraiser effect, and the

Measurement Correlation It means the correlation or comparison of the measurement values from one measurement system with the corresponding values reported by one or more different measurement systems. A measurement system or device can be used to compare values against a known standard. The measurement system or device may also be compared against the mean and standard deviation of multiple other similar devices, all reporting measurements of the same or similar artifacts, often referred to as proficiency testing or round robin testing. Measurement correlation can also mean comparison of values obtained using different measurement methods used to measure different properties. Examples are correlation of hardness and strength of a metal, temperature and linear expansion of an item being heated, and weight and piece count of small parts. It may also identify issues with the measuring device that can be corrected. Besides repeatability and reproducibility, other components whose combined effect explains measurement correlation are bias, linearity and P/T variation.

Bias It is often due to human error. Whether intentional or not, bias can cause inaccurate or misleading results. In other words, bias causes a difference between the output of the measurement method and the true value. Types of bias include Participants tend to remember their previous assessments so, collect assessment sheets immediately after each trial, change the order of the inputs, transactions or questions and include an adequate waiting period after the initial trial to make remembering details of the trial less likely.

Six Sigma - Green Belt Participants spend extra time when they know they are being evaluated, so give specific time frames. When equipment is set wrong. If an instrument underestimates, the bias is negative. If an instrument overestimates, the bias is positive. The equation for bias is

with, n is the number of times the standard is measured, Xi is the ith measurement and T is the value of the standard.

Linearity It is the variation between a known standard throughout the operating range of the gauge. The purpose of measurement linearity is to determine the reliability of a measuring instrument by indicating any linearity error or change in the accuracy of the measuring instrument. When measuring linearity, draw a line through the data points to view a slope (b). The slope is a "best fit" line that runs through the data points. Linearity is equal to the slope multiplied by the process variation Vp (tolerance or spread). Typically, the lower the absolute value of the slope, the better the linearity. The percent linearity is equal to the slope, b , of the best-fit straight line through the data points, and the linearity is equal to the slope multiplied by process variation, as

The bias at any point can be estimated from the slope and the y-intercept, of the best-fit line, as

Six Sigma - Green Belt If gauge linearity error is relatively high it is due to the gauge is not being calibrated properly at both the lower and upper ends of its operating range, there are errors in the minimum or maximum master, the gauge is worn or the internal gauge has faulty design characteristics.

Percent Agreement Percent agreement between the measurement system and either reference values or the true value of a variable being measured, can be estimated using a correlation coefficient, “r”. If r = ±1.0, then there is 100 percent agreement and if r = 0, then there is 0 percent agreement between the measurement system variables and the reference or true values.

Precision-Tolerance Ratio Precision/Tolerance (P/T) is the ratio between the estimated measurement error (precision) and the tolerance of the characteristic being measured, where 6σE is the standard deviation of the measurement system variability, as

The P/T ratio needs to be small to minimize the effect of measurement error. As the P/T ratio becomes larger, the measurement method loses its ability to indicate a real change in the process. Values of the estimated ratio [P/T] of 0.1, or less, often are taken to imply adequate gauge capacity. This is based on the generally used rule that requires a measurement device to be calibrated in units one-tenth as large as the accuracy required in the final measurement though it is not applicable every time hence, the gauge must be sufficiently capable to measure product accurately enough and precisely enough so that the analyst can make the correct decision. The formula for P/T ratio assumes that measurement errors are independent, measurement errors are normally distributed and measurement error is independent of the magnitude of the measurement.

Metrology It is the science of measurement. The word metrology derives from two Greek words: matron (meaning measure) and logos (meaning logic). Metrology involves the following The establishment of measurement standards that are both internationally accepted and definable The use of measuring equipment to correlate the extent that product and process data conforms to specification The regular calibration of measuring equipment, traceable to established international standards

Measurement Error Measurement error is the degree to which the measuring instrument differs from a true value. The error of a measuring instrument is indication of a measuring instrument minus the true value, as

Six Sigma - Green Belt

Measurement error is due to factors, as Operator variation - This occurs when the same operator realizes variation when using the same equipment with the same standards. Operator to operator variation - This occurs when two or more operators realize variation in results while using the same equipment with the same standards. Equipment variation - The equipment exhibits erratic measurement results. Process variation - This occurs when there are two or more methods for using measurement equipment and those methods yield different results. Other variation – It includes material variation, software variation, etc. The confidence interval for the mean of measurements is reduced by obtaining multiple readings according to the central limit theorem using the following relationship

Total Product Variability The total variability in a product includes the variability of the measurement process, as

Calibration Calibration is the comparison of a measurement standard or instrument of known accuracy with another standard or instrument to detect, correlate, report or eliminate by adjustment, any variation in the accuracy of the item being compared. The elimination of measurement error is the primary goal of calibration systems. The calibration systems is used for Ensures that products and services meet the tolerance range and quality specifications. A well-maintained calibration system has a positive impact on the quality of products and services offered to the customer Ensures that measuring equipment is recalled from use when it is time to be recalibrated. Periodic recalibration of measuring and test equipment is necessary for measurement accuracy Ensures that measuring equipment is removed from use when it is incapable of performing its function with an agreed level of accuracy Calibration achieves the following goals, as Reduce quality costs through the early detection of nonconforming products and processes with the use of measuring equipment of known accuracy Provide customers with an indication of a supplier’s calibration capabilities

Six Sigma - Green Belt

Calibration Schedule - Measuring equipment should be calibrated before initial use and periodically recalibrated as often as necessary to maintain prescribed accuracies. When production is continuous, a frequency (or interval) is usually established. When production is sporadic, calibration is often done on a “prior to use” basis. The recalibration interval will depend on variables such as historical information, stability, purpose, extent of use, tendency to wear or drift, how critical the measurement is, the cost of an inaccurate measurement, the environment in which it is used, etc. Measuring and test equipment should be traceable to records that indicate the date of the last calibration, by whom it was calibrated and when the next calibration is due. Coding is sometimes used. It is generally accepted that the interval of calibration of measuring equipment be based on stability, purpose and degree of usage. The stability of a measurement instrument refers to the ability of a measuring instrument to consistently maintain its metrological characteristics over time. The purpose is important, in general, the critical applications will increase frequency and minor applications would decrease frequency. The degree of usage refers to how often an instrument is utilized and to what environmental conditions an instrument are exposed. Calibration Standards - In the SI system, most of the fundamental units are defined in terms of natural phenomena that are unchangeable. This recognized true value is called the standard. Primary reference standards consist of copies of the international kilogram plus measuring systems which are responsive to the definitions of the fundamental units and to the derived units of the SI table. National standards are taken as the central authority for measurement accuracy, and all levels of working standards are traceable to this “grand” standard.

3.6. Control Chart Control charts can either be univariate when they monitor a single CTQ characteristic of a product or service or be multivariate when they monitor more than one CTQ. The univariate control charts are further classified according to whether they monitor attribute data or variable data. A typical control chart plots sample statistics and is made up of minimum four lines of, a vertical line to measure the levels of the sample's means, the two outmost horizontal lines for the UCL and the LCL; and the center line, which represents the mean of the process. If all of the points plot between the UCL and the LCL in a random manner, the process is considered to be, in control which means that the variations are random but are not outside the control limits thus, the process trends can be predicted because the variations are strictly due to common causes. The control charts helps in prevent the process from going out of control by detecting the assignable causes of variation in time and it dissuade from making unnecessary adjustments when they are not needed. It also determines the natural range (control limits) of a process so as to compare the range to its specified limits. Control charts inform about the process capabilities and stability as well. It is a tool for constant process monitoring thus, facilitate the planning of production resources allocation. Control limits on a control chart are readjusted every time a

Six Sigma - Green Belt significant shift in the process occurs. As per the Western Electric (WECO) rules, a process is said to be out-of-control if one the following occur A single point falls outside the 3σ limit Two out of three successive points fall beyond the 2σ limits Four out of five successive points fall beyond 1σ from the mean Eight successive points fall on one side of the center line Attribute Data univariate chart - It's characteristics resemble binary data — they can only take one of two given forms like conforming or not conforming, good or bad, etc Attribute data must be transformed into discrete data to be meaningful. The types of charts used for attribute data are The p–chart - The p-chart is used when dealing with ratios, proportions, or percentages of conforming or nonconforming parts in a given sample. A good example for a p-chart is the inspection of products on a production line. They are either conforming or nonconforming. The probability distribution used in this context is the binomial distribution with p for the nonconforming proportion and q (which is equal to 1 − p) for the proportion of conforming items. Because the products are only inspected once, the experiments are independent from one another. The first step when creating a p-chart is to calculate the proportion of nonconformity for each sample as p =m/b where, m represents the number of nonconforming items, b is the number of items in the sample, and p is the proportion of nonconformity. The mean proportion is computed as

where, k is the number of samples audited and pk is the kth proportion obtained. The control limits of a p-chart are

The benefit of the p-chart is that the variations of the process change with the sizes of the samples or the defects found on each sample. The np-chart - The np-chart is one of the easiest to build. While the p-chart tracks the proportion of nonconformities per sample, the np-chart plots the number of nonconforming items per sample. The audit process of the samples follows a binomial distribution—in other words, the expected outcome is “good” or “bad,” and therefore the mean number of successes is np. The control limits for an np-chart are

The c-chart - The c-chart monitors the process variations due to the fluctuations of defects per item or group of items. The c-chart is useful for the process engineer to know not just how many items are not conforming but how many defects there are per item. Knowing how many defects there are on a given part produced on a line might in some cases be as important as knowing how many parts are defective. Here, non-conformance must be

Six Sigma - Green Belt distinguished from defective items because there can be several nonconformities on a single defective item. The probability for a nonconformity to be found on an item in this case follows a Poisson distribution. If the sample size does not change and the defects on the items are fairly easy to count, the c-chart becomes an effective tool to monitor the quality of the production process. If c is the average nonconformity on a sample, the UCL and the LCL limits will be given as

The u-chart - One of the premises for a c-chart is that the sample sizes had to be the same. The sample sizes can vary when a u-chart is being used to monitor the quality of the production process, and the u-chart does not require any limit to the number of potential defects. Further, for a p-chart or an np-chart the number of nonconformities cannot exceed the number of items on a sample, but for a u-chart it is conceivable because what is being addressed is not the number of defective items but the number of defects on the sample. The first step in creating a u-chart is to calculate the number of defects per unit for each sample as u = c/ n. where u represents the average defect per sample, c is the total number of defects, and n is the sample size. Once all the averages are determined, a distribution of the means is created and then the mean of the distribution is to be computed as

where k is the number of samples. The control limits are determined based on u and the mean of the samples, n as

Variable control charts - Control charts monitor not only the means of the samples for CTQ characteristics but also the variability of those characteristics. When the characteristics are measured as variable data (length, weight, diameter, and so on), the X -charts, S-charts, and Rcharts are used. These control charts are used more often and they are more efficient in providing feedback about the process performance. The principle underlying the building of the control charts for variables is the same as that of the attribute control charts. The whole idea is to determine the mean, the standard deviation, and the distance between the mean and the control limits based on the standard deviation.

X charts and R-charts – It is similar to attribute control charts but, quantitative measurements are considered for the CTQ characteristics instead of qualitative attributes. X -and R-charts both combined observe the sample means and the variations through their spread. Samples are taken and measurements of the means X and the ranges R for each sample derived and plotted on two separate charts. The CL is determined by averaging the X s as, X =( X 1 + X 2 + X n )/n where, n is the number of samples. The UCL and the LCL are UCL= X + 3σ, CL= X and LCL= X + 3σ. The mean range and the standard deviation for normally distributed data are linked as σ =R/d2 where, the constant d2 is function of n.

Six Sigma - Green Belt

Standard error-based X-chart - It is based on the Central Limit Theorem, the standard deviation used for the control limits is nothing but the standard deviation of the process divided by the square root of the sample’s size as,

Mean range- based X-chart - With sample sizes n ≤ 10, the variations are also small, so the range can be used against the standard deviation when constructing a control chart. R is called the relative range computed as R = d2/σ and the mean range is R = (R1 + R2+···Rk)/k where Rk is the range of the kth sample. Therefore, the estimator of σ is σ =R/d2. The formulas for the control limits are R-chart - In it, the center line will be R and the estimator of sigma is given as σ R = d3σ. The control limits are

3.7. Process Capability and Performance Process capability is a predictable pattern of statistically stable behavior where the chance causes of variation are compared to the engineering specifications. A capable process is a process whose spread on the bell-shaped curve is narrower than the tolerance range.

Process Capability Studies A process capability study attempts to quantify whether a process can consistently meet the standards set by internal or external customers. Since this study yields a prediction, and predictions should be made from relatively stable processes, a process capability study should only be used in a relatively controlled and stable process environment. Measuring capability can be challenging because it is, by definition, a point estimate. Every process has unpredictable instability, which creates an inherent risk of estimate errors. Since there is no confidence interval related for mean and standard deviation, there is no confidence interval for capability, therefore risk cannot be quantified. The user must accept the risk of variability related to instability. If the variation is due to a common cause, the output will still form a distribution that is relatively stable as the variation is constant. In this case, a process capability study may be completed but, if the variation is a result of a special cause, then the output is not as stable and not as predictable. In this case, a process capability study may have problems with its accuracy.

Six Sigma - Green Belt

The objective of a process capability study is to establish a state of control over the manufacturing process and then maintaining that state of control through time. Study Procedure – It includes various steps, as Select a process to study which is critical and can be selected using several techniques like a Pareto analysis or a cause-and-effect diagram. Verify or define the process parameters. Verification of what the process entails, its boundaries, and gain agreement on the process’s definition. Many of these steps are completed when developing a process map. Conduct a measurement systems analysis to ensure that the measurement methods produce sound data. Select a process capability analysis method like Cpk, Cp, Ppk and Pp. Obtain the data and conduct an analysis. Develop an estimate of the process capability. This estimate can be compared to the standards set by internal or external customers. After completing a process capability study, address any special causes of variation that can be isolated. If able, eliminate the special causes that are not desirable. In some cases, a special cause of variation may be desirable if it produces a better product or output. In that circumstance, if possible, attempt to make the special cause a common cause to ensure the benefit is achieved equally on all output. Identifying Characteristics - Characteristics selected to be part of a process capability study should meet certain requirements, as The characteristic should be important relative to the quality of the product or process. A process may have 15 characteristics, but only one or two should be selected for inclusion in the process capability study. The characteristics are Ys or outcomes to process steps that meet customer requirements. The Ys are changed by changing the Xs or inputs. The characteristic’s value should be adjustable. The operating parameters that influence the characteristic should be able to be determined and controlled.

Six Sigma - Green Belt Sometimes, the characteristic selected has a history of being the most difficult item to control. Identifying Specifications/Tolerances The process specifications or tolerances are determined either by customer requirements, industry standards, or the organization’s engineering department. Developing Sampling Plans If the process fits a normal distribution and is in statistical control, then the standard deviation can be estimated from

For new processes, for example for a project proposal, a pilot run may be used to estimate the process capability. Specification Limits - Specification limits are set by the customer, and result from either customer requirements or industry standards. The amount of variance (process spread) the customer is willing to accept sets the specification limits. A customer wants a supplier to produce 12-inch rulers. Specifications call for an acceptable variation of +/- 0.03 inches on each side of the target (12.00 inches). The customer is saying acceptable rulers will be from 11.97 to 12.03 inches. If the process is not meeting the customer's specification limits, two choices exist to correct the situation: Change the process's behavior. Change the customer's specification (requires customer approval). Examples of Specification Limits - Specification limits are commonly found in Blueprints Engineering drawings and specs Industry standards Self-imposed standards within a shop Federally mandated standards (e.g., emissions controls) Verifying Stability and Normality - If only common causes of variation are present in a process, then the output of the process forms a distribution that is stable over time and is predictable. If special causes of variation are present, the process output is not stable over time. While the process is currently capable, stability may need to be improved to assure continued capability. Since the process is stable, but not capable, we can be reasonably sure the lack of capability is reasonably correct. The process must be improved to become capable. The lack of stability makes it difficult to estimate the level of capability with any certainty. First, we need to reduce variation and remove special causes of variation to improve stability so we will have reasonable estimates of the centering of the process. Following that, we may need to re-center the process and/or further reduce process variation.

Process performance vs. specification

Six Sigma - Green Belt The performance metric indices establish a controlled process, and then maintain that process over time. Numbered values are a shortcut method indicating the quality level of a process in parts per million (ppm). Once the status of the process is determined, the causes in variation (based on statistical significance) may be identified. Courses of action might be to Do nothing. Change the specifications. Center the process. Reduce the variation in the Six Sigma process spread. Accept the losses. Process Limits - A stable process can be monitored to determine if changes that occur are due to factors other than random variation. Such observation determines whether changes are necessary and if any corrective actions are required. Process limits are the voice of the process based on the variation of the products produced. The supplier collects data over time to determine the variation in the units against the customer's specification. These data points collected over time establish the process curve. Having a predictable process producing 100 percent conformances is the ideal state. Day-to-day control charts help identify assignable causes to any variations that occur. Control charts are special types of time series charts in which control limits are calculated around the central location, or mean, of the variable being plotted.

A process capability diagram displays both the voice of the process and the voice of the customer. To draw one of these diagrams Locate the mean of the distribution (X) and draw a normal curve that reflects the upper and lower process limits (UPL, LPL) to the data. Draw the customer specifications with the upper and lower limits for those specifications as appropriate (USL, LSL). Note that a customer may only have a lower limit or just an upper limit. Process Performance Metric - It is a measure of an organization's activities and performance and includes metrics like percentage defective which is defined as the (Total number of defective parts)/(Total number of parts) X 100. So if there are 1,000 parts and 10 of those are defective, the

Six Sigma - Green Belt percentage of defective parts is (10/1000) X 100 = 1%. Other metrics have been discussed earlier and are summarized as Performance Metric

Description

Percentage Defective

What percentage of parts contain one or more defects?

Parts per Million (PPM)

What is the average number of defective parts per million? This is the same figure in metric 1 above of “percentage defective” multiplied by 1,000,000.

Defects per Unit (DPU)

What is the average number of defects per unit?

What is the average number of defects per opportunity? (where Defects per Opportunity opportunity = number of different ways a defect can occur in a (DPO) single part Defects per million The same figure in metric 3 above of defects per opportunity Opportunities (DPMO) multiplied by 1,000,000 Rolled throughput yield The yield stated as a percentage of the number of parts that go (RTY) through a multi-stage process without a defect. Process sigma Cost of poor quality

The sigma level associated with either the DPMO or PPM level found in metric 2 or 5 above. The cost of defects: either internal (rework/scrap) or external (warranty/product)

Process capability indices Process capability indices includes Cp and Cpk, who identify the current state of the process and provide statistical evidence for comparing after-adjustment results to the starting point. Cp – It measures the ratio between the specification tolerance (USL-LSL) and process spread. Whenever a process which is normally distributed and is exactly mid-way between the specification limits, would yield a Cp of 1 if the spread is +/- 3 standard deviations. The usual accepted minimum value for Cp is 1.33. It’s requirements for both an upper and lower specification and usage after the process is centered, is the major limitation. It is computed as

It is used to identify the process's current state and measures the actual capability of a process to operate within customer defined specification limits hence, it should be used when the data set is from a controlled, continuous process. Hence, it needs standard deviation/Sigma information

Six Sigma - Green Belt with USL and LSL specifications. Cp indicates the amount of variation in the process but not about the process's ability to align with the target. Cpk – It measures the absolute distance of the mean to the nearest specification limit. Usually a Cpk value of minimum 1 and maximum 1.33 is desired. It needs the centering process similar as that for Cp. Along with Cp, Cpk provides a common measurement for assigning an initial process capability to center on specification limits. It is computed as

Cp measures "can it fit" while Cpk measures "does it fit.". If Cp= Cpk , then the process is centered. Cpm - It is also referred to as the Taguchi index. It is more accurate and reliable than the other indices. It focuses on reducing the variation from a target value (T). Variation from the target T is expressed as process variability or σ2 and process centering (µ - T), where µ= process average. Cpm provides a common measurement assigning an initial process capability to a process for aligning the mean of the sample to the target. It is computed as

With T is the target value, µ is the expected value and σ is the standard deviation. It is applied if the target is not the center or mean of the USL – LSL or when establishing an initial process capability during the Measure phase. Higher Cpm value, indicates more likely the output of the process meet the specs and the target. Sigma and Process Capability - When means and variances wander over time, a standard deviation (symbolized by the Greek letter σ) is the most common way to describe how data in a sample varies from its mean. A Six Sigma goal is to have 99.99976% error-free work (reducing the defects to 3.4 per million). By computing sigma and relating to a process capability index such as Ppk, it can be determined the number of non-conformances (or failure rate) produced by the process. To compute sigma (σ), use the following equation for a population

With, N is the number of items in the population, each data point.

is the mean of the population data and x is

Six Sigma - Green Belt Process Performance Indices The most used process performance indices are Pp, Ppk, and Cpm which depict the present status of the process and also act as an important tool for improvement of the process. These metrics have a common purpose as process capability indices but, they differ in their approach. Pp – It measures the ratio between the specification tolerance and process spread. It helps to measure improvement over time as it signals where the process is in comparison to the customer's specifications. It is computed as

It is used for collecting continuous data and the process is not in control. It depicts the amount of variation, but not alignment to the target and for process to be in control, a process must only have common causes for each of the data points (no data points existing beyond the UCL or LCL). Ppk – It measures the absolute distance of the mean to the nearest specification limit. It provides an initial measurement to center on specification limits. It also examines variation within and between subgroups. It is computed as

It is used with continuous data and the process is not in control. It indicates alignment to the USL and LSL but not the amount of variation.

Short-term vs. long-term capability Short-term capability is measured over a very short time period since it focuses on the machine's ability based on design and quality of construction. By focusing on one machine with one operator during one shift, it limits the influence of other outside long-term factors, including operator, environmental conditions such as temperature and humidity, machine wear and different material lots. Thus, short-term capability can measure the machine's ability to produce parts with a specific variability based on the customer's requirements. Short-term capability uses a limited amount of data relative to a short time and the number of pieces produced to remove the effects of longterm components. If the machines are not capable of meeting the customer's requirements, changes may have a limited impact on the machine's ability to produce acceptable parts. Remember, though, that short-term capability only provides a snapshot of the situation. Since short-term data does not contain any special cause variation (such as that found in long-term data), short-term capability is typically rated higher. When a process capability is determined using one operator on one shift, with one piece of equipment, the process variation is relatively small. Control limits based on a short-term process evaluation are closer together than control limits based on the long-term process.

Six Sigma - Green Belt A modified and R chart can be used for short runs, based on an initial 3 to 10 pieces, using a calculated value compared with a critical value. Inflated D4 and A2 values are used to establish control limits. Control limits are recalculated after additional groups are run. The X and MR chart can also be used for small runs, with a limited amount of data. The X represents individual data values, and the MR is the moving range, a measure of piece to piece variability. Process capability or Cpk values determined from either of these methods must be considered preliminary information. As the number of data points increases, the calculated process capability will approach the true capability.

Process capability for attributes data The control chart represents the process capability, once special causes have been identified and removed from the process. For attribute charts, capability is defined as the average proportion or rate of nonconforming product. For p charts, the process capability is the process average nonconforming, conforming to specification, 1-

.The proportion

, may be used.

For np charts, the process capability is the process average non-conforming, . For c charts, the process capability is the average number of nonconformities, , in a sample of fixed size n. For u charts, the process capability is the average number of nonconformities per reporting unit, . The average proportion of nonconforming may be reported on a defects per million opportunities scale by multiplying

times 1,000,000.

Six Sigma - Green Belt

4. ANALYZE This phase is the starting of the statistical analysis of the problem. This phase statistically reviews the families of variation to determine which significant contributors to the output are. The statistical analysis is done with the development of a theory, null hypothesis. The analysis will "fail to reject" or "reject" the theory. The families of variation and their contributions are quantified and relationships between variables are shown graphically and numerically to provide the team direction for improvements. The main objectives of this phase are Reduce the number of inputs (X’s) to a manageable number Determine the presence of noise variables through Multi-Vari Studies Plan first improvement activities

4.1. Exploratory Data Analysis Exploratory data analysis or EDA, is the important first step in analyzing the data from an experiment as it is used for Detection of mistakes Checking of assumptions Preliminary selection of appropriate models Determining relationships among the explanatory variables, and Assessing the direction and rough size of relationships between explanatory and outcome variables. EDA does not include any formal statistical modeling and inference. The four types of EDA are univariate non-graphical, multivariate non-graphical, univariate graphical, and multivariate graphical.

Multi-vari studies Usually the variation is within piece and the source of this variation is different from piece-topiece and time-to-time variation. The multi-vari chart is a very useful tool for analyzing all three types of variation. Multi-vari charts are used to investigate the stability or consistency of a process. The chart consists of a series of vertical lines, or other appropriate schematics, along a time scale. The length of each line or schematic shape represents the range of values found in each sample set. The multi-vari chart presents an analysis of the variation in a process, hereby differentiating between three main sources Intra-piece, the variation within a piece, batch, lot, etc Inter-piece, the additional variation between pieces. Temporal variation, variation which is related to time.

Six Sigma - Green Belt

Data can be grouped in terms of sources of variation to help define the way measurements are partitioned. These sources describe characteristics of populations, and the few common types are classifications (by category), geography (of a distribution center or a plant), geometry (chapters of a book or locations within buildings), people (tenure, job function or education) and time (deadlines, cycle time or delivery time). We can stratify the data to help us understand the way our processes work by categorizing the individual measurements. This helps us understand the variation of the components as it relates to the whole process. For example, errors are being tracked in a process. The variation could be within a subgroup (within a certain batch), between subgroups (from one batch to another batch) or over time (time of day, day of week, shift or even season of the year). Interpretation of the chart is apparent once the values are plotted. The advantages of multi-vari charts are It can dramatize the variation within the piece (positional). It can dramatize the variation from piece to piece (cyclical). It helps to track any time related changes (temporal). It helps minimize variation by identifying areas to look for excessive variation. It also identifies areas not to look for excessive variation. Sources of variation in multi-vari analysis can be Within Individual Sample - Variation is present upon repeat measurements within same sample. Piece to Piece - Variation is present upon measurements of different samples collected within a short time frame. Time to Time - Variation is present upon measurements collected with a significant amount of time between samples. Multi-vari analysis is applicable to either product or service as it can control variation for both as Within Individual Sample variations like Measurement Accuracy, Out of Round, Irregularities in Part, Measurement Accuracy and Line Item Complexity Piece to Piece variations like Machine fixturing, Mold cavity differences, Customer Differences, Order Editor, Sales Office and Sales Rep Time to Time variations like Material Changes, Setup Differences, Tool Wear, Calibration Drift, Operator Influence, Seasonal Variation, Management Changes, Economic Shifts and Interest Rate

Six Sigma - Green Belt

Steps to develop multi-vari chart Plot the first sample range with a point for the maximum reading obtained, and a point for the minimum reading. Connect the points and plot a third point at the average of the within sample readings

Plot the sample ranges for the remaining “piece to piece” data. Connect the averages of the within sample readings.

Plot the “time to time” groups similarly.

Six Sigma - Green Belt Interpreting the multi-vari chart Within Piece - It is characterized by large variation in readings taken of the same single sample, often from different positions within the sample, as shown below

Piece to Piece - It is characterized by large variation in readings taken between samples taken within a short time frame, as shown below

Time to Time - It is characterized by large variation in readings taken between samples taken in groups with a significant amount of time elapsed between groups, as shown below

Simple linear correlation and regression Correlation Correlation is tool that is with a continuous x and a continuous y. The Pearson correlation coefficient (r) measures the linear relationship between the x and y as discussed earlier. Causation is different from correlation as the correlation is the mutual relation that exists between two or more things while causation is the fact that something causes an effect. The correlation between two variables does not imply that one is as a result of the other. The correlation value ranges from -1 to 1. The closer to value 1 signify positive relationship with x and y going in same direction similarly if nearing -1, both are in opposite direction and zero value means no relationship between the x and y.

Six Sigma - Green Belt

Confidence in a relationship is computed both by the correlation coefficient and by the number of pairs in data. If there are very few pairs then the coefficient needs to be very close to 1 or –1 for it to be deemed ‘statistically significant’, but if there are many pairs then a coefficient closer to 0 can still be considered ‘highly significant’. The standard method used to measure the ‘significance’ of analysis is the p-value. It is computed as

For example to know the relationship between height and intelligence of people is significant, it starts with the ‘null hypothesis’ which is a statement ‘height and intelligence of people are unrelated’. The p-value is a number between 0 and 1 representing the probability that this data would have arisen if the null hypothesis were true. In medical trials the null hypothesis is typically of the form that the use of drug X to treat disease Y is no better than not using any drug. The p-value is the probability of obtaining a test statistic result at least as extreme as the one that was actually observed, assuming that the null hypothesis is true. Project team usually "reject the null hypothesis" when the p-value turns out to be less than a certain significance level, often 0.05. The formula to calculate the p value for Pearson's correlation coefficient (r) is p=r/Sqrt(r^2)/(N—2).

Linear Regression When the input and output variables are both continuous and to see a relationship between the two variables, regression and correlation are used. Determining how the predicted or dependent variable (the response variable, the variable to be estimated) reacts to the variations of the predicator or independent variable (the variable that explains the change) involves first to determine any relationship between them and it's importance. Regression analysis builds a mathematical model that helps making predictions about the impact of variable variations. Usually, there is more than one independent variable causing variations of a dependent variable like changes in the volume of cars sold depends on the price of the cars, the gas mileage, the warranty, etc. But the importance of all these factors in the variation of the dependent variable (the number of cars sold) is disproportional. Hence, project team should concentrate on one important factor instead of analyzing all the competing factors.

Six Sigma - Green Belt

In simple linear regression, prediction of scores on one variable is done from the scores on a second variable. The variable to predict is called the criterion variable and is referred to as Y. The variable to base predictions on is called the predictor variable and is referred to as X. When there is only one predictor variable, the prediction method is called simple regression. In simple linear regression, the predictions of Y when plotted as a function of X form a straight line. As an example, data for X and Y are listed below and having a positive relationship between X and Y. For predicting Y from X, the higher the value of X, the higher prediction of Y.

X 1.00 2.00 3.00 4.00 5.00

Y 1.00 2.00 1.30 3.75 2.25

Linear regression consists of finding the best-fitting straight line through the points. The bestfitting line is called a regression line. The diagonal line in the figure is the regression line and consists of the predicted score on Y for each possible value of X. The vertical lines from the points to the regression line represent the errors of prediction. As the line from 1.00 is very near the regression line; its error of prediction is small and similarly for the line from 1.75 is much higher than the regression line and therefore its error of prediction is large. The error of prediction for a point is the value of the point minus the predicted value (the value on the line). The below table shows the predicted values (Y') and the errors of prediction (Y-Y') like, for the first point has a Y of 1.00 and a predicted Y (called Y') of 1.21 hence, its error of prediction is -0.21. X 1.00 2.00 3.00 4.00 5.00

Y 1.00 2.00 1.30 3.75 2.25

Y' 1.210 1.635 2.060 2.485 2.910

Y-Y' -0.210 0.365 -0.760 1.265 -0.660

(Y-Y')2 0.044 0.133 0.578 1.600 0.436

The most commonly-used criterion for the best-fitting line is the line that minimizes the sum of the squared errors of prediction. That is the criterion that was used to find the line in the figure. The last column in the above table shows the squared errors of prediction. The sum of the squared errors of prediction shown in the above table is lower than it would be for any other regression line.

Six Sigma - Green Belt

The regression equation is calculated with the mathematical equation for a straight line as y = b0+ b1 X where, b0 is the y intercept when X= 0 and b1 is the slope of the line with the assumption that for any given value of X, the observed value of Y varies in a random manner and possesses a normal probability distribution. For calculations are based on the statistics, assuming MX is the mean of X, MY is the mean of Y, sX is the standard deviation of X, sY is the standard deviation of Y, and r is the correlation between X and Y, a sample data is as MX 3

MY 2.06

sX 1.581

sY 1.072

r 0.627

The slope (b) can be calculated as b = r sY/sX and the intercept (A) as A = MY - bMX. For the above data, b = (0.627)(1.072)/1.581 = 0.425 and A = 2.06 - (0.425)(3) = 0.785. The calculations have all been shown in terms of sample statistics rather than population parameters. The formulas are the same but need the usage of the parameter values for means, standard deviations, and the correlation. Least Squares Method – In this method, for computing the values of b1 and b0, the vertical distance between each point and the line called the error of prediction is used. The line that generates the smallest error of predictions will be the least squares regression line. The values of b1 and b0 are computed as

The P-value is determined by referring to a t-distribution with n-2 degrees of freedom.

Simple Linear Regression Hypothesis Testing Hypothesis tests can be applied to determine whether the independent variable (x) is useful as a predictor for the dependent variable (y). The following are the steps using the cost per transaction example for hypothesis testing in simple regression Determine if the conditions for the application of the test are met. There is a population regression equation Y = β0+ β1 so that for a given value of x, the prediction equation is Given a particular value for x, the distribution of y-values is normal. The distributions of yvalues have equal standard deviations. The y-values are independent. Establish hypotheses. Ho:b1= 0 (the equation is not useful as a predictor of y - cost per transaction) Ha:b1≠ 0 (the equation is useful as a predictor of y - cost per transaction) Decide on a value of alpha.

Six Sigma - Green Belt Find the critical t values. Use the t–table and find the critical values with +/- tα/2 with n – 2 df. Calculate the value of the test statistic t. The confidence interval formula is used to determine the test statistic

Interpret the results. If the test statistic is beyond one of the critical values greater than tα/2 OR less than -tα/2 reject the null hypothesis; otherwise, do not reject.

Multiple Linear Regression Multiple linear regression expands on the simple linear regression model to allow for more than one independent or predictor variable. The general form for the equation is y = b0+ b1x + ... bn+ e where, (b0,b1,b2…) are the coefficients and are referred to as partial regression coefficients. The equation may be interpreted as the amount of change in y for each unit increase in x (variable) when all other xs are held constant. The hypotheses for multiple regression are Ho:b1=b2= ... =bn Ha:b1≠ 0 for at least one i. It is an extension of linear regression to more than one independent variable so a higher proportion of the variation in Y may be explained as first-order linear model And second-order linear model

R2 the multiple coefficient of determination has values in the interval 0