BUSINESS CONTINUITY. Good Practice Guide

Good Practice Guide BUSINESS CONTINUITY Contents 1. Introduction 2 2. The BCM life cycle 3 3. Understanding the organisation 6 4. Determi...
3 downloads 0 Views 1MB Size
Good Practice Guide

BUSINESS CONTINUITY

Contents

1. Introduction

2

2. The BCM life cycle

3

3. Understanding the organisation

6

4. Determining strategy

13

5. Developing a response

19

6. Exercise, maintain, review

27

7. Embedding BCM

31

APPENDIX 1: Incident Response

32

APPENDIX 2: Glossary

35

APPENDIX 2: Further Information

36

BUSINESS CONTINUITY Continuing your professional development (CPD) is all about keeping on the front foot in your career. Developments in facilities management come thick and fast through technological, legislative, environmental, economic and political changes so CPD is essential to stay informed and to help you reach your potential. Members of BIFM can access a wide range of knowledge and information, such as the Good Practice Guides, through the member’s area of the BIFM website. However to help you on your way and identify core activities look out for BIFM’s CPD logo. If you are not a member of the institute then be sure you don’t miss out! Join today at www.bifm.org.uk/join

ISBN: 978-1-909761-17-9 Edition: Second Date: September 2015 Authors: Steve Dance, Managing Partner, Risk Centric and BIFM Risk and Business Continuity SIG Committee Member Peer reviewer: Mike Cronin, BIFM Risk and Business Continuity SIG Committee Member and Group Facilities Director, Haymarket Media Group BIFM Number One Building The Causeway Bishop’s Stortford Hertfordshire CM23 2EN T: +44 (0) 1279 712620 E: [email protected] www.bifm.org.uk Advertising T: +44 (0) 1279 712620

© The Good Practice Guides series is published by the British Institute of Facilities Management (BIFM). The guides do not necessarily reflect the views of BIFM nor should such opinions be relied upon as statements of fact. All rights reserved. This publication may not be reproduced, transmitted or stored in any print or electronic format, including but not limited to any online service, any database or any part of the internet, or in any other format in whole or in part in any media whatsoever, without the prior written permission of the publisher. While all due care is taken in writing and producing this Good Practice Guide, BIFM does not accept any liability for the accuracy of the contents or any opinions expressed herein.

Business Continuity GPG 1

1. Introduction

Business continuity is the ability to continue essential business functions at all times, under all circumstances and as far as humanly possible. The purpose of this Guide is to describe the general principles and the practical application of BCM, to enable facilities managers to develop an understanding of the issues. The Guide is aimed at those with little or no previous knowledge of business continuity, although familiarity with the working environment and the culture in which business continuity is to be implemented is assumed.

2 GPG Business Continuity

The British Standard, ISO22301, the current Business continuity management code of practice, is followed with input from the Business Continuity Institute’s Good Practice Guidelines 2013 (BCI GPG 2013), in which further, more detailed, information can be found: www.thebci.org BCM and its skills and disciplines should be seen as “common sense applied in a structured manner”.

2. The BCM life cycle

O RG AN

IS

BC M

I

UNDERSTANDING THE ORGANISATION

T N ’ S IO

E B

U

DEVELOPING AND IMPLEMENTING BCM RESPONSE

CU LT

I N G DD

EXERCISING, BCM DETERMINING MAINTAINING PROGRAMME BCM AND REVIEWING MANAGEMENT STRATEGY

E

M

The table (right) gives a more detailed overview of the key stages of the life cycle.

N

E TH

A

BCM needs to be embedded in the corporate culture or it becomes overlooked, forgotten or taken for granted. If this happens, it will lack buy-in and essential support from senior management in terms of funding, resources, exercises, etc.

E

The model begins with a setting-up procedure and then becomes an iterative process in four areas.

BCM needs to be embedded in the corporate culture or it becomes overlooked, forgotten or taken for granted.

R

The British Standard, ISO22301, introduced a life cycle model of a Business Continuity Management (BCM) programme.

Business Continuity GPG 3

Managing the Programme BCM programme management is first concerned with managing the introduction and maintenance of business continuity principles into the organisation. It should be based on a formal policy with defined responsibilities and processes, all documented as auditable evidence.

The policy sets out what needs to be done and by whom. Typically it would cover: > Corporate business continuity organisation and responsibilities: – Senior management team – Steering committee

Subsequently it will provide the impetus to promote, maintain and assure the implemented programme.

– Business continuity co-ordinators

BCM programme management requires:

> Requirements for review and exercising

> A team to manage the programme with the authority to define and implement policies and standards and influence the prioritisation of business continuity activities. The team must operate with the board’s support and endorsement, otherwise essential support is unlikely to be available. > Policies, standards and guidelines that define the framework of the programme. The BCM policy is a document, issued by senior management, that communicates the organisation’s BCM framework, together with the responsibilities and expectations of those involved with managing and maintaining the organisation’s business continuity arrangements.

4 GPG Business Continuity

> Criteria to determine which parts of the organisation will need plans

> Requirements for awareness and training > Audit review and reporting > Requirements for record keeping. Standards and guidelines would include: > Templates to provide a consistent format > Guidelines on key activities and template completion.

BCM life cycle – the key stages Life cycle stage

Main activities

Outcome

BCM programme management

> E stablish management organisation

> Committee structure and project staffing > BCM policy > Standards and guidelines

Understanding the organisation

> B  usiness impact analysis > R  isk assessment > C  ontinuity requirements analysis

> P roduct and service exposure map showing types of exposure causing operational disruption (see, for example, Table 1) > M  aximum tolerable period of disruption and recovery time objective for each operation > P rioritised risks that could cause operational disruption, (see, for example, Table 2) > R  ecovery point objective and matrix showing minimum resources to maintain each operation (see, for example, Table 3)

Determining BC strategy

> Identify countermeasures in order to achieve resumption of operations

> R  ecovery strategies for each operation (see, for example, Table 4)

Developing and implementing a BCM response

> Identify detailed actions necessary and resources required to manage an interruption and maintain effective communications with all affected parties

> Incident management plan for notification, escalation and management of an incident (see, for example, Table 5)

Exercising, maintaining and reviewing

> E stablish a framework and organisation to support oversight, evaluation maintenance and testing of BC arrangements

> B  CP testing document (see, for example, Table 8)

Embedding BCM in the organisation’s culture

Ongoing initiatives to:

> Regular communication programme

> B  CP to resume operations within a predefined timescale (see, for example, Table 6) > A  ctivity resumption plan to resume individual activities (see, for example, Table 7)

> P rovide access to details of BC arrangements > C  reate quick reference resources and materials > Implement an enforceable policy > M  easure levels of awareness

Business Continuity GPG 5

3. Understanding the organisation To begin the BCM life cycle you must understand the organisation within which the strategy is to be implemented. Three principal tools are used in this context: Business Impact Analysis (BIA) Is a means of identifying, quantifying and qualifying the consequences of a loss, interruption or disruption of business activities over time. A BIA can be used at any level on any activity in the organisation. Risk Assessment (RA) Estimates the likelihood of loss, interruption or disruption from known threats. Continuity Requirements Analysis (CRA) Analysis (CRA) assesses the resources required for a resumption of activities.

Business Impact Analysis (BIA) A BIA needs the following information: > What resources and services are critical to the core business activities > The potential impact of a disruption to the provision of those resources and services > The stage at which, in terms of the duration of the disruption, the impact on the business would become unacceptable. Deciding the scope of the analysis may limit the maximum extent over which a disruption is considered. This could be determined by geographical considerations, regulations or statutes, products, markets or specific customer requirements. BIA methods Collecting information from staff responsible for core business activities and their dependencies aids the choice of continuity strategies. Collection methods include: > Workshops which provide rapid results and engagement with the BCM programme > Questionnaires give a lot of data although the quality varies > Interviews offer good information but are time consuming. Combinations of the above can give excellent results.

6 GPG Business Continuity

Table 1 Product and services exposure map

From

Legal Financial Reputation Regulatory contract RTO MTPD MTPD MTPD MTPD

Responsible manager (example names)

1

Product

2

Vision Opticals – Direct to customer

3

Manufactured Operations 3 months 14 days

12 days

John Priestly

4

Factored

Logistics

3 months 14 days

12 days

John Simmonds

5

3rd party

Logistics

30 days

25 days

Eric Dickens

6

Vision Opticals – Wholesale

7

Manufactured Operations 3 months 30 days

25 days

Jane Ross

8

Factored

25 days

Tom Mickleson

9

Opticals – Export

10 Manufactured Operations 3 months 30 days

25 days

Toby Rice

11 Factored

25 days

Anna Austin

13 Manufactured Operations 3 months 14 days

12 days

Jack Prince

14 Factored

12 days

Mike Reason

12 days

Joel Kent

Logistics

Logistics

None

None

None

3 months 30 days None

None

3 months 14 days

15 Solar Opticals – OneVision

None

16 Manufactured Operations 3 months 14 days 17

None

3 months 30 days

12 Solar Opticals – Techstrap

Logistics

None

None

None

ETC.

Business Continuity GPG 7

A standard reporting format will improve the consistency of recording and analysing information across multiple functions. The types of questions and the objectives are the same whichever approach is chosen. They include: > Location of activities > The impact of losing the activity > How long the organisation can last without the activity > Timeframes for activity resumption > Influences, such as peak periods or regulatory reporting > What the alternatives are. Factors to consider include: > Volumes, e.g. calls per hour, output on production line > Contractual, regulatory or legal requirements > Key tools to achieving continuity of the activity: buildings, processes, suppliers (how many, where and when) > People; staff (skill set), customers > Equipment; IT, telecommunications, manufacturing/industrial, plant > Data; paper and electronic > Dependencies; internal and external to the organisation > Public/media/brand implications.

8 GPG Business Continuity

The main outputs from a BIA are: > The Maximum Tolerable period of Disruption (MTPD), leading to the recovery time objective (RTO) – the timescale within which a function must be restored to enable continuity of the business to be maintained or resumed. > Recovery Point Objective (RPO) – the condition to which the situation is to be restored to enable business activities to resume effectively. The output from the first stage of the BIA process would look similar to the information shown in Table 1. The main products of the company have been listed vertically; below each are broad sub-headings of their required resources and services. For each, the MTPD has been defined before exposure to a variety of business continuity concerns such as financial and reputation. The person responsible for recovery management is also identified. Several products have a MTPD of 14 days before there is a risk to the company’s reputation. Therefore the company must focus on RTOs for these which provide a minimum level of acceptable service, and the RPO within this timeframe to avoid damaging its customer relationships. The RTO in Table 1 is 12 days in order to give some margin.

DOs and DON’Ts

Risk Assessment (RA) In the BCM context, RA highlights specific threats that could cause a significant business interruption to the broad categories of resources and services identified as most crucial. In large or complex organisations it is desirable to carry out the exercise in manageable sections. The RA can be used to inform the decision about where to concentrate BIA efforts. The objective is to:

> DO ensure that business interruption risks are expressed in categories such as reputation, contractual/legal obligations, regulatory compliance and financial impact > DO ensure that the MTPD and RTO thresholds have been considered for all of the above risk categories > DO ensure that interruption threats at different stages in the business cycle have been fully enumerated > DO ensure you have documented the outcome of the “Understand the Organisation” activities > DON’T get bogged down in unnecessary detail

> identify internal and external threats that could cause disruption and to assess their probability and impact,

> Consider appropriate measures to:

> prioritise those threats according to an agreed formula

– avoid the risk, eg, remove the cause of the threat

> supply input to a risk management action plan

– reduce the risk, eg, introduce further controls

The key stages in an RA are:

– transfer the risk, eg, through insurance (but note that although insurance can provide financial compensation, it may not provide cover for the full expense of the incident or damage)

> Agree a scoring system for impacts and probabilities with the project sponsor > Calculate a risk from each threat using the list of vulnerabilities from the BIA > Prioritise these risks, taking account of the ability to control them > Obtain the sponsor’s approval and sign-off on these risk priorities > Review existing control strategies, noting where the risk level is out of step with the current strategies for that threat.

– accept the risk, eg, low impact or probability. Ensure planned risk measures do not increase other risks. For example, outsourcing may decrease some types of risk but increase others.

Business Continuity GPG 9

Table 2 Suggested prioritised risks Resource

Threat

Likelihood Impact

Risk Risk Response treatment

People

Pandemic among staff

Low

High

Accept

Industrial action

Low

Low

Reduce

Maintain good staff links

Extreme weather

Medium

High

Reduce

Multiple locations Co-location

Loss of utilities

Medium

High

Avoid

Install backup systems

Medium

Medium Reduce

Sprinkler system

Co-location

Low

Medium Reduce

Stock essential spares

Multiple equipment

Low

High

Reduce

Install physical access controls

IT Disaster Recovery plan

Medium

High

Accept

Install security software

Premises

Equipment Fire

Lack of spares

Technology Deliberate damage

Virus infected IT equipment Supplies

Stock

Contingency plan

Robust HR policy

Transport disruption Medium

Medium Reduce

Contingency plan Contract private transport

Sub-contract default Medium

Medium Reduce

Multiple sub-contractors

Manufacturing fault Low

Low

Quality control procedures

ETC.

10 GPG Business Continuity

Avoid

Multiple supply

The outcomes from an RA include the identification and documentation of: > single points of failure > prioritised list of threats to the organisation or specific business processes > input to the risk control management strategy and action plan to address the risks > documented acceptance of identified risks that are not to be addressed. This activity should result in an understanding of: > how and why an incident could have an adverse impact on your business > time thresholds for key activities that must be re-established > the internal and external dependencies they rely on. It should be remembered that: > It is impossible to identify all threats > estimates of probability are only estimates > impacts increase over time at different rates > numeric scales may distort the perceived impact of minor events. Unacceptable concentrations of risk or “single points of failure” should be brought to the attention of the business continuity sponsor with options for addressing the issue. The decision to avoid, reduce, transfer or accept the risk should be formally documented and signed off (see Table 2).

Continuity Requirements Analysis (CRA) The next step is the CRA. The aim is to quantify the resources (eg, people, technology, telephony) that are required over time to resume and continue business activities to a satisfactory level. In other words, to operate at an acceptable level, the RPO, within an acceptable time, RTO. This is usually done simultaneously with the BIA. Its purpose is to: > provide resource information to develop the recovery strategy to support agreed service levels > identify resource requirements resulting from dependencies between internal activities and external suppliers. It is important to explore whether systems must be recovered to the status they had when the failure occurred. The RPO for IT systems will be derived from the information restoration needs. The RPO is sometimes seen as “the amount of data we could afford to lose”. It is also necessary to take account of additional activities generated by the interruption and clearing of backlogs. For example, a call centre may have to cope with extra calls following an interruption. This information feeds into the business continuity strategy. Resource requirements help us to evaluate alternative recovery solutions in terms of capacity and performance.

Business Continuity GPG 11

Table 3 Matrix of resource requirements Vision Opticals – Direct to customer, third party products

RTO: RPO:

Activity/Product Dependencies People Premises

Equipment

Technology

Supplies

Stock

Current provision 20

1,000 m2 office 10,000 m2 warehouse

Racking Packing Fork lifts

ERP Email File server

2,500 units/day

5,000 units on hand

Minimum requirement

5,000 m2 warehouse

Shrink wrapping equipment to process 2,000 units per day

Five users 2,000 with server units/day access within 24 hours

2,000 units to handle pending delivery

10

The dependencies would generally be mapped on a separate matrix (see Table 3), showing the product or service and the services, processes and resources that support it. Typically, there are six main areas of dependence: > People, including partners, customers and contractors, to provide skills, knowledge and manpower > Premises, to provide a working environment, accommodating equipment and stock > Equipment to perform specialised tasks > Technology to communicate and to store, manipulate and present data > Supplies for manufacturing processes > Stock to fulfil orders.

12 GPG Business Continuity

All activities depend on such “enablers” and will require them to be available within a given timeframe to avoid the associated risk exposures. In Table 3, one of the products identified on the BIA has been analysed to reveal its dependencies.

4. Determining strategy

In the BIA, the MTPD for key activities will have been determined, together with an RTO and RPO for each of those activities. The BCM strategy sets out an appropriate approach to recovering each activity. It is the selection of a goal (eg, “If we lose access to building XX, we will relocate staff to YY”) that needs to identify, in general terms, how many staff, what skills and what resources we might need to have available at the chosen locations, as well as any necessary travel arrangements. The BCM strategy describes what has to be done, not how it has to be done. It is therefore the selection of a high-level response such as: >  Replicate and restore Keep copies in case the originals are lost or damaged (most IT recovery plans are based on this concept) > Repair Remedial work may be the quickest method of recovering key resources > Replace If supply is plentiful then key resources can be replaced quickly

> Reciprocity Arrange to borrow another organisation’s facilities > Relocate Move the workforce and workload temporarily to another location > Workaround Temporarily adopt an alternative approach to a process > Suspend Adjourn the activity until normal service is restored. Different activities require different solutions. Strategy selection is influenced by practicalities such as the cost of implementation and maintenance. Transferring staff and operations takes time and effort. Normally, a fast and seamless recovery entails a more costly solution. Therefore, it is important to ensure that realistic RTOs and RPOs are set. Is it essential to recover systems to the status they had when the failure occurred or, for example, will restoring yesterday’s back-ups be sufficient? There are three main aspects to setting the BCM strategy to achieve the agreed RTO: > Selecting the tactics for continuing the delivery of products and services > Consolidating the resource requirements > Sourcing these requirements. The various options must be fully understood before selecting the appropriate tactics.

Business Continuity GPG 13

Activity continuity strategies For each activity, the most appropriate tactics to meet the RTO must be selected based on cost, guarantees, additional benefits and other factors. Agreements may vary from verbal promises through to contractually committed service levels. The shorter the RTO, the more important the reliability of the delivery becomes. People Some of the following techniques should be considered: > Process mapping Allowing staff to undertake unfamiliar roles > Multi-skill training Of individuals > Cross-training of skills Across a number of individuals

The Business Continuity Manager should be aware of threat reduction techniques

You should ensure various stakeholders’ needs are satisfied or they may impede the recovery effort. For example, the local residents could press the local authorities to refuse you permission to rebuild on the site following a fire. For civil emergencies dialogue with local emergency responders may provide useful information, such as: > Recommendations for assembly points and evacuation routes > Notice of specific hazards in the vicinity

> Succession planning.

> Likely position of any traffic cordons

Additional skills may arise from permanent or occasional use of third-party support. Alternatively, an inventory can be made of staff skills not used in existing roles. This might include previous experience in other roles – First-aid training, salvage or rescue experience or emergency management skills.

> Special access arrangements

Many stakeholders (including customers, partners and contractors) may be affected by an incident. In a major fire at your site contractors may be injured, local residents evacuated and local businesses closed for safety reasons or because of reduced trade. The organisation’s level of responsibility (both legal and moral) for these groups should be understood.

14 GPG Business Continuity

> Participation in exercises. The BC manager should be aware of threat reduction techniques, including: > Physical security where advice can be sought from security professionals > Information security. ISO 27001, Information Security Management and ISO 17799, Code of Practice for information security management, provide useful guidelines. Premises The RTO is the principal determinant of worksite continuity tactics. Once the RTO parameter has been satisfied cost and availability will guide the choice of tactics.

Premises tactics include: >  Do nothing This may be acceptable for the least urgent activities identified in the BIA. Where the RTO exceeds a few months it allows time for buildings to be found and utilities installed post incident, all with minimal planning and preparation. >  Relocate your staff Move up Use existing accommodation such as a training facility or canteen to provide recovery space, or increase office density. This needs planning and preparation. Displacement High priority activity personnel could temporarily displace some of those who are performing less urgent business processes. But beware of unmanageable backlogs. Remote working This includes “working from home” and from other non-corporate locations such as hotels. Reciprocal agreements Great care must be taken when establishing this type of agreement. It requires contracted regular testing. >  Use third-party premises Third-party alternative site arrangements may be considered if they meet the RTO. Commercial services include fixed, mobile and prefabricated premises. Dedicated work areas provide exclusive use of the accommodation. “Syndicated” or “Subscription” options offer access, provided the accommodation is not already in use. This can be on a first come, first served or an equitable share basis whereby

resources are allocated in proportion to the subscription. > Use diverse location tactics This option moves the activity and not the staff via dual-site operations or continuous availability solutions. In the event of an interruption at one site the business activities are transferred to alternative locations where staff and facilities are already prepared to handle it. Equipment With uninterruptible power supply (UPS) or back-up generators, some risks are acceptable. Risk reduction can use monitoring systems to warn against utility or equipment failures and destructive threats, eg, sprinkler and fire suppression systems in buildings with a high loading of flammable materials or expensive equipment. Possible recovery techniques to consider are: > Maintenance contracts, preferably with local firms > Salvage engineers can often restore equipment after damage by fire or water > Asset restoration specialists can often minimise damage after fire and flood to equipment, buildings and papers, and they may offer useful advice, as well as being available on request > Use of local subcontractors or competitors with similar equipment.

Business Continuity GPG 15

Technology The loss of a data centre can have a major financial impact on a business. There are several options, including in-house resilience, recovery or third-party support. It is a complex and costly area in which technical expertise and a sound working knowledge of the critical systems are invaluable. The IT department, or the equivalent service provider, should investigate and recommend appropriate recovery options which include: > Ship-in contracts for IT and specialist equipment, including telephone systems. Terms of contract vary from ‘best efforts’ to guaranteed delivery. > Call redirection for telephony Most telecommunications operators offer solutions for redirecting calls from one site to another. The logistics of handling redirected calls must be addressed. > Convergence of telephony and data networks, VoIP (Voice over IP): This creates new opportunities and issues, since telephones and email are often used as alternatives if one fails; these issues need to be assessed and the risks and impacts thoroughly analysed. Since business continuity incidents often involve denial of access, back-up copies of records should be kept at another location. There is no ‘correct’ separation distance, but one must consider denial of access factors such as loss of power or transport disruption.

16 GPG Business Continuity

There may also be limits on the distances staff would be prepared to travel at short notice. Note that after an incident the regulatory, statutory or business standards for information management still apply. Key issues to address are: – Confidentiality – Integrity – Availability – Currency. Supplies One must determine what supplies (including equipment) are needed, and how quickly, to meet the RTO of each activity. Replacement strategies include: > Storing additional supplies at another location. If the supplies degrade over time they should be rotated with regular stock > Changes in the core process may require stored supplies to be changed (eg, headed stationery may need new address or contact details) > Delivery of stock at short notice > Diversion of just-­in-­time deliveries to other locations > Holding materials at warehouses or shipping sites

You should ensure that various stakeholders’ needs are satisfied or they may impede recovery

Table 4 Recovery strategy for third party products RTO:

Vision Opticals – Direct to customer, third party products

RPO:

Activity/Product Dependencies People Premises

Equipment

Technology

Supplies

Stock

Current provision 20

1,000 m2 office 10,000 m2 warehouse

Racking Packing Fork lifts

ERP Email File server

2,500 units/day

5,000 units on hand

Minimum requirement

5,000 m2 warehouse

Shrink wrapping equipment to process 2,000 units per day

Five users 2,000 with server units/day access within 24 hours

2,000 units to handle pending delivery

10

Recovery strategy Replicate/restore

X

Repair X (warehouse)

Replace Reciprocity Relocate

X

X X

X

Workaround Suspend

X (office)

X



> Transferring sub-assembly operations to new locations > Holding older equipment as emergency replacements or for spares > Specific risk mitigation strategies are needed for unique or long lead

time equipment: replacing outdated equipment with long lead time updated versions may impede recovery > Geographical diversity of processes. Make sure that the RTO can be met by the alternative location.

Business Continuity GPG 17

Techniques for reducing the impact of supply interruptions include:

> Obtaining sign-off for financial and resource provision

> Dual or multi-sourcing

> Creating project and action plans

> Inspection of supplier’s business continuity arrangements. This may include a requirement for certification to ISO22301

> Applying the agreed strategy.

> Holding inventories off-site, at another site or at the supplier’s site

Executive management must make a strategic evaluation and sign off the strategy, together with the requisite financial and resource provisions.

> Penalty clauses on supply contracts (no protection against bankruptcy) > Pre-acceptance of alternative suppliers. Resource level consolidation The objective of resource level consolidation is to understand and locate the resources necessary to achieve the RTO and RPO. It is necessary for two reasons: > Co-ordinating the acquisition and utilisation of resources can prevent conflicts, such as when more than one operation expects to use the same alternative workspace > Bulk purchasing may be more efficient and cost-effective. Resource consolidation includes the following stages: > Aggregating resource requirements from the CRA > Evaluating each option against the RTO and RPO and providing executive management with a strategic evaluation

18 GPG Business Continuity

The result is a set of recovery resources and services for the restoration of business systems within their RTO and RPO.

In Table 4 we have addressed recovery strategies for the thirdparty products, part of a direct to customer business. The following issues were considered: > We are going to relocate 10 staff. What skills are required, where will they go, how will they get there and what resources will they need? > We will need to find alternative warehousing facilities. Who will do that and what information will they need to source this? > We are going to replace any damaged equipment. Who will do that and identify potential suppliers in advance? > We will need to identify alternative suppliers. Who are they and who will contact them? > We may need to replace lost stock, so we are looking at a reciprocal fulfilment arrangement with another firm to supply products while we recover our operation.

5. Developing a response This part of the process concerns the most detailed planning documents, which are also likely to be the most fluid. The aim is to identify the actions and resources required to manage an interruption, whatever its cause. Key requirements for an effective response are: > A clear procedure for escalation and incident control > Communication with stakeholders > Business continuity plans (BCPs) to resume interrupted activities. A BCP is a set of guidelines that require interpretation by the business continuity team (BCT) according to circumstances. It is not possible – or even desirable – to predict what might occur. The Incident Management Plan (IMP) defines how strategic issues would be addressed and managed by the executive. This may include incidents where there is no physical disruption, right up to a national emergency. Media response to

any incident is usually managed through an IMP. At a tactical level the BCPs address business disruption from the initial response through to the point at which normal business operations are resumed. Based on the BCM strategy, they provide procedures and processes for the BCT, allocating roles and responsibilities. They must also give details regarding liaison with external agencies such as recovery services’ suppliers and emergency services. If the event falls outside the scope of the BCP, the situation should be escalated to the senior incident management team (IMT). Operationally, Activity Resumption Plans (ARPs) provide detailed guidelines for the recovery teams to implement the resumption of normal business functions and support services. Incident Management Plan (IMP) The IMP provides a framework for managing any incident. The plan should contain initial prompts for action, such as a list of stakeholders to be contacted. The BIA will offer useful pointers to potential impacts which may need to be managed. Wherever a BCM response is required the IMT should be alerted. If no IMP exists it may be useful to run an exercise with the senior management team so that the many requirements become apparent (such as the need for a plan). All incidents differ and so the IMP is a framework of components and resources that may be useful, rather than a rigid procedure. Business Continuity GPG 19

The roles of the team and specific individuals should be documented. Deputies should be identified for each role. Responsibilities may include: > Managing communications (see section below) > Ensuring IMTs and BCTs are properly staffed > Liaising with the BCT to agree the resumption timetable > Approving significant expenditure > Monitoring recovery progress and personnel performance > Identifying and maximising opportunities or advantages arising from the incident > Looking at the strategic impact, which may require significant changes in direction or open up new opportunities > Maintaining a decision log throughout the incident. Clear invocation criteria should be set out, and the persons able to initiate the call-out decided. This should encourage action where there is doubt; it is easier to stand down a team than to activate them once the incident is out of control. The activation procedure should be documented so decisions are not delayed. A number of alternative meeting locations should be identified and, on invocation, the first person notified should select the most suitable, based on current information.

All BCM strategies should take into account welfare issues in an incident. Staff are more likely to co-operate if their needs are met.

At least two locations should be predefined to act as an incident management centre (control room or command centre). One is likely to be on-site where the senior management team are based but the other should be off-site. The off-site location does not have to be owned by the organisation. By prior arrangement, a 24-hour hotel may provide all the facilities required. Consideration should be given to: > Communication: inbound and outbound > Recording events, actions and issues > Monitoring the media > Access control. The following resources should be considered: > Whiteboards or flip charts (and pens that work) > Telephones, including an outgoing line and a recording facility > Hotline/helpline facility > TV and radio > Stationery > A means of logging all actions

20 GPG Business Continuity

Table 5 Incident Management Plan Incident management framework Team members and responsibilities Role

Responsibility

Contact Details

Deputy

Contact Details

Site evacuation Personnel accountancy Communication (staff & others) Emergency services liaison Telephone reception for next of kin Media & external communication Transport assistance Translation services Incident management centre locations: Incident management centre access arrangements: Incident management centre resources Location

Desks/Chairs

Phones

PCs

Fax

Other Office Materials / Equipment



> Refreshments and nearby or on-site sleeping facilities

> An IMP

> A locked trunk (often called a ‘battlebox’ or ‘recovery box’), in which hardware and information can be kept offsite at the alternative location.

> Demonstration of preparedness

All BCM strategies should take into account welfare issues during an incident and the recovery. Staff are more likely to co-operate if their welfare needs are met. Issues to consider include individual special needs during prolonged stay-in periods.

The IMP should be documented. The template in Table 5 gives an example of a suitable format. Major incidents requiring an IMP can vary from those which threaten the continued existence of an organisation but have little impact outside of it, to those which, like the Buncefield oil depot explosion, can become a national emergency.

An IMP should be succinct and clear because it will be used under pressure in stressful circumstances. The outcomes of the process include:

> An incident communications plan > Compliance with statutory, regulatory and ethical requirements.

Business Continuity GPG 21

The principles to be applied to the latter are exactly the same but there is increased emphasis on health and safety and liaison with emergency services. These are features which may have little or no prominence in a purely internal issue such as the failure in a supply chain.

Key steps in developing a BCP are:

Appendix 1 describes the incident response structure employed by the UK emergency services. The model is suitable for organisations with the potential for major health and safety incidents.

> Gather information to populate the plan and prepare a draft plan

The BCPs and the ARPs, are similar in structure, but focus on different aspects of recovery: > BCPs cover the management of common resources such as facilities, information technology, finance and personnel – in essence the organisation’s infrastructure > ARPs focus on the recovery of specific activities, often customer facing, such as order taking, customer helpdesks or claims handling. Both types of plan have similar considerations and structure dealing with what has to be done by whom, when, where and how. Business Continuity Plans (BCPs) All BCPs should be ‘action orientated’, easy to reference at speed and exclude superfluous information. The BCP should document assumptions about the maximum scale of the incident. If these are exceeded then this should be escalated to the IMT.

22 GPG Business Continuity

> Appoint an owner for the BCP(s) > Define objectives and scope based on the BCM policy and strategy > Decide the structure, format, components and content

Circulate the draft plan for consultation and review > Test/exercise the plan > Gather feedback from consultation and amend the plan as appropriate. All BCPs should be modular in design so that separate sections can be supplied to teams on a need-to-know basis. Each section could be printed on different coloured paper to provide ease of use and reference. Dynamic information, such as contact details, should be in appendices which can be amended easily, with job titles rather than names in the main body of the document. Software products are available to help you build and maintain a BCP. However, normal office software may well suffice and does not require special training. Customised software, though, may prove helpful in plan maintenance.

Table 6 Business Continuity Plan (BCP) Business Continuity Plan Location:

Activity:

Recovery management centre:

Available facilities:

Alternative location: Contact list Name

Office tel.

Mobile

Role/ Action

Actions



Business Continuity Plan – the contents

> Actions

> Basic information

– Responding to an invocation

– Document owner and maintainer

– Decision making

– T eam members and their roles along with named deputies

– Mobilising resources

Responsibilities may include:

– Receiving information from other teams

– Liaising with emergency services

– Initiating activity recovery

– Obtaining information from response teams

– Reporting status to the IMT.

– Reporting to the IMT

– Personnel

– M  obilising suppliers of salvage and recovery services

– Facilities and supplies

– Allocating available resources to recovery teams – Invocation/mobilisation instructions. There should be a number of possible meeting locations, favouring those with the required resources. On invocation the first person notified should identify the most suitable meeting place, plus a fallback based on current information.

> Resource requirements

– Technology, communications and data – Security – Transportation and logistics – Welfare requirements – Emergency cash and payments – Any additional resource requirements for specific activities – Contact information to access required resources. Business Continuity GPG 23

DOs and DON’Ts

> Vital information

> DO remember that the BC response is a framework for a stressful situation

– Customer information

> DO ensure key roles and responsibilities are clearly defined

– Contact details

> DO check that staff holding BC response roles have been appropriately trained

– Legal documents, such as contracts and insurance policies

> DO have a “quick reference” guide readily available summarising key information

– Service Level Agreements. > Forms and annexes – Checklists to assist recovery. A function-specific BCP generally has two main sections:

The complexity and urgency of the business processes may determine whether one plan covers a single activity or a department with several activities.

– A list of key contacts and team members identifying the roles and responsibilities of each

Process

– An outline of the specific actions necessary for recovery.

> Making someone responsible for overall plan development with representatives from each operational unit

Examples in Table 6 provide a framework within which ARPs can operate. The BCP should be signed off by the executive. Activity Resumption Plans (ARPs) Introduction ARPs cover the response by a department or business unit to an incident. Examples are: > Procedures to assist the IMT, led by the facilities department, to deal with the incident and its physical impact > An HR response to welfare issues

Key steps in ARP development and planning include:

> Providing a template to encourage standardisation but allow individual variations where appropriate > Ensuring that business units nominate individuals to fulfil roles within their plans > Circulating the draft plan for consultation, review and challenge within and, where necessary, outside the department

> A business department (eg, finance department) plan to resume its functions within a predefined timescale

> Validating and amending the plan as appropriate through a unit test

> An IT department plan for the resumption of IT services to the business.

> Documenting connections with the BCP and between the ARPs for each business unit

24 GPG Business Continuity

> Consolidating the various business unit plans and reviewing for consistency

> Conducting a resource requirements analysis across all plans to define the resource requirements for support functions.

– Special procedures

Methods

Outcomes and deliverables

Development of an ARP is similar to that for other plans. Specific ARPs may include the following:

Outcomes of planning include:

> Facilities

> Criteria for escalating issues to the BCT

> Staff welfare plans

> Clearly defined BCM roles within each department.

> Business unit resumption > IT disaster recovery The above plans may include information on:

– Work in progress issues – Consumables required.

> An ARP for each business activity or department

Table 7 is a simple example of some of the activities needed to recover financial management after a major incident.

> Evacuation and stay-in plans > Emergency services liaison > Dispersal of staff and visitors > Salvage resources and contracted assistance > HR and welfare issues > Procedures for contacting and accounting for staff > Counselling and rehabilitation resources > Escalation criteria and procedures > Contacting team members > Resumption plans for each process: – Staff numbers – Key contacts – Procedure for activity resumption – Activity priorities

Business Continuity GPG 25

Table 7 Activity Resumption Plan (ARP) Business Recovery – Finance department responsibilities Action Outline here the actions that will need to be performed to put the chosen recovery plan into effect

Primary Deputy Responsibility Who will be responsible for performing these actions

Primary considerations

Who will deputise if the primary responsibility holder is unavailable?

Identify the support team(s)who will be responsible for these particular actions

Assess financial exposures related to legal and financial issues

1. Potential civil liabilities

Banking

1. Establish alternate means of accessing bank account balances and movements 2. Establish alternate means of making payments: > S mart cards, ‘dongles’, passwords >A  uthentication software/devices installed on alternate PCs 3. Payroll 4. Activating company credit cards 5. Emergency limits for company credit cards 6. Emergency overdraft/funding considerations

Accounts payable

Emergency credit lines, renegotiating supplier payment terms

Insurance

Liaison with insurers and loss adjusters



26 GPG Business Continuity

Support team

2. Compliance with industry-specific regulations 3. Extent of financial penalties under Service Level Agreements

6. Exercise, maintain, review All BC documents should be reviewed and the plans exercised at least annually. Reviews and exercises should also be carried out whenever there is a significant change to the business processes or environment. No plan is reliable if it has not been exercised, nor can the personnel involved be relied upon until they have had some form of practice. BC exercises are crucial because they develop the necessary competence, confidence and knowledge to act. Five stages of exercise are recognised, as detailed in Table 8. The normal progression would be to start at the bottom with a desktop exercise and work up towards the full-scale exercise at the top. (Most organisations limit themselves to stages 1-4.)

DOs and DON’Ts > DO include requirements for exercising, maintaining and reviewing in the BC policy > DO ensure compliance is subject to independent assessment > DO ensure there is regular confirmation of roles, contact details and availability of BC response resources > DO ensure plans are subject to regular exercising, at least annually > DO ensure a formal issues log is created, maintained and reviewed

Concepts and assumptions For any test to be ‘useful’, it needs to meet the following criteria: Stringency, realism and minimal exposure to additional risk. This may require some degree of compromise. > Stringency Ideally, tests should be as realistic as possible, however, it may not be practical to run certain tests without altering ‘live’ procedures. This applies especially to technical testing. >  Realism This ensures that the audience engages in the event and ultimately gains more from it. >  Minimal exposure Testing may increase exposure to risk. The designer of the test should ensure that the risk and impact of disruption is minimised. The business must understand and accept the risk.

Business Continuity GPG 27

Table 8 Types of exercise Stage

Purpose

Style and focus

5

Full-scale exercise

Develop the capability Demonstrate competence

Simulation and realism Scenario based

4

Command post exercise

Acquire the skills Develop the techniques

Co-ordination and communication Scenario based

3

Active testing

Practical experience Test the components

Participation and interpretation Consequence based

2

Walk-through

Challenge the assumptions Verify the dependencies

Review and discussion Effect based

1

Desktop

Validate the logic Spot the weaknesses

Introduction and familiarisation Plan based

Process

> Debrief participants after the exercise

All tests must be planned, the results captured and any remedial work monitored for successful implementation. A technical test may include the following steps:

> Evaluate results and prepare a report with recommendations.

> Agree the scope, objectives and budget > Devise a simple scenario and a set of assumptions to put the test in context > Conduct the test and record the results > Assess and report the results > Address any issues raised. A scenario exercise will require similar steps, with some additional ones, such as:

> Circulate report to participants and senior management. Methods > Participants Possible participants, in addition to staff, in desktop or scenario exercises include: – Facilitators – S uppliers of specialist resources, services or products

> Prepare a realistic and suitably detailed scenario

– Communications and PR

> Brief observers and prepare questionnaires to capture lessons learned

– Outsourced activity providers.

> Pre-exercise information and briefing of participants.

28 GPG Business Continuity

– Subject experts

Table 9 Business Continuity Plan testing document Incident discovery and notification

Tested

Effective

Remedial action required

The right people were notified/alerted effectively The required emergency services were identified and notified in a timely manner Effectiveness of assembling the Incident Management Team Effectiveness of locating and communicating with staff Business recovery Recovery management organisation Working accommodation: relocated staff Staff working from home Insurance and disaster restoration services liaison Customer management activities established Handling and prioritisation of customer commitments performed effectively Telephony/voice communications successfully re-established IT systems fully and effectively restored Locating replacements for damaged or destroyed equipment/stock/raw materials Capacity and resources available at alternative working location Arrangements for relocating staff Order management/fulfilment resumption

Business Continuity GPG 29

DOs and DON’Ts > DO make sure staff can easily find information relating to the plan and the structure of the supporting BCM organisation

No plan is reliable if it has not been exercised, nor can the personnel involved be relied upon until they have had practice.

> DO remind staff regularly about BCM arrangements > DO ensure BC arrangements are on board meeting agendas at least once a year

> Outcomes and deliverables

Maintenance



A BCM programme must be established to ensure all relevant stakeholders have the current and relevant parts of the BCP.

T he outcomes of the BC exercising process include:

– Validation of the BC strategies – F amiliarisation of participants in responding to an incident – Testing of the plan(s) and the supporting infrastructure

Review There are several ways to review a BCM programme including: > Internal audit

– A post exercise report

> External audit

– Increased awareness of BC

> Self-assessment.

– An opportunity to improve preparedness. Table 9 gives an example of part of a BCP testing document outlining the assurances required, whether they were tested and where remedial work is required. This would normally form part of a wider report and follow-up process where the results of the test were communicated and responsibilities for remedial work identified.

30 GPG Business Continuity

7. Embedding BCM

BCM is a holistic management process which identifies risks that may threaten an organisation. It provides a framework for building resilience and the capability for an effective response to safeguard the interests of its key stakeholders, reputation, brand and value creating activities. To be successful it has to be seen as a part of normal business management, regardless of the organisation’s size or sector. At all points in the BCM process, opportunities exist to introduce and enhance an organisation’s BCM culture. Precisely how everybody is made aware of BC and its implications will depend to a large extent on the existing culture and ways of communicating ideas. Three stages can be envisaged: (i)

 ssessing BCM awareness and A training needs

Before planning and designing the components of an awareness campaign, it is critical to understand what level of awareness currently exists.

Assessing awareness is just another aspect of “Understanding the Organisation” (see page 6) and the same techniques re: workshops, questionnaires and interviews can be used. These will identify the level of training required. Is it in just the practicalities of your organisation or is an explanation of the basic concept required? (ii)  Developing BCM in the organisation’s culture This will build on the training needs identified above and lead to the design and delivery of a programme of education, training and awareness, which must: – explain the need for business continuity plans within the organisation, – provide access to details of the organisation’s specific business continuity arrangements, – create quick reference resources and materials, and – implement an enforceable policy. (iii) Monitoring cultural change The awareness assessment, stage (i), should be maintained as an ongoing task to identify any further requirements for education and training. The importance of a common understanding of the value of BCP should not be underestimated. It ranges from board-level support to staff commitment to exercises. The value will be clearly demonstrated in an incident.

Business Continuity GPG 31

Appendix 1 Incident Response UK emergency services incident response structure UK emergency services use a three-tier incident response structure (see Figure 1) with responsibilities and relationships.

> Temporary accommodation > Counselling and rehabilitation services, perhaps within an employee health package > Welfare needs at alternative locations:

In an incident the three levels of involvement address different issues during the various phases of the event, as the following diagram (Figure 2) shows.

– Refreshments

Smaller organisations may elect for a single hands-on management group with both tactical and strategic responsibilities. However, it is still important that they address the strategic issues despite the pressing issues of a tactical response.

– Appropriate training on replacement equipment.

For geographically diverse organisations a variety of models may be appropriate, perhaps with additional tiers beyond the three named above. For example: > A response team at each site backed up by a central BC team > A BC team at major sites with a central IMT > BCM at a national level with limited involvement from the international board unless global reputation is threatened. All BCM strategies should recognise people issues but in major health and safety incidents they can be the dominating issue. During such an incident someone should assume responsibility for the activities listed in Table 5 (see page 21). Subsequently there may be additional needs, including: 32 GPG Business Continuity

– Personal safety and security – Transport and accessibility

Someone should be appointed to liaise with the emergency services as they arrive on site and subsequently as required. The emergency services need to be advised of the whereabouts of any casualties, the status of the situation and any hazards they may encounter. While on site, the emergency services’ instructions take precedence over all others. When they depart, the organisation resumes responsibility for the site. The incident communications plan addresses communication with all stakeholders including: > Staff, relatives, friends and emergency contacts > Customers and suppliers > Shareholders, partners or owners > Informing and liaising with regulatory authorities (legal and compliance functions) > Issues relating to serious injuries or fatalities (with the emergency services) > Media: local and national newspapers, radio, TV, internet and other media.

Figure 1. Three-tier incident response structure

GOLD

Senior (incident) Management

SILVER

BRONZE

CONTROL

ESCALATION

Strategic

Tactical

Business Continuity Team Incident Response & Business Unit Resumption Teams

Operational

Figure 2. Phases of an incident

INCIDENT OVERALL OBJECTIVE: Back-to-normal as soon as possible

NORMAL

TIMELINE INCIDENT RESPONSE

WITHIN MINUTES TO DAYS: Contact staff, customers, suppliers etc. recover critical processes; rebuild lost work-in-progress

BUSINESS CONTINUITY WITHIN MINUTES TO HOURS: Account for people; deal with casualties; contain damage; assess damage; invoke business continuity

RECOVERY / RESUMPTION WITHIN WEEKS TO MONTHS: Repair/replace damage; relocate to permanent site; recover costs from insurers

Business Continuity GPG 33

When an incident gets into the public domain, effective communication plays a key role in protecting an organisation’s reputation.

Answers to the following questions need to be considered: > What are the messages? > Who will form the IMT? > What resources and facilities are available? > Are the IMT and spokespeople properly trained? When an incident or business discontinuity gets into the public domain, effective communication plays a key role in protecting an organisation’s reputation.

34 GPG Business Continuity

Appendix 2 glossary OF TERMS ItActivity is necessary to consider: Resumption Plan

Detailed guidelines for operational recovery teams to implement

(ARP)Ownership of the plan the resumption of normal business functions and services. > Everybody involved should agree Back-up A reserve copy of information which is deemed to be ‘Essential for beforehand about the who, Recovery’, how and including data and documentation. what of communication. Business Continuity The capability to continue essential business functions under all (BC) Perception is reality circumstances. > Reputation is affected Business Continuity Institute by perceptions The world’s leading membership organisation for BC practitioners. (BCI)

> Act fast Business Continuity Reticence ruins reputations Those management disciplines, processes and techniques which Management (BCM)

seek to provide the means for continuous operation of the essential

> Be as open as you legally and business functions. practically can Business Continuity Plan (BCP) A set of procedures and processes to guide the Business Continuity Show you have nothing to hide Team in the tactical management of an incident. > Show you careTeam Business Continuity Staff responsible for the tactical management of an incident. (BCT)See it from your audiences’ point of view. Business Impact Analysis (BIA)

The process of identifying, and quantifying, the impacts on an enterprise of the effect of a incident, in both financial and non-financial terms.

Continuity Requirements Analysis (CRA)

An assessment of the resources required for a resumption of activities.

Disaster

Any event which threatens or disrupts normal operations, or services, for sufficient time to affect significantly, or to cause failure of, the enterprise.

Disaster Recovery (DR)

A term normally used to describe the process for restoration and recovery of IT equipment, functions and applications.

Incident

Any event which may be, or may lead to, a disaster.

Incident Management Plan (IMP)

A framework document to guide the Incident Management Team in the strategic management of any incident.

Incident Management Team (IMT)

Staff responsible for the strategic management of an incident.

Maximum Tolerable Period of Disruption (MTPD)

The maximum period of time for which the business can afford to be without a critical function or process.

Risk Assessment (RA)

An estimate of the likelihood of loss, interruption or disruption from known threats.

Recovery Point Objective (RPO)

The point in a process or function which must be restored to enable continuity of the business operation to be maintained, or achieved.

Recovery Time Objective (RTO)

The time scale within which a function or business unit must be restored, usually determined by means of a Business Impact Analysis.

Business Continuity GPG 35

Appendix 3 Further Information Further information can be found on the following websites:

> www.gov.uk/government/policies/ emergency-planning

> www.thebci.org

Details of what the government is doing about emergency planning.

The Business Continuity Institute (BCI): The world’s leading membership organisation for BC practitioners. The BCI’s Good Practice Guidelines are available from its website. > www.continuitycentral.com Continuity Central: a free source of news and information. > www.noaa.gov NOAA (National Oceanic & Atmospheric Administration): covers climate and weather patterns, including storm and hurricane forecasts. > www.bankofengland.co.uk/ financialstability The Bank of England maintains monetary and financial stability of the United Kingdom.

36 GPG Business Continuity

> www.rothsteinpublishing.com/ppbc J Burtles, Principles and Practice of Business Continuity: Tools and Techniques (ISBN 978-1-931332-39-2) Rothstein Associates Inc, 2007.

About BIFM The British Institute of Facilities Management (BIFM) is the professional body for facilities management (FM). Founded in 1993, we promote excellence in facilities management for the benefit of practitioners, the economy and society. Supporting and representing over 16,000 members around the world, both individual FM professionals and organisations, and thousands more through qualifications and training. We promote and embed professional standards in facilities management. Committed to advancing the facilities management profession we provide a suite of membership, qualifications,training and networking services designed to support facilities management practitioners in performing to the best of their ability.

BIFM Number One Building The Causeway Bishop’s Stortford Hertfordshire CM23 2ER T: +44 (0) 1279 712620 E: [email protected] www.bifm.org.uk ISBN: 978-1-909761-17-9

ISBN 978-1-909761-17-9

9 781909 761179 Price: £19.99