IT Operations Analytics. from rear view mirror to glass globe

IT Operations Analytics from rear view mirror to glass globe This white paper is intended for CIOs and IT managers who wish to make the complex land...
Author: Lorin Davis
0 downloads 2 Views 2MB Size
IT Operations Analytics from rear view mirror to glass globe

This white paper is intended for CIOs and IT managers who wish to make the complex landscape of IT more predictable and manageable with the help of IT Operations Analytics. This can be done by linking relevant IT data to business information, and making these part of IT monitoring. Even organisations that already use Application Performance Management to achieve this goal can benefit from an orientation towards IT Operations Analytics. It is the next step towards control of the IT landscape. With the help of a maturity model, you can see where your organisation is now and what steps can be taken. In this year of 2016, the performance of IT resources is a crucial factor for virtually every organisation. The dependence on IT is so great that a deficiency frequently has immediate economic consequences and brings adverse publicity. Furthermore, many IT environments have become extremely complex in a relatively short time, so that system changes often result in performance problems without the immediate cause being identifiable. This leads to a heavy workload for the IT department, chronically frustrated end-users, and loss of revenue because of customer dissatisfaction. Although organisations attempt to deal with these problems, in practice it remains a challenge to configure diverse systems and applications into a unified whole that functions with synergy. There are methods to identify choke points in the application landscape and to gain insights into the origin of problems. Application Performance Management (APM) ensures that applications are monitored in real time, so that it quickly becomes clear when a problem has arisen. It is then possible to pinpoint precisely where in the chain the problem has arisen. Based on these insights, the IT department can solve problems, and the costs associated with downtime or reduced performance can be limited.

IT OPERATIONS ANALYTICS | 2

“We have taken a position that, by 2018, 25 percent of the Global 2000 will have deployed an IT Operations Analytics platform taking data feeds from a variety of performance and availability systems; that’s up from about 2 percent today.” Will Cappelli, VP & Research Analyst, Gartner

However, it remains difficult to detect issues in advance and take the necessary measures. After all, action is only taken after problems have already arisen. Ergo: the costs and frustrations associated with deficient systems continue to exist, imposing pressure on the relationship between business and IT. Fortunately, there are new technological possibilities for approaching performance management in a different way. What if we could already predict when an application is going to play up? Or if we could know in advance how an application will behave after a new release? And what if we could take preventive measures to avoid problems? Would we make major progress in, for example, customer satisfaction? IT Operations Analytics (ITOA) provides answers to these questions. Various analysts, including Gartner, point to IT Operations Analytics as the most significant development in performance management for the years to come. Just imagine the advantages your organisation would have in a situation where there was ongoing work to produce more efficient and more effective support for business processes. It would make the IT organisation more versatile and adaptable. The availability of IT resources is greatly improved – a major benefit in this era of ‘always on’. The performance of the resources is also greatly increased: fewer malfunctions, and also fewer delays. Finally, your security policy will benefit if you can promptly detect and interpret the signs leading up to a security breach. In short, ITOA contributes to improved performance, agility, security and availability.

IT OPERATIONS ANALYTICS | 3

1

From APM to IT Operations Analytics

David versus Goliath Many IT departments are reactive in their service provision. Most of their efforts are devoted to ‘keeping the train on the rails’, firefighting and meeting their obligations to the business side. The reactive approach frustrates both IT and the business side: the latter mainly sees the problems and the amount of time IT needs to resolve these. It is also frustrating to the organisation that so much time, money and energy is required to keep the environment running and to provide service, while the effect of this is not always visible. In turn, IT wrestles with the business side’s demand for more innovation, faster, and spends too much time on resolving problems and keeping IT processes on the right track. At present, IT departments seldom achieve the right balance between innovation and continuous improvement of the chain, which eventually leads to even bigger problems and bottlenecks. IT’s battle against performance problems is actually a David versus Goliath struggle. Organisations have a need to improve the IT process and to lower the associated costs. In an ideal situation, the IT service provision would be calmer, more time would be available for optimising hardware, software and services, and IT would have strategic added value for the organisation. To achieve control, it is essential to have a balance, based on quality and business aims, between operation and innovation. Measuring & monitoring One of the strategies deployed by IT in this battle is Application Performance Management (APM). APM provides an overview of the performance and availability of software applications, and ensures that IT can respond in order to minimise disruptions. In this way, a major challenge is dealt with: the detection and location of bottlenecks. If applications for an organisation’s staff constantly involve excessive waiting time, what is the precise cause of that? If the website is so slow that it affects the behaviour of visitors, what modifications should then be made? These are key questions in APM. Configuring APM tooling correctly yields valuable insights, derived by means of various monitoring processes. These monitoring processes yield combined and powerful business information. Not all organisations, by a long way, utilise this information to implement preventive management. That’s a shame because, with all the data that is available, preventive IT management is possible. After all, there are even more data sources that give valuable information about the performance of an IT landscape and facilitate forecasts on how it will function. By combining information from multiple internal and external data sources, advance detection and remediation of bottlenecks becomes a realistic proposition. Not only IT but also the business side directly profits from such a situation: realising better performance of the IT chain, in less time, thereby creating the opportunity for strategic cooperation between the IT organisation and the business.

IT OPERATIONS ANALYTICS | 4

IT Operations Analytics: Bringing together multiple sources of data to enable data-driven IT Operations, and delivering consistent, high-quality results for maximum digital performance, availability, security and agility.

The next step: ITOA The key to successful IT management lies in prevention, making performance predictable, and the detection of possible problems before they grow to become an actual risk. That is ITOA: the combination of existing best practices with new, relevant data to produce information with which your IT becomes predictable. It is therefore time to rethink the monitoring processes and, based on APM, take the next step: what data sources would you like to combine to gain more insight into your IT? And what insight do you need in order to give optimal support to business aims? Insight into IT performance and consequences for the business ITOA revolves around having a 360° view of the entire environment. The needs of the business and the effect of various IT measures on business processes are central to this. This makes “What does a half-second delay in the system signify for the business?” an important question. What does it mean to a help desk employee if he always has to wait several seconds when searching in the CRM system? Does this frustration promote a shadow IT and threats to security? And what effect does IT performance have on the conversion and hence the turnover of a web shop? That is what ITOA is about: not just a graph or dashboard that tells you whether applications are performing properly, but an ongoing dialogue with the business. A dialogue that helps to determine which IT components must take priority, and what the critical criteria are for this. To answer these questions, it is important to collect various data sources at a central location and to combine them. A ‘single point of truth’ is essential to obtaining and sharing insights. Precisely which data must be collected depends on the nature of the organisation and its dependence on IT. For an energy company, that is different than for a financial service provider, and for an online retailer, for example, webcare data can be very interesting.

IT OPERATIONS ANALYTICS | 5

Data for the asking Various departments in organisations have already begun to work in a more data driven way. However, they often do this in isolation; separate silos which use this data to operate more effectively. However, the collected data from the various disciplines are not being combined. The IT department uses APM tools to collect all kinds of information, but its interpretation remains internally-oriented. The strength of ITOA is that it combines this information with data from, for example, the marketing man, the service staff member or the purchasing department. In an organisation for which IT is a strategic resource, the important thing in the end is integration with other business processes. Some examples: • If a manufacturer knows that product X will sell much better in a certain season, the purchasing process is adapted to take this into account. But why should IT not also be prepared for this? After all, the website, customer portals and other systems also have to deal with a higher workload. Predicting a seasonal rush can be helpful in preventing IT bottlenecks. Integrating the data from a logistical department with that of IT makes this possible. • A marketing department has information on the conversion level of a specific page of a web shop. With the addition of IT data, it is possible to examine the exact activity of visitors to that page, and when they quit. What happens if you link that page’s performance to the conversion? Can IT contribute to increased turnover, and what is the impact of planned maintenance? • How many calls does a call centre worker deal with per day? And how much time does the worker spend waiting for the application? It has been established that a major telecommunications provider can save thousands of euros on call centre costs by reducing the waiting time for each search in the CRM application by a single second. The integration between the IT process and the service desk’s business process thereby produces immediate monetary reward. • Many communications or support departments have a webcare team which gives support via social media. The reports recorded by this team can offer the IT department relevant insights into the consequences of certain changes or new configurations. If it is properly organised, webcare information can even make predictive management possible. • For a local authority, it can be very useful to set up a control room where all monitoring information is gathered at the same place. Various disciplines are represented in this control room, from the service desk to administration, and from analysts to process managers. They see the same information, and thus speak the same language. Can the public be served better if the application performs faster?

IT OPERATIONS ANALYTICS | 6

2

The ‘magic 4’ for ITOA

IT Operations Analytics is about valuable IT insights combined with relevant business data. By getting this combination right, the IT department can achieve better and more consistent performance, make the organisation more agile, achieve ‘always on’ availability, make cost-efficient IT investments, and strengthen the organisation in terms of IT security. 1. Availability In the current economy, ‘always on’ is an absolute requirement; downtime often has serious consequences. It is therefore very important to obtain insight into the risks that downtime can bring, and to be able to take preventive measures against this. The IT train has to keep going, even when maintenance is being done, modifications are made, or other variables change. This is no sinecure: IT Operations Analytics makes this possible, because the effect of each modification to the environment is charted in advance. 2. Performance Always on is not the only factor, however; the train also has to go faster, be able to increase its capacity, and look further ahead to prevent incidents. Predictive analyses can help prevent performance drops. The optimal IT environment achieves consistent high performance without deep troughs. By combining APM information with other relevant sources, performance is made measurable in detail and even predictable. More insight also means more clarity about the performance of individual components of the IT chain. This strengthens the IT department’s negotiating position against its suppliers. The blame game, in which suppliers pass the buck to other suppliers, is then a thing of the past. 3. Agility If IT is stable and can guarantee a good level of performance, more breathing space is gained in order to innovate and thereby respond to new developments in IT and business. This is a self-reinforcing effect: the IT department constantly improves its knowledge of how technical components operate and are integrated into the total environment, which means innovations can be introduced more effectively. In short, IT is able to make the train faster even while it is moving, increase capacity and add new components so they can move together with the business. 4. Security By monitoring the behaviour of end users at application and workstation level, and recording all changes in the IT landscape, we contribute to the security of the IT environment. When incidents occur, it is possible to establish what went wrong, what changes preceded them, and which resources were used for these. The system recognises patterns, so can indicate in subsequent situations which actions or changes are risky. By combining different data sources, a new, more complete view is created:

IT OPERATIONS ANALYTICS | 7

what end-user behaviour preceded the security breach? Were there any abnormalities in the website visit? Can we use the log files to foresee whether an outsider is trying to hack in? Asking questions of this type and linking the outcomes together enables us to set up a preventive security policy. In that way, security issues can not only be prevented but also promptly detected and remedied if, nevertheless, something or someone breaches the digital walls. Furthermore, by logging all the possible, relevant processes, the IT department can show what the method was, thereby meeting the burden of proof for external audits. Given the stricter legislation and regulations of the EU and the Dutch government, this is a priority for many IT organisations. Linking IT and business The aforesaid result areas are especially relevant for the IT manager. The business, however, is looking for other results. What benefit does it have for the organisation? • Turnover and conversion ITOA repays the investment. When systems are continuously available and perform well, it brings increased productivity for the people working with them. In the case of customers dealing with an online system, lack of availability leads automatically to loss of revenue, whereas better performance can lead to more conversion and hence more revenue. • Customer satisfaction and image ITOA results in higher customer satisfaction for the business, thereby producing a good image. Not for nothing is it said that trust is hard to gain and easily lost. When customers and stakeholders can work with fast, secure and reliable IT systems, this has a positive effect on customer satisfaction (higher NPS) and prevents harm to the image. An organisation which can consistently meet customer and stakeholder demand, achieve an agile IT, and keep its security under control, avoids damage to its reputation. • Continuity IT must never affect the continuity of business processes. What is more, as IT increasingly often plays a central role in the business process, downtime is more damaging than ever. The business calls for unbroken continuity, regardless of the effort required. IT Operations Analytics is an instrument that can help meet this business requirement.

IT OPERATIONS ANALYTICS | 8

• Quality Even when ‘always on’ is achieved, business processes can be disrupted by the failure of IT systems. It is possible for minor disruptions of short duration to have the same effect as complete failure. In IT Operations Analytics, it is particularly important to gain an insight into the effect of such problems on the bottom line. What do customers notice of it, and when do they reach the end of this chain? By continuously working on this, IT and the business jointly improve the quality of service provision. • Efficiency Moreover, an organisation can cut costs by automating processes, shortening resolution times and reducing the number of incidents. Preventive management allows the IT department to spend less energy on firefighting and enables it to give optimal support to business processes with IT applications. • Supplier management In today’s complex IT landscape, multiple suppliers, service providers and collaborating partners are usually involved. Due to outsourcing and out-tasking, the organisation loses specialism and profound knowledge of systems, while the strategic responsibility remains with the IT department. This does not make the dialogue with suppliers and service providers any easier. IT Operations Analytics is a valuable instrument for making matters measurable and engaging in dialogue with suppliers.

IT OPERATIONS ANALYTICS | 9

5

The 3 Ps

The form of the internal ITOA strategy will vary from organisation to organisation. After all, not all business processes depend on technology; conversely, there are some processes that cannot function without IT. When implementing ITOA, however, three elements must always be taken into account: Products, People and Processes. Each of these elements have to be aligned with the ITOA strategy if success is to be achieved. Products In terms of technology, it is essential to use APM systems to monitor multiple parameters and collect the output centrally in an overarching system which stores and combines the data from individual processes. The software in that system must have a certain degree of intelligence and incorporate machine learning attributes, through which patterns and correlations become visible automatically. The system analyses not only one’s own systems and business processes, but also integration with systems of partners in the chain. Depending on the organisation and its dependence on IT, additional monitoring processes can be incorporated into the system. Imagine, for example, a link with social media and the activity displayed there, a link with a website such as www.allestoringen.nl, or relevant business information, such as a forecast of peak seasonal business. In fact, all relevant data sources can be added, provided this information is eligible for combination with other (IT) data sources. Various products are available which acquire a predictive character with correlations. Processes The monitoring processes must be embedded in the organisation. The APM processes can be calibrated by IT, but the business side has to be involved to define other relevant data sources. What information does the business side need to optimise results? And what insights does IT need in order to optimise the configuration of its systems? Once the monitoring processes have been configured and linked to business processes, there is yet another step: the definition of dependencies between separate parameters. For this, you need practical experience and a system which makes connections. After all, the alarm bells need to sound not only when values exceed a critical limit, but also when certain sequential events take place or other specific situations occur. In short, when can it be said that there is a connection or interdependency?

IT OPERATIONS ANALYTICS | 10

Finally, it is important to set up a process in which the insights are delivered to both the business side and to IT: a single point of truth. This ensures that they both speak the same language, and agreements can be made on the actions to be taken. Only then does the added value of ITOA become visible. People The success of an ITOA strategy will always remain dependent on people. For both IT and the business, investments have to be made in advance in terms of both time and resources. In this respect, it is important to set up a balanced team. Specialists are needed, who can administer the tooling and make links with other relevant data sources. They must be assisted by people who know the business and recognise the importance of IT, and who can invest the processes in the organisation. The team forms the bridge between the business and IT, and bears the responsibility for translating IT processes into primary business processes. The relationship between system failures and the behaviour of customers or end-users is, after all, valuable information that must be acted on.

IT OPERATIONS ANALYTICS | 11

6

The ITOA-maturity model

In most organisations, the first steps towards IT Operations Analytics have already been taken. This means ITOA is not a replacement approach that changes the current monitoring and management processes in one blow. ITOA is an addition to what you have already set up: it is the next step in IT monitoring. Almost every organisation is already engaged in ITOA to a greater or lesser degree, without being aware of it. Even the IT department which takes a very reactive approach to its work applies certain resources or has established certain processes which are an initial step on the road to ITOA. Organisations that use APM are already further along that road. A maturity model provides more clarity.

Reactive Users unexpectedly report incidents. This causes unpredictable costs and downtime.

Aware Application monitoring provides continious insight in the performance and availability of the complete IT environment.

Pro-active Additional APM tools, test automation, and processes are implemented to reach a stabile user experience.

Preventive By combining APM and Business KPIs, it’s possible to measure IT business impact and support operations cost efficiently.

Empowered By bringing together multiple data sources, new innovative insights are created that steer business objectives.

Reactive Many organisations find themselves in the reactive phase and are mainly effective in firefighting. Over the years - under the influence of takeovers, reorganisations and other organisational changes - a highly complex IT landscape has arisen, containing separate silos. Various departments are each responsible for a small piece of the landscape. A central service organisation records whether applications and systems are meeting expectations and ensures that messages are converted to incidents and are remedied. This works, until bigger, unforeseen issues arise. This is because the service organisation does not see them coming, which leads to unpredictable situations. The monitoring it applies is frequently limited to hardware and infrastructure. It is also difficult for IT to innovate due to a lack of insight and high work pressure from malfunctions, and endusers suffer ‘message fatigue’ because problems are not resolved, or resolved too late. All these factors contribute to the result that a reactive IT department costs more than it produces.

IT OPERATIONS ANALYTICS | 12

Aware The IT department in this phase approaches the IT landscape from a chain concept. There is awareness of the effect all kinds of factors have on the landscape’s performance. For example, it may be that all the individual technical domains function well, but that performance does not correspond to that from the end-user’s perspective. The aware IT organisation makes a point of involving the end-user in its analyses and views the functioning of applications and systems from a supra-domain perspective. As a result, there is more insight into performance and the effect for end users. IT Service Management is made possible, and the foundation is laid for improvement programmes. When messages arrive from end users, they come as no surprise to the IT organisation because it has already been alerted by the monitoring processes. In this way, IT can detect and resolve issues faster, and is more able to perform capacity management. The IT department reports, agrees on KPIs and creates a single point of truth about the functioning of the system. Despite this, problem resolution still often does not proceed fast enough, and the organisation pays the penalty in terms of productivity. The IT organisation sees problems coming but is not adequately equipped to take proactive measures to counter them. Quick fixes are frequently opted for, rather than structural solutions. IT is therefore consciously incompetent in this phase: it becomes clear where improvement is possible.

IT OPERATIONS ANALYTICS | 13

Proactive The proactive IT department does take proactive measures based on monitoring processes, and thus resolves issues faster. The user experience of customers and end users is improved by this. Resolution times are shortened, downtime is reduced, and incidents are sometimes prevented. The department has more insight into the total IT environment and the impact of situations on the customer and the business. As the old saying goes, prevention is better than the cure. A proactive organisation ensures that maintenance is done at the right times, and that incidents are discovered at an early stage. In that way, end users are spared nasty surprises. An important aspect of proactive intervention is having a centralised organ where the monitoring processes are brought together: a control room. Based on the insights arriving here, conclusions are drawn from a variety of perspectives. The proactive IT department is good at testing, and uses specific APM tools to ensure that incidents are energetically resolved. Priorities are drawn up and issues are resolved. Patterns and correlations among incidents become apparent, so that incidents are more frequently detected in advance and possibly even prevented. All IT systems in the organisation are monitored with APM. Preventive In this phase of maturity, IT works constantly towards more efficient and effective support for business processes and, for that purpose, is constantly in dialogue with the business. The IT organisation operates preventively to avoid downtime and performance drops. Effectiveness and efficiency are priorities. The IT department has a lot of insight into the operation of the systems and the effect of routine and less routine factors on them. The ‘preventive’ IT organisation also increasingly projects the obtained insights beyond the department’s own walls. The valuable insights are used by the entire organisation. IT communicates with HR, Marketing and Operations so they can jointly decide which APM and ITOA insights are of value. Together, the business and IT analyse trends to estimate their value. This is because it is not only useful for IT to have business information available, but also vice versa. An example: if a customer stops responding halfway through an ordering process, can a sales assistant step in and approach the customer so the deal can still be closed? The customer remains satisfied, and revenue is generated. The applied preventive management is linked to business goals. In this way, IT makes a direct contribution to the primary business process, and priorities are even more strongly based on the situation on the shop floor.

IT OPERATIONS ANALYTICS | 14

Empowered The ‘empowered’ IT organisation has taken the final step in the ITOA maturity model. With the addition of even more data sources, 360° insight becomes a reality. The added value and impact of APM and ITOA already extends into the entire organisation in the preventive phase, and in this phase new internal and external data sources are added to that. As a result, the empowered IT organisation is able to respond better to trends and developments in the market, and also to intervene preventively based on historic data and recurring patterns in data. Events, monitoring and other IT triggers are collected, and all the relevant information is automatically combined with big data technology. Based on context and business goals, IT ensures that ITOA is used to support the business with more knowledge, quality and speed. IT translates it into something the business can use. Issues are prevented or quickly resolved, but IT also gives advice on whether more or less manpower and support are needed, and whether new systems are required or not. Requests for advice on these subjects can be answered quickly. Eventually, the empowered IT department can deploy machine learning in the tooling. Self-teaching systems are then occupied day and night in analysing dozens of data sources to discover relevant connections and patterns, and to give correlations a place in the monitoring processes. The system recognises exceptional situations in the normal run of affairs without a monitoring process having been actively applied to them. It draws the IT department’s attention to issues that were not yet identified as possibilities.

IT OPERATIONS ANALYTICS | 15

7

Conclusion: the future of IT management

IT Operations Analytics is the future of IT management, because IT is becoming ever more complex and existing resources are no longer adequate. In many organisations, furthermore, the role of IT is changing from a facilitating to a directing one. The organisation that deploys IT as a strategic asset for its business processes will reap the benefits of this. This means that it will have to streamline its monitoring processes in the coming years and supplement them with relevant additional data sources. System downtime can already be an enormous setback today; in five or ten years’ time it might well directly cause bankruptcy. Bearing that in mind, IT Operations Analytics is not a point on the horizon but an immediate priority. Configuring ITOA is therefore an ongoing process. Existing monitoring processes must be optimised further, and IT has to be continuously engaged in adding extra data sources. In the meantime, the systems are becoming smarter and smarter, developing a high level of intelligence. The earlier an IT department can take the first steps in the ITOA process, the better it will be prepared for future technological developments that are going to change business processes even more. Preventive IT management is still an ideal concept for many today, but in the years to come it will be the norm.

IT OPERATIONS ANALYTICS | 16