Journal of Theoretical and Applied Computer Science

Journal of Theoretical and Applied Computer Science Vol. 6, No. 1, 2012 METHOD FOR DETECTING SOFTWARE ANOMALIES BASED ON RECURRENCE PLOT ANALYSIS Mic...

Author: Jesse Johnston

0 downloads 4 Views 6MB Size

Report

Download PDF

Recommend Documents

American Journal of Theoretical and Applied Statistics

Master of Applied Computer Science

JOURNAL OF APPLIED SCIENCE AND AGRICULTURE

INTERNATIONAL JOURNAL OF BASIC AND APPLIED SCIENCE

Journal of Applied Science and Agriculture

SECTIONS. Journal of Chemical Engineering, theoretical and applied chemistry

Master of Applied Computer Science XML DATABASES

6. Department of Applied Computer Science

KWAME NKRUMAH UNIVERSITY OF SCIENCE AND TECHNOLOGY COLLEGE OF SCIENCE DEPARTMENT OF THEORETICAL AND APPLIED BIOLOGY

Journal of Environmental Science, Computer Science and Engineering & Technology

International Journal for Applied Science

Journal of Environmental Science, Computer Science and Engineering & Technology

KWAME NKRUMAH UNIVERSITY OF SCIENCE AND TECHNOLOGY DEPARTMENT OF THEORETICAL AND APPLIED BIOLOGY. (MSc. ENVIRONMENTAL SCIENCE)

Theoretical Computer Science (Bridging Course) Turing Machines

Theoretical Computer Science II (ACS II)

DTU Compute. Department of Applied Mathematics and Computer Science

Journal of Theoretical and Applied Computer Science Vol. 6, No. 1, 2012

METHOD FOR DETECTING SOFTWARE ANOMALIES BASED ON RECURRENCE PLOT ANALYSIS Michał Mosdorf . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3 GRAPHICS EDITORS IN CPDEV ENVIRONMENT Marcin Jamro . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13 ARCHITECTURAL VIEW MODEL FOR AN INTEGRATION PLATFORM Tomasz Górski . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 QUALITYSPY: A FRAMEWORK FOR MONITORING SOFTWARE DEVELOPMENT PROCESSES Marian Jureczko, Jan Magott . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35 CONTROLLABILITY AND OBSERVABILITY GRAMIANS PARALLEL COMPUTATION USING GPU Damian Raczyński, Włodzimierz Stanisławski . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 47 NEW APPROACH TO THE DECISION ANALYSIS IN CONDITIONS OF UNCERTAINTY – INFO-GAP THEORY Andrzej Piegat, Karina Tomaszewska . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67 RUNTIME SOFTWARE ADAPTATION: APPROACHES AND A PROGRAMMING TOOL Jarosław Rudy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75

Journal of Theoretical and Applied Computer Science Scientific quarterly of the Polish Academy of Sciences, The Gdańsk Branch, Computer Science Commission Scientific advisory board: Chairman: prof. dr hab. inż. Henryk Krawczyk, Corresponding Member of Polish Academy of Sciences, Gdansk University of Technology, Poland Members: Prof. dr hab. inż. Michał Białko, Member of Polish Academy of Sciences, Koszalin University of Technology, Poland Prof. dr hab. inż. Ludosław Drelichowski, University of Technology and Life Sciences in Bydgoszcz, Poland Prof. Gisella Facchinetti, Università del Salento, Italy Prof. Constantin Gaindric, Corresponding Member of Academy of Sciences of Moldova, Institute of Mathematics and Computer Science, Republic of Moldova Prof. dr hab. inż. Janusz Kacprzyk, Member of Polish Academy of Sciences, Systems Research Institute, Polish Academy of Sciences, Poland Prof. dr hab. Jan Madey, University of Warsaw, Poland Prof. Elisabeth Rakus-Andersson, Blekinge Institute of Technology, Karlskrona, Sweden Prof. dr hab. inż. Leszek Rutkowski, Corresponding Member of Polish Academy of Sciences, Czestochowa University of Technology, Poland Prof. dr hab. inż. Piotr Sienkiewicz, National Defence University, Poland Prof. dr inż. Jerzy Sołdek, The West Pomeranian Business School, Poland Prof. Stergios Stergiopoulos, University of Toronto, Canada Prof. dr hab. inż. Andrzej Straszak, Systems Research Institute, Polish Academy of Sciences, Poland Prof. dr hab. Maciej M. Sysło, University of Wrocław, Poland

Editorial board: Editor-in-chief: Antoni Wiliński, West Pomeranian University of Technology, Szczecin, Poland Deputy editor-in-chief: Dariusz Frejlichowski, West Pomeranian University of Technology, Szczecin, Poland Managing editor: Piotr Czapiewski, West Pomeranian University of Technology, Szczecin, Poland

Section editors: Michaela Chocholata, University of Economics in Bratislava, Slovakia Piotr Dziurzański, West Pomeranian University of Technology, Szczecin, Poland Paweł Forczmański, West Pomeranian University of Technology, Szczecin, Poland Przemysław Klęsk, West Pomeranian University of Technology, Szczecin, Poland Radosław Mantiuk, West Pomeranian University of Technology, Szczecin, Poland Jerzy Pejaś, West Pomeranian University of Technology, Szczecin, Poland Izabela Rejer, West Pomeranian University of Technology, Szczecin, Poland ISSN 2299-2634 The on-line edition of JTACS can be found at: http://www.jtacs.org. The printed edition is to be considered the primary one. Publisher: Polish Academy of Sciences, The Gdańsk Branch, Computer Science Commission Address: Waryńskiego 17, 71-310 Szczecin, Poland http://www.jtacs.org, email: [email protected]

Journal of Theoretical and Applied Computer Science ISSN 2299-2634

Vol. 6, No. 1, 2012, pp. 3-12 http://www.jtacs.org

Method for detecting software anomalies based on recurrence plot analysis Michał Mosdorf Warsaw University of Technology, Institute of Computer Science, Poland [email protected]

Abstract:

Presented paper evaluates method for detecting software anomalies based on recurrence plot analysis of trace log generated by software execution. Described method for detecting software anomalies is based on windowed recurrence quantification analysis for selected measures (e.g. Recurrence rate - RR or Determinism - DET). Initial results show that proposed method is useful in detecting silent software anomalies that do not result in typical crashes (e.g. exceptions).

Keywords: anomaly detection, fault injection, recurrence plot, software dependability

1. Introduction Detection of software anomalies caused by various failures is important part of software dependability methods. It allows us to make decision to undertake corrective actions in case of detected failures. Literature mentions different methods for detecting software failures. Part of the methods for detecting software anomalies is based on building very accurate and often formal assertions that check program invariants. Those methods usually impose program execution overhead and necessity to optimize obtained assertion set. Authors of the [1][2] present technique allowing to detect software faults based on dynamic derivation of detectors that check discovered invariants. Those methods are based on analysis of Dynamic Dependence Graph (DDG) [3][4] that represents dependencies of values observed during program execution. Another example of similar approach is presented by DAIKON tool [5] that can be used for generating assertions set. Different approach for failure detection methods is presented in [6][7][8] where authors describe two techniques EDDI (Error Detection by Duplicated Instructions) and CFC (Control Flow Checking). EDDI technique is based in the idea of duplicating software instructions and inserting additional instructions that compare results obtained from original and redundant instructions. CFC realizes control flow checking by generating sun-time signatures for control change that are verified by different program blocks. This paper evaluates alternative method for detecting software execution anomalies based on the recurrence plot analysis. Presented method focuses rather on detecting anomalies in dynamics of data flow than checking value invariants in examined software. This approach was proposed for the purpose of detecting anomalies that are caused by software errors that do not result in typical application crashes that are relatively easy to detect and compensate. The proposed approach aims at detecting software errors that result in change of overall dynamical properties of software data flow that is characterized by a few recurrence plot quantification measures.

4

Michał Mosdorf

This paper is organized as follows. The section two gives a short overview of recurrence plot analysis and used Recurrence Quantification Analysis measures. Next section describes the proposed method for software anomaly detection. Next section describes architecture or artificial software model used for the evaluation of proposed method. Next section describes software anomalies introduced in used software model and events caused by those anomalies. Then the paper describes proposed methodology and presents obtained results. The end of the work contains conclusions and future plans.

2. Recurrence Plot analysis Recurrence plot is a technique for nonlinear data analysis that allows us to investigate recurrent behavior in m-dimensional phase space trajectory through 2 dimensional representation. This technique was first introduced by Poincaré in 1890 [9]. Calculation of the recurrence plot starts with reconstruction of phase space of dynamical system. For this purpose there can be applied time delay method with autocorrelation function that allows us to calculate time delay τ [10]. During the next step there can applied Grassberg-Procaccia method for calculation of dimension required for attractor reconstruction. Recurrence plot that visualizes recurrences is described by the matrix: (1) where: N is the number of considered states xi in m dimensional space, ε is the threshold distance, || ⋅ || - a norm and Θ( ⋅ ) - the Heaviside function. The proposed method for detecting software anomalies is based on windowed Recurrence Quantification Analysis (RQA) for selected measures (e.g. Recurrence rate - RR or Determinism - DET). Anomalies are reported based on change of selected RQA measures. Results of this research focus mainly on two parameters: RR and DET that are obtained with following the equations [10]: ,

(2) (3)

where: N – number of points on the phase space trajectory, P(l) – histogram of the lengths l of the diagonal lines, - neighborhood size. RR measures the density of recurrence points in recurrence plot. DET shows the ratio of recurrence points that form diagonal lines to all recurrence points.

3. Software anomaly detection method Discussed method is based on the idea of comparing results of windowed RQA analysis of traces data generated from program execution without anomalies and program execution that may be influenced by anomalies. Figure 1 shows steps of the proposed method. In the first step the examined software must be executed without anomalies to gather not disturbed execution trace. Trace log contains series of integer values that represent different transitions in the program state (e.g. function calls) In the next step the obtained execution trace is analyzed with autocorrelation function and Grassberg-Procaccia method to determine delay and dimension required for the attractor reconstruction. With those quantities there is performed windowed RQA analysis of the obtained trace log. Time series obtained after this analysis describes dynamical properties of

Method for detecting software anomalies based on recurrence plot analysis

5

not disturbed software execution and it is used as comparison pattern for software anomaly detection.

Figure 1. Algorithm of proposed anomaly detection method

During anomaly detection process obtained RQA analysis data is compared with RQA data generated from original software execution. At current stage of method development this comparison is performed offline after the completion of software execution. This assumption was made to simplify the evaluation of proposed approach. Future work will be focused on development of method allowing for real-time software anomaly detection and classification of different dynamical states of software.

4. Architecture of tested software The proposed approach was verified during experiments performed on artificial software model that simulated messages flow between separated threads. The aim of this model was to simulate typical data flow between different modules of e.g. real time software divided to separate application threads that can be found in typical software based on operating systems like FreeRTOS or RTEMS. Architecture of tested software is shown in fig 2.

Figure 2. Architecture of tested software

6

Michał Mosdorf

The prepared software consists of one sender thread that generates message events with Poisson distribution. Each message contained randomly generated designator and additional number describing the amount of time required to process it by receiver thread. This number was also generated randomly with Poisson distribution. Each message was inserted into first queue that connected the sender thread with router thread that was responsible for routing received messages to correct destination queue according to destination designator. In the presented model there were 6 different receiver threads grouped into 3 groups. Each thread group was responsible for receiving messages from given group queue. For the purpose of creating execution trace the selected program points were equipped with log generation procedures. For the whole program there were selected 13 points which represented message generation, send and receive events by different threads. Execution of each selected point resulted in generation of log containing single integer number in range of 1 to 13.

5. Simulated anomalies During the experiment there were collected 6 different execution traces. One for the proper execution and 5 for different simulated software anomalies. Anomalies were introduced artificially and concerned the amount of time required to handle message at destination thread or status of the tread (enabled or disabled, by default all threads were enabled). Each trace was collected for 3 minutes and contained about 14k reported events. The below list provides more details about collected trace logs. • 1. 2. 3. 4. 5.

Execution without anomalies A1 thread requires 2 times larger time to handle messages A1 thread requires 4 times larger time to handle messages A1 is not working A1 and B1 require 2 times larger time to handle messages A1 and B1 are not working

For all the experiments there was made the assumption that if router thread was not able to insert message to receiver queue thread then message was lost (queue was full). There was no particular trace for such kind of event. Figure 3 shows the example of time series gathered for execution without anomalies.

Figure 3. Example of time series from execution trace without anomalies

Method for detecting software anomalies based on recurrence plot analysis

7

It is important to notice that the software test model was tuned in such a way that without anomalies the program was working in stable way. The amount of messages in all queues was maintained at low level and none of the messages were lost. Due to introduced anomalies there were observed special events caused by anomalies. The below list gives a short description of those events for anomalies from 1 to 5. (1) (2) (3) (4) (5)

Queue A full at 1 minute and 50s Queue A full at 1 minute and 10s Queue A full at 1 minute Queue A full at 1 minute and 40s and queue B full at 2 minutes and 40s Queue A full at 55s and queue B full at 1 minute

For the initial examination of obtained trace logs from different executions, all reported program points were counted. Results are presented in the fig. 4. As it is visible the initial inspection of the results is not showing a lot of difference between gathered trace logs. Such kind of inspection can only show differences in number of registered points that were associated with given threads operations. Total number of calls for thread A1 decreases for anomalies 1, 2 and 3 what is caused by introduced anomalies that increase the required time to handle message received from queue (log number 6).

Figure 4. Number of different program points occurred in analyzed execution trace logs

6. Analysis of obtained results In the first stage of execution trace analysis without anomalies was analyzed with autocorrelation function and Grassberg-Procaccia method to determine the delay and dimension required for the attractor reconstruction. Also value of was selected based on execution trace without errors (required for recurrence plot calculation). In the next step for the each of the execution traces with anomalies there were created many recurrence plots with window size of 300 samples. For each of the resulting recurrence plots there were calculated selected RQA measures. Figure 5 shows the example of calculated recurrence plot for selected window size for trace log collected from execution without anomalies.

8

Michał Mosdorf

Figure 5. Example of calculated recurrence plot for window size of 300 sample of trace log collected from execution without anomalies

Figure 6 shows the example of recurrence plot calculated for trace log collected from execution with anomaly 5. It is visible that both presented recurrence plots differ in number and structure of recurrence points.

Figure 6. Example of calculated recurrence plot for window size of 300 samples of trace log collected from execution with anomaly 5

Figures 7 and 8 show calculated RR and DET measures for trace logs without anomaly and with anomaly 5. It can be observed that RR and DET series are noisy. It can be noticed on both figures that at about 30% of experiment time series associated with “Anomaly 5” drastically change value. This is caused by anomaly 5 event when queues A and B become full. Additionally the value of RR from the beginning of the experiment shows that data from “Anomaly 5” execution trace has different dynamic character than original data without anomalies.

Method for detecting software anomalies based on recurrence plot analysis

9

Figure 7. RR measure calculated for trace logs without anomalies and with anomaly 5

Figure 8. DET measure calculated for trace logs without anomalies and with anomaly 5

Due to presence of noise in RR and DET series, some anomalies may be difficult to distinguish from original series. Because of that, figures 9 and 10 show series obtained from original series with moving averaging window with size of 500 samples. After that the anomaly series can be easily distinguished from original data obtained from trace log of system without anomalies.

10

Michał Mosdorf

Figure 9. RR measures calculated for all trace logs containing data without anomaly and with all simulated anomalies. Original plot was filtered by moving averaging window with size of 500 samples.

Fig. 10. DET measures calculated for all trace logs containing data without anomaly and with all simulated anomalies. Original plot was filtered by moving averaging window with size of 500 samples

Presented results show that RR and DET measures from trace log without anomalies maintain rather similar values in relatively small range. This fact is caused by stable character of program execution without anomalies. In case of all introduced anomalies RR measure value after averaging was different than the value computed from trace log without anomalies. This property allows us to distinguish executions with the anomalies from the original one. Additionally it can be observed that values of both measures for trace logs with anomalies change in much greater range. This fact is caused by the effect of the anomalies that caused affected queues to maintain higher

Method for detecting software anomalies based on recurrence plot analysis

11

amount of data and eventually become full. This effect is especially visible in case of “Anomaly 5” that causes very rapid increase of amount of messages maintained in queues A and B and queue blockage in relatively short time.

7. Conclusions The presented paper proposed method for software anomaly detection. Described approach is based on the idea of performing windowed RQA analysis on software execution trace logs and making decisions about anomaly detection based on comparison of RQA measures calculated for original not disturbed software execution. For the evaluation purpose, the method was applied to very simple and artificial software model that simulated messages flow between different program threads. For that model there were introduced 5 different anomalies that influenced performance of threads responsible for handling messages. Created anomalies disturbed stable character of the model and caused affected queues to maintain higher level of messages. Results obtained for performed tests showed that RQA measures allowed to distinguish executions with anomaly from original execution. Results of initial study show that recurrence plot analysis can be useful tool suitable for detecting anomalies in software execution. Results show that this approach can help us to detect silent software errors that do not result in typical application crashes (e.g. exceptions). This type of errors may result in change of system statistical behavior or performance degradation. In future this method can be applied for anomaly detection in more complex systems such as kernel of operating system. Drawback of this solution is high computation power required to perform recurrence plot analysis. Due to this, applicability of the method for real time applications will be investigated in future research. Additionally, due to the presence of noise, data obtained from RQA analysis may be difficult to read. In the presented paper there was used additional windowed average to show the differences between anomalies series and original series. Due to that fact making reliable and rapid decision about possible anomaly detection may be difficult. This issue will be investigated in future work.

References [1] Pattabiraman K., Kalbarczyk Z., Iyer K. R., Application-Based Metrics for Strategic Placement of Detectors, Dependable Computing, 2005. Proceedings. 11th Pacific Rim International Symposium on, 12-14 Dec. 2005 [2] Pattabiraman K., Saggese G. P., Chen D., Kalbarczyk Z., Iyer K. R., Dynamic Derivation of Application-Specyfic Error Detectors and their Implementation in Hardware, Dependable Computing Conference, 2006. EDCC '06. Sixth European, 18-20 Oct. 2006 [3] Austin T. M., Sohi G. S., Dynamic Dependency Analysis of Ordinary Programs, ISCA '92 Proceedings of the 19th annual international symposium on Computer architecture, 1992 [4] Tip F., A Survey of Program Slicing Techniques, JOURNAL OF PROGRAMMING LANGUAGES, Volume: 5399, Issue: 3, Publisher: Citeseer, Pages: 1-65, 1995 [5] Ernst M.,Cockrell J.,Griswold W., Notkin D., Dynamically Discovering Likely Program Invariants to Support Program Evolution, IEEE Trans. on Software Engineering, 27(2), 2001Trans. on Software Engineering, 27(2), 2001.

12

Michał Mosdorf

[6] George A. Reis , Jonathan Chang , Neil Vachharajani , Ram Rangan , David I. August, SWIFT: Software Implemented Fault Tolerance, Proceedings of the 3rd International Symposium on Code Generation and Optimization, 2005. [7] N. Oh, P. P. Shirvani, and E. J. McCluskey. Control-flow checking by software signatures, volume 51, pages 111– 122, March 2002. [8] N. Oh, P. P. Shirvani, and E. J. McCluskey. Error detection by duplicated instructions in super-scalar processors.IEEE Transactions on Reliability, 51(1):63–75, March 2002. [9] Poincaré H., Sur la probleme des trois corps et les équations de la dynamique, Acta Mathematica 13 (1890) 1–271. [10] Norbert Marwan, M. Carmen Romano, Marco Thiel, Jürgen Kurths, Recurrence plots for the analysis of complex systems, Physics Reports, Volume 438, Issues 5–6, Pages 237–329, January 2007

Journal of Theoretical and Applied Computer Science ISSN 2299-2634

Vol. 6, No. 1, 2012, pp. 13-24 http://www.jtacs.org

Graphics editors in CPDev environment Marcin Jamro Rzeszow University of Technology, Department of Computer and Control Engineering, Poland [email protected]

Abstract:

According to IEC 61131-3 norm, controllers and distributed control systems can be programmed in textual and graphical languages. In many scenarios using a graphical language is preferred by the user, because diagrams can be more legible and easier to understand or modify also by people who do not have strong programming skills. What is more, they can be attached to the documentation to present a part of a system implementation. CPDev is an engineering environment that makes possible to program PLCs, PACs, softPLCs and distributed control systems with the usage of languages defined in IEC 61131-3 norm. In earlier versions, it supported only textual languages – ST and IL. Currently, graphics editors for FBD, LD and SFC languages are also available, so users can choose a suitable language depending on their skills and a specificity of a program that they have to prepare. The article presents implementation of the graphics editors, made by the author, which support creating program organization units in all graphical languages defined in IEC 61131-3 norm. They are equipped with a set of basic and complex functionalities to provide an easy and intuitive way of creating programs, function blocks and functions with visual programming. In the article the project structure and some important mechanisms are described. They include e.g. automatic connections finding (with A* algorithm), translation to ST code, conversion to and from XML format and an execution mode supporting multiple data sources and breakpoints.

Keywords: IEC 61131-3, graphical languages, visual programming, control systems

1. Introduction CPDev (Control Program Developer) [12] is an engineering environment which can be used for programming PLCs (Programmable Logic Controllers), PACs (Programmable Automation Controllers), softPLCs (PCs used as controllers) and distributed control systems according to IEC 61131-3 norm [10]. It has been being developed in the Department of Computer and Control Engineering at Rzeszow University of Technology for a few years. CPDev is used not only by lecturers and students during didactic activities but also by companies, like LUMEL S.A. (Poland) [8][9] and Praxis Automation Technology (Netherlands) [11] which uses it in ship control and monitoring system. Cooperation with industry has an important impact on improving CPDev, because it shows required new features and areas where the software can be improved. After creation of the first version, some possible disadvantages were found, that should be removed in the next version. One of necessary features is a support for developing programs in FBD, LD and SFC languages [7], which was designed, implemented and connected with CPDev environment by the author. In the current version users can create program organization units (POUs) in all languages defined in IEC 61131-3 norm, i.e. ST (Structured Text), IL (Instruction List), FBD (Function Block Diagram), LD (Ladder Diagram) and SFC (Sequential Function Chart). It

14

Marcin Jamro

makes possible to choose a programming language by the user depending on skills and problem specificity. Programming in graphical languages has many advantages. One of them is diagrams legibility which makes understanding of an implementation easier and faster. Users can also modify it in an easier way. Usage of graphical languages could require less work and be more comfortable for programmers than using textual languages, especially for those who do not have strong programming skills. What is more, diagram printouts can be used as a part of the project documentation. Some integrated development environments with editors of graphical languages already exists, e.g. CoDeSys [4], Beckhoff TwinCAT [3], Control Builder F [1] or even an opensource environment [14]. However, these solutions can’t be easily integrated with CPDev software and its modules, like CPSim. The solution which met this requirement is necessary for the further development of CPDev environment. It caused an implementation of new graphics editors equipped with many advanced possibilities, not always available in other IEC 61131-3 IDEs, like an automatic connections finding or an execution mode with support for multiple data sources, tracing variable values or even breakpoints (both conditional and unconditional). Designing and implementation of graphics editors of FBD, LD and SFC languages, dedicated to CPDev environment, made possible to get many benefits caused by strict integration with the existing software and its modules. The second chapter presents a structure of graphics editors projects with information about common features, classes and interfaces. Main implemented mechanisms available in all graphics editors are explained in the third chapter.

2. Graphics editors projects Adding support for graphical languages [6] caused some modifications in a concept of the whole CPDev environment (Figure 1). FBD, LD and SFC diagrams are translated to the ST code, which is compiled to VMASM. It is assembled to the virtual machine code, that is run on target platforms containing CPDev virtual machine. An addition of an intermediate stage (translation to the textual ST language) is caused by an availability of well-tested ST compiler in CPDev environment. By this design, programs created in graphics editors can be run on all platforms supported by CPDev. It significantly increases software functionalities and eliminates a chance of incorrect work of FBD, LD and SFC compilers.

Figure 1. A concept of CPDev environment

Apart from translation to ST code, it is also necessary to prepare a mechanism to save an exact state of the diagram as a string, which can be used to load diagram without losing any data. To achieve this goal, XML format is used. Its structure is based on PLCopen standard [13] with modifications required by additional features implemented in the graphics editors.

Graphics editors in CPDev environment

15

Mechanisms of translation and conversion cooperate to create a complete solution for saving diagram, which can be then translated to ST code, compiled and run on any target platform supported by CPDev (Figure 2).

Figure 2. A cooperation between mechanisms of diagram conversion and translation

FBD, LD and SFC editors are prepared in C# language with the usage of .NET Framework 2.0 (to provide developers with consistency with the main part of CPDev software) and Microsoft Visual Studio 2010 integrated development environment. Every editor is written as a separate project (FBDEditor, LDEditor, and SFCEditor) connected with a general library called GraphicEditor. It contains implementation common for all graphics editors including a mechanism of automatic connections finding, a class of a main editor’s window or classes and interfaces representing diagram elements.

2.1. Features All editors of graphical languages in CPDev are equipped with a set of common functionalities necessary for creation of program organization units (POUs) with the usage of elements available in a given programming language. An important assumption is to prepare graphics editors which allow users to create diagrams as fast as possible and in an intuitive way. It causes also a necessity of adding solutions dedicated for specific editors. A set of basic functionalities contains: − adding, removing and moving elements, − copying, pasting and cutting elements, − saving and loading diagrams, − automatic connections finding between outputs and inputs with an automatic update after changing a position or a size of an element, − translation of the diagram to ST code, − conversion of the diagram to and from XML format, − printing the diagram accordingly to the printout template, − setting element properties, − automatic adjustment of elements width, − adding branch points, − detection of lines intersection, − operations history (undo and redo commands), − adjusting display settings for diagram, e.g. by showing or hiding grid and subsidiary lines or changing scale, − simple diagram verification, − informing a user of an attempt of placing an element in an incorrect location, − an execution mode for running programs with a support for tracing variable values and breakpoints (conditional and unconditional). Particular editors contain also a set of dedicated features, that make creating solutions with them easier and faster, like: − drawing lines on FBD diagrams in different colors depending on their types and settings (line color, width and style) defined by the user, − automatic generation of rungs on LD diagrams,

16

Marcin Jamro

− a possibility of adding elements directly on existing rungs on LD diagrams, − creating actions in SFC language with the usage of ST, IL, FBD and LD editors, − automatic generation of vertical lines on SFC diagrams.

2.2. Common classes and interfaces Many elements and mechanisms are similar in editors of different graphical languages available in CPDev environment. They are extracted as common classes and interfaces in an additional GraphicEditor library which is referenced by projects representing FBD, LD and SFC editors. All of them have a window used to present a single diagram (Figure 3). It is represented by an instance of a class deriving from FormDiagram which derive from CPDIChild that enables cooperation between the editor and the main CPDev window [6]. The window with a single element has two main parts: − left – containing a tree with nodes representing all elements which can be placed on the board (like variables, comments, functions or instances of function blocks), and a textbox to filter elements accordingly to a substring typed by a user, − central – with a board. Both parts are developed as controls named ItemsTree and DrawingPanel. Window appearance can be adjusted, which makes possible to add elements required by a given language.

Figure 3. Main CPDev window

In the window a diagram is shown, which is represented by an instance of a class deriving from GraphicDiagram. It contains a few properties to set and get diagram data, including a list of drawn elements (Elements), a list of all types available in the project (Types), content type (program, function or function block – ContentType), and data of global variables existing in the project (GlobalVariables). It contains also methods making possible to perform operations on the diagram (created in any graphical language) like adding or removing an element (AddElement and DeleteElement) and getting selected elements (GetSelectedElements). An important group of classes and interfaces is related to elements which can be placed on the diagram, including functions, instances of function blocks, and lines (Figure 4). These elements can be grouped by some abilities which is well visible in implementation of particular interfaces and makes possible to perform operations on different elements in the same way if they have the same ability.

Graphics editors in CPDev environment

17

Figure 4. A structure of common classes and interfaces

All elements which can be placed on the diagram implement IDrawable interface (Listing 1). It defines basic properties and methods, including rectangle which represents an element (Rectangle), value indicating whether an element is currently selected (IsSelected) and a method to draw it on the board (Draw). public interface IDrawable { Rectangle Rectangle { get; set; } Rectangle CollisionRectangle { get; } bool IsSelected { get; set; } bool IsInvalid { get; set; } ElementContextMenu Menu { get; } void Draw(Graphics graphics, double scale, DiagramModeEnum mode); void Deselect(); List GetAllDrawableElements(); IDrawable GetElement(Point point); IDrawable CloneElement(Clones clones); }

Listing 1. IDrawable interface

IParent interface is implemented by elements which contain inputs or outputs and are distinguished by an identifier (LocalId, an integer value). These elements contain functions, instances of function blocks and branch points. Classes that implement IParent have to possess members of both IDrawable and IParent interfaces. The last one defines methods to adjust dynamically a size of element (PrepareRectangle) and to set a required margin around it (GetMapModifiers). Another group consists of elements which are start or end parts of a line that form a connection. Classes representing elements from this group implement IConnectable interface which defines properties pointing to a previous and next element (PreviousElement and NextElement) and also a method returning a location of a starting point for the next line (GetConnectionPoint). A connection does not always contain only one line, because if direction changes are required, the connection will contain more lines (Figure 5). Classes implementing IConnectable interface represent for instance inputs and outputs of elements placed on the diagram (Input and Output) and lines (Line). An important element existing on diagrams created in FBD and LD languages is a branch point. It is represented by an instance of BranchPoint class implementing IConnectable interface. This element makes possible to split or connect lines, which is necessary in many scenarios, for instance to perform OR operation on LD diagrams (Figure 7). An additional class related to branch points is BranchPointInputOrOutput representing its input or

18

Marcin Jamro

output. Four its instances are connected with every instance of BranchPoint class. They represent places where connections can be added, and are located above, below, on the left and on the right of the branch point (Figure 6).

Figure 5. Differences between line and connection

Figure 6. Graphical representation of a branch point. Figure 7. An example of LD program which contains OR operation

3. Mechanisms available in graphics editors Graphics editors available in CPDev environment are equipped with many mechanisms to perform different operations related to either diagram or the window in which it is presented. These mechanisms allow to: − find connections between elements placed on the diagram, − convert diagram to and from XML format, − translate diagram to ST code, − verify diagram, − create a copy of elements, − create instances representing different diagram elements, − support operations history, including undo and repo operations, − execute a program, either in simulation or commissioning, − perform operations on elements map, − show useful information to the user, − show tooltips for elements placed on the diagram, − print diagram accordingly to the printout template, − analyze frequently used elements. Instances representing all mechanisms mentioned above are values of properties of FormDiagram class. This solution hides details of implementation, improves code readiness and makes possible to modify it in an easier way.

3.1. Automatic connections finding One of the assumptions of graphics editors is to make possible to create program organization units without unnecessary work. It is achieved also by implementation of the mecha-

Graphics editors in CPDev environment

19

nism of automatic connections finding, for instance between an output of function block instance and a variable input. It is important to be certain that a connection: − can be found every time (if elements are placed on the diagram correctly), − passes round elements added earlier, − rarely changes directions, − limits a number of intersections with other connections (Figure 8).

Figure 8. Examples of connections

20

Marcin Jamro

− ToDiagram – converting the XML format to the object representing the diagram. Converter class has also an implementation of methods used to convert parts common for all languages, like WriteFileHeader, WritePositionTag, and ReadComments. Implementation details for specific languages are set in classes deriving from Converter, placed in projects of different editors. (a)

(b)

ALARM

Listing 2. Generated nodes of XML document: (a) inVariable (FBD), (b) step (SFC)

3.3. Translation to ST code According to the assumption described in the second chapter, all diagrams are translated into ST code which is then compiled into VMASM code with the usage of ST compiler available in CPDev environment (Figure 1). This approach requires a mechanism of translating all diagrams to ST code in the same way. An additional interface named ITranslator is created to solve this problem. It contains only Translate method with one parameter of string type (diagram data in XML format) that returns a code in ST language prepared in a translation process. Translation process depends on a language. From this reason, every project has a class implementing ITranslator interface with a logic of diagram translation to ST code.

Figure 10. Translation of the FBD diagram to ST code

Graphics editors in CPDev environment

21

In case of FBD program, information about global variables is placed in VAR_EXTERNAL part (Figure 10, area 1). Local variables including instances of function blocks and variables representing element outputs (with added connections) are located in VAR part (area 2 and 3). The main part of ST code contains instructions of calling instances of function blocks, functions and also setting values of output variables (area 4).

3.4. Drawing diagrams All graphics editors need to draw elements on the board. It is implemented with the usage of methods available in Windows Forms technology [15], that makes possible to create user interfaces in a fast and easy way. It contains many classes representing window elements like buttons, textboxes and drop down lists. When the board is being refreshed, a new bitmap and Graphics object are created. Then the mechanism prepares a rectangle representing a currently visible part of the diagram accordingly to the current scrollbars positions. The next step consists of drawing all necessary elements on the diagram and is performed by DrawDiagram method. At the end, bitmap is drawn on the board and presented to the user. DrawDiagram method is defined as a virtual with a common implementation in FormDiagram class (from GraphicEditor library). It calls other methods that perform operations related to drawing the following diagram parts: grid, subsidiary lines, elements, connections, border around currently added element, temporary connections during moving elements, rectangle representing a selection, incorrect elements, additional diagram map, and symbols representing breakpoints. In case of specific languages, it is necessary to add some dedicated elements on the board, like left and right power rails on LD diagrams. What is more, for all graphics editors a mechanism of double buffering is used to prevent from showing undesired effects while the board is being refreshed.

3.5. Printing diagrams One of important advantages of graphical languages is diagrams legibility and a possibility of attaching their printouts directly to the project documentation to provide engineers with important information about implementation of a specific system part. FBD, LD and SFC editors are equipped with a mechanism of printing accordingly to the printout template defined by the user in XML file. It makes possible to adjust page margins and add a table with information about company name, project or printout date (Listing 3). (...)

Listing 3. A part of printout template file

22

Marcin Jamro

The printout template allows users to define content shown in the table in three ways: from resources, from variables (where the following keywords are available: PROGRAM, PROJECT, VERSION, COMPANY and CURRENT_TIME) or defined directly in the printout template file (with a support for many languages).

3.6. Simple diagram verification FBD, LD and SFC editors are equipped with a mechanism of finding basic errors on the diagram. The main concept is to connect a set of tests for each language and run them every time before a project build and on user demand (Figure 11). Tests analyze different diagram parts and save information in a form of errors and warnings. If errors exist, project can’t be built and messages are shown to the user with more information about error source.

Figure 11. Concept of the verification mechanism

An additional interface IValidator is implemented by classes from FBDEditor, LDEditor and SFCEditor projects. The interface has two lists with errors and warnings data represented by instances of ValidationMessage class. By the usage of an additional class it is possible not only to show an information that error or warning occurred during diagram verification, but also point at elements that caused it. IValidator interface (Listing 4) contains Validate method which run all required tests. It is important that the mechanism of simple diagram verification cannot find all errors, but it could be useful for users to inform about operations which could be undesired and cause problems while the program is running. public interface IValidator { List Errors { get; } List Warnings { get; } bool IsValid { get; } void Validate(GraphicDiagram diagram); }

Listing 4. IValidator interface

3.7. Programs execution and testing All graphics editors contain an execution mode which allows users to check execution of created program either on a simulator or a real device (commissioning) by tracing variable values. The mechanism supports many data sources (Figure 12) that are systems providing current variable values, which can be retrieved for instance from simulator (local virtual machine), by Modbus protocol (used by Lumel S.A. company) or from FPGA platform.

Graphics editors in CPDev environment

23

Figure 12. Execution mechanism with the common data source provider

Presentation of the received data on the diagram depends on its type and value (Figure 13). For instance, for variable of BOOL type and value FALSE a dashed line is drawn. In case of TRUE value, a line is solid. It is also possible to show values of variables as tooltips or a text above lines representing connections.

Figure 13. Breakpoints in graphics editors

Execution mechanism supports breakpoints, both unconditional and conditional. They can be placed next to diagram elements (e.g. function block instances) and stop execution of the program just before performing an operation related to the element (e.g. setting a value of a variable or calling a function). In case of conditional breakpoints, the execution of the program will be stopped only if expression (given as a condition) is equal to TRUE.

4. Conclusion IEC 61131-3 norm defines languages that make possible to program controllers and distributed control systems in a convenient way by using both textual and graphical languages. One of the most significant advantages is that users can choose a suitable language and combine different languages in the same project, e.g. write a function block in FBD and the main program in ST. One of the integrated development environments used to program in languages of IEC 61131-3 norm is CPDev, which has been being developed in the Department of Computer and Control Engineering at Rzeszow University of Technology for a few years. Earlier it supported programming only in textual languages. That lack of functionality was removed after adding support, by the author, for graphical languages defined in IEC 61131-3 norm. Currently CPDev engineering environment can be used to create program organization units in any language of this norm. It significantly increases possibilities of its usage, because users can choose a suitable language depending on their skills and problem specificity.

24

Marcin Jamro

There are many similarities between FBD, LD and SFC editors available in CPDev, which led to extraction of a common part as a separate GraphicEditor library containing a set of classes and interfaces used by other projects. It decreases an amount of code, number of potential errors and also makes modifications easier. Graphics editors are equipped with a set of mechanisms to perform specific operations on the diagrams, like automatic connections finding (with the usage of A* algorithm), translation to ST code, conversion to and from XML format, operations history, simple diagrams verification and also an advanced execution mode with support for tracing variable values and breakpoints.

References [1] ABB Engineering – Freelance (Sterowniki i systemy sterowania) [online], http://www.abb.pl/product/seitp334/ee37d357581192adc12571ca00431c6e.aspx [access: 2012]. [2] Amit’s A* Pages, Introduction to A* [online], http://theory.stanford.edu/~amitp/Game Programming/AStarComparison.html [access: 2012]. [3] BECKHOFF New Automation Technology, http://www.beckhoff.de/ [access: 2012]. [4] CoDeSys, http://www.3s-software.com/ [access: 2012]. [5] Dijkstra E. A note on two problems in connexion with graphs, Numerische Mathematik, 1959, p. 269-271. [6] Jamro M., Sadolewski J. Edytor diagramów FBD jako moduł zintegrowanego środowiska CPDev. [w:] Trybus L, Samolej S. (red.): Projektowanie, Analiza i Implementacja Systemów Czasu Rzeczywistego. WKŁ, Warszawa, 2011. [7] Jamro M., Rzońca D., Sadolewski J., Stec A., Świder Z., Trybus B., Trybus L. Rozwój środowiska inżynierskiego CPDev do programowania systemów sterowania. [w:] Trybus L, Samolej S. (red.): Projektowanie, Analiza i Implementacja Systemów Czasu Rzeczywistego. WKŁ, Warszawa, 2011. [8] Jamro M., Rzońca D., Sadolewski J., Stec A., Świder Z., Trybus B., Trybus L.: Uruchamianie rozproszonego systemu kontrolno-pomiarowego. Conference publication – XVII Krajowa Konferencja Automatyki KKA’2011 – 19-22.06.2011 r., Kielce - Cedzyna. [9] Lubuskie Zakłady Aparatów Elektrycznych LUMEL S.A., Zielona Góra, http://www.lumel.com.pl. [10] PN-EN 61131-3 – Sterowniki programowalne. Część 3: Języki programowania. Warszawa 2004. [11] PRAXIS Automation Technology – http://www.praxis-automation.nl. [12] Stec A., Świder Z., Trybus L.: Charakterystyka funkcjonalna prototypowego system do programowania systemów wbudowanych według normy IEC 61131-3. [w:] Z. Huzar, Z. Mazur (red.): Systemy Czasu Rzeczywistego. Metody i zastosowania. WKŁ, Warszawa, 2007. [13] Technical Paper – PLCopen Technical Committee 6. XML Formats for IEC 61131-3, Version 2.01 – Official Release, 2009. [14] Tisserant E., Bessard L., de Sousa M. – An Open Source IEC 61131-3 Integrated Development Environment, Industrial Informatics, 2007 5th IEEE International Conference on, Vol. 1 (2007), pp. 183-187. [15] The Official Microsoft WPF and Windows Forms Site [online] – http://windowsclient.net/ [access: 2012].

Journal of Theoretical and Applied Computer Science ISSN 2299-2634

Vol. 6, No. 1, 2012, pp. 25-34 http://www.jtacs.org

Architectural view model for an integration platform Tomasz Górski Military University of Technology, Faculty of Cybernetics, Poland [email protected]

Abstract:

The most common architectural view model is "4+1" by Philipe Kruchten. This model presents the views required for a full description of computer system architecture. By contrast, this model seems to be insufficient to describe architecture of integration platform. Definitely lacks the view of integrated business processes. In the serviceoriented approach, one of the basic elements is a contract. It should also be included in the description of the architecture. Moreover, very important are integration mechanisms and mediation flows that should be presented in the description of architecture. Hence the need for integrated services view, and manner of their integration on the enterprise service bus. Use case view should also be extended by stereotypes required for presenting functionality exposed for other computer systems. It is therefore proposed architectural view model “1+5” for an integration platform. This model has following architectural views: Integrated processes, Use Cases, Logical, Integrated Services, Contracts, Deployment. Furthermore, in article was presented new UML profile "UML Profile for Integration Flows". In the profile were placed stereotypes corresponding to integration patterns and mediation mechanisms. It is important, that UML activity diagram was extended and its special form was obtained to model mediation flows on integration platform. Thus was proposed a new UML diagram: mediation flows diagram.

Keywords: Information systems integration, architecture of information system, modeling

1. Introduction Service-oriented architecture is the concept of creating IT systems based on defining of services that system should offer. The service, in this context, is the software component acting completely independently and having clearly defined interface. The basis for the specification of the modeling method is the SOA reference model, with particular emphasis on the integration layer. In this layer, the basic element is a enterprise service bus (ESB)1. It is software, which enables the efficient and standardized communication between connected applications. ESB allows you to connect applications and services, regardless of implementation, technology, operating systems and data types. Services interact with each other using XML (eXtensible Markup Language). Integration platform consists of enterprise service bus and connected information systems. The most common architectural view model is "4+1" Philipe Kruchten2. This model presents views required for a full description of the information system architecture. But, this model seems to be insufficient to describe in full architecture of an integration platform. Model "1+5" takes into account the business process layer and the layer of integration of 1 2

Keen M., Achraya A., Implementing an SOA Using an Enterprise Service Bus, IBM, 2010 Kroll P., Rational Unified Process Made Easy-A Practitioner's Guide to the RUP, Addison-Wesley, 2003

26

Tomasz Górski

services between systems. A completely new element is the mediation flow diagram that shows the mediation required to complete service calls between different systems. This model is definitely better suited to the specific of integration solutions.

2. “1+5” architectural view model “4+1” view model definitely lacks the view of integrated business processes. In the service-oriented approach one of the basic elements is notion of contract. It should also be included in architectural description. Moreover, very important are integration mechanisms and mediation flows that should be presented in the description of architecture. Hence the need for integrated services view, and manner of their integration on the enterprise service bus. Use case view should also be extended by stereotypes required for presenting functionality exposed for other computer systems. Hence redefined architectural view model was proposed which is tailored to the needs of integration platforms designing. This model was called "1+5" and its first version was presented and published in article3. The model was refined and now consists of following architectural views: • Integrated processes, • Use cases, • Logical, • Integrated services, • Contracts, • Deployment. Figure 1 shows an architectural view model "1+5".

Figure 1. Architectural view model „1+5”

The basic architectural view is the integrated processes view. The next four views are used to present design of an integration platform. All products of design of integration platform should be deployed on common runtime environment. This environment is presented within deployment view. Table 1 shows the comparison of architectural view models: architectural view model "4+1" and architectural view model "1+5" (proposed by author of the article). The main difference is the proposal of two additional views: Integrated processes and Contracts. In Integrated processes view are modeled business processes which should be automated on integration platform. In both approaches are present following views: Use cases, Logical and Deployment. In “1+5” model Use cases view contains additional stereotype for integrated information system which is connected by integration platform. The 3

Górski T., Metoda modelowania architektury platformy integracyjnej „1+5”, Inżynieria Oprogramowania w Procesach Integracji Systemów Informatycznych, PWNT Gdańsk, 2011

Architectural view model for an integration platform

27

name of Implementation view was changed on Integrated services view. In this view are presented services exposed from information systems, their way of inclusion on enterprise service bus and mediation flows defined for services. Table 1. Architectural view models comparison „4+1” architectural views Use cases Logical Implementation Deployment Processes

„1+5” architectural views Integrated processes Use cases Logical Contracts Integrated services Deployment -

2.1. Integrated processes view The purpose of this view is the identification of business processes defined across all the analyzed organizations, which will require the integration. The view is presented in the Processes Model. Business processes are presented on a business process diagram of BPMN language. As a swim lines on business process diagrams are organizations (systems) analyzed in this model. This is a basic view for the rest architectural views. In this view are identified all services which require support by information system. Those services encompass both human and automated tasks. In business process are identified following types of tasks BPMN: • Manual task – this kind of task represents work which is entirely designed for humans and does not require contact with electronic devices, • Human task – this kind of task represents work for human, but his results must be entered to the electronic device, • Automated task – this work is entirely designed for electronic devices and requires no human intervention.

2.2. Use cases view Use case view defines the scope and expected functionality of information systems in the form of use case diagrams. This view is presented in the Use case model. The functionality of each system is presented on use case diagrams. For each of the integrated system is built separate use case diagram. One of the main tasks of this view is presentation of system’s use cases provided by the platform to other systems, without indicating the specific technical solutions. Use case diagram for integrated system has the same abstractions as in standard UML notation. It is possible therefore to define the actors who are computer systems that use the services exposed by the enterprise service bus. In order to distinguish the actors, such as integrated systems, a new stereotype has been proposed with a defined shape, which is part of the newly created UML profile "UML Profile for Integration Platform"4. The flow of events within each use case can be presented onto the UML activity diagram. 4

Górski T., Profil „UML Profile for Integration Platform” do modelowania architektury platformy integracyjnej, Inżynieria Oprogramowania w Procesach Integracji Systemów Informatycznych, PWNT Gdańsk, 2011

28

Tomasz Górski

2.3. Logical view The view from one side presents the realizations of the use cases identified in information systems. For this purpose following diagrams are used: sequence, communication, and classes. Moreover, the important task of this view is to present the structure of business entities as defined in the integrated processes view. On class diagrams are presented structures of classes needed for realization of requests of human tasks. Interaction with the user of the human task can be presented in sequence or communication diagrams. It is also important to present the class implementing service calls from external systems. In addition, this view shows an executable business process in form of BPEL (Business Process Execution Language). Elements of Logical view are presented in the Design model.

2.4. Contracts view This view is presented in the Services model. It is important to present participants of the integration which are systems that are connected to the enterprise service bus. So, it is important to define the contracts that will be implemented on the platform integration. This view is used to illustrate the cooperation of components in order to realize the contract. This view presents all contracts that occur between systems connected to the integration platform. The definition of a contract presents two parties: the component implementing the service () and a component that uses the service (). Contracts are presented in the UML diagram – composite structures diagram. The contract is represented on this diagram as a collaboration.

2.5. Integrated services view The purpose of this view is to show all the services included in the integration platform. This view is presented in the service model. The view presents service customers and providers through appropriate interfaces and references. For this purpose component diagram is used, which illustrates the services with relevant stereotypes. On the same diagram the main elements of the platform are presented, without describing the details of their operation and implementation. The central part of the diagram is enterprise service bus. It is presented as a component with stereotype >, which is part of the created UML profile "UML Profile for Integration Platform". Other elements in this view are services which should be included in the integration platform. These are components with stereotype on integration platform according to the notation SoaML5. These can be either a single service such as SCA components, as well as all composites. Inclusion condition of composite is the need to expose a common interface and reference to receive results from executed service on the ESB. Each of these elements should have clarified role on the platform by using stereotypes. Such an element can be provider or customer of service. In this view should be hidden all logical structures defined in the individual components. Their analysis is subject in the logical view. In addition, this view shows the structure of the enterprise service bus on component diagram. An important aim of this view is to present integration flows on enterprise service bus. For this purpose, activity diagram is used with applying UML profile "UML Profile for Integration Flows." Note that in this way, UML activity diagrams were extended, and its special form was obtained for modeling mediation flows on integration platform. Thus was proposed a new UML diagram: mediation flows diagram. 5

2009

Casanave C., Service Oriented Architecture Using the OMG SoaML Standard, Model Driven Solution,

29

Architectural view model for an integration platform

2.6. Deployment view This view is represented in the deployment model. System specifications at the physical level should include a description of the deployment architecture of equipment required for proper operation of the proposed integration platform. This applies to list all the necessary equipment and connections between them. UML diagram which realizes these tasks is deployment diagram, which specifies the location of the designed application on the infrastructure node in the organization. The central element is the integration server, which will communicate with computer systems connected to the platform. The main objective is to analyze the equipment necessary for services exposition on the enterprise service bus. The view for integration platform is a combination of deployment views for each of the analyzed systems. However, only those elements are analyzed which are necessary to provide services on enterprise service bus. In the diagram, it can also be described connection protocols between integrated systems.

3. Architecture modelling elements of integration platform In the presented approach were proposed models and diagrams of languages BPMN, BPEL and UML for modeling architecture of integration platform (Table 2). Table 2. Architecture modelling elements of integration platform Model

View

Diagram

Processes

Integrated processes

(BPMN) Business process

Use cases

Use cases

(UML) Use case (UML) Activity

Design

Logical

(UML) Sequence (UML) Communication (UML) Class (BPEL) Business process in XML format

Services

Integrated services

(UML) Component (UML) Activity

Contracts

(UML) Component (UML) Composite structure

Deployment

Deployment

(UML) Deployment

4. UML Profile for Integration Flows UML Profile "UML Profile for Integration Flows" contains stereotypes needed to show mediation patterns and mechanisms6. UML activity diagram was extended and therefore specific form of this diagram was obtained for modeling mediation flows on integration platform. Thus it was proposed a new UML diagram: mediation flows diagram. This diagram shows the flow of actions to be taken to transfer a call from the service consumer to 6

G. Hophe, B. Woolf, Enterprise Integration Patterns: Designing, Building, and Deploying Messaging Solutions

30

Tomasz Górski

service provider's system. In particular, these may include data format conversion, message enrichment or message filtering. It is important that in the existing tools modeling of mediation flows was limited due to small range of offered mediation icons. Repeated modeling icons confused both modeler and reader of the models. In the profile for each of mediation patterns was proposed unique icon. In this way was provided full transparency of modeled mediation. The profile in its present form can be very useful for Integration Architect. This profile was created in an IBM Rational Software Architect. The manner of UML profile design and its application to the modeling project were described in detail in the literature7. This profile can be used to model the mediation flows on enterprise service bus. Table 3 shows selected stereotypes defined in the described profile. Table 3. Description of selected stereotypes in profile “UML Profile for Integration Flows” Pattern’s name ContentEnricher

Pattern’s icon

Pattern’s description Enrichment of message content.

ContentFilter

Message content filter.

Endpoint (Message Endpoint)

Point of sending or receiving messages.

EnvelopeWrapper

Wraps the data to be sent in accordance with the requirements of the messaging system.

Translator

Transformation of data formats.

5. Integration platform design for electronic circulation of prescription Issue under consideration is implementation of electronic circulation of prescription in the National Health Services. Basic processes in the circulation of prescription are: • Writing a prescription by a licensed physician (doctor), • Realization of a prescription at a pharmacy. These processes are closely related. In the present situation, prescriptions are written by hand by a doctor and next handed to the patient. The patient goes to the pharmacy where pharmacist based on paper prescription passes him drugs. In these processes are also 7

Górski T., Profil „UML Profile for Integration Platform” do modelowania architektury platformy integracyjnej, Inżynieria Oprogramowania w Procesach Integracji Systemów Informatycznych, PWNT Gdańsk, 2011

Architectural view model for an integration platform

31

involved the National Health Fund (NHF) which acts in Poland as the agency responsible for reimbursement of medicines. Pharmacies and doctors are required to regularly provide information to NHF about prescriptions issued and realized. Replacement of paper prescription by its electronic equivalent would automate NHF control process and reduce the requirements for entities issuing and realizing prescriptions. In addition, there would be eliminated problem of damaged, unreadable or incorrectly filled prescriptions. The e-Prescription (pol. e-Recepta8) project involves the preparation of an integration platform for the electronic version of a prescription exchange between a doctor, pharmacy and the National Health Fund. During a visit, doctor would issue prescription for the patient in system e-Prescription. Then, in the pharmacy the patient would tell his/her Social Security Number (pol. PESEL) and the pharmacist would have insight into all of patient’s issued and unrealized prescriptions. In design of integration platform were used selected architectural views from model “1+5”. As modeling language was used Unified Modeling Language9 and UML profiles “UML Profile for Integration Platform”10 and “UML Profile for Integration Flows”.

5.1. Use cases view In this business case we have two parties who carry out their activities. The first is the doctor who must be able to write prescriptions and view them. This second feature will be available for a pharmacist in order to find a prescription to be realized. The application which implements the functionality has been called "e-Prescription". Figure 2 shows the use case diagram for the application "e-Prescription" with a separate system, integrated by the integration platform. This is “e-Pharmacy” system and to that system it has been applied the stereotype from profile "UML Profile for Integration Platform". This diagram is a part of the use cases view form model "1+5". The basic functionality of application “e-Prescription” is use case “Write prescription”.

Figure 2. Use case diagram for „e-Prescription” with integrated system „e-Pharmacy”

The other side is a pharmacist, which realizes prescriptions issued by doctors. The pharmacist must be able to realize prescriptions and view realizations of prescriptions. This second feature will be available for the doctor to find the realization of prescription that has previously been issued. The application which implements the functionality has been called 8

http://www.e-recepta.gov.pl Fowler M., UML Distilled Third Edition, Addison-Wesley, 2005 10 Górski T., Profil „UML Profile for Integration Platform” do modelowania architektury platformy integracyjnej, Inżynieria Oprogramowania w Procesach Integracji Systemów Informatycznych, PWNT Gdańsk, 2011 9

32

Tomasz Górski

"e-Pharmacy" (Figure 3). The basic functionality for application "e-Pharmacy" is use case “Realize prescription”.

Figure 3. Use case diagram for „e-Pharmacy” with an integrated system „e-Prescription”

5.2. Integrated services view Use cases "Get prescriptions" and "Get prescription’s realization" are implemented as services and they are exposed to the integration platform using WSDL11. Services exposed from individual systems and required by individual systems are shown in the UML component diagram (Figure 4). This diagram is a representation of integrated services architectural view of model “1+5”.

Figure 4. Component diagram in Integrated services architectural view

Data exchange involves the selection of format for documents which will be sent on integration platform. In applications XML12 document format was chosen to write a prescription. However, the use of XML implies a lot of overhead in the size of the files exchanged between systems. In the realization of the target system should be considered using a data format that does not make such large overhead in file size. This view is used, the mentioned above mediation flow diagram. Mediation flow for getting prescriptions is shown in figure Figure 5.

11 12

Web Services Description Language (WSDL) Version 2.0, W3C 2007 http://www.w3.org/TR/wsdl20/ eXtensible Markup Language, http://www.w3.org/XML/

Architectural view model for an integration platform

33

Figure 5. Mediation flow diagram for getting prescriptions for patient

5.3. Implementation of applications and integration Implementation of applications has been implemented in Java Server Faces. Implementation involved using of several tools: Eclipse IDE, Tomcat application server, database server MS SQL Server and IBM Enterprise Service Bus. Both considered applications were implemented. Then enterprise service bus was configured to integrate applications for doctors and pharmacists. The configuration set up the listener endpoint for the SOAP over HTTP protocol and it were created queues for incoming and outgoing calls. Subsequently, on the enterprise service bus were registered service made available by system "ePharmacy" allowing for the realization of prescriptions and the service made available by system "e-Prescription", allowing getting issued prescriptions. Incoming services were also registered. First processes requests for access to services provided by the system "ePrescription", the second forwards requests to the application "e-Pharmacy". Application "ePrescription" allows on writing prescription and searching for prescriptions which are already issued. The application "e-Pharmacy" offers functionality for finding prescription for realization. In this searching enterprise service bus is involved. After finding a prescription it can be realized in the application "e-Pharmacy". Then in the application "e-Prescription" you can search for realized prescriptions, which was previously done using the application "e-Pharmacy". Enterprise service bus participates in searching for realized prescriptions. In this way, a circulation of electronic prescription was made available between physician issuing a prescription and realizing it pharmacist. Pharmacist having available, specified by patient Social Security Number (pol. PESEL) is able to read prescriptions of that person and realize them.

6. Summary The article presents the refined model of architectural view model “1+5” for designing integration platforms. Model "1+5" takes into account the business process layer and the layer of integration of services between systems. This model is definitely better suited to the specific of integration solutions. In Integrated services view a new type of UML diagram

34

Tomasz Górski

was proposed – mediation flows diagram. This diagram shows the flow of actions to be taken to transfer a call from the service consumer to service provider's system. In particular, these may include data format conversion, message enrichment or message filtering. Usually various systems have different data structures, protocols, message formats. For that reason mediation flows diagram can be very useful for Integration Architect. In order to fully describe the integration platform architecture were included previously defined “UML Profile for Integration Platform” and a new UML profile "UML Profile for Integration Flows". Moreover, in architectural description were used following languages: SoaML13, BPMN and BPEL. The proposed model was applied in design of integration platform for electronic circulation of prescription. The proposed architectural approach is dedicated to the design of integration platform with a defined set of architectural views, modeling language, process design and tool support. Modeling and design of integration platform were made in IBM Rational Software Architect version 8.0. On the market there are many environments for building integration platforms and the proposed method can be used regardless of the type of such an environment. A wide range of enterprise service buses can be used: WebSphere ESB, ServiceMix (FuseESB), Mule and webMethods. Further studies are moving in the direction of design automation of integration platform. Important aspects of design integration platforms, in which ongoing studies are also: simulation model of integration platform, performance analysis of integration platform14, configuration of development process for integration platforms.

References [1] Arsanjani A. i in., SOMA: A method for developing service-oriented solutions, IBM SYSTEMS JOURNAL, VOL 47, NO 3, 2008, str. 377-396, [2] Casanave C., Service Oriented Architecture Using the OMG SoaML Standard, Model Driven Solution, 2009, [3] Fowler M., UML Distilled Third Edition, Addison-Wesley, 2005, [4] Górski T., Metoda modelowania architektury platformy integracyjnej „1+5”, Inżynieria Oprogramowania w Procesach Integracji Systemów Informatycznych, PWNT Gdańsk, 2011, [5] Górski T., Profil „UML Profile for Integration Platform” do modelowania architektury platformy integracyjnej, Inżynieria Oprogramowania w Procesach Integracji Systemów Informatycznych, PWNT Gdańsk, 2011, [6] Górski T., Badanie wydajności wybranych środowisk budowy platform integracyjnych, Biuletyn Wojskowej Akademii Technicznej, Vol. LXI, Nr 1, 2012, [7] Keen M., Achraya A., Implementing an SOA Using an Enterprise Service Bus, IBM, 2010, [8] Kroll P., Rational Unified Process Made Easy-A Practitioner's Guide to the RUP, Addison-Wesley, 2003, [9] Web Services Description Language (WSDL) Version 2.0, W3C 2007 http://www.w3.org/TR/wsdl20/

13

Casanave C., Service Oriented Architecture Using the OMG SoaML Standard, Model Driven Solution, grudzień 2009 14 Górski T., Badanie wydajności wybranych środowisk budowy platform integracyjnych, Biuletyn Wojskowej Akademii Technicznej, Vol. LXI, Nr 1, 2012,

Journal of Theoretical and Applied Computer Science ISSN 2299-2634

Vol. 6, No. 1, 2012, pp. 35-45 http://www.jtacs.org

QualitySpy: a framework for monitoring software development processes Marian Jureczko, Jan Magott Wrocław University of Technology, Faculty of Electronics, Poland [email protected], [email protected]

Abstract:

The growing popularity of highly iterative, agile processes creates increasing need for automated monitoring of the quality of software artifacts, which would be focused on short terms (in the case of eXtreme Programming process iteration can be limited to one week). This paper presents a framework that calculates software metrics and cooperates with development tools (e.g. source version control system and issue tracking system) to describe current state of a software project with regard to its quality. The framework is designed to support high level of automation of data collection and to be useful for researchers as well as for industry. The framework is currently being developed hence the paper reports already implemented features as well as future plans. The first release is scheduled for July.

Keywords: framework, quality assurance, software metrics, software engineering

1. Introduction This paper presents a framework for monitoring software development process. Failing to deliver requested feature on eligible schedule and quality may cause serious loss (e.g. financial). Therefore, it is important to keep an eye on the work progress and early identify problems. There are a number of tools employed in most of the software development environments. We believe that these tools constitute a valuable source of metrics that can be used to evaluate project current state and predict about the further development. The QualitSpy framework is designed to integrate with the issue tracking system, the version control system, the continuous integration system and the source code itself by collecting raw data as well as software metrics. The collected data is used for research (empirical experimentation, model creation) and for project evaluation. Before the further discussion some important distinctions should be made. We refer to the information collected by the QualitySpy framework as metrics or as raw data. The term metrics refers to measurements that were conducted on at least nominal scale and describe various attributes of the investigated artifacts (e.g. the sum of modified lines in a certain class in a set of SubVersion revisions) whereas the term raw data is adequate in the case of a direct description of the artifacts (e.g. the content of a file or a class committed in a certain SubVersion revision). Furthermore, we identify two types of metrics. There are product metrics, which describe the state of an artifact on a certain time (e.g. number of lines of code in a certain class) and there are historical metrics (sometimes called process metrics), which refer to the changes in an artifact over a time period (e.g. number of SubVersion revisions committed between two dates).

36

Marian Jureczko, Jan Magott

The rest of the paper is organized as follows: in the next section related works are discussed; the third section presents our motivations; the fourth and fifth sections describes the QualitySpy framework; the fourth one is focused on functionalities whereas the fifth one discusses the framework architecture and the sixth one presents a simple case of the system usage; the final chapter presents future plans, specifically the schedule of forthcoming releases.

2. Related work There are several tools that are similar to our framework. First of all we would like to mention our own solutions that laid foundation for the QualitySpy framework. These are: • CKJM extended (http://gromit.iiar.pwr.wroc.pl/p_inf/ckjm/) – it calculates 19 different product metrics including CK [2], QMood [1] and Tang’s [14] metric suites and Efferent Coupling, Afferent Coupling, Lack of COhesion in Methods 3 and Lines Of Code; • BugInfo (http://kenai.com/projects/buginfo/) – it calculates number of defects and six other historical metrics from SubVersion and CVS repositories; • Metrics Repository (http://purl.org/MarianJureczko/MetricsRepo) – it is a web site where a number of collected metrics are publicly available. The entire aforementioned tools haven been used in earlier studies [11, 12, 13].

2.1. Software developer perspective There are tools that are focused on low level measurements (close to source code). Hence, such tools are specifically interesting for developers. The basic set of features of the QualitySpy framework overlaps the main functionalities of tools like Sonar (http://www.sonarsource.org/). The main purpose of such tools is to calculate quality related metrics. Most of the metrics concern product (e.g. lines of code), however, there are more sophisticated ones as well. Specifically, there are metrics related to test coverage and code smells (symptoms in the source code that possibly indicate a deeper problem). Both are useful for developers since show possibilities for improving software quality

2.2. Manager perspective SPR KnowledgePLAN (http://www.spr.com/spr-knowledgeplanr.html) is a tool designed to help planning software projects. It supports estimating work, schedule, and defects as well as evaluating project strengths and weaknesses to determine their impact on quality and productivity. The tool is focused on upfront estimations (expressed in figures and Gannt chart) that are based on questionnaire filled by the user. Collecting metrics and monitoring project progress are not supported. Similar features are available in the SEER for Software (http://www.galorath.com/index.php/products/software/C5/), although the focus is moved toward statistical methods. ProjectCodeMeter (http://ww.projectcodemeter.com/cost_estimation/index.html) is a tool to measure and estimate the Time, Cost, Complexity, Quality and Maintainability of software projects as well as development team productivity by analyzing their source code. The tool supports a number of estimation models (e.g. WMFP, COCOMO) and collecting metrics from source code. Software-Cockpit (http://ww.de.capgemini.com/capgemini/forschung/research/software/) is focused on moni-

QualitySpy: a framework for monitoring software development processes

37

toring software projects in order to early identify potential schedule or quality problems. The tool cooperates with other systems from the development environment (e.g. version control system). Nevertheless, it only slightly overlaps QualitySpy features since SoftwareCockpit focus is moved toward the perspective of high level management and hence many important details that regard the employed measurement and evaluation methods are not presented. The manager perspective is focused on high level project features and goals like the overall cost estimation or return of the investment. Therefore, it is hardly possible to draw conclusions regarding maintainability or quality issues of low level project modules or artifacts (e.g. classes). There is also no room for research since the tools that aims this perspective offer ready to use reports but limited data for user defined analysis.

2.3. Researcher perspective Unfortunately, the aforementioned tools have limited utility for researchers. Tools from developer perspective deliver us with many interesting metrics; however there is little support for further analysis and the available metrics are almost exclusively calculated from source code (other data sources are usually ignored). One the other hand, the tools from manager perspective implement complex models but in a black-box approach and hence there is no room for self-invented analyses. In consequence the researchers usually write own script to collect the necessary data (e.g. [12, 15]). Nevertheless, to metric collection process with regard to software development environment has already been considered in several studies. Fisher et al. [6] presented a toolset for collecting data from CVS and Bugzilla that was based on Perl scripts and the wget tool. Furthermore, heuristics for identifying merges and bugfix comments were suggested. The solution was validated on the Mozilla project. Subsequently the authors (in [7]) suggested a method for combining the data collected from CVS and Bugzilla with the program call graph. The aforementioned experiences resulted in a tool called Evolizer (https://www.evolizer.org/), which is successfully employed in recent studies e.g. [9]. Methods of extracting data from Bugzilla and CVS were also considered by D’Ambros et al. [3], Zimmerman and WeiBgerber [16] and German [8].

3. Motivation Software metrics are widely employed in areas such as effort estimation, defect prediction and maintainability assessment. All of them require empirical data that is subsequently transformed to metrics and used to construct models. Unfortunately, the data collection process may be costly and laborious (Fenton and Neil estimated the cost of a software measurement program to be 4% of the development budget [4]). Therefore, it is vital to have good tool support and to limit the costs. All the three mentioned areas of software engineering (i.e. effort estimation, defect prediction and maintainability assessment) require large data sets in order to construct models with satisfactory level of external validity. The community works on repositories that may satisfy such requirement (e.g. promisedata.org). Nevertheless, the available data is not uniform since different contributors choose different metrics. Recent studies showed great value of process and historical metrics [5, 10, 11]. Such metrics can be usually extracted from issue tracking systems and version control systems. Unfortunately, the collection process is difficult and has unsatisfactory tool support. The researchers usually build their own solutions, e.g. [15]. Furthermore, there are no so called

38

Marian Jureczko, Jan Magott

industrial standard for the process and historical metrics. Therefore, different researchers use slightly different metric definitions and hence the results are difficult to compare and to replicate on other data sets. It could be also challenging to validate the results since even when the input data is published it may be not obvious how it was extracted from the source code or other artifacts. The publicly available data sets contain already calculated metrics and the mapping on software processes and artifacts may be vague (This doubt does not regard metrics definitions but missing details. E.g. let’s assume that a researcher is investigating the number of distinct authors per Java class. The researcher says that he is investigating project P in version V and that he is collecting the number of distinct authors from SubVersion repository. The situation seems to be clear… However, what if there are two branches in the SubVersion repository? Did the researcher investigate both? Moreover, it is possible that some changes have been made when the investigation was finished. Specifically the source code visibility could be changed, e.g. a new branch is added to the public domain or the whole repository is shut down. In such case it may be not possible to recreate the original state.) The QualitySpy framework attempts to address the aforementioned issues. The framework is designed to collect raw data and to allow the user (e.g. researcher) to define metrics. Therefore, it will be possible to collect uniform data (i.e. the raw data; it will be a detailed copy of project state and hence when the project evolves it is not necessary to recreate previous state for investigation purposes since the previous state is stored in QualitySpy repository) define various metrics on top of them and conduct a variety of experiments. Furthermore, the results could be compare among different experiments (even experiments that use different metrics) since the data source (the raw data) will be in the same format. The QualitySpy approach increases external validity of experiments by providing uniform data. Let’s assume that there is a researcher that investigates a certain phenomenon and he collects empirical data in order to conduct an experiment. Since the collected data is stored in uniform format another researcher that is investigating another phenomenon (maybe using different metrics) can employ the already collected data in his experiment to extend the external validity. There are two main goals for the QualitySpy framework. The more important one (mentioned above) regards collecting uniform data for research. The second one is connected with industry. The QualitySpy framework should be in assistance for assuring quality in software development processes as well. The tool will cooperate with most common development tools and systems in order to deliver valuable information with regard to project estimation and planning or software quality and maintainability. Nevertheless, we believe that accomplishing the second goal requires further research. Therefore, the first release of QualitySpy corresponds only with the first goal. The first version will be used to collect empirical data. Further investigation (based on the collected data) will show how the data can be effectively used by industry. The investigation results will be employed to define reports with regard to different project aspects.

4. Features There are two main groups of features. The first is designed for data acquisition when the later for data analysis and reporting (Fig. 1).

QualitySpy: a framework for monitoring software development processes

39

4.1. Acquisition The acquisition related features are encapsulated in an user interface where the configuration for all connectors can be set. The user can define several different configurations; specifically there can be many software projects under investigation and a separate configuration for each of them. Using the connectors the user can collect data from different sources: • software metrics are calculated from Java classes, • raw data regarding history of the source code is collected from source version control system (e.g SubVersion), • raw data regarding project history is extracted from the issue tracking system (e.g. Jira), • raw data regarding project history is collected from continuous integration system (e.g. Hudson). Three of the four aforementioned connectors collect raw data. It means that the data is stored in a textual form without transformation into metrics. However, later the user has the possibility to define metrics on top of the raw data using the reporting interface. All the collected data is stored in a central repository and is available for further investigation. Specifically, the collected data will be available through the acquisitor’s interface.

4.2. Reporting The framework offers a publicly available web interface. All the collected data is available through the interface. Furthermore, it will be also possible to define new metrics on top of the available data and generate reports that are based on these metrics. There will be a set of predefined report templates (based on research results), however the user will have the possibility to define his own. The reporting interface of the QualitSpy first release will be mainly used for research. We are going to formulate the research results into report templates that will be used to reflect current state and estimations for the software project. The reports are scheduled for one of the future releases.

40

Marian Jureczko, Jan Magott

Figure 1. Use Case diagram. Use cases with grey background are scheduled for second release

QualitySpy: a framework for monitoring software development processes

41

5. Architecture The QualitySpy framework consists of several modules. There is a central repository with two user interfaces one for data acquisition and one for reports (Fig. 2.) and a set of connectors, which are designed to collaborate with development tools and artifacts in order to collect the data. Acquisitor application

Inspector application

Internet

Central repository

Figure 2. QualitySpy high level architecture

5.1. Repository The repository is built on a relational database, where all the collected data is stored. There is an interface for data acquisition where the user can configure how the framework should collaborate with the development tools to collect necessary data. There is also an independent web interface that gives access to the collected data through internet. The later one is designed to replace our earlier service, i.e. Metrics Repository (http://purl.org/MarianJureczko/MetricsRepo).

5.2. Issue tracker connector The issue tracker connector is designed to collect data from an issue tracker called Jira (http://www.atlassian.com/software/jira). We are going to develop support for other issue trackers (Bugzilla, Mantis and Redmine) in the future. Currently it is possible to connect with a Jira running instance and collect detailed data about selected issues (including history). QualitySpy collects data through the user interface (web page) using Selenium. Therefore the process does not depend on Jira configuration (i.e. Jira CLI or Jira Web Service which could be disabled).

5.3. Source version controller connector The source version controller connector is designed to collect data from one of the most commonly used version control systems, namely SubVersion (http://subversion.tigris.org). Support for CVS and GIT will be implemented in one of the next versions. This connector is

42

Marian Jureczko, Jan Magott

derived from our earlier tool – BugInfo (http://kenai.com/projects/buginfo). It is possible to connect with a SubVersion server and read detailed data about specified revisions (modified files and modified lines within those files, authors of modifications…). Specifically, the connector can identify which files represent Java classes. Therefore, the collected data can be easily combined with data obtained from other connectors.

5.4. Metrics connector The metric connector wraps the CKJM extended tool (http://gromit.iiar.pwr.wroc.pl/p_inf/ckjm/) in order to calculate software metrics form the byte code of compiled Java classes. The connector supports all the 19 metrics calculated by CKJM extended and connects them with certain Java classes and methods.

5.6. Continuous integration connector The continuous integration connector is scheduled for the second release. The connector is designed to collaborate with Hudson continuous integration system (http://hudson-ci.org/) and to collect detailed data regarding build successes and failures as well as other available data. The scope of available data depends on Hudson configuration. Most common are test results. However, others (e.g. test coverage) are sometimes available as well. We are considering integration with other continuous integration systems, e.g. Atlassian Bamboo, Apache Continuum, CruiseControl.

5.7. Reporting module The collected data and reports are available in the reporting module. We are going to implement some predefined report templates according to research results. However, the tool is intended to be flexible and hence the user could define own templates as well. The reports will be based on metrics and the metrics will be defined on top of the raw data. We are going to suggest a dedicated language (or an extension to existing one) that will be used by the user to define metrics and report templates. Report is an instance of report template that is filled with a combination of metric values. The reporting module is implemented as light web client, which communicate with the server (central repository) using Representational State Transfer (REST). This module is operated through a web browser and hence it will become a part of the public domain once it has been deployed on an application container.

6. Instead of evaluation This paper deadline is a couple weeks before QualitySpy release and hence some refactorization and integration activities are still in progress. In consequence, it is not possible to conduct a formal evaluation. Instead a simple case of the system usage is presented. QualitySpy was executed on the version control system of the Apache Ivy project (http://ant.apache.org/ivy/) with following settings: • SubVersion repository url: http://svn.apache.org/repos/asf/ant/ivy/core/trunk/ • Java classes prefix (necessary to identify whether a file represents the object of study e.g. a Java class and to cut off this part of file path that does not belong to the Java packages): /(ant|incubator)/ivy/(core/)?trunk/src/java/ • Java classes postfix: .java • start revision: 734618

43

QualitySpy: a framework for monitoring software development processes

• end revision: 811778 In result an amount of data was collected and stored in the repository. For the sake of readability we present only a small subset of those data, namely description of revision number 737330 (Tab. 1). The content of Tab. 1. slightly differ from the real commit number 737330. Four files from this commit are not present in the table. Those are a xml and a txt file and two JUnit test classes located in the directory test/java. Those files were excluded since they do not correspond with the given Java classes prefix and postfix. Table 1. Data collected with regard to Apache Ivy revision 737330 Revision number

Date

Author Comment

Added Removed Tags Lines Lines

737330 2009-01- xavier FIX: TTL 24 does not 12:00:40 work as expected (IVY1012)

2

737330 2009-01- xavier FIX: TTL 24 does not 12:00:40 work as expected (IVY1012)

3

737330 2009-01- xavier FIX: TTL 24 does not 12:00:40 work as expected (IVY1012)

8

737330 2009-01- xavier FIX: TTL 24 does not 12:00:40 work as expected (IVY1012)

7

6

Class name

New content

org.apache.ivy /* * Licensed to the Apache .core.cache.D Software efaultReposito Foundation ryCacheMana (ASF) under one or more * ger contributor license agreements. See the NOTICE file.....

3

/* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file.....

org.apache.ivy /* * Licensed to the Apache .core.cache.Re Software positoryCache Foundation (ASF) under one Manager

/* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file.....

org.apache.ivy /* * Licensed to the Apache .plugins.resolv Software er.AbstractRe Foundation (ASF) under one solver

/* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file.....

or more * contributor license agreements. See the NOTICE file.....

0

/* * Licensed to the Apache Software Foundation (ASF) under one or more * contributor license agreements. See the NOTICE file.....

org.apache.ivy /* * Licensed to the Apache .plugins.resolv Software er.BasicResol Foundation (ASF) under one ver or more * contributor license agreements. See the NOTICE file.....

0

Old content

or more * contributor license agreements. See the NOTICE file.....

The collected data consist of: • Revision number – the revision number that is used in the SubVersion repository. • Date – date of the commit, when the revision was committed. • Author – identifier of the developer that committed the revision.

44

Marian Jureczko, Jan Magott

• • • • • •

Comment – the commentary that was assigned to the commit. Added lines – number of lines that were added to the committed file (Java class). Removed lines – number of lines that were removed from the committed file (Java class). Tags – the tags from version control system that are assigned to the revision. New content – the whole content of the committed file; Tab.1 presents only the beginning of the file. Old content – the whole content of the previous state of the committed file (i.e. the state before the commit); Tab.1 presents only the beginning of the file; Old content together with New content may be used to track down every single change in the committed file.

7. Schedule and future work The QualitySpy framework is developed by undergraduate students of computer science on the Faculty of Electronics. The students created a 16-member team divided into 4 subteams. They were working during one semester. The software development process is based on Scrum. The first release will be made after two semesters of development (there is a new group of students for the second semester) and is scheduled for July 2012. The first release contains only the basic features regarding data collection, namely integration with CKJM extended (software metrics), version control system (i.e. SubVersion) and issue tracking system (i.e Jira). There is a database where all the collected data is stored and very simple reporting module that gives access to the collected data in a read only mode. All the other mentioned features are scheduled for future releases. Specifically, we are going to use the first release to collect data for experiments that will constitute solid basis for the reporting module. The first release as well as all the forthcoming will be available through project web page: http://purl.org/MarianJureczko/QualitySpy. We believe that we will be able to make a release each year. In consequence, the second release is scheduled for July 2013.

Acknowledgement We would like to thank to all the students that were involved in the QualitySpy development process. A complete list of all the students is available online: http://purl.org/MarianJureczko/QualitySpy/Acknowledgement.

References [1] Bansiya J. and Davis C.G. A Hierarchical Model for Object-Oriented Design Quality Assessment. IEEE Transactions on Software Engineering, No 1(28), 2002, 4-17. [2] Chidamber S. and Kemerer C.. A metrics suite for object oriented design. Software Engineering, IEEE Transactions on. No 20(6), 1994, 476–493. [3] D'Ambros M., Lanza M., Pinzger M. "A Bug's Life" Visualizing a Bug Database, 4th IEEE International Workshop on Visualizing Software for Understanding and Analysis, 2007, 113-120. [4] Fenton N. and Neil M. Software Metrics: Successes, Failures and New Directions. Journal of Systems and Software, No 47(2-3), 1999, 149-157.

QualitySpy: a framework for monitoring software development processes

45

[5] Fenton N., Neil M., Marsh W., Hearty P., Radliński Ł and Krause P. Project Data Incorporating Qualitative Factors for Improved Software Defect Prediction. Proceedings of the 29th International Conference on Software Engineering Workshops. 2007. [6] Fischer M., Pinzger M. and Gall H. Populating a Release History Database from Version Control and Bug Tracking Systems. International Conference on Software Maintenance (ICSM '03). Washington, DC, USA, 2003. [7] Fischer M., Pinzger M. and Gall H. Analyzing and Relating Bug Report Data for Feature Tracking. 10th Working Conference on Reverse Engineering (WCRE '03). Washington, DC, USA, 2003. [8] German D.M. Mining CVS repositories, the softChange experience. IEE Seminar Digest. No 917. 2004. 17-21. [9] Giger E., Pinzger M, and Gall H. Using the gini coefficient for bug prediction in eclipse. 12th International Workshop on Principles of Software Evolution and the 7th annual ERCIM Workshop on Software Evolution (IWPSE-EVOL '11). New York, USA, 2011. [10] Illes-Seifert T. and Paech B. Exploring the relationship of a file's history and its faultproneness: An empirical method and its application to open source programs. Information and Software Technology. No 52(5), 2010, 539-558. [11] Jureczko M. Significance of Different Software Metrics in Defect Prediction. Software Engineeering: An International Journal. No 1(1), 2011, 86-95. [12] Jureczko M. and Madeyski L. Towards identifying software project clusters with regard to defect prediction. Sixth International Conference on Predictive Models in Software Engineering, Timisoara, Romania, September 12-13, 2010. [13] Jureczko M. and Spinellis D. Using object-oriented design metrics to predict software defects. Fifth International Conference on Dependability and Complex Systems DepCoS-RELCOMEX 2010. s. 69-81. [14] Tang M.-H., Kao M.-H. and Chen M.-H. An Empirical Study on Object-Oriented Metrics. Sixth IEEE International Symposium on Software Metrics. 1999. [15] Weiss C., Premraj R., Zimmermann T., and Zeller A. How Long Will It Take to Fix This Bug? Proceedings of the 4th International Workshop on Mining Software Repositories. 2007. [16] Zimmermann T. and WeiBgerber P. Preprocessing CVS data for fine-grained analysis. IEE Seminar Digests. No 917. 2004. 2-6.

Journal of Theoretical and Applied Computer Science ISSN 2299-2634

Vol. 6, No. 1, 2012, pp. 47-66 http://www.jtacs.org

Controllability and observability gramians parallel computation using GPU Damian Raczyński, Włodzimierz Stanisławski Opole University of Technology, Faculty of Electrical Engineering, Automatic Control and Computer Science, Poland [email protected]

Abstract:

Algorithms and parallel programs for the controllability and observability gramians computation of the Linear Time Invariant (LTI) systems using Lyapunov equation with an application of the NVIDIA general purpose Graphics Processing Unit (GPU), are presented in the paper. Parallel computing of the gramians on the basis of Lyapunov equation is justified for the large scale systems (n>104) due to the computational cost O(n3). The parallel performance of controllability gramians computation using NVIDIA graphics hardware GTX465 have been compared with the performance obtained for MATLAB environment employing analogous algorithms. They have also been compared with the performance obtained for lyap function provided by MATLAB environment. The values of maximum computing acceleration were up to 20. The computations have been made on the basis of linearized models of the one-phase zone of a once-through boiler obtained with the finite elements method. The orders of the models were being adapted within the range between 30 and 4200.

Key words: Controllability and observability gramians, Lyapunov equation, GPU, parallel computing

1. Controllability and observability gramians concept For LTI systems, described by state equations: x& (t ) = Ax (t ) + Bu(t )

(1)

y (t ) = Cx (t ) + Du(t )

a controllability gramian P and an observability gramian Q are square symmetric matrices determined in accordance with [20]: ∞

∞

P = ∫ e BB e At

* A*t

dt

0

Q = ∫ e At C*Ce A*t dt

(2)

0

In practice, the matrices P and Q are determined on the basis of the Lyapunov equations [9]:

AP + PA* + BB* = 0

A*Q + QA + C*C = 0

(3)

One of the main methods for linear models reduction by means of Singular Value Decomposition (SVD) is based on the gramians matrices P and Q [1].

48

Damian Raczyński, Włodzimierz Stanisławski

Since both of the Lyapunov equations (3) have similar forms, therefore all algorithms and programs in the further part of the paper have been based on the procedure of determining the controllability gramian P.

2. Methods of solving the Lyapunov equation * * The basic methods for solving the Lyapunov equation AP + PA + BB = 0 are as follows [20, 22]: • Bartels-Stewart method, • Smith method, • Alternating Direction Implicite (ADI) method, • Sign Function method.

2.1. Bartels-Stewart method The algorithm includes the following steps. Firstly, Schur decomposition is applied to transform matrices A and A* into upper triangular forms [22]:

R 1 = UAU * , R 2 = VA * V *

(4)

~ and then matrix D is formed according to the following relation: ~ D = U* (BB* ) V

(5)

After transformations, the Lyapunov equation receives the following form: ~ ~ ~ R 1X + XR 2 + D = 0 .

(6)

Due to a triangular form of the matrices R1 and R2, equation (4) may be effectively solved by application of the following relation: k −1 ~ ( R 1 + rkk( 2 ) I )~ x k = −d k − ∑ ~ x i rik( 2 )

(7)

i =1

where r ( 2 ) denotes an element of matrix R2 in a row and column defined by the subscripts, ~ ~ ~ d k denotes k-th column of matrix D , ~ xi denotes i-th column of matrix X . In the final step, the matrix, which is the solution of the Lyapunov equation, is determined with the following relation: ~ gramian = UXV* (8)

Figure 1 presents a program for MATLAB environment determining the controllability gramian by means of Bartles-Stewart method.

2.2. Smith method Smith method is based on a conversion of the Lyapunov equation by means of a bilinear transformation into a discrete form presented below [9]: VPV * − P + W = 0 ,

where:

(9)

Controllability and observability gramians parallel computation using GPU

49

V = (qI − A * ) −1 (qI + A * ) ,

(10)

W = 2q (qI − A * ) −1 BB * ( qI − A ) −1 .

(11)

Assuming that matrix A is asymptotically stable, the consecutive approximations of the gramian values for parameter q>0 are obtained from the following relation: K

K

PK +1 = PK + V 2 PK (V 2 )* ,

(12)

P0 = 2q(qI − A * ) −1 BB* (qI − A) −1 .

(13)

for

The number of iterations required for desired accuracy depends on an appropriate selection of parameter q (as proven experimentally, the most favourable q value equals 0.1). Figure 2 presents a program for MATLAB environment determining the controllability gramian by means of Smith method. 1. [rows, cols]=size(A); 2. [Q1,R1]=m_schur(A); 3. [Q2,R2]=m_schur(A'); 4. D=Q1'*B*Q2; 5. X=zeros(rows,cols); 6. for counter=1:rows; 7. result=zeros(rows,1); 8. if counter>1 9. R(1:counter-1, 1)=R2(1:counter-1, counter); 10. result=(X(:, 1:counter-1))*R; 11. end 12. X(:,counter)=(R1+R2(counter,counter)*eye(rows, cols))\(D(:,counter)-result); 13. end 14. P=Q1*X*Q2';

Figure 1. The controllability gramian computation by means of Bartels-Stewart method 1. 2. 3. 4. 5. 6. 7. 8. 9. 10. 11. 12. 13. 14. 15.

A=A'; q=0.1; [rows,cols]=size(A); V=(q*eye(rows,cols)-A')^-1*(q*eye(rows,cols)+A'); W=2*q*(q*eye(rows,cols)-A')^-1*Q*(q*eye(rows,cols)-A)^-1; P=W; futher=true; while futher N=V*P*V'; P=P+N; if(max(max(abs(N)))length(parameters)) 11. [parameters, lambdas]=adi_parameters3(A, parameters, iterations); 12. end 13. if (mod(i, iterations)==0) 14. P_old=P; 15. end 16. P1_2=(A+parameters(i)*eye(rows))\(-B-P*(A'parameters(i)*eye(rows))); 17. P=(A+parameters(i)*eye(rows))\(-B-P1_2'*(A'parameters(i)*eye(rows))); 18. if(mod(i,iterations)==0); 19. if (max(max(abs(P_old-P)))