Modular Flow Analysis for Concurrent Software

Modular Flow Analysis for Concurrent Software Matthew B. Dwyer Department of Computing and Information Sciences Kansas State University [email protected]....

Author: Francine Miller

1 downloads 2 Views 83KB Size

Report

Download PDF

Recommend Documents

Flow analysis for verifying properties of concurrent software systems

MATERIAL FLOW ANALYSIS WITH SOFTWARE STAN

Secure Information Flow for Concurrent Programs under Total Store Order

Software for mediation analysis

Dynamic Flow Analysis for JavaScript

Modular zum Software-Baukasten. Entwicklungserfolg

Load Flow Analysis of 132 kv substation using ETAP Software

CFD Analysis of Twin Jet Supersonic Flow with Fluent Software

Decentralized Modular Control of Concurrent Discrete Event Systems

Software for network meta-analysis

Synthesizer A Pattern Language for Designing Digital Modular Synthesis Software

Optimal Allocation of Testing Resources for Modular Software Systems

Mass Flow Meter Analysis for Reliable Measuring

Safety and hazard analysis in concurrent systems

Data Flow Analysis for Software Prefetching Linked Data Structures in Java

Network Flow Analysis. Tutorial

Data Flow Analysis for Software Prefetching Linked Data Structures in Java

cash flow analysis

Cash Flow Analysis REVIEW

Analysis Strategies for Software Product Lines

Software Design and Analysis for Engineers

NI Combustion Analysis System Software for LabVIEW

High level failure analysis for Integrated Modular Avionics

Modular Flow Analysis for Concurrent Software Matthew B. Dwyer Department of Computing and Information Sciences Kansas State University [email protected]

Abstract

Top-down, bottom-up, and mixed software development strategies all require that testing and analysis approaches be applicable to partially implemented programs. In the case of top-down development, modular analysis can be applied to reason about a partial program by incorporating stub specifications for unimplemented portions of the program. In the case of bottom-up development, modular analysis can be applied to reason about a partial program by incorporating interface specifications for implemented portions of the program and performing unit-level analysis. Such specifications capture information about the context in which the partial program will execute and as such they can be crucial to enabling precise analysis. While the approach presented in this paper is applicable to these inter-procedural settings, we focus on the modular analysis of concurrently executing components. The FLAVERS approach and the extensions described in this paper are not limited to a specific programming model, however, the examples in the paper are Ada-83 programs. To support empirical evaluation of our approach we have implemented our modular analysis for programs written in Ada-83 that do not use dynamic task creation and do not include exceptional control-flow. We describe some experiences with applying modular FLAVERS analysis to a realistic concurrent Ada-83 program. The next section describes related work. We go on to provide a high-level overview of FLAVERS in the following section and then we describe the extensions to FLAVERS that enable modular analysis. We report on the application of FLAVERS to the modular analysis of a selected application in Section 5. In the final section, we conclude and discuss some directions for future work.

Modern software systems are designed and implemented in a modular fashion by composing individual components. Early validation of individual module designs and implementations offers the potential to detect and correct defects that might otherwise go undetected until system-level validation. This is particularly true for errors related to interactions between system components. In this paper, we describe a static analysis approach that allows validation of components, or groups of components, of sequential or concurrent software systems. This work builds off of an existing approach, FLAVERS, that uses program flow analysis to verify explicitly stated correctness properties of software systems. We illustrate our modular analysis approach and some of its benefits by describing part of a case-study with a realistic concurrent multi-component system.

1. Introduction Users and developers of large, complex software systems require cost-effective techniques that can be applied to gain confidence in system correctness. For sequential software, determining whether a design or implementation satisfies an explicitly stated correctness property is a difficult problem. For concurrent software, the problem is significantly more difficult both theoretically and in practice. A wide variety of approaches have been developed to address this problem for whole programs. It is well known that early life-cycle validation of system components offers the potential to reduce overall development cost. In this paper, we present an extension to the FLAVERS analysis approach [8, 10] that can be applied to component implementations at the unit-level, such that, the analysis results can be leveraged effectively for integration and system-level analyses. FLAVERS applied to program components is refered to as modular FLAVERS.

2. Related Work There is a large body of research on flow analysis for sequential programs, e.g., [14]. Recently these techniques have begun to be adapted to explicitly concurrent programs, e.g., [12, 15, 17]. The modular structure of sequential programs has been the focus of much work on inter-procedural

This work was supported in part by NSF and DARPA under grants

CCR-9633399 and CCR-9703094.

1

analysis. Conceptually, we can think of a modular analysis of concurrent software in much the same way as interprocedural analysis. Both include the notion of a summary of the behavior of the context in which a module executes. This summary is used to sharpen the precision of the analysis of the module itself. There is an important difference between the two, in concurrent software we must interpret the summarized behavior as executing in parallel rather than in sequence with the module under analysis. To the best of our knowledge, this has not been done by other program flow analyses. FLAVERS is a flow analysis designed to solve a class of specification verification problems over concurrent programs. In recent years, researchers have developed a number of concurrency analysis approaches that employ flow analyses. Masticola and Ryder [16] developed an analysis approach for checking deadlock freedom in Ada tasking programs that uses flow analyses [17] to improve the accuracy of the analysis results. Cheung and Kramer [3] developed a flow analysis that is capable of detecting a class of anomalous behaviors in concurrent programs. Flow analysis is also used in other static concurrency analyses for refining the models on which analysis is based rather than as the primary analysis algorithm. The field of concurrency analysis is rich with different approaches. As Corbett [6] discusses, despite exponential worst-case bounds on their running time many of these approaches are capable of providing cost-effective analysis of non-trivial programs. Nevertheless, there have been numerous efforts intended to further reduce the cost of analysis. Most relevant to our work are the compositional approaches. These apply a divide-and-conquer strategy to decompose a program into subsystems, analyze the subsystems in isolation, then recombine subsystem analysis results to infer properties of the whole program. Compositional variants of state reachability analyses, e.g., [20, 4], model-checking, e.g., [5], and integer necessary conditions analysis [7] have all been shown to provide lower cost analysis than their noncompositional counterparts for selected systems. The work described in this paper contributes another analysis technique that could be applied as part of a compositional analysis although our focus is on analyzing modules in isolation. Our work shares much in common with the recent work of Avrunin et. al. [1]. Both approaches consider partially defined systems, but each applies a different analysis technique to the problem. Their work focuses on real-time aspects of system behavior, but could also be applied to the correctness properties considered in this paper. The notion of capturing the behavior of the context in which a system executes has a long history in concurrency analysis and is fundamental to the compositional approaches mentioned above. Perhaps the best known example of incorporating the behavior of a system‘s environment into the

verification process is the use of fairness constraints. Notions of fairness are useful, but limited to expressing requirements related to the progress of concurrently executing system components. In general, we wish to incorporate partial knowledge about the behavior of an open-system‘s environment for a broad class of environments. This knowledge effectively converts an open-system to a closed-system that can be safely verified. Recent work by Kupferman and Vardi [13] develops complexity results for model checking of partial systems in the presence of temporal logic formulae that describe environment assumptions. Dwyer and Schmidt [11] show that either temporal logic formulae or finite-state automata can be used to filter an imprecise description of a system, for example one that leaves the environment completely unconstrained, to enable precise model checking. In this paper we introduce environment automata to achieve a similar filtering; these automata are closely related to the interface constraints Cheung and Kramer [4] use in their compositional reachability analysis.

3. Background To address the fundamental complexity of reasoning about the execution behavior of concurrent programs FLAVERS takes a novel approach. Rather than constructing a precise representation of program behavior and applying costly analysis algorithms that attempt to produce highly accurate results, FLAVERS take an incremental approach to producing accurate analysis results. Analysis begins by constructing an imprecise conservative model of program behavior. Conservative flow analysis algorithms are then applied to compare the modeled program behaviors to a specification of intended program behavior. Subsequent to this initial analysis the precision of both the program model and flow analysis algorithms can be increased. This flexibility allows users to begin with a low-cost analysis then incorporate only as much information as is necessary to obtain precise analysis results for the specification in question. FLAVERS is described in detail in [8, 10]. Our goal here is to provide sufficient background so that the reader can understand how we have modified the analysis approach for modular analysis. FLAVERS consists of three major components.

3.1. Trace Flow Graph The trace flow graph (TFG) is a model of the set of feasible program executions; it is conservative in the sense that all such executions are included in the model. Conceptually, the TFG is a forest of task control flow graphs with additional nodes and edges that represent inter-task communication, synchronization and execution ordering information. 2

Users designate an alphabet of program events, denoted by , such that each symbol corresponds to the execution of a specific program activity, such as the definition of a variable or a procedure call. Symbols from this alphabet are used to label the nodes of the TFG, thus program executions are modeled as strings over .

E int

P

erf ace

3.2. Property Automaton Figure 1. Subsystem Interfaces The property automaton (PA) is a deterministic finite state automaton defined over an alphabet . The language recognized by such an automaton is conservative in the sense that it accepts all relevant behaviors, over , that the program is intended to exhibit. Typically, these specifications only partially describe the intended behavior of a program. Individual specifications can focus on key aspects of program behavior yet remain small; program behavior can be more completely specified by combining individual specifications.

4. Modular FLAVERS FLAVERS was originally defined for the analysis of whole programs. A TFG was built to represent all possible behaviors of the program. To enable modular analysis we must be able to formulate FLAVERS analyses over partially defined applications, such as collections of tasks or even an individual program task. This requires the definition of a TFG for incomplete applications. To assure the conservativeness of a modular analysis this incomplete TFG must be combined with a representation of all possible behaviors of the rest of the application. We use a new type of feasibility constraint to form such a representation and incorporate it into FLAVERS analysis.

3.3. State Propagation Analysis State propagation analysis is a monotone, bounded flow analysis algorithm that performs a conservative test for whether the sequences of program actions modeled by the TFG are contained in the language of the PA. If this containment test is true then the analysis has conclusively established that the program satisfies the specification. If the test is false then either the program fails to satisfy the specification or due to imprecision in the analysis a false negative result was produced; in either case we say the result is inconclusive.

4.1. Partial Programs Before we can describe how to model and analyze a partial program we must define what we mean by a partial program. We view a concurrent program as being composed of a collection of tasks. Consider a partition of the set of program tasks into two subsets , the partial program of interest, and , the rest of the program which is referred to as ’s environment; we refer to these subsets as subsystems. We require that the alphabet of program actions be decom posed into two subsets and such that the intersection of these alphabets represent the points at which the subsystems interact as illustrated in Figure !#"%$ '&( *) 1. We call this the interface alphabet, . In this paper, we restrict the actions modeled by symbols in the interface alphabet to be inter-task communication requests, either sends or receives, such that all sends to a specified channel lie on one side of the interface and all receives lie on the other. This restriction is not severe and can be lifted allowing arbitrary actions to be modeled by interface symbols; while this generalization is not difficult we do not consider it in this paper. Our modular analysis will use the TFG representation of a subsystem, such as , and a feasibility constraint that conservatively represents the possible sequences of program actions that lie in the subsystem interface.

As described above the intention is that this analysis be efficient at the expense, at least initially, of inconclusive results. Efficiency comes in part from the fact that the state propagation algorithm has a bound on its execution time that is linear in the number of PA states and cubic in the number of TFG nodes. In order to eliminate false negative results FLAVERS provides mechanisms for the analyst to increase the precision of state propagation analysis by incorporating feasibility constraints (FC)s into the flow analysis algorithm. These techniques have been implemented and applied to the analysis of a number of complete concurrent Ada83 applications [2, 8, 18]. This experience indicates that FLAVERS is capable of producing conclusive analysis results in verifying a variety of specifications. Furthermore, empirical evidence suggests that FLAVERS analyses grow sub-cubicly in the number of TFG nodes. 3

T1

T2

T1

T2

s1

s3

s1

s3

accept E;

s2

x := 17;

y := y + 1;

Put(x);

done := y >= 10;

E

T1.E;

s4

T2

T1

s2

s4

Figure 3. Modeling Asynchrony

Uncombined Task Flow Graphs

request for communication, either a send or receive, whose !#"%$

associated symbol is in we do not construct intertask communication edges or nodes. Instead we model the communication statement as any other noncommunication statement. For the example shown in Figure 2, if task T1 were not included in the partial system then the call on the E entry would be modeled by a single node just as it is in the uncombined task flow graph. The resulting trace flow graph says nothing about those tasks not present in the partial program, nor does it constrain the order or number of times that interface actions may be performed. As long as we are only reasoning about the actions in we do not need to model the asynchronous execution of actions that are internal to the environment of . Conceptu ally, we can view program executions as strings over and state propagation analysis as the exhaustive enumeration of those strings and checking their acceptance by the PA. By restricting our consideration to , we need only check the projection of those strings over for acceptance. Sym,+- bols in are internal to the environment and cannot appear in the projected strings, thus they can be ignored for the purpose of modular analysis. This means that in addition to the nodes and control flow edges that are internal to the environment, we can also eliminate MIP edges that cross the interface from the TFG of a partial program. Note that we can build such a TFG directly and do not need to construct the TFG for the complete program first. The result is a TFG for a partial program that is conservative in model ing all feasible sequences of program actions from . A formal definition and proofs of the conservativeness of such partial TFGs for FLAVERS analyses are given in [9].

Trace Flow Graph

Figure 2. Modeling inter-task Communication

4.2. Partial TFGs Before describing TFGs for partial programs we discuss TFGs for whole programs as described in [8]. TFGs are constructed from a collection of control flow graphs, one for each task in the program. The first phase of TFG construction involves identifying pairs of matching communication statements, e.g., Ada entry calls and accept statements for the same task entry. Figure 2 illustrates the TFG structure that is built to represent the joint flow of data and control that is present when pairwise intertask communication is modeled. These jointly executed operations are denoted as diamonds. Similar structures are constructed to model the passing of data between communication partners and the execution of fragments of code while synchronized. To faithfully model all feasible sequences of statements the TFG incorporates an additional class of edges; these may immediately precede (MIP) edges represent the pairwise interleavings of the program actions that label their source and destination nodes. Interleavings of strings of actions in asynchronously executing tasks are represented in the transitive closure of control flow and MIP edges. Figure 3 illustrates how a TFG path including MIP edges, drawn as double-headed dashed lines, represents the asynchronous execution of actions in independent tasks. The dotted line depicts a path through the TFG that represents a program execution where the scheduler suspends T1 after the assignment to x and runs T2 for two statements before suspending it and switching back to T1. The paths in a TFG model all possible scheduling decisions. Our goal is to construct a TFG, for a partial program , that models all possible sequences of program actions whose labels are in . To construct such a TFG, we apply the same algorithm that is used for building TFGs for complete programs with one modification. If a task in includes a

4.3. Environment Automata While the unconstrained behavior of the environment that is modeled in the TFG for a partial program is conservative, it may include many infeasible sequences of interface actions. This can have a deleterious effect on the precision of FLAVERS analyses and lead to inconclusive results. Our goal is to construct feasibility constraints that allow us to restrict the environments behavior to be more precise, yet 4

remain conservative. First a few words on how feasibility constraints are used in FLAVERS analyses. Feasibility constraints are built to encode necessary conditions for path feasibility. During state propagation analysis these conditions are tested and if violations of the conditions are detected analysis data can be ignored, since it was produced by considering an infeasible path. The key observation that enables FCs to be cost-effective is that we need not enforce complete information about path feasibility in a constraint. Instead FCs model selected components of control and data states of the program that can be used to eliminate some infeasible paths. Combining multiple constraints in a single analysis allows increasingly higher levels of precision to be obtained. Constraining the flow of values to be consistent with program execution semantics is the key benefit of feasibility constraints. Previous work with FLAVERS incorporated constraints that capture task control flow information and variable state information. We add a new type of feasibility constraint that restricts the sequences of actions that correspond to interface symbols of a partially specified program. These constraints either encode known properties of the environment, derived by previous analyses, or specify intended environment behavior. We refer to such constraints as environment automata (EA). A formal definition of the structure of environment automata parallels that for other feasibility contraints and is given in [9].

ing PA states. Thus, the analysis for all executions cannot be verified conclusively. When the EA is incorporated into analysis the picture becomes more complicated. Values at TFG nodes are elements of the powerset of the Cartesian product of PA and EA states. In general, a constraint enters its violation state v when a sequence of symbols associated with a path through the TFG is inconsistent with respect to the information encoded in the constraint. In the case of an EA that means that the path violates the known behavior < +=3>/? of the environment. Whenever a tuple of the form is encountered the tuple is discarded, as depicted in the figure. Tuples may flow to the final TFG node, such as < 9@39? , that do not correspond to accepting states of the EA. All such tuples are discarded prior to comparison of the PA state values to the accepting state. To perform this comparison we collect only the PA components of the value tuples, ignoring all feasibility constraint values. In our example, 0A73;B6 this leaves states which is a subset of the accepting PA states. Thus, for our example, the addition of the EA for the Client task eliminated enough infeasible paths from the analysis so that conclusive verification of the specification was achieved.

4.5. Sources of EAs Environment automata can be constructed from a variety of different sources. Existing code, design and architectural descriptions can be reduced and abstracted to form an EA. This approach involves application of a partition-refinement algorithm to a given description which has all non-interface information abstracted away. We took this approach in our example, where we abstracted and reduced the structure of the Client task to build the EA. This can also be done when the environment consists of multiple tasks. Users may provide design and interface specifications for subsystems that can be used to construct a collection of EAs. These EAs can be thought of as assumptions that must be subsequently verified with respect to the environment. We note that all partial programs have a default EA. It consists of a single accepting state, that is also the start state, where for all symbols there is a self-loop transition; this EA accepts ./ !"$ C . The extension of TFGs, described the language above, for partial programs effectively enforces this default environment constraint.

4.4. Example Figure 4 illustrates the TFG for the program consisting of Server and Client tasks with ovals denoting local task operations and diamonds denoting joint communication operations. It also gives the PA specification for the property ”on all program executions, if an a occurs then a b must follow”. Note that the unlabeled nodes are to be interpreted as having no label for state propagation flow analysis, in which case an identity transfer function is applied. While this example is very simple it allows us to illustrate the principles and some details of modular flow analysis. Based on the alphabet of the PA we can divide the system in half with ./ !"$ *&10#24356 . Figure 5 illustrates the partial TFGs for both Client and Server. Note that the communication nodes from the whole program TFG have been converted to be local nodes in the partial TFGs. We could attempt a modular analysis of the specification for either partial TFG, but we only illustrate an analysis of the Server. To do this we use the EA from Figure 5 which approximates the behavior of the environment, i.e., Client. Using the partial TFG we could conduct a state propagation flow analysis without using the EA as a constraint. Figure 6 illustrates the final values computed for each TFG node; the values are elements of the powerset of PA states. Note that the 087/39:3;86 value at the final node is not a subset of the accept-

In modular FLAVERS analysis these interface specifications serve two purposes: as specifications of the intended behavior of components which FLAVERS can validate, and as abstract descriptions of component behavior that can be incorporated into analysis of other system components. In this way, the accumulated results of the unit-level analyses for a component can serve as a resevoir of validated abstractions of interface behavior that can be used in integration and system-level analyses. 5

Server

Client

task body Server is begin while not Done loop select accept a; or accept b; end select; end loop; end Server;

b

1

a

task body Client is begin while not Done loop Server.a; Server.b; end loop; end Client;

a a

2

b

b 3

Property Automaton

TFG

Source Code

a,b

Figure 4. Concurrent Program and Specification Server

Client

1

a

a

b b

a 2

v

a,b

a

b

b

Partial TFG for Client

Environment Automaton for Client

Partial TFG for Server

Figure 5. Partial TFGs and Environment Automaton

{1,2,3}

{(1,1),(2,2),(3,1),(3,2)}

{1,2,3}

{(1,1),(2,2),(3,1),(3,2)} {2,3}

{(2,2),(2,v),(3,2),(3,v)}

{1,3}

{(1,v),(3,1),(3,v)}

{1,2,3}

{(2,2),(3,1),(3,2)}

{1,2,3}

{(1,1),(2,2),(3,1),(3,2)}

w/ default EA

w/ EA for T2

Figure 6. State Propagation

6

put

Loc_Sampler Water_Sampler Wind_Sampler Air_Sampler

rel Semaphore

Database acq curr

iter

next

weather

Ship1

hist Reporter send

emerg reset

Receiver up

sos down

Sailor1

Xmitter Reporting Subsystem

Buoy Subsystem

Figure 7. Host-buoy System Architecture

5. Applying Modular Analyses

and tasks, depicted as rhombuses. Lines with arrows represent either subprogram invocation or calls to a task entry depending on the type of node at the destination of the arrow. Shaded semicircles represent interface operations for program components. The system we analyzed has single Sailor and Ship tasks and four data Sampler tasks. This system can be decomposed in a number of different ways. Sanden decomposes the problem in a manner consistent with with the dashed lines in Figure 7 that is reflective of his design method. We use these same subsystem boundaries to guide our analysis. We refer to all tasks and packages as the full system. If we separate the Ship and Sailor tasks from the full system we refer to what remains as the buoy subsystem. If we separate the Semaphore and Sampler tasks and the Database package from the buoy subsystem we refer to what remains as the reporting subsystem. Figure 7 places these subsystem names in the appropriate module boundary. Our modular analysis of the host-buoy system works by analyzing the partial programs contained within these subsystem boundaries together with environment automata that capture the behavior of the rest of the full system.

In this section, we discuss the application of modular FLAVERS analyses to Sanden’s Ada-83 design and implementation of the Host-buoy at sea problem [19]. We focus primarily on how the structure of this system was exploited in the modular FLAVERS analysis. We also present data on the performance improvements that we were able to attain relative to whole-program FLAVERS analysis.

5.1. Tools and Data Collection We used the FLAVERS/Ada toolset to analyze specifications of Ada-83 tasking programs. The toolset was enhanced to construct partial TFGs for incomplete Ada-83 programs and to allow EAs as feasibility constraints. Performance of the toolset applied to specific analysis problems was measured in user plus system time on a DEC Alpha 200 4/233 with 64M of RAM; the machine was networked and was being used by 2-3 users during the time the data was collected. We ran each analysis at least 4 times and report the average; this is, admittedly, insufficient for characterizing analysis times. Our goal here, however, is to understand the potential benefits of modular analysis rather than to provide a statistically sound empirical evaluation of the performance of such an analysis. In reporting our data we provide measures of the TFG size, in terms of nodes and edges, the number of states in the PA for the specification being checked, the number of states in constraint automata, total analysis time in seconds, and an indication of whether the results are conclusive or inconclusive.

5.3. Persistent SOS We analyzed the system to determine whether it is always the case that when a sailor flips the emergency switch to the up position, an SOS will be continually sent and no subsequent weather updates will be processed until the switch is flipped down. For a FLAVERS analysis, we can specify such a property as a finite state automaton. To do this, we must identify the program actions that should be included in the alphabet of events for the specification; we include symbols for the three communications that can occur between Ships, Sailors and the buoy subsystem, i.e., weather, up and down, and a symbol representing the transmission of an SOS by the Xmitter package. In FLAVERS/Ada the communication symbols are constructed automatically; the SOS symbol is inserted as a comment into the Xmitter code that will be extracted by FLAVERS/Ada.

5.2. Host-Buoy At Sea We analyzed a version of the host-buoy at sea system developed by Sanden using the entity-life modeling approach [19]. This system simulates the operation of a weather data collector/provider and emergency transmitter resident on a sea buoy. The architecture of the system is illustrated in Figure 7 and consists of Ada packages, depicted as rectangles, 7

2

. up, weather

we

c

ath

Figure 8. Persistent SOS Specification Nodes/Edges 147/1929 147/2057 130/1477 130/1569 79/276 79/288

PA 3 3 3 3 3 3

TA 17 17 17

Time 21.0 60.8 15.2 31.5 5.1 9.5

er

4

2

up,down, waiting

1

up down

Module full full buoy buoy reporting reporting

weather ing

wait

3

up,down, weather

waiting

c

.

1

.

xmit_sos

.

down,weather, xmit_sos

task body Sailor is begin while not done loop buoy.receiver.up; ... -- QREa[waiting] buoy.receiver.down; end loop; end Sailor;

Property Automaton

Result inc con inc con inc con

Skeletal Sailor Task

Figure 9. No Weather During Rescue Specification

The buoy subsystem analysis used the default conservative environment automaton for the Ship and Sailor tasks. It ran in approximately half the time and provided the same level of precision as the full analysis. Analysis of the reporting subsystem also used the default EA. In addition to being 6 times faster than an analysis of the full system, this modular analysis promises to scale much better than the analysis of the full system. In fact, modular analysis of the reporting subsystem will be constant with increasing numbers of Ship, Sailor and Sampling tasks, while the cost of analyzing the full system can grow rapidly.

Table 1. Analysis Data for persist Specification

Using this alphabet the persist specification is given in Figure 8 as a property automaton. The specification says that ”on all program executions, once the buoy switch is up the buoy only transmits SOS until the switch is moved down”. We shortened the names of the symbols for readability, labeled the crash state with a c, and use the symbol ”.” on transitions leaving a state to denote all symbols in the alphabet that do not appear explicitly on some other transition leaving that state. For example, up means buoy.receiver.up. Table 1 provides data on a series of FLAVERS analyses run for the persistent SOS specification. Analysis of the full system is just a whole program FLAVERS analysis. There are many different ways we could chose to decompose the system for analysis. We use the alphabet of the specification as guidance. Intuitively, we want to select subsystems that have all symbols in the specification either internal or in the environment’s interface. We have two choices here: the buoy subsystem and the reporting subsystem. Note that because we are interested in the calls to sos of the Xmitter package we cannot consider the Receiver task in isolation. We began with an initial unconstrained conservative analysis. For all three system decompositions, the initial analysis produced inconclusive results. Knowing that the Receiver task is directly related to the events used in the specification, we chose to constrain the analysis with information about the control flow structure of the Receiver task. This was done using a type of feasibility constraint called a task automaton (TA) [8]. For all three system decompositions this eliminated enough imprecision to allow a conclusive analysis. Thus, the strategy of choosing the smallest subsystem containing the symbols in the specification worked well for this analysis.

5.4. No Weather During Rescue The persist specification described intended system behavior in terms of sequences of events. An alternative to this is to specify information about states of the system. We can do this by introducing events into the description of the software whose sole purpose is to indicate entry or exit from some state of interest. For example, in Figure 9 the Sailor task has the waiting event which indicates that the sailor has flipped the switch up and has yet to switch it down. By looking at the symbols that can occur immediately before or after the waiting event in a program execution we can reason about what the software can do while the Sailor is in the waiting state. Informally, we want to verify that no ship can succeed in accessing weather information while a sailor is waiting. Figure 9 gives a PA for this nwdr specification. Note that unlike the persist specification this PA should hold on none of the program’s executions. As before we can look at the alphabet of the property to guide selection of the subsystem to analyze. We select the collection of Ship and Sailor tasks. Given this selection the buoy subsystem is our environment. Since we have already analyzed the persist property on that subsystem, we can use that PA as the basis for our environment automaton. Note that this is guaranteed to be a conservative environment automaton since we have already proven that it contains all of the possible sequences of executable behavior over the interface alphabet, i.e., up, down and weather. 8

Module ss ss

Nodes/Edges 12/28 12/30

PA 5 5

EA

TA

3

7

Time 1.0 1.2

given specification it is likely the case that there exists system components that do not influence the pattern of behavior described in the specification. For this reason, when modular FLAVERS analysis of a specification is applied to a partial system where the environment has no influence, the size and structure of that environment has little or no affect on analysis time. In fact, if the environment is completely independent of a specification we can perform constant-time modular analyses when scaling the environment, e.g., by adding additional tasks. There are a number ways to produce environment automata, as discussed in Section 4. From our experience with different kinds of feasibility constraints we know that FLAVERS runs faster when the constraints are weaker, i.e., they are encoded as smaller and structurally simpler constraint automata. Developing weak constraints that capture sufficient information for producing conclusive FLAVERS analyses would enable further reduction in analysis cost. We have developed a number of transformations that can be applied to weaken a constraint automaton while preserving its conservativeness. These may serve as the basis of an automated approach to producing families of EA that encode abstractions of environment behavior with varying degrees of accuracy.

Result inc con

Table 2. Analysis Data for nwdr Specification The EA is identical to the PA illustrated in Figure 8, except that the crash state is considered a violation state; it represents execution that are known to be infeasible for the buoy subsystem. Table 2 provides data on the cost of modular analysis of the ss subsystem. In order to obtain conclusive results we needed to incorporate both the environment automaton and a task automaton for the Sailor task. We note that analyzing the full system with respect to the nwdr specification would be at least as expensive as the analysis of the persist specification, which was 50 times more than our modular analysis. Thus, leveraging off of the history of previous analysis results for a given system appears to offer great potential for reducing the cost of flow analysis of concurrent software.

5.5. Discussion The data presented in this section have illustrated how modular flow analysis can reduce analysis time. The execution times given above do not account for the translation of programs from source to TFGs; for modular analyses this can be significantly shorter, thereby widening the performance gap between whole program and modular analyses. In [9] we report on experiments where modular FLAVERS analyses without additional constraints produced more precise results than a whole-program analyses. We also report on problems where modular analysis scales more slowly with problem size, by over an order of magnitude, than whole-program analysis. We now describe how a modular analysis can produce more precise results and how it can scale so slowly. Infeasible paths in the TFG can arise from infeasible control flow choices, communications, or event orders. In discussing partial TFGs, we described how feasible subpaths through a partial program’s environment need not be con. Eliminating sidered when analyzing specifications over these feasible subpaths has the beneficial effect of also eliminating a number of infeasible paths. In a whole program TFG these infeasible paths can introduce opportunities for inconclusive results and their elimination can improve the precision of analysis results. FLAVERS encourages users to focus on small aspects of system correctness and encode a collection of individual specifications that cover the behavior of interest. For a

6. Conclusion Modular analysis enables exploitation of the structure of large complex systems to broaden the context in which FLAVERS analyses can be applied. During development, modular FLAVERS can be applied as soon as components of the system that have been implemented. For example, unitlevel analyses can validate properties of implemented subsystems in isolation. To enable precise analysis, the remainder of the system can be incorporated into analysis as conservative environment constraints. In addition to extending the applicability of analysis, modular FLAVERS can lead to significant reductions in analysis cost and increases in the precision of analysis results. Application of this approach is ongoing and we are gathering empirical data over a range of specifications and programs. This promises to yield a broad understanding of a number of practical issues related to modular FLAVERS analysis including the most effective kinds of decompositions for analysis, and the strength of environment constraints required for conclusive results. In summary, we have extended an existing low-cost flow analysis to reduce analysis cost, increase the precision of analysis results, and enable its application in realistic software development processes. We believe that modular FLAVERS is a step toward making flow analysis of specifications useful to practicing developers of concurrent software. 9

Acknowledgments

[15] S. Masticola, T. Marlowe, and B. Ryder. Lattice frameworks for multisource and bidirectional data flow problems. ACM Transactions on Programming Languages and Systems, 17(5):777–803, Sept. 1995. [16] S. Masticola and B. Ryder. A model of ada programs for static deadlock detection in polynomial time. In Proceedings of Workshop on Parallel and Distributed Debugging. ACM, May 1991. [17] S. Masticola and B. Ryder. Non-concurrency analysis. In Proceedings of the 4th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, May 1993. [18] G. Naumovich, L. Clarke, and L. Osterweil. Verification of communication protocols using data flow analysis. In Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering, Oct. 1996. [19] B. Sanden. Software Systems Construction with Examples in Ada. Prentice-Hall, Englewood Cliffs, NJ, 1994. [20] W. Yeh and M. Young. Compositional reachability analysis using process algebra. In Proceedings of the ACM SIGSOFT Symposium on Testing, Analysis and Verification, pages 49– 59, Victoria, Canada, Oct. 1991.

The author would like to thank members of the LASER analysis group at the University of Massachusetts at Amherst for helpful discussions. In particular, thanks to Gleb Naumovich for helping with the continuing development of the FLAVERS/Ada toolset.

References [1] G. Avrunin, J. Corbett, and L. Dillon. Analyzing partiallyimplemented real-time systems. In Proceedings of the 19th International Conference on Software Engineering, May 1997. [2] A. Chamillard. An Empirical Comparison of Static Concurrency Analysis Techniques. PhD thesis, University of Massachusetts at Amherst, May 1996. [3] S. C. Cheung and J. Kramer. Tractable flow analysis for distributed systems. IEEE Transactions on Software Engineering, 20(9), Aug. 1994. [4] S. C. Cheung and J. Kramer. Checking subsystem safety properties in compositional reachability analysis. In Proceedings of the 18th International Conference on Software Engineering, Berlin, Mar. 1996. [5] E. Clarke, D. Long, and K. McMillan. Compositional model checking. In Proc. 4th Annual Symposium on Logic in Computer Science, pages 353–362, June 1989. [6] J. Corbett. Evaluating deadlock detection methods for concurrent software. IEEE Transactions on Software Engineering, 22(3), Mar. 1996. [7] J. Corbett and G. Avrunin. Towards scalable compositional analysis. Software Engineering Notes, 19(5):53–61, Dec. 1994. Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering. [8] M. Dwyer. Data Flow Analysis for Verifying Correctness Properties of Concurrent Programs. PhD thesis, University of Massachusetts at Amherst, Sept. 1995. [9] M. Dwyer. Experiments with modular FLAVERS analysis. Technical Report CIS-96-5, Department of Computing and Information Sciences, Kansas State University, Apr. 1996. [10] M. Dwyer and L. Clarke. Data flow analysis for verifying properties of concurrent programs. Software Engineering Notes, 19(5):62–75, Dec. 1994. Proceedings of the ACM SIGSOFT Symposium on the Foundations of Software Engineering. [11] M. Dwyer and D. Schmidt. Limiting state explosion with filter-based refinement. In Proceedings of the ILPS workshop on Verification, Abstract Interpretation and Model Checking, Oct. 1997. to appear. [12] D. Grunwald and H. Srinivasan. Efficient computation of precedence information in parallel programs. In Proceedings of the Sixth Annual Workshop on Languages and Compilers for Parallel Computing, Portland, OR, Aug. 1993. [13] O. Kupferman and M. Vardi. Module checking revisited. In Proceedings of the Eighth International Workshop on Computer Aided Verification, July 1997. [14] T. Marlowe and B. Ryder. Properties of data flow frameworks. Acta Informatica, 28:121–163, 1990.

10